What are the responsibilities and job description for the Senior Platform Engineer position at TuSimple, Inc.?
This position is available as a remote opportunity.
You can apply to the job with this email address:
swe___build_and_release__software_engineer__general_2e01b65f2us@ivy.greenhouse.io
Company Overview
Come join a higher calling and find a deeper purpose!
As a multi-national Artificial Intelligence Technology Company, we are at the epicenter of the Autonomous Vehicle Universe. Our breakthroughs are leading the industry in autonomous trucking.
While inventing the framework of Autonomous Driving, our current fleet of autonomous Trucks are helping communities receive much-needed supplies and medical equipment around the clock. Our people are some of the most talented engineers and contributors who are leaving behind a historic legacy.
TuSimple was founded half a decade ago with the goal of bringing the top minds in the world together to achieve the dream of a driverless truck solution. With a foundation in computer vision, algorithms, mapping, and Artificial Intelligence, TuSimple is working to create the first global commercially viable Autonomous Freight Network.
Opportunity Overview
Embedded within the TuSimple Service Infrastructure group, the Senior Platform Engineer is a specialized and adept engineer who designs, builds, and operates/maintains infrastructure services, tools, and libraries using cutting-edge technology. They are oriented towards eliminating risks and building reliable, scalable, and performant systems.
Others depend on the Senior Platform Engineer II to help accelerate the development cycle of machine learning products. As such, the incumbent uses their deep and broad technical experience to standardize deployments, ensure the auditability of infrastructure, automate various deployment processes, write documentation, and facilitate training events.
From diving into technical details for critical decisions, to holding workshops to impart knowledge and drive adoption, the Senior Platform Engineer II has exceptional communication skills and understands how their work impacts others, including end users. They employ a global mindset and excel at collaborating with global teams to resolve obstacles. In addition, they build strong relationships within the Platform Engineering team and beyond, by demonstrating appreciation and regard for others’ ideas and work product, and by helping the team consistently deliver results that meet and/or exceed expectations.
What You'll Do
- Participates in designing, building, and operating TuSimple’s foundational software services and platforms.
- Skillfully implements new features and evolves existing infrastructure.
- Imparts knowledge and drives adoption via user guides, Application Programming Interface (API) references, and workshops.
- Develops both cloud platform (AWS and/or alternatives) and on-premise solutions.
- Supports and maintains core Artificial Intelligence (AI) infrastructure at TuSimple.
- Designs and implements tooling and automation for clustering, scaling, monitoring, and alerting.
- Standardizes Kubernetes deployments and ensures that infrastructure is auditable.
- Assists with recruiting and training initiatives; helps select candidates with strong skills and great potential, mentors junior engineers, and grows the teams’ technical capabilities and capacity.
- Ensures infrastructure security compliance; implements security, permissions, and authentication.
- Leads containerization and deployment of microservices on Kubernetes.
- Establishes and monitors application access and connectivity.
- Auto scales and monitors performance for Kubernetes. Runs applications using Prometheus and Grafana, or similar tools
- Performs Site Reliability Engineering (SRE) activities such as: availability and reliability monitoring and reports.
- Sets up infrastructure as a service (IaaS) using Terraforms.
- Establishes and contributes to the continuous integration/continuous delivery (CI/CD) processes.
- Establishes and operates code repositories with GitHub Enterprise.
- Builds and maintains strong relationships across the organization.
What You'll Bring
- Advanced capability to design and implement reliable, scalable, and performant distributed systems and data pipelines.
- Proficiency in modern container technologies including Docker and Kubernetes.
- Proficiency in infrastructure as code (IaC) tools like Terraform, Vagrant, Chef, Puppet, or Amazon Web Services (AWS) CloudFormation.
- Proficient in Go (Golang) and Python.
- Ability to resolve ambiguity and collect feature requirements and feedback from users.
- Working knowledge of, and experience with, large scale distributed software and systems.
- Ability to actively collaborate with global teams and resolve obstacles by evaluating all possible solutions and using informed judgment to select the best path forward for the project.
- Demonstrated experience with security and access management.
- Ability to clearly and succinctly communicate verbally and in writing, in both technical and non-technical English.
- Ability to identify, troubleshoot, and resolve issues quickly and effectively.
- Strong attention to detail and documentation skills.
- Experience designing or maintaining the machine learning platform is considered an asset.
- Knowledge of, or experience with, Agile or Scrum project management environments/methodologies is considered an asset.
- Experience with supporting and maintaining networking infrastructure is considered an asset.
- Experience with system administration on AWS is considered an asset.
- Experience with large-scale backend systems and infrastructure is considered an asset.
- Experience with high availability and fault-tolerant systems is considered an asset.
- Previous experience in any of the following areas is considered an asset: infra-level outages, making blameless postmortems, and GPU/CPU scheduling.
- Global mindset; excels at collaborating with global teams and resolving obstacles by evaluating all possible solutions and selecting the best path forward for the project.
- Oriented towards eliminating risks and building reliable, scalable and performant systems.
- Strong interpersonal skills, underpinned by genuine and transparent communication, as well as appreciation and regard for others’ ideas and work product.
- High sense of urgency; self-starter, highly responsive, and able to work and deliver in a fast-paced stream-alignment environment.
- Driven to learn. Committed to keeping current with best practices and emerging industry trends in a quickly evolving sector.
- Intellectually curious with a strong bias to action.
- Driven to understand and collaborate with multiple stakeholders, able understand and interpret stakeholder needs - translating needs into clear objectives.
- Analytical, judgment, persuasion and consensus building abilities where there are competing interests.
- Strong oral, and written communication skills. Capable of listening and obtaining clarification, changing approach or method to best fit the situation. Able to effectively partner with cross-functional teams to coordinate activities and accomplish goals.
- Demonstrated experience building and improving processes and promoting quality.
- Ability to work independently with limited required direction and guidance.
- Confident in making technical recommendations to senior management.
- Strong organizational skills, ability to coordinate multiple tasks and support projects of varying complexity in parallel within tight deadlines.
- Proven ability to work independently in a matrix organization, tech start-up experience preferred.
- Ability to maintain resilience throughout aggressive deadlines, changing priorities, and evolving operations, as common to progressive, start-up environments.
Perks
- Visa sponsorship is available for this position
- Opportunity for professional growth and career advancement
- Competitive salary and benefits
- Up to a 30% discretionary bonus
- Daily breakfast, lunch, and dinner
- Shape the landscape of autonomous driving
- 100% Company-paid Medical, Vision, and Dental insurance plans
- Company 401(K) program
- Company-paid life insurance
- Company-paid education/training
- Company-paid gym membership
TuSimple is an Equal Opportunity Employer. This company does not discriminate in employment and personnel practices on the basis of race, sex, age, handicap, religion, national origin or any other basis prohibited by applicable law. Hiring, transferring and promotion practices are performed without regard to the above listed items.
This position is available as a remote opportunity.
You can apply to the job with this email address:
swe___build_and_release__software_engineer__general_2e01b65f2us@ivy.greenhouse.io
Brown University, California Institute of Technology, Carnegie Mellon University, Columbia University, Cornell University, Dartmouth College, Duke University, Georgia Institute of Technology, Harvard University, Harvey Mudd College, Massachusetts Institute of Technology, North Carolina State University, Northwestern University, Princeton University, Purdue University, Rice University, Rose - Hulman Institute of Technology, Stanford University, Tufts University, University of California — Berkeley, University of California — Los Angeles, University of Illinois--Urbana-Champaign, University of Maryland--College Park, University of Massachusetts--Amherst, University of Michigan--Ann Arbor, University of Notre Dame, University of Pennsylvania, University of Southern California, University of Texas Austin, University of Washington, University of Wisconsin--Madison, Williams College, Worcester Polytechnic Institute (WPI), Yale University, MIT, CMU, Waymo, Uber, Facebook, Uber, Amazon, Cruise, Tesla, Argo AI, Baidu, DIDI, Zoox, Nutonomy, Nuro, Aptiv, Pony.Ai, Kodiak, Toyota, Nissan, GM, Ford, VW, Autonomous Car, Autonomous Driving, Robotics, Artificial Intelligence, Machine Learning, Deep learning, Perception, Prediction, Planning, Control, Anduril Industries, Sift, Nauto, Tempus, Salesforce, Automation Anywhere, SenSat, Phrasee, Defined Crowd, Pymetrics,Siemens, Socure, AEye, Rev.com, Suki.ai, Verkada, DataVisor, People.ai, AlphaSense, Icertis, Casetext, Blue River Tech, Nvidia, Bright Machines, Orbital Insight, Brighterion, H2O, Intel, Clarifa, X.ai, Zebra Medical Vision, Iris AI, Freenome, Neurala, Akamai, Zoho, ServiceNow, SalesForce, Oracle, Tableau,Splunk,Cvent, Veeam,Atlassian, DocuSign, Dropbox, Veeva Systems, Proofpoint, Cornerstone, Qualtrics. New Relic, Okta, Intralinks, MuleSoft, Freshworks, Slack, Twilio, Anaplan, Stripe, Workfront, Smartsheet, Zuora, OutSystems, Coupa, Cylance, Elastic, Zoom, SailPoint, BlackLine, iCIMS, Digitate, Qualys, Kareo, DataStax, DiscoverOrg, Siteimprove, Druva, Centrify, Looker, SimilarWeb, Odoo, Kyriba, Sumo Logic, Sisense, PagerDuty, DigitalOcean, Liquid Web, Zaloni, Databricks, ServiceTitan, Fastly, SnapLogic, Mendix, Couchbase, Egnyte, Seismic, Bill.com, Justworks, Collibra, ActiveCampaign, Schoology, SalesLoft, Cylynt,