Senior Data Engineer
Company: Toyota Research Institute
Location: Los Altos
Posted on: April 2, 2026
|
|
|
Job Description:
At Toyota Research Institute (TRI), we’re on a mission to
improve the quality of human life. We’re developing new tools and
capabilities to amplify the human experience. To lead this
transformative shift in mobility, we’ve built a world-class team
advancing the state of the art in AI, robotics, driving, and
material sciences. The Automated Driving Advanced Development
division at TRI will focus on enabling innovation and
transformation at Toyota by building a bridge between TRI research
and Toyota products, services, and needs. We achieve this through
partnership, collaboration, and shared commitment. This new
division is leading a new cross-organizational project between TRI
and Woven by Toyota to conduct research and develop a fully
end-to-end learned driving stack. This cross-org collaborative
project is harmonious with TRI’s robotics divisions' efforts in
Diffusion Policy and Large Behavior Models. We are looking for a
Senior Data Engineer to design and build the foundational data
infrastructure and tools that power our autonomy research and
development workflows. This includes large-scale ingestion
pipelines, structured feature stores, labeling infrastructure,
scene search and data discovery tools, and performance diagnostics
for machine learning and simulation workflows. Responsibilities
Design and implement scalable, production-grade pipelines for data
ingestion, transformation, storage, and retrieval from vehicle
fleets and simulation environments. Build internal tools and
services for data labeling, curation, indexing, and cataloging
across large and diverse datasets. Collaborate with ML researchers,
autonomy engineers, and data scientists to design schemas and APIs
that power model training, evaluation, and debugging. Develop and
maintain feature stores, metadata systems, and versioning
infrastructure for structured and unstructured data. Support the
generation and integration of synthetic datasets with real-world
logs to enable hybrid training and simulation workflows. Optimize
pipelines for cost, latency, and traceability, ensuring
reproducibility and consistency across environments. Partner with
simulation and cloud platform teams to automate workflows for
closed-loop testing, scenario mining, and performance analytics.
Qualifications Bachelor’s or Master’s degree in Computer Science,
Data Engineering, or a related field. 8 years of experience
building data-intensive software systems, ideally in robotics,
autonomous driving, or large-scale ML environments. Proficient in
Python, SQL, and familiar with C++. Experience designing ETL
pipelines using modern frameworks (e.g., Apache Spark, Flyte,
Union). Strong knowledge of cloud-native architectures, including
AWS services (e.g., S3, or equivalents (Google Cloud platform)
Familiarity with sensor data types (camera, lidar, radar, GPS/IMU)
and common data serialization formats (e.g., protobuf. ROS2bag,
MCAP). Deep understanding of data quality, observability, and
lineage in high-volume systems. Track record of building reliable
and performant infrastructure that supports both ad-hoc exploration
and repeatable production workflows. Bonus Qualifications
Experience in AD/ADAS, robotics, or autonomous systems — especially
handling perception or planning datasets. Familiarity with ML
pipeline orchestration frameworks (e.g. Kubeflow, SageMaker, etc).
Experience working with temporal or spatial data, including
geospatial indexing and time-series alignment. Exposure to
synthetic data generation, simulation logging, or scenario replay
pipelines. Strong software engineering fundamentals, CI/CD,
testing, code review, and service deployment best practices.
Experience collaborating with cross-functional, distributed teams
across research and production orgs. Please include links to any
relevant open-source contributions or technical project write-ups
with your application. The pay range for this position at
commencement of employment is expected to be between $180,000 and
$258,750/year for California-based roles. Base pay offered will
depend on multiple individualized factors, including, but not
limited to, a candidate's experience, skills, job-related
knowledge, and market location. TRI offers a generous benefits
package including medical, dental, and vision insurance, 401(k)
eligibility, paid time off benefits (including vacation, sick time,
and parental leave), and an annual cash bonus structure. Additional
details regarding these benefit plans will be provided if an
employee receives an offer of employment. Please reference this
Candidate Privacy Notice to inform you of the categories of
personal information that we collect from individuals who inquire
about and/or apply to work for Toyota Research Institute, Inc. or
its subsidiaries, including Toyota A.I. Ventures GP, L.P., and the
purposes for which we use such personal information. TRI is fueled
by a diverse and inclusive community of people with unique
backgrounds, education and life experiences. We are dedicated to
fostering an innovative and collaborative environment by living the
values that are an essential part of our culture. We believe
diversity makes us stronger and are proud to provide Equal
Employment Opportunity for all, without regard to an applicant’s
race, color, creed, gender, gender identity or expression, sexual
orientation, national origin, age, physical or mental disability,
medical condition, religion, marital status, genetic information,
veteran status, or any other status protected under federal, state
or local laws. It is unlawful in Massachusetts to require or
administer a lie detector test as a condition of employment or
continued employment. An employer who violates this law shall be
subject to criminal penalties and civil liability. Pursuant to the
San Francisco Fair Chance Ordinance, we will consider qualified
applicants with arrest and conviction records for employment. We
may use artificial intelligence (AI) tools to support parts of the
hiring process, such as reviewing applications, analyzing resumes,
or assessing responses. These tools assist our recruitment team but
do not replace human judgment. Final hiring decisions are
ultimately made by humans. If you would like more information about
how your data is processed, please contact us.
Keywords: Toyota Research Institute, Novato , Senior Data Engineer, Engineering , Los Altos, California