Senior Data Engineer
About the job
Recent progress in artificial intelligence coupled with a growing resource of longitudinal patient data from historical clinical trials and real world sources provides an unprecedented opportunity to transform clinical trials and speed the development of new medicines.
Unlearn is a science-first company that has invented the first machine learning platform for creating Digital Twins of patients in clinical trials — comprehensive simulations that answer “what would likely happen to this patient if he/she were randomized to the control group?”.
Using information from Digital Twins to estimate treatment effects reduces required sample sizes, increase statistical power, and provides patient-level treatment response information.
It’s about increasing confidence in trial results.
It’s about bringing new medicines to patients faster.
It’s about time.
Your Role at Unlearn
You will be collaborating with the incredibly talented folks who are working to productionize and scale Unlearn’s technology. Your primary responsibility will be to lead the development of Unlearn’s ETL software infrastructure, making it possible to scale our production workflows by an order of magnitude. As a leader you will be relied upon to guide the software organization in the highest standards of software development, and in particular in the best practices related to building and automating complex, data processing and ML pipelines. You will have outsized impact throughout the technical organization, laying the groundwork for scale. You are an experienced data engineer who wants to help build a new, computationally-focused approach to important problems in healthcare.
- Lead the development of Unlearn’s ETL software infrastructure, making it possible to scale our production workflows by 10x while improving quality.
- Closely collaborate with data scientists and ML engineers to architect software solutions that meet the needs of the broader organization.
- Scoping of work, management of backlog, and predictable project delivery.
- Provide input into long-range platform requirements and operational guidelines.
- Develop and own best practices/methods, for reproducible, traceable, and validatable data processing and ML pipelines.
- Develop expertise in relevant data standards (e.g., CDISC), and collaborate with Unlearn’s regulatory team to ensure compliance with regulatory guidance related to ETL activities and software
- 5+ years work experience as a data engineer or similar role.
- Extensive expertise in custom ETL design, implementation, and maintenance.
- Fluent in Python, and Python data processing ecosystem tools.
- Experience with AWS services.
- Knowledge of scheduling, logging, monitoring, and task orchestration platforms like Airflow or Nextflow.
- Demonstrated proficiency in writing high-quality and scalable code and integrating with version control systems.
- Experience leading successful data engineering projects and operationalizing machine learning algorithms.
- The ability to lead, collaborate, communicate, and mentor.A strong background in open source technology.
- Proven experience deploying machine learning algorithms to production.
- Experience with data processing pipelines in a highly regulated context.
- Eventually (after the pandemic) able to work in the downtown SF office at least 2 days per week.
Compensation & benefits
Unlearn offers compensation commensurate with experience as well as a competitive benefits package, including:
Generous equity participation.
Unlimited PTO plus company holidays.
Annual company-wide shutdown between Christmas Eve and New Years holidays.
Professional development budget to attend conferences or other events.
401k plan with generous matching.
Company-subsidized medical, dental, & vision insurance plans.
Commuter benefits plan.
Paid Parental Leave
- Unlearn is not currently offering visa sponsorships for any position. Please only apply if authorized to work in the U.S.
- Address San Francisco, CA