Staff / Senior Staff Data Engineer, Real-World Data

Posted 24 Days Ago
Be an Early Applicant
Remote
Senior level
Artificial Intelligence • Machine Learning • Software • Biotech • Pharmaceutical
The Role
As a Staff/Senior Staff Data Engineer at Valo Health, you will lead data engineering initiatives to transform real-world data into analysis-ready data products. Responsibilities include building and maintaining data transformation pipelines, integrating EHR data, ensuring data quality, and providing technical leadership within the data engineering team.
Summary Generated by Built In

About Us

Valo Health is a technology company that is integrating human-centric data and AI-powered technology to accelerate the creation of life-changing drugs for more patients faster. Valo was created with the belief that the drug discovery and development process can and should be faster and less expensive, with a much higher probability of success. We are using models early to fail less often, executing clinical trials to add valuation to the company, and generating fit-for-purpose data to feed back into Valo’s Opal Computational Platform™ as we reinvent drug discovery and development from the ground up. Disease doesn’t wait, so neither can we.

We are a multi-disciplinary team of experts in science, technology, and pharmaceuticals united in our mission to achieve better drugs for patients faster. Valo is committed to hiring diverse talent, prioritizing growth and development, fostering an inclusive environment, and creating opportunities to bring together a group of different experiences, backgrounds, and voices to work together. We achieve the widest-ranging impact when we leverage our broad backgrounds and perspectives to accelerate a new frontier in health. Valo seeks to become the catalyst for the pharmaceutical industry and drive the digital transformation of the industry. Are you ready to join us?

About the Role

As a Staff / Senior Staff Data Engineer, you will join the data engineering core in the Translational Data Sciences group, working with data scientists and engineers building powerful computational tools and answering critical scientific questions about patients, diseases, and drug development.

In this role, you will lead the development, road mapping, and execution of complex initiatives to transform real-world data (eg, electronic medical records, biomarkers and biomedical imaging, and text notes) into analysis-ready data products for internal teams. To do so you will partner with a diverse set of scientists, engineers, and domain experts across traditional industry boundaries. Primary downstream use cases of these data are longitudinal deep learning models of patient trajectories, and knowledge graph integration for target identification, statistical genetics, and multi-omics modeling.

What You'll Do...

  • Build, maintain, and extend data transformation pipelines and systems to ingest and harmonize third-party EHR data into Valo’s data ecosystems
    • Define Valo’s EHR data models and pipelines (spark, SQL) in a centralized data ecosystem and semi-isolated cloud environments.
    • Work closely with data providers and in-house data users to integrate third-party EHR data with Valo’s standardized data
      • Maintain and extend data integration (standardization & harmonization) & data quality processes to improve quality, reliability, and FAIRness
      • Ensure conceptual accuracy and generalizability of data: do standardized derived features represent clinical concepts in repeatable ways?
  • Simplify how data scientists access, transform, and use their data
    • Promote consistent data usage patterns, including version management, shared ontologies & data dictionaries
    • Support internal data users both directly and by composing demos, how-tos, and reference documentation
  • Provide technical leadership within the translational data engineering team
    • Simplify how data engineers build, maintain, and extend their data pipelines
    • Advise colleagues on data transformations and database design
    • Provide critical feedback and encourage best practices within the data engineering team
    • Participate in the creation and maintenance of technical documentation

What You Bring...

  • Bachelor’s degree + 8 (staff) /10 (sr staff) years of experience, MS + 6/8 YOE, PhD + 5/7 YOE in computer science, information systems, or data science
  • 5+ yrs experience in a technical role in:
    • SWE / DE: data ingestion, streaming technologies, troubleshooting data pipelines (eg prefect, airflow) and implement CI/CD practices
    • Production programming experience in python & SQL; cloud compute and big data tools, eg spark
  • 3+ yrs experience in a professional role gathering requirements and understanding customers/data users goals
    • Demonstrated experience scoping projects, determining timelines and milestones, delivering end-to-end projects
    • Technical project management experience (scoping, defining milestones & timelines) a plus
  • Experience with EHR/EMR data and medical coding ontologies (eg, ICD, ATC, LOINC, SNOMED)
    • Nice to have: experience with sparse longitudinal records, eg customer / log data with historical ontologies – about the concepts, distinct from data provenance & qualitative data and coding structures
  • Experience with data engineering best practices and testing methodologies (data provenance, collaborative development using source control management (git), code versioning, reproducibility, etc)

More on Valo

Valo Health, LLC (“Valo”) is a technology company built to transform the drug discovery and development process using human-centric data and artificial intelligence-driven computation. As a digitally native company, Valo aims to fully integrate human-centric data across the entire drug development life cycle into a single unified architecture, thereby accelerating the discovery and development of life-changing drugs while simultaneously reducing costs, time, and failure rates. The company’s Opal Computational Platform™ is an integrated set of capabilities designed to transform data into valuable insights that may accelerate discoveries and enable Valo to advance a robust pipeline of programs across cardiovascular metabolic renal, oncology, and neurodegenerative diseases. Founded by Flagship Pioneering and headquartered in Lexington, MA, Valo also has offices in New York, NY. To learn more, visit www.valohealth.com.


Top Skills

Spark
SQL
The Company
New York, NY
221 Employees
Hybrid Workplace
Year Founded: 2019

What We Do

Valo is a technology company built to transform the drug discovery and development process using human-centric data and AI-powered computation. Valo is fully integrating human-centric data across the entire drug development lifecycle into a single unified architecture, thereby accelerating the discovery and development of life-changing drugs while simultaneously reducing the cost, time, and failure rate. The company’s Opal Computational Platform™ consists of an integrated set of capabilities designed to transform data into valuable insights that may accelerate discoveries and enable Valo to advance a robust pipeline of programs across cardiovascular metabolic renal, oncology, and neurodegenerative disease.

Why Work With Us

We’re focused on bringing together the brilliant minds to play a critical role in shaping and executing our mission. Be part of a culture that emphasizes trust and diversity, where people are generous with their ideas, support their colleagues, and feel free to voice their opinions. Be part of Valo.

Gallery

Gallery

Similar Jobs

Block Logo Block

Staff Data Engineer, Public Web

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
Remote
Hybrid
8 Locations
12000 Employees
153K-270K Annually

Arcadia Logo Arcadia

Data Engineer

Big Data • Healthtech • Software • Analytics
Remote
USA
370 Employees

NBCUniversal Logo NBCUniversal

Sr. Data Engineer

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Remote
Hybrid
New York, NY, USA
68000 Employees
115K-145K Annually

Two Barrels LLC Logo Two Barrels LLC

Senior Data Engineer

eCommerce • Legal Tech • Professional Services • Software • Data Privacy
Remote
Hybrid
Austin, TX, USA
950 Employees
150K-150K Annually

Similar Companies Hiring

Alchemy Thumbnail
Web3 • Software • Information Technology • Cryptocurrency • Blockchain
New York, NY
200 Employees
Spark Advisors Thumbnail
Software • Sales • Other • Insurance • Healthtech
New York City, NY
80 Employees
bet365 Thumbnail
Software • Gaming • eSports • Digital Media • Automation
New York, NY
6100 Employees

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account