Want to professionalize your AI skills, pivot to an AI role and increase your salary?
Master AI Engineering with the most practical and comprehensive LLM Development certifications at Towards AI Academy.

Insight Global

Data Engineer

Insight Global

Published 21 Mar 2026
New York, NY, USA
Remote

Share this job

Role Highlights

Languages used

SCALA
Python

Key skills

Machine Learning
Data Science
Data Engineer
Data Infrastructure
ML Ops
Cloud Infrastructure
Data Quality
Operations
ETL
Storage
AI
Inference
Architecture
Research

Tools, Libraries and Frameworks

Data Factory
Spark SQL
DataBricks
Kubernetes
Apache Spark
PySpark

Description

Job Description Insight Global is looking for a Data Engineer to join a dedicated team building and evolving a clinical data platform serving the clinical operations space. You will architect and build the large-scale data pipelines that power clinical insights processing billions of records across medical claims, clinical trials, publications, and provider data. This is a core infrastructure role. You will be responsible for designing, building, and maintaining ETL frameworks that feed into analytics, machine learning, and product surfaces. You should be deeply comfortable with distributed computing at scale and experienced working alongside ML and data science teams in production environments. Responsibilities Include: · Design, build, and maintain large-scale ETL pipelines and data frameworks using Apache Spark (PySpark/Scala) on cloud infrastructure · Architect scalable data models and pipeline patterns to process structured and unstructured healthcare data at volume · Build and optimize data layers on Azure cloud services, including Databricks, Delta Lake, and supporting compute and storage infrastructure · Ensure data quality, lineage, and governance across the platform implementing validation, monitoring, and alerting at scale · Collaborate with AI Scientists and MLOps teams to build data pipelines that serve model training, inference, and retraining workflows · Work with data analysts and product teams to ensure curated, reliable data is available for downstream insights and reporting · Contribute to platform architecture decisions and help define best practices for data engineering within the team We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce : Skills and Requirements · 5+ years of experience in data engineering with a focus on large-scale distributed data systems · Strong proficiency in Python, SQL, and Scala · Deep hands-on experience with Apache Spark (PySpark, Spark SQL) for building ETL pipelines and data transformations at scale · Experience with Azure cloud services including Databricks, Delta Lake, and Azure Data Factory · Understanding of MLOps practices and experience building data infrastructure that supports machine learning workflows · Experience with data quality frameworks, data lineage, and governance tooling · Comfortable working independently in a remote setting with a distributed, cross-time zone team · Familiarity with Kubernetes and container orchestration for data workloads · Background in healthcare, life sciences, pharma, or clinical research is a strong plus

Required Qualifications and Skills

The role requires at least 5 years of experience in data engineering, with a focus on large-scale distributed data systems. Strong proficiency in Python, SQL, and Scala is necessary. Deep hands-on experience with Apache Spark, including PySpark and Spark SQL, for building ETL pipelines and data transformations at scale is essential. Experience with Azure cloud services such as Databricks, Delta Lake, and Azure Data Factory is required. Familiarity with MLOps practices and building data infrastructure to support machine learning workflows is also needed. Experience with data quality frameworks, data lineage, and governance tooling is expected.

Disclaimer

Disclaimer: Job and company description information and some of the data fields may have been generated via GPT-4 summarisation and could contain inaccuracies. The full external job listing link should always be relied on for authoritative information.

About the company

Insight Global

Size

13144

Founded

HQ

Atlanta, US

Description

Insight Global is an international staffing and services company specializing in sourcing IT, accounting, finance, healthcare, and engineering professionals and delivering service-based solutions to Fortune 1000 clients with more than 70 office locations throughout the U.S., Canada, and U.K. In addition to staffing services, Insight Global provides culture consulting, DEI training, specialized healthcare staffing and resources, and an array of client programs through their professional services division, Evergreen. To find out more, visit www.insightglobal.com

Share

Share this job

Related jobs

Azure Databricks
Power BI
ERP
DataBricks
Bloomington, IL, USA
Full Time
Data Analysis
Data Processing
Data Management
Relational Database
Chicago, IL, USA
Full Time
Insight Global

Data Scientist

Insight Global

Data Science
Computer Science
Statistical Analysis
Operations
USA
Full Time