Lead Data Engineer

5+ years of experience

Fulltime, remote

About the project

The client is launching a large-scale migration of existing ETL pipelines from Cloudera to Databricks. The role combines hands-on technical ownership, migration architecture, and coordination of a cross-functional team. Strong communication skills are essential, as the position requires active interaction with both business and technical stakeholders.

Main Responsibilities:

Lead and coordinate the migration of ETL pipelines from Cloudera to Databricks.
Analyze existing pipelines, dependencies, and pain points, and design the target architecture on Databricks.
Support the team in troubleshooting and optimizing performance.
Communicate with business and technical stakeholders to align priorities and remove blockers.
Ensure delivery timelines and quality standards are met.

Requirements:

Strong hands-on experience with Databricks (Delta Lake, Databricks SQL, Unity Catalog, MLflow, Workspace management).
Deep understanding of ETL/ELT concepts, data pipeline design, and data orchestration.
Experience with Cloudera, Spark, and migration between Hadoop-based and cloud-based platforms.
Good knowledge of Azure ecosystem (Data Factory, Synapse, DevOps CI/CD).
Strong programming in Python and PySpark.
Experience with Git, CI/CD, and deployment automation in Databricks.
Advantage: previous experience as a technical lead or migration architect in similar projects.