Data Engineer
We are looking for a Data Engineer for our client.
Founded in 2003, the company is a global technology consultancy specializing in digital transformation, data management, and AI-driven solutions. They focus on enhancing customer experiences and providing secure IT infrastructure. The company also offers expertise in Salesforce, UX/UI design, and loyalty programs, aiming to accelerate business value through integrated technology and industry insights.
Key Responsibilities:
-
Design, implement, and optimize data pipelines using AWS services, including RDS (Relational Database Service), EKS (Elastic Kubernetes Service), and Glue, to support scalable and efficient data workflows.
-
Develop and maintain Change Data Capture (CDC) mechanisms to ensure accurate and timely data updates across systems.
-
Leverage Amazon MSK (Managed Streaming for Apache Kafka) to enable real-time data synchronization and processing between distributed systems.
-
Build and manage streaming and batch data processing pipelines using tools such as AWS Glue, Apache Kafka, Apache Spark, and Amazon Kinesis.
-
Ensure relational data modeling best practices are applied, maintaining data integrity and consistency during migrations and transformations.
-
Write clean, efficient code in Python, Scala, or Java to support data pipeline development and automation.
-
Design and monitor resilient data pipelines, minimizing downtime and data loss through proactive troubleshooting and performance tuning.
-
Collaborate with cross-functional teams to understand data requirements and deliver solutions that align with business objectives.
-
Document technical designs, processes, and operational procedures to ensure maintainability and knowledge sharing.
Requirements
-
4+ years of experience in data engineering, with a strong focus on cloud-based data solutions.
-
Advanced expertise in AWS services, particularly RDS, EKS, and Glue, with hands-on experience in deployment and optimization.
-
Proven experience designing and implementing Change Data Capture (CDC) mechanisms for real-time data updates.
-
In-depth knowledge of Amazon MSK and Apache Kafka for real-time data synchronization and streaming applications.
-
Hands-on experience building streaming and batch data pipelines using AWS Glue, Apache Spark, Amazon Kinesis, or similar tools.
-
Strong understanding of relational data modeling, data integrity, and consistency principles, especially in the context of data migrations.
-
Proficiency in at least one programming language (Python, Scala, or Java) for data pipeline development and scripting.
-
Demonstrated ability to design and monitor resilient, fault-tolerant data pipelines with minimal downtime and data loss.
-
Excellent problem-solving skills and attention to detail in a fast-paced environment.
-
Upper intermediate English level
Will be a plus:
-
Experience with containerization and orchestration (e.g., Docker, Kubernetes) in conjunction with EKS.
-
Familiarity with infrastructure-as-code tools (e.g., Terraform, CloudFormation) for AWS deployments.
-
Exposure to data governance, security, and compliance standards in a cloud environment.
-
Certification in AWS (e.g., AWS Certified Data Analytics – Specialty or AWS Certified Solutions Architect) is a plus.
What we offer:
-
Employment with competitive compensation, based on experience
-
Possibility to work remotely
-
An open, transparent and fun work culture
-
Multi-national team and collaborative work environment
-
Continuous knowledge sharing with engaged co-workers
-
Career and professional growth opportunities