aiShare Your Requirements
Technologies Involved:
APACHE
AWS
Project Description

The DWP EMR Data Reprocessing System is a data engineering initiative built for large-scale data transformation and reprocessing. Oodles supported the client by enabling reliable ETL execution on AWS EMR using Spark, backed by a strong DevOps layer to ensure secure, repeatable, and compliant deployments through automated CI/CD pipelines.

Scope Of Work

The client engaged Oodles to design, support, and optimize distributed ETL workflows running on AWS EMR. The scope included managing Spark-based data reprocessing jobs, implementing CI/CD pipelines using GitLab, enforcing security checks through automated scans, and ensuring consistent environments for testing and deployment via containerization.

Our Solution

Oodles delivered a resilient EMR-based data reprocessing setup with Spark-driven ETL pipelines tailored for large datasets. We implemented GitLab CI/CD workflows with automated testing, dependency scanning, and secret detection to strengthen security and compliance. Dockerized environments and infrastructure support ensured stable deployments, efficient troubleshooting, and uninterrupted data operations.