EvermosData Engineer
Jan. 2022 - Mar. 2022Remote• Successfully migrated pipelinewise dag (Apache Airflow) from BashOperator to KubernetesOperator, resulting in improved pipeline performance and reliability.
• Maintained and optimized data infrastructure comprising Apache Airflow, Redshift, Rudderstack, and Kubernetes, resulting in efficient and effective data processing and storage.
• Implemented string matching process on top of Apache Spark, which enhanced the accuracy of data analysis and reporting.
• Developed a Grafana dashboard for monitoring Apache Airflow, enabling quick identification and resolution of issues.
• Deployed AWS Data Migration Service for data ingestion and combined it with Apache Airflow, significantly improving data management and integration.
• Created Slack alerting for AWS Data Migration Service using AWS Lambda function with AWS SNS, ensuring timely notification of any potential data issues.
• Automated termination of locking query in Redshift using PythonOperator in Apache Airflow, saving time and increasing efficiency.