Lead Data Engineer with 9+ years of experience building and scaling cloud data platforms on GCP and AWS. I design and implement robust ETL pipelines using tools like PySpark, Airflow, Databricks, and Snowflake, supporting both batch and real-time data processing.
Recent highlights:
• Built and scaled a GCP data platform supporting a large analytics team using a medallion architecture (Bronze/Silver/Gold).
• Improved performance and reliability of data pipelines through optimized PySpark processing.
• Led the development of data infrastructure for audit and analytics use cases, enabling better visibility into business operations.
• Implemented real-time data streaming to support operational dashboards and time-sensitive insights.
Experienced in collaborating with global, cross-functional teams (USA, UK, India, China), with a strong ability to translate business needs into scalable data solutions.More...