FreelancerBig Data / Hadoop / Spark consultant
May. 2015Würzburg e Região, AlemanhaI do customers engagements to build or review new Big Data Architectures on premise or on the Cloud. Furthermore I do programing on top of Apache Spark for Machine Learning, ETL Jobs, and Apache Spark Streaming. - Apache Hadoop / YARN (HDP and Ambari) - Scala - Java - Maven, SBT and Gradle - Docker - Amazon EMR, Amazon Redshift - Writing geospatial applications in Scala on Spark - Custom Spark Data sources for Hbase and Aggregation for Data exploration. - Spark Cluster optimizations. - Sizing for Hadoop / Spark Clusters - Big Data on Amazon Web Services - Reviewing Lambda Architecture (BI) - Data migration into Hadoop with Sqoop - Spark Upgrades - Architecting a Framework for Alerting and compute service based on Spark - Writing Spark applications (Example: Geospatial operations on Spark Data sources) - Writing Custom Spark data sources for HBase - Machine Learning with Spark Decision Trees and k-means