Senior Data Engineer CV Sample

Justin Lauren

Senior Data Engineer


Senior Data Engineer with 6 and a half+ years of experience in building data intensive applications, tackling challenging architectural and scalability problems, managing data repos for efficient visualization, for a wide range of products.

Highly analytical team player, with the aptitude for prioritization of needs/risks. Constantly striving to streamlining processes and experimenting with optimising and benchmarking solutions. Creative troubleshooter/problem-solver and loves challenges.

Experience in implementing ML Algorithms using distributed paradigms of Spark/Flink, in production, on Azure Databricks/AWS Sagemaker. Additionally, NLP and Recommender Systems Products(POC).

Experience in shaping and implementing Big Data architecture for Medical Devices,Retail, Banking, Games and Transport Logistics domain (IOT).


  • Frameworks : Spark [Structured Streaming, MLlib, SQL], Flink, KafkaStreams
  • Databases : PostgreSQL/AWS Aurora, Neo4J/Azure Cosmos Graph, Cassandra, MongoDB/Azure Cosmos [Doc], Redshift, Clickhouse
  • Schedulers/Workflow : Airflow, Luigi, AWS Step Functions, Oozie
  • Visualization: Looker, SQL Analytics, Tableau
  • Programming Languages : Python, Scala, Java
  • Datalakes/Blobs: Azure DataLake, Databricks Delta, S3
  • Cloud Platforms: AWS, Azure, GCP, Databricks
  • Devops : Docker, Kubernetes, Terraform
  • ML Frameworks : Scikit Learn, TensorFlow

Work Experience

Senior Data Engineer

StrongArmTech, NY


  • Creating streaming pipelines to ingest sensor data and process them in real time to populate dashboards and the warehouse
  • Created pipelines for Sensor data published into Kinesis (and S3 for failsafe reprocessing) ingested by a databricks job, written into azure delta tables and Clickhouse (GCP earlier)
  • Worked on Looker and SQL Analytics dashboards for Clickhouse/GCP( benchmarking/production)
  • Built pipelines as a part of a SOLID principled codebase including ad hoc time bound backruns, CDC jobs for metdata entities and the MLLib prod optimised code, in Python.
  • Designed and integrated entities of the product using azure delta and Clickhouse tables, exposed by Python web-service APIs on AWS Lambda.

Senior Data Engineer

Jones LaSalle Lang Technologies(JLL), India

Dec 2020

Property Web Based Product

  • Worked on multiple API source Ingestion, dump schema creation and entity modelling using Cosmos and Scala Azure Functions.
  • Worked on global multi region sources and associated rule based implementation of Spark Azure Databricks notebooks driven etl region specific pipelines. Databricks Delta.
  • Integrated entities in the property domain, using Azure Cosmos Graph and Azure Databricks Notebooks, followed by Scala web-service APIs deployed on Azure HDinsights for quick search
  • Worked on Streaming data Application element of the pipeline, detecting refreshes

Competitive analytics platform

  • Designing of, individual table based schema handling, ingestion and implementation of a data warehouse for KPI tracking, and it’s respective components for a full fledged reporting data-warehouse.
  • Created spark jobs for handling of daily data from Mongo, MySQL, Postgres and Folder dumps to update the data warehouses, using Airflow scheduling.
  • Managed scaled ingestion from public competitor apis for tracking relevant parameters in analytics warehouse on Redshift.
  • Worked on complex custom reporting spark logic driving insightful marketing strategy.
  • Benchmarked the real-time elements of the solution with Kafka Streams.

Senior Data Engineer

Robert Bosch Engineering Solutions, Germany

Dec 2018

Kiosk Monitoring Product

  • Created Spark batch jobs based on derivation from incoming data-model via a productionised ML model with associated business logic.
  • Implemented Flask APIs layer and simulator for the application
  • Testing end-to-end pipeline and DEVops of associated individual component log monitoring
  • Overall design and development of the lambda architecture MQTT based, Kafka, Spark pipeline for data ingestion, and alert detection. Cloud agnostic framework.

Flink Scala Akka Complex Event Processing Product

  • Created Scala Flink complex event processing and detection pipeline from incoming data-model with business logic.
  • Worked on APIs layer implementation in Akka and a simulator for data (ongoing)
  • Testing end-to-end pipeline and DEVops of associated individual component log monitoring on AWS(ongoing).
  • Created data format based overall design and development of a MQTT based, Kafka, Flink, RDBMS and Cassandra pipeline for data ingestion, and event/milestone detection.

Software Developer

General Electric Corp, India

Oct 2017

GE Healthcare’s Device Monitoring Product

  • Deployment and maintenance of the Azure cloud based cluster(DevOps), along with pipeline design and data handling constraint using a Data Virtualization tool.
  • Implemented detection algorithms, of different respiration and lung parameters, and accumulation algorithms for case-end aggregation requirements.
  • Data modeling for Cassandra for real-time data storage and case-end data aggregation.
  • Data Modeling for Data-warehousing and UI based consumption

Company Log Data Analytics 

  • Involved in PIG scripting and the HIVE database, to staging layer for processing before loading into final Hadoop table
  • Worked on OOZIE workflows for executing Java, pig and hive actions based on decision nodes, scheduled Oozie Workflow and Coordinator Jobs


Bachelors in Engg.

San Jose State University

Jul 2014


  • English
  • French
  • Arabic
  • German

Career Expert Tips:

  • Always make sure you choose the perfect resume format to suit your professional experience.
  • Ensure that you know how to write a resume in a way that highlights your competencies.
  • Check the expert curated popular good CV and resume examples