Data Engineer

Role: Data Engineer

Location: Emeryville, CA

Type: Contract

Job Description:

· Build and optimize performance of Scala-Spark batch jobs using AWS EMR

· Build and optimize performance of Kafka, NIFI and other components of
real-time pipelines

· Participate and contribute to the design, architecture and
development of high quality data-lake data models

· Build data pipeline orchestration

· Ensuring code quality and conformance of code to applicable rules,
norms and relevant best

· practices

· Coaching, mentoring and developing new hires and less experienced
developers to work on Big Data projects

Skills

· Strong understanding of distributed systems and distributed
computation.

· Strong working knowledge in Scala and Spark, Kafka

· Good knowledge in AWS services like S3, EMR, Glue, SageMaker, Lambda,
ECS, DMW and Athena.

· Demonstrated working knowledge in Spark, Kafka, NiFi,

· Demonstrated working knowledge of data modeling

· Experience in designing and running Unit and Integration testing

· Hands-on exposure to data store technologies like MongoDB, DynamoDB,
Postgres and Redshift.

· Knowledge of TDD & BDD methodologies, tools and practices