Job Description:
We are seeking a skilled and motivated Data Engineer with 8+ years of experience to join our team. The ideal candidate should have hands-on experience in data engineering using Databricks, with a strong focus on developing, maintaining, and optimizing large-scale data pipelines and solutions. You will work closely with cross-functional teams to deliver high-quality data solutions aligned with business requirements.
Key Responsibilities:
Design, develop, and maintain data pipelines using Databricks and Apache Spark
Perform ETL on structured and unstructured data into the Databricks platform
Optimize data pipelines for performance and scalability
Collaborate with data analysts and data scientists to gather and understand data requirements
Implement data quality checks and monitoring processes
Troubleshoot and resolve data pipeline and processing issues
Stay updated with modern data engineering tools, technologies, and best practices
Participate in code reviews and maintain proper documentation
Required Skills and Experience:
Bachelor’s degree in Computer Science, Data Science, Engineering, or a related field
8+ years of experience in data engineering or related domain
Strong proficiency in Python or Scala
Hands-on experience with Databricks and Apache Spark
Strong knowledge of SQL and data warehousing concepts
Experience working with cloud platforms (AWS, Azure, or GCP)
Good understanding of data modeling and data warehousing principles
Strong problem-solving and analytical skills
Excellent communication and teamwork skills
Exposure to CI/CD practices in Databricks
Basic understanding of machine learning lifecycle and model deployment
Experience with orchestration tools such as Apache Airflow
Bonus Skills:
Additional experience with CI/CD in Databricks
Knowledge of machine learning lifecycle
Hands-on experience with Apache Airflow