About the Role
We are seeking a highly skilled Senior Software Data Engineer (SSDE) with strong expertise in Azure Databricks, PySpark, ADF, ADLS Gen2, Azure SQL, and Python. The ideal candidate will design, build, and optimize scalable data pipelines and cloud-based data engineering solutions to enable enterprise-grade products in the healthcare sector.
The role requires both technical depth in data engineering on Azure Cloud and strong interpersonal skills to collaborate effectively with cross-functional teams and customer stakeholders.
Key Responsibilities
1. Data Engineering & Development
-
Design and build scalable data pipelines using Azure Databricks with PySpark for batch and near-real-time ingestion, transformation, and processing.
-
Optimize Spark jobs and manage large-scale distributed data processing using RDD/DataFrame APIs.
-
Implement modular, testable, and reusable Python code for data transformations and utility functions.
-
Develop Python packages and manage dependencies using pip/Poetry/Conda.
2. Orchestration & Integration
-
Design and implement data orchestration workflows using Azure Data Factory (ADF).
-
Build parameterized, reusable pipelines using tasks, triggers, and dependencies.
-
Handle pipeline performance tuning, error-handling strategies, and integration runtime configurations.
-
Design and support IDOC-based interfaces, RFCs, and BAPI Wrappers (if applicable).
3. Data Storage & Management
-
Implement secure, hierarchical namespace-based storage using ADLS Gen2 aligned to bronze-silver-gold architecture layers.
-
Apply best practices in file partitioning, retention management, RBAC/ACL-based access control, and storage performance optimization.
-
Manage lifecycle policies and folder-level security.
-
Develop T-SQL queries, stored procedures, and metadata layers on Azure SQL Database.
4. Documentation & Quality Assurance
-
Write clear, concise documentation following industry-standard practices.
-
Adopt code analyzers and unit testing frameworks to ensure high-quality deliverables.
-
Contribute to design reviews, code reviews, and provide constructive feedback.
Required Technical Skills
-
2+ years of hands-on experience in Azure Databricks with PySpark.
-
2+ years of experience in Azure Data Factory (ADF).
-
2+ years of experience in ADLS Gen2.
-
2+ years of experience in Azure SQL (T-SQL, stored procedures, metadata management).
-
1+ years of experience in Python programming and package development.
Soft Skills & Other Requirements
-
Strong communication skills (verbal, email, instant messaging) for stakeholder collaboration.
-
Strong interpersonal skills to build and maintain productive team relationships.
-
Problem-solving mindset with ability to troubleshoot and resolve issues efficiently.
-
Experience working in Agile/Scrum environments with tools like Jira or Azure DevOps.
-
Proactive, detail-oriented, and diligent in task execution and reporting progress.
Expected Outcomes
-
Deliver robust, scalable, and optimized data engineering solutions on Azure Cloud.
-
Collaborate with internal teams and customers to align solutions with design documents and coding standards.
-
Contribute to the growth of data engineering capabilities within the healthcare domain.
Preferred Qualifications (Good to Have)
-
Experience in Healthcare domain projects.
-
Knowledge of advanced data modeling and architecture patterns.
-
Exposure to AI/ML workflows for data engineering.
