Project descriptionWe are hiring for projects in the Middle East. There are many opportunities in the region. Our team consists of frontend and backend developers, data analysts and data scientists, architects, analysts and project managers, and Databricks Data engineers. As a Lead Data Engineer, you will manage the development, implementation and maintenance of data infrastructure for modern data plarform.Responsibilities
1. Platform Design: Collaborate with cross-functional teams to refine and design the architecture of our data platform. 2. Platform Optimization: Continuously enhance the performance, scalability, and reliability of the data platform to meet evolving business requirements. 3. Platform Maintenance: Implement monitoring, alerting, and maintenance processes to ensure the availability and health of the data platform. 4. Data Governance: Establish and enforce data governance best practices to uphold data quality, security, and compliance standards. 5. Platform Integration: Integrate new technologies seamlessly into the data platform ecosystem, ensuring compatibility and interoperability. 6. Collaboration: Work closely with data engineers, data scientists, analysts, and business stakeholders to comprehend requirements and offer technical expertise and support. 7. Documentation: Create and maintain comprehensive documentation for the data platform architecture, processes, and workflows. 8. Mentorship: Provide guidance and mentorship to junior members of the data engineering team, nurturing their growth and development.
Design, develop, and maintain data pipelines to ingest and transform raw data into Silver and Gold tables using Databricks.
Build reusable data pipelines that can be utilized by application development team
Collaborate with data consumers and other stakeholders to understand data requirements and deliver high-quality data solutions.
Implement declarative ETL pipelines using Databricks, Delta Lake, and other relevant tools to ensure data is accurate, timely, and accessible.
Optimize data models and pipelines for performance, scalability, and reliability.
Model data from various sources, ensuring consistency and quality in the Silver layer.
Curate data assets in the Silver layer to support data science and machine learning applications.
Build and maintain the Gold layer to provide clean, aggregated, and business-ready data for reporting and analytics.
Ensure data governance and security best practices are followed.
Well versed with databricks fine tuning techniques for huge data sets
Document data processes, models, and pipelines for transparency and knowledge sharing.
Well versed with Devops practices for continuous integration and deployment
Writing Code, review pull requests, provide constructive feedback and maintain high code quality standards
Maintain a deep understanding of data engineering principles, best practices and emerging technologies to drive engineering initiatives
SKILLSMust have
Overall 5-7 years of experience in data engineering and transformation on Cloud
3+ Years of Very Strong Experience in Azure Data Engineering, Databricks
Expertise in supporting/developing lakehouse workloads at enterprise level
Experience in pyspark is required - developing and deploying the workloads to run on the Spark distributed computing
Candidate must possess at least a Graduate or bachelor's degree in Computer Science/Information Technology, Engineering (Computer/Telecommunication) or equivalent.
Cloud deployment: Preferably Microsoft azure
Experience in implementing the platform and application monitoring using Cloud native tools