Solve challenging problems, using python coding skills.
Design, build and launch new data extraction, transformation & loading processes in production.
Web crawling, data cleaning, data annotation, data ingestion and data processing.
Reading and collating complex data sets.
Creating and maintaining data pipelines.
Continual focus on process improvement to drive efficiency and productivity within the team.
Use of Python, SQL, ES, Shell etc. to build the infrastructure required for optimal extraction, transformation, and loading of data.
Provide insights into key business performance metrics by building analytical tools that utilize the data pipeline.
Support the wider business with their data needs on an ad hoc basis.
Comply with QHSE (Quality Health Safety and Environment), Business Continuity, Information Security, Privacy, Risk, Compliance Management and Governance of Organizations policies, procedures, plans and related risk assessments.
QualificationsRequirements:
Bachelor\'s degree in computer engineering, Computer Science, or Electrical Engineering and Computer Sciences.
3+ years of programming experience, solid coding skills in Python, Shell, and Java
Good corporate capacity, good communication skills.
Experience with Web crawling, cleaning.
Experience with solution architecture, data ingestion, query optimization, data segregation, ETL, ELT, AWS, EC2, S3, SQS, lambda, Elastic Search, Redshift, CI/CD frameworks and workflows.
Working knowledge of data platform concepts - data lake, data warehouse, ETL, big data processing (designing and supporting variety/velocity/volume), real time processing architecture for data platforms, scheduling and monitoring of ETL/ELT jobs
PostgreSQL and programming (preferably Java, Python), proficiency in understanding data, entity relationships, structured & unstructured data, SQL and NoSQL databases
Knowledge of best practice in optimizing columnar and distributed data processing system and infrastructure
Experienced in designing and implementing dimensional modelling
Knowledge of machine learning and data mining techniques in one or more areas of statistical modelling, text mining and information retrieval.
Develop and maintain scalable data pipelines and systems on Azure
Extensive experience with Azure cloud services
Ideally, youll also need
In-depth market and domain knowledge
A passion for constant improvement
An innovative and creative approach to problem-solving