Data Engineer Location: Montreal, Canada
, QC
100,000 - 200,000
Job Description:
RESPONSIBILITIES
- Data Related Responsibilities
- Architect, engineer, deploy and maintain data pipelines (Airflow DAGs) that are fault tolerant, temporally consistent, idempotent, replayable, and generally awe-inspiring
- Engineer tested and automated data transformations using PySpark, SQL, and Pandas
- Ensure the highest standard of data governance by crafting data contracts and service level agreements, automating data lineage tracking, data cataloging and runtime validations
Technical Responsibilities
- Ensure high code quality and engineering standards
- Work with Jenkins for continuous integration and deployment, Docker for containerization, Git for version control, and Kubernetes for deployment
- Write infrastructure as code scripts with Terraform to support and improve our data lakes AWS infrastructure
- Engage with technical challenges in the domains of storage, pipelining and schema management
- Work on problems related to data access and security
- Collaborate with other teams and contribute code to other technical projects when necessary
- Provide rigorous code reviews and help manage our repositories
- Write comprehensive tests and resolve errors in a timely manner
Non-Technical Responsibilities
- Take end-to-end ownership of data pipelines, ensuring that every stakeholderâ??s business needs are well understood and delivered accordingly
- Support peers as necessary, both within and outside of your team
- Act as a subject matter expert for all Data Engineering related matters within the company
- Mentor peers and contribute meaningfully to the technical culture at client
REQUIREMENTS
- Relevant academic background and/or verifiable domain expertise in Data Engineering.
- A minimum of 2 years programming experience, preferably in high-level Object-Oriented or Functional languages. Fluency in Python is a major asset
- Experience working with cloud based infrastructure and DevOps, AWS based work experience an asset
- Extensive experience working with batch data pipelining frameworks such as Airflow or Luigi, experience with stream processing frameworks an asset
- Deep understanding of data lakes, data warehouses or other analytics solutions
- Deep understanding of data transformation techniques and ETL scripting, knowledge of Spark and Pandas a strong asset
- Extensive experience writing and optimizing SQL queries
- Domain expertise in architecting and maintaining distributed data systems
- Knowledge of source control with Git, CICD pipelining, testing, containerization and orchestrated deployment
- Experience working in an Agile ecosystem, an asset
- Strong written and verbal communication skills in English, French an asset
SKILLS
- Highly analytical and detail oriented
- Creative thinker with excellent problem solving abilities
- Ability to thrive in a fast-paced, performance-driven environment
- Team player with solid interpersonal skills