Location: NYC, NY - Onsite Role Overview: As a Data Engineer at Blockboard, you will be responsible for designing, building, and maintaining our data infrastructure, ensuring its scalability, reliability, and efficiency. You will work closely with cross-functional teams to understand data requirements, develop ETL pipelines, and implement solutions for data processing, storage, and analysis.
Key Responsibilities:
Design, develop, and maintain ETL pipelines for processing large volumes of data from various sources, ensuring data quality and integrity.
Implement and optimize data models and schemas using Delta Tables on Databricks platform.
Develop and maintain data warehousing solutions, including data modeling, indexing, and optimization.
Troubleshoot and resolve data-related issues, ensuring data pipelines are running smoothly and efficiently.
Stay updated with the latest technologies and trends in data engineering and recommend innovative solutions to improve our data infrastructure.
Qualifications:
Bachelor's degree or higher in Computer Science, Engineering, or related field.
5+ years of experience working as a Data Engineer or similar role.
Strong proficiency in Python and Spark for scripting and programming.
Hands-on experience with Databricks platform and Delta Tables for data processing and storage.
Proficiency in SQL and experience with relational databases such as MySQL.
Experience with cloud platforms, particularly AWS (Amazon Web Services), and services like S3, RDS, Glue, etc.
Solid understanding of data warehousing concepts, ETL processes, and data modeling techniques.
Excellent problem-solving skills and ability to work independently or in a team environment.
Desired Skill set:
Experience with streaming data processing frameworks such as Apache Kafka or Apache Spark Streaming.
Familiarity with OpenAI / LLM models for natural language processing tasks.
Experience with data visualization tools such as Tableau, Power BI, or similar.