Data Engineer Loc: Waukegan, IL
, IL
100,000 - 200,000
Job Description:
EDO Data Engineer Loc: Waukegan, IL
For large enterprise datasets, the data engineer is responsible for
- curating content to support key business initiatives, working primarily with data scientist and data analysts across functional disciplines.
- Participants in the acquisition, cataloging, and harmonization of information aligned with the needs of business stakeholders.
- Supports data consumers in understanding information context, generating fit for purpose datasets, and effectively utilizing advance analytic tools.
Key Responsibilities Include
- â?¢ Planning, building and running enterprise class information management solutions across a variety of technologies (e.g. big data, master data, data profiling, batch processing, and data indexing technologies,
- â?¢ Establishing advance search solutions that include synonym, inference and faceted searching
- â?¢ Ensuring appropriate security and compliance policies are followed for information access and dissemination
- â?¢ Defining and applying information quality and consistency business rules throughout the data processing lifecycle
- â?¢ Collaborating with information providers to ensure quality data updates are processed in a timely fashion
- â?¢ Enforcing and expanding use of client Common Data Model and industry standard information descriptions (ontologies, taxonomies, vocabularies, lexicons, dictionaries, thesaurasus, glossaries etcâ?¦)
- â?¢ Managing the information portal and its customer-facing resources (data catalog, data portal, etcâ?¦)
Basic
- â?¢ Bachelorâ??s Degree with 10+ years of related work experience and a strong understanding of specified functional area.
- Degree in Computer Science or related discipline preferred.
Advanced:
- â?¢ At least 10 years experience in a several data processing roles such as database developer/administrator, ETL developer, data analyst, BI analytics developer, and/or solution developer of contextual search applications
- â?¢ Experience with Informatica tools (PowerCenter, Big Data Management, Master Data Management), Cloudera CDH and ecosystem tools (SOLR, Spark, Impala, Hive, Hue, etcâ?¦), MarkLogic, SAS Analytics, python, R and Amazon Web Services preferred.