Install, configure, and upgrade Hadoop components in the Horton works HDP 2.x platform.
Troubleshoot cluster issues, and support developers running Map Reduce or Tez jobs in Pig and Hive.
Proactively optimize and tune cluster configuration for performance.Organize, document, and improve process around data inventory management for 100+ TB of data.
Monitor Hadoop cluster with Ambari, Nagios, and GangliaManage cluster resources and multiple concurrent workloads to maintain availability and SLAs.
Write Linux Shell scripts for automating tasksCreate scripted workflows and work with Pig/Hive developers to develop automated production ETL processesImplement cluster backups and manage production data deployments between Data Centers.
Implement and maintain security policies in the Hadoop & Linux environmentResearch latest developments in the Hadoop open source platform and recommend solutions and improvements.
Evaluate tools and technologies to improve cluster performance and ETL processing.Install, configure, and administrate Linux cloud based servers running in Amazon Web Services.
Create documentation on cluster architecture, configuration, and best practices.
Work with Hortonworks support team to investigate and resolve tickets for cluster or developer issues.
Opportunity to contribute in Hadoop architecture, Hive ETL development, and data QA, depending on candidate background.
Qualifications & Skills
BS in Computer Science or related field, or equivalent experience.
Must have strong hands on experience with Hadoop administration using Ambari and other tools.
Must be intimately familiar with the many configuration options available to Hadoop admins, and their impact on cluster performance and stability.
Strong Linux administration and shell scripting experience.
Understanding of HDFS, Distributed Hadoop 2.x architecture, Yarn, and Map Reduce concepts.
Development experience with Pig, Hive, or Spark a huge plus.
Experience with real-time streaming workflows using Kafka, Spark, or Storm a plus.
Development experience with Java, Python, or other general programming languages a plus.
Experience with AWS or other cloud service providers is a plus.
Responsible team member able to take ownership to independently drive projects and resolve issues without task micro management.
Ability to quickly learn and a passion to experiment and stay current with latest technology trends and best practices.
Possess strong communication and collaboration skills.