Design, develop, construct, install, test and maintaining the complete data management & processing systems
Contribute to the continuous optimization of the Big Data Platform and related infrastructure, network, database, and middleware capabilities to support and enable the development and operations of Data modules and solutions.
Discover opportunities for data acquisitions and explore new ways of using existing data.
Contribute to improving data quality, reliability & efficiency of the whole system.
Create data models to reduce system complexity and hence increase efficiency & reduce cost.
Administer the monitoring, maintaining, and supporting the operational capacity, availability, and performance of the Big Data Platform solutions against SLAs from a level two and level three support perspective
Contribute in technical discussions with Teradata Platform service providers (on-/ off-prem) to understand forecast and right- sizing impacts for short-, mid- and long-term capacity and performance requirements, iterating regularly.
Experience with Hadoop Platform & Ecosystem at Enterprise scale.
Proven experience with:
o HDFS
o YARN
o MapReduce
o Analytical techniques/models incl Machine Learning, Data Modelling and Visualisation
At least two or more Analytical programming and scripting languages, e.g. SQL, Python, Linux, Java
Data Driven evidenced through strong creativity, analysis, and problem solving skills. Utilise Data/Visualisation techniques to illustrate issues and challenges.
Excellent knowledge in administration Big Data systems in a production environment
Cloudera Certified Hadoop Administrator/Developer or comparable qualifications
Experience working closely in a team/squad/tribe using agile methodologies Scrum and/or Kanban, practicing DevOps and Continuous Delivery / Integration.