DT One is currently seeking a talented Cloud Infrastructure Engineer based in Dubai with experience supporting a global company. A career with DT One provides invaluable experience in an exciting and rapidly expanding market and an opportunity to be part of a truly global company with various offices worldwide and a workforce that includes dozens of different nationalities.
Reporting to the Head of Infrastructure, Network & Security, the ideal candidate is a highly driven, self-motivated, technically hands-on individual who is truly excited about creating meaningful impact, willing to build and lead a small team of Engineers. In this role you will combine a startup mindset with the scale of an industry leader, providing you with hands-on exposure to how key organization decisions are made and the challenges of operating and securing critical cloud infrastructure and services.
The Infrastructure, Network & Security team is part of the Engineering division and is in charge of the overall infrastructure (provisioning, cost, reliability, business continuity, disaster recovery, backup strategy, … hosted mainly on AWS Cloud), databases, network (VPCs, IPSec VPN, Layer 3 & 4 routing), monitoring (SRE), incident management and security (all layers of the stack, patching, vulnerability scanning, IDS/IPS/SIEM). Working closely with the entire Engineering team, needs, weaknesses and risks are identified and an action plan defined to bring the platform to the next level using the latest tools and technologies.
Key Role Responsibilities
Manage, maintain, upgrade and monitor the critical infrastructure of DT One in a highly available environment to achieve an SLA of 99.99%+ availability
Deploy and manage Cloud infrastructure to serve business needs, optimizing performance and cost in a highly available environment
Simplify and automate the provisioning of the platform to support the engineering team with their requirements and needs
Work closely with the rest of the Engineering team to design and architect the platform
Perform maintenance and system upgrades including patches, hot fixes, configuration updates, backups, … to keep resources current and secure
Employ multiple patching strategies, patch and build new AMIs for cloud-aware applications that can be easily restarted, and resort to in-place patches for the rest
Design the backup and restoration strategy, and the business continuity plan in the event of a failure to protect the business
Build and maintain both Unix and Linux systems to provide critical infrastructure services such as FTP/SFTP, NFS, DNS, SMTP, and Proxy Services.
Implement relevant KPI and metrics to assess and follow on the performance of the platform and systems (Infrastructure Reliability Engineering)
Identify risks and weaknesses on the infrastructure early on and ensure they are addressed before they become actual problems
Work within established configuration and change management policies to ensure awareness, approval and success of changes made to the infrastructure
Maintain and support all enterprise monitoring technologies and establish associated policies governing both advanced notifications and escalation procedures
Create and maintain clear and accurate system and process documentation
Configure logging and monitoring based on best practices
Setup, monitor, correlate and investigate alerts to detect and resolve incidents
Keep up to date with trends and innovation in engineering, including containers and orchestration, serverless and other programming paradigms, microservices, DevSecOps/DevOps/SRE, etc.
Degree in Computer Science or equivalent
5+ years of experience in a similar role
3+ years of experience supporting and securing large scale and critical systems and APIs in production
Strong experience with AWS Cloud infrastructure management and related services
Experience installing, configuring, and maintaining services such as Bind, Squid, Apache, MySQL, and HAProxy in a Linux/Unix environment
Ability to utilize a scripting language (e.g. Bash, Perl, Python) to automate regular tasks and processes
Experience designing networks, systems and application architectures
Strong hands-on understanding and experience of Linux administration, command line interface, shell scripting
Strong understanding of application protocols (e.g. DNS, SSH, HTTPS, SFTP, SMTP) and their behaviors across network environments
Experience supporting the following technology stack and services (Amazon AWS, Terraform, Ansible, Docker, HAProxy, Nginx, ELB/ALB, ELK, Prometheus, Grafana, ECS/EKS/Kubernetes, Fluentd, Elasticsearch)
Experience in designing, integrating, developing web services and APIs in the cloud
A strong multi-tasker with a keen eye for detail
Strong analytical, problem-solving skills and willingness to investi
Mention that you found this ad on Bibango.com when you Call or Message.