We are looking for a Site Reliability Engineer (SRE) to a major Telco company IoT Program. This person will help building, supporting, and managing the activities towards the best practices implementation.
Key Responsibilities
- Will participate in the solution definition to ensure its operability
- Ensure the solution resilience, acting as a SPOC within the team
- Look at monitoring KPI’s & logging efficiency to introduce new tools towards a more reliable solution
- Work with developers during the software development lifecycle to ensure that developed services are operationalized
Skills and Experience
- Experience in application reliability practices for client (internal and client) facing experiences
- Experience with Environments & Infrastructure (Unix/Linux)
- Experience with Cloud (AWS, Oracle, Azure)
- Experience with Containers (Docker, Kubernetes)
- Good experience to address the following activities
- Experience working in Operations
- Experience in business/technical assessments on solutions life cycle asset management processes