Pure Storage Acquired Portworx in October 2020, Creating the Industry's Most Complete Kubernetes Data Services Platform for Cloud Native Applications.This acquisition represents Pure's largest to date and our deeper expansion into the fast-growing market for multi-cloud data services to support Kubernetes and containers.
SHOULD YOU ACCEPT THIS CHALLENGE...
Combine software and systems engineering to work on the design, construction, and optimization of large-scale infrastructure for Portworx's cloud services. SRE is responsible to build and run large-scale, distributed, fault-tolerant systems ensuring that our systems have reliability, uptime appropriate to users' needs and a fast rate of improvement.
Responsibilities
Review product design strategies to ensure products are designed at conception to consider and prioritize scalability, reliability.
Build relationships and work with our Software Engineer partners to build reliability into the products.
Engage with product engineering teams to gain a comprehensive understanding of their infrastructure use cases. Communicate design trade-offs effectively and construct scalable systems to meet their unique needs.
Develop advanced tooling to automate the build and deployment of microservices and infrastructure components, enhancing efficiency and productivity.
Proactively identify bottlenecks in the daily usage of core infrastructure and implement robust solutions to resolve them.
Keep an ever-watchful eye on our systems capacity and performance.
Reduce manual labor and increase operational efficiency through automation.
Monitor the infrastructure to alert on significant events, ensuring the highest level of system performance and reliability.
Provide feedback and suggestions to help improve processes around DevOps, agile, and CI/CD.
Actively participate in retrospectives to help the team improve.
Write high-quality code along with the unit tests and automation tests to exercise that code
Will be an integral part of the team's on-call rotations
Requirements
[Must have] Hands-on experience in designing and building infrastructure to support large-scale, fault-tolerant distributed services.
[Must have] Strong experience with cloud infrastructure platforms like AWS, Azure, or Google Cloud.
[Must have] Expertise in administering, operating, and configuring Kubernetes.
[Must have] Proficiency in programming language (Golang preferred) .
[Must have] High level of proficiency in Infrastructure as Code and Configuration Management tools like Terraform.
[Good to have] Proficiency in various monitoring tools such as Prometheus, Grafana, Cloudwatch, and Thanos.
[Good to have] Strong background in cloud security, Kubernetes security, and application security.
[Good to have] Proficiency in debugging issues involving networks, DNS, HTTP, Linux, and containers.
[Good to have] Experience in algorithms, data structures, analysis and software design and/or Unix/Linux systems, IP networking, performance and application issues.