What are the responsibilities and job description for the DevOps/SRE - Allen Park, MI position at N2 Services Inc?
Job Description
Position: DevOps/SRE
Location: Allen Park, MI
Job Type: W2
Need: Automotive industrial experience
Job Description:
We’re passionate about building software that solves problems. We count on our Site Reliability Engineers (SREs) to empower our users with a rich feature set, high availability, and stellar performance level to pursue their missions. We are currently seeking a public cloud experienced engineer for planning, designing and implementing next generation cloud infrastructure solutions. Cloud Engineer will be a part of the Engineering team and will require a strong knowledge of application monitoring, infrastructure monitoring, automation, maintenance, and Service Reliability Improvements. Specifically, we are searching for someone who brings fresh ideas, demonstrates a unique and informed viewpoint, and enjoys collaborating with a cross-functional team to develop real-world solutions and positive user experiences at every interaction.
Skills Required:
• Design, automate and manage a highly available and scalable cloud deployment that allows development teams to deploy and run their services.
• Collaborating with engineering and Architects teams to evaluate and identify optimal cloud solutions, also leveraging scalability, high-performance and security.
• Modernize existing on-prem solution and improving existing systems.
• Extensively automated deployments and managed applications in Google Cloud Platform.
• Developing and maintaining cloud solutions in accordance with best practices.
• Ensuring efficient functioning of data storage and processing functions in accordance with company security policies and best practices in cloud security.
• Collaborate with Engineering teams to identify optimization strategies, help develop self-healing capabilities. Experience in developing a strong observability capabilities.
• Identifying, analyzing, and resolving infrastructure vulnerabilities and application deployment issues. Regularly reviewing existing systems and making recommendations for improvements.
Experience Required:
• Proven work experience in designing, deploying and operating mid to large scale public cloud environments.
• Professional Certification is an advantage.
• Public Cloud >> Google Cloud Platform is a good to have.
• Proven work experience in Docker/Kubernetes (image building, k8s schedule).
• Experience in package, config and deployment management via Helm, Kustomize, ArgoCD.
• Proven work experience in provisioning Infrastructure as Code (IaC) using Terraform Enterprise or community edition.
• Proven work experience in writing custom terraform providers/plug-ins with Sentinel Policy as Code Strong knowledge in Github, DevOps (Tekton is an advantage).
• Should be proficient in scripting and coding, that include traditional languages like Python, GoLang,Java, JS and Node.js.
• Extensive knowledge and hands-on experience in Grafana and Prometheus micro libraries.
• Exposure to Cloud Monitoring and logging.
• Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks (Mesos, Kubernetes, Yarn)
• Experience with automation tools should be a priority