Job Description
Description:
We are seeking a Senior DevOps Engineer to join the Salesforce AI Research Incubation Team. In this role, you will be responsible for designing, implementing, and maintaining cloud infrastructure and CI/CD pipelines to support AI research and development. You will ensure the reliability, scalability, and security of our AI-driven applications through automation, containerization, and infrastructure as code (IaC).
The ideal candidate has extensive experience with AWS, GCP (DNS, VM, Kubernetes, networking, firewall), as well as strong expertise in CI/CD, Docker, Kubernetes, Helm, Terraform, Python, and shell script.
Key Responsibilities
· Design, implement, and manage cloud infrastructure (AWS, GCP) including networking, security, and compute resources.
· Develop and maintain CI/CD pipelines to automate deployment and testing of AI models and applications.
· Build, manage, and optimize Kubernetes clusters for deploying AI services and research applications.
· Implement infrastructure as code (IaC) using Terraform and Helm to ensure repeatable and scalable deployments.
· Automate system operations and monitoring using Python and shell scripting.
· Ensure security best practices across cloud environments, including firewall and access control management.
· Troubleshoot infrastructure issues and optimize system performance.
· Collaborate with AI researchers and software engineers to streamline model deployment and integration.
· Task about managing databases (SQL and No-SQL), including database provisioning, performance tuning, and backup strategies.
· Ensure database security, replication, and high availability across cloud environments.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Required Skills & Experience
Required Qualifications
· Bachelor’s degree in Computer Science, Software Engineering, or a related field.
· Experience with AI/ML model deployment and pipeline automation.
· 3+ years of experience in DevOps, cloud infrastructure, or site reliability engineering.
· Strong experience with AWS and GCP, including DNS, VM management, networking, Kubernetes, and firewall security.
· Proficiency in CI/CD pipeline development and automation (GitHub Actions, Jenkins, GitLab CI/CD, etc.).
· Expertise in Docker, Kubernetes, and Helm for container orchestration and deployment.
· Hands-on experience with Terraform for infrastructure provisioning and management.
· Strong scripting skills in Python and shell scripting for automation.
· Solid understanding of networking, security best practices, and cloud monitoring tools.
· Excellent troubleshooting and problem-solving skills.
Nice to Have Skills & Experience
Preferred Qualifications
· Knowledge of logging and monitoring tools (Prometheus, Grafana, ELK stack, etc.).
· Familiarity with serverless computing and cloud-native application design.
· Contributions to open-source DevOps tools or frameworks.
Experience with Salesforce Falcon is a plus.
Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.