An Employer is looking for 2 AI Infrastructure Engineers. You will be helping this enterprise telecom company to with a companywide effort related to AI Infrastructure deployment for a new cloud platform. You will work cross functionally with engineering teams to integrate and deploy AI functions as they build new Data Centers, virtualized environments, and cloud services.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to
[email protected].
To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy:
https://insightglobal.com/workforce-privacy-policy/ .
- Expert understanding of AI/ML infrastructure components, or GPU-based systems preferably in a high-availability, large scale environment.
- Hands-on Experience with NVIDIA DGX servers, BasePOD architectures, and advanced GPU technologies.
- Proficient in Linux/UNIX environments, including scripting/automation tools (Bash, Python, Ansible, Terraform)
- Understanding of AI infrastructure security best practices
Experience with container orchestration (Kubernetes, Docker) and GPU workload management tools.
- Strong knowledge of networking (InfiniBand/Ethernet) and storage solutions in AI/ML contexts.
- Understanding of CI/CD pipelines using tools such as Git, Artifactory, Jenkins, etc.
- Experience with AI/ML pipelines (PyTorch, TensorFlow, RAPIDS AI, or other deep learning frameworks)
- Experience with configuring and using monitoring tools (e.g., Prometheus, Grafana, NVIDIA DGCM)
Benefit packages for this role will start on the 31st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.