Job Description
We are looking for a mid-level AI Engineer with hands-on experience in Retrieval-Augmented Generation (RAG)systems, Small Language Models (SLMs), and distributed databases such as Google Cloud Spanner.
You will work closely with senior engineers and product teams to build scalable AI systems that integrate retrieval pipelines, language models, and distributed transactional infrastructure. This role is ideal for someone who has already built AI features in production and wants to deepen their expertise in applied GenAI systems.
Contract: Through the end of the year
What You’ll Be Working On
• Production RAG features.
• Distributed knowledge storage backed by Spanner.
• AI-powered APIs and services.
• Retrieval optimization and evaluation.
• Model cost/latency optimization.
________________________________________
Technical Skills Snapshot
Category Skills
AI RAG pipelines, Embeddings, Prompt engineering
Models SLM/LLM integration
Database Spanner schema design, SQL optimization
Backend Python, APIs
Cloud GCP
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Required Skills & Experience
• 3–5 years of software engineering experience.
• 1–2 years working with LLM or RAG-based systems.
• Strong proficiency in Python.
• Experience with:
o Embedding models and vector search
o LangChain, LlamaIndex, or similar frameworks
o API development (FastAPI/Flask)
• Experience working with Google Cloud Spanner or similar distributed SQL databases.
• Solid understanding of distributed systems fundamentals.
• Comfortable working in cloud environments (GCP preferred).
Nice to Have Skills & Experience
• Experience fine-tuning or quantizing small language models.
• Familiarity with evaluation metrics for retrieval systems (Recall@K, etc.).
• Knowledge of:
o Vertex AI
o Pub/Sub
o Dataflow
• Experience optimizing AI inference for cost and latency.
• Exposure to CI/CD pipelines.
Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.