Job Description
We’re looking for a junior full stack engineer who is excited about building data-driven and AI-powered applications end-to-end. You’ll work across frontend + backend + data + ML integration, helping ship real features such as search, enrichment, classification, and analytics—primarily on Google Cloud Platform (GCP) using a mix of open-source and cloud-native tools.
Day-to-Day Responsibilities
Build and maintain AI-enabled web apps (UI + APIs) that use data and ML/LLM capabilities.
Develop backend services (e.g., FastAPI/Flask/Node) to serve AI predictions, search, and enrichment workflows.
Work with structured and semi-structured data (CSV/JSON/Parquet), write clean SQL, and support data pipelines.
Integrate models and libraries from Hugging Face (Transformers, Datasets, Tokenizers) and common ML tooling.
Implement and test ML inference pipelines (classification, similarity search, reranking, embeddings, evaluation metrics).
Connect systems with GCP services (e.g., BigQuery, GCS, Pub/Sub, Cloud Run), following best practices.
Support vector search / hybrid search integrations where needed (embeddings + keyword search).
Add logging, monitoring, and basic performance improvements (latency, batching, retries).
Write unit tests, document APIs, and collaborate via Git PRs and code reviews.
Typical Tech Stack (What they may touch)
Frontend: React + TypeScript (or similar)
Backend: Python (FastAPI/Flask), REST, background workers
Data: BigQuery, GCS, pandas, SQL, Parquet
AI/ML: scikit-learn, Hugging Face Transformers, embeddings, evaluation tooling
Cloud (GCP): Cloud Run, Pub/Sub, Secret Manager, IAM basics
DevOps: Docker, GitHub Actions/Cloud Build (nice)
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Required Skills & Experience
1–3 years experience
MS/BE in Computer Science, Data Science, Engineering, or equivalent practical experience.
Demonstrable personal/academic projects in AI + data + web apps is highly valued.
Programming & Fundamentals
Strong Python fundamentals (data structures, OOP, debugging, clean code).
Solid understanding of REST APIs, JSON, authentication basics, and integration patterns.
Good working knowledge of SQL (joins, aggregations, window functions are a plus).
Data & ML Basics
Hands-on experience with scikit-learn (training, evaluation, feature basics, metrics like precision/recall/F1).
Familiarity with common ML workflows: train/validate/test, leakage awareness, basic tuning concepts.
Experience working with data using pandas and NumPy.
Hugging Face / LLM & Embeddings Awareness
Basic familiarity with Hugging Face Transformers (loading a model, tokenization, inference).
Awareness of embeddings, semantic similarity, vector search concepts (what they are, why they’re used).
Cloud & Deployment (GCP-first)
Exposure to GCP services such as GCS and BigQuery (or willingness to learn quickly).
Ability to containerize and run services using Docker; familiarity with Cloud Run is a strong advantage.
Full Stack / Product Engineering
Experience with at least one modern frontend stack: React (preferred) or Angular/Vue.
Ability to build simple, clean UIs to interact with APIs (forms, tables, filters, pagination).
Engineering Practices
Git workflow (branching, PRs), basic testing mindset, and clear documentation habits.
Nice to Have Skills & Experience
Experience with Vertex AI (model endpoints, embeddings, pipelines) or any managed ML platform.
Familiarity with vector databases and search stacks (FAISS, Elasticsearch/OpenSearch, LanceDB, pgvector, etc.).
Knowledge of hybrid search / reranking (BM25 + embeddings, Cross-Encoder rerankers, evaluation approaches).
Experience with Pub/Sub event-driven patterns and batch/stream processing.
Understanding of MLOps basics: model versioning, reproducibility, experiment tracking (MLflow, W&B).
Basic monitoring/observability: logs/metrics/traces (Datadog, Cloud Monitoring).
Exposure to CI/CD (GitHub Actions, Cloud Build) and infrastructure basics (IAM, service accounts).
Comfort working with Parquet, partitioning strategies, and performance-minded data handling.
Familiarity with LangChain or similar orchestration frameworks (optional).
Benefit packages for this role will start on the 1st day of employment and include medical, dental, and vision insurance, as well as HSA, FSA, and DCFSA account options, and 401k retirement account access with employer matching. Employees in this role are also entitled to paid sick leave and/or other paid time off as provided by applicable law.