The Opportunity

About Quantiphi

Quantiphi is an award-winning, AI-First digital engineering and consulting company focused on delivering high-impact Services and Solutions that help organizations solve what truly matters. We partner with enterprises to reimagine their businesses through intelligent, scalable, and transformative AI driving measurable outcomes at the very core of their operations.

Since our founding in 2013, Quantiphi has tackled some of the world’s most complex business challenges by combining deep industry expertise, disciplined cloud and data engineering practices, and cutting-edge applied AI research. Our work is rooted in delivering accelerated, quantifiable business value, not just technology for technology’s sake.

Headquartered in Boston, Quantiphi is a global organization with 4,000+ professionals serving clients across key industry verticals including BFSI, Healthcare & Life Sciences, Consumer Goods, Manufacturing, and Technology, Media & Entertainment.

As an Elite and Premier partner to leading cloud and AI platforms such as NVIDIA, Google Cloud, AWS, and Snowflake, we build and deliver enterprise-grade AI services and solutions that create real-world impact.

We’ve been recognized with:

● 17x Google Cloud Partner of the Year awards in the last 8 years

● 3x AWS AI/ML award wins

● 3x NVIDIA Partner of the Year titles

● 2x Snowflake Partner of the Year awards

● Top analyst recognitions from Gartner, ISG, and Everest Group

● Industry-leading AI accelerators across Generative AI, Agentic AI, Data, and Cloud

We have also been certified as a Great Place to Work for multiple consecutive years.

Be part of a trailblazing team that’s shaping the future of AI, ML, and cloud innovation.

Your next big opportunity starts here!

Role: ML / AI Engineer – AI Platforms and Agentic Systems

Experience Level: 7+ Years

Work Location: Open / Hybrid

Job Summary

We are looking for a highly skilled ML / AI Engineer to work at the intersection of Data Platforms and AI within a large-scale healthcare modernization program.

In this role, you will focus on building and scaling AI model training environments, embedding and annotation pipelines, semantic search platforms, and agentic workflows that support advanced Models & Insights delivery. You will play a key role in taking AI and ML assets from experimentation through to production deployment, primarily on GCP Vertex AI, while also supporting hybrid and multi-cloud environments involving Azure ML and Azure OpenAI.

The ideal candidate will have strong experience in LLM engineering, RAG systems, vector search, model fine-tuning, and healthcare-focused AI use cases.

Roles & Responsibilities

● Design, develop, and deploy AI/ML solutions on GCP Vertex AI including model training, tuning, deployment, and monitoring.

● Build and orchestrate end-to-end ML workflows using Vertex AI Pipelines, Kubeflow Pipelines, and MLOps best practices.

● Fine-tune and optimize LLMs using techniques such as LoRA, QLoRA, Prefix Tuning, supervised fine-tuning, RLHF, and RLAIF.

● Develop and productionize Retrieval-Augmented Generation (RAG) systems including document ingestion, chunking, embeddings, vector storage, retrieval, reranking, and response generation.

● Build embedding pipelines using Vertex AI embedding models and open-source frameworks while managing vector infrastructure at scale.

● Design and optimize semantic search and hybrid search systems using dense retrieval, sparse retrieval, keyword search, reranking, and ranking models.

● Build and orchestrate agentic AI workflows using frameworks such as LangChain, LlamaIndex, LangGraph, CrewAI, or AutoGen.

● Design agent tools and APIs for Text-to-SQL, vector search, knowledge graph traversal, event-driven workflows, and enterprise data access.

● Build memory and state management systems for multi-turn AI interactions using vector stores, Redis, and conversation state persistence strategies.

● Collaborate closely with data engineers to build training datasets, annotation pipelines, feature stores, and model input/output schemas.

● Implement model evaluation and benchmarking frameworks using BLEU, ROUGE, RAGAS, Vertex AI Experiments, MLflow, and custom evaluation harnesses.

● Support production deployment of AI services using CI/CD pipelines, model registries, versioning strategies, and model monitoring solutions.

● Work closely with healthcare stakeholders to build solutions involving clinical NLP, de-identified datasets, FHIR data, and HIPAA-compliant AI workflows.

● Participate in client-facing architecture discussions and clearly explain AI trade-offs, model performance, and implementation approaches to technical and non-technical audiences.

Required Skills & Qualifications

● 4–7+ years of experience in ML/AI Engineering with at least 2 years focused on LLMs, Generative AI, RAG systems, or agentic workflows in production environments.

● Strong hands-on expertise with GCP Vertex AI including Model Garden, Vertex AI Training, Pipelines, Endpoints, Feature Store, Vector Search, and Model Monitoring.

● Experience with model serving, autoscaling endpoints, ONNX, TensorRT, quantization, and latency optimization strategies.

● Proven experience fine-tuning LLMs such as Gemini, PaLM, Llama, Mistral, or similar open-source models.

● Strong understanding of embeddings, vector databases, semantic search, reranking, and context management techniques.

● Hands-on experience with agentic AI frameworks such as LangChain, LlamaIndex, LangGraph, CrewAI, or AutoGen.

● Strong knowledge of prompt engineering, prompt versioning, structured output prompting, prompt injection defense, and prompt registries.

● Proficiency in Python and AI/ML frameworks including PyTorch, TensorFlow, Hugging Face, scikit-learn, NumPy, Pandas, and Polars.

● Experience with FastAPI, gRPC, streaming APIs, WebSockets, and real-time AI inference patterns.

● Familiarity with Azure ML, Azure OpenAI, Prompt Flow, and hybrid cloud deployment strategies is preferred.

● Experience with distributed training strategies including multi-GPU, TPUs, DeepSpeed, and large model optimization is a plus.

● Strong understanding of MLOps practices including CI/CD for models, model registries, versioning, promotion gates, and monitoring for drift and degradation.

● Experience with NLP pipelines including entity extraction, NER, relation extraction, document classification, or clinical NLP use cases.

● Familiarity with healthcare AI, FHIR-based data, HIPAA-compliant workflows, or de-identified clinical datasets is highly preferred.

● GCP Professional ML Engineer certification is preferred.

● Strong communication, collaboration, and problem-solving skills with the ability to work across technical and business teams.

What Is In It For You

● Be part of one of the fastest-growing AI-first digital transformation and engineering companies in the world.

● Work on cutting-edge AI initiatives involving LLMs, RAG, Agentic AI, search, and healthcare modernization.

● Collaborate with world-class engineers, data scientists, architects, and AI leaders across global teams.

● Gain exposure to advanced technologies across Vertex AI, Azure OpenAI, semantic search, MLOps, and distributed systems.

● Work with Fortune 500 healthcare organizations to deliver transformative AI solutions with measurable impact.