Prasad Kavuri
Prasad Kavuri

Prasad Kavuri

Head of AI Engineering ยท AI/ML Executive Leader

Agentic AILLM PlatformsApplied AI StrategyGlobal Engineering Leadership

Visionary AI/ML executive with 20+ years driving transformative technology strategies across enterprise platforms and AI ecosystems. Currently leading India's first Agentic AI platform (Kruti.ai) at Krutrim, achieving 50% latency reduction and 40% cost savings.

20+

Years Experience

200+

Engineers Led

13K+

B2B Customers

70%

Cost Reduction

AI-Powered Tools

Live demos showcasing agentic AI, semantic search, and open-source LLM orchestration.

๐Ÿš€Live
RAG Pipeline

Real retrieval-augmented generation with Transformers.js embeddings and ChromaDB โ€” runs entirely in your browser.

Transformers.jsChromaDBnomic-embed-text
Open demo
๐Ÿ‘ฅUpgrading
Multi-Agent System

CrewAI-powered agents with real LLM calls via Groq โ€” Analyzer, Researcher, and Strategist collaborating in real time.

CrewAIGroqLlama 3.3
Open demo
๐Ÿ”„Live
LLM Router

Real multi-model routing across Llama 3.1 8B, 70B, and Mixtral โ€” see live latency, cost, and quality trade-offs.

GroqMulti-modelLive latency
Open demo
๐Ÿ”ŽLive
Vector Search

Semantic search with real sentence-BERT embeddings and UMAP visualisation of the embedding space.

all-MiniLM-L6-v2UMAPCosine similarity
Open demo
๐Ÿค–Live
AI Portfolio Assistant

Streaming RAG-powered assistant over my experience โ€” ask anything about my background and see retrieved context.

Vercel AI SDKStreamingRAG
Open demo
๐Ÿ“„Live
Resume Generator

Paste a job description, get a tailored resume with skill matching scores and selection reasoning.

JD parsingSkill matchingPDF export
Open demo
๐ŸŽญLive
Multimodal Assistant

Florence-2 image captioning and OCR running in-browser via Transformers.js โ€” no server, no API key.

Florence-2WebGPUIn-browser
Open demo
โšกLive
Model Quantization

Live ONNX benchmark comparing INT8 vs FP32 inference โ€” real file sizes, real latency, real quality diff.

ONNXINT8 vs FP32Transformers.js
Open demo
๐Ÿ”ŒLive
MCP Tool Demo

Model Context Protocol in action โ€” watch an LLM discover and call tools to answer questions about Prasad's background.

MCPTool UseGroq API
Open demo

Core Expertise

Deep expertise across AI/ML leadership, cloud infrastructure, executive leadership, and industry-specific domains.

๐Ÿค–
AI/ML Leadership
Agentic AI ArchitectureMulti-Model LLM OrchestrationRetrieval-Augmented Generation (RAG)Vector Search & EmbeddingsTransformer Models & BERTReal-time PersonalizationAI Agent DevelopmentCrewAILangGraphLLM Ops & MLOps
โ˜๏ธ
Cloud & Infrastructure
AWSAzureGCPKubernetes & MicroservicesCI/CD & DevOpsInfrastructure as CodeCloud-Native ArchitectureAPI Platform DevelopmentPaaS & Platform EcosystemsSDK/API Integration24/7 Production Systems
๐Ÿ‘”
Executive Leadership
Global Team Leadership (200+ engineers)P&L ManagementExecutive Stakeholder ManagementStrategic Vendor PartnershipsCost OptimizationCross-functional CollaborationAI Governance & EthicsDigital Transformation
๐Ÿข
Industry Expertise
Autonomous SystemsComputer VisionMobility & TransportationMapping & GISAutomotive TechnologyFleet ManagementEnterprise AI IntegrationB2B SaaS Platforms

Experience Highlights

20+ years building AI platforms, leading global engineering teams, and driving transformative technology strategies.

Head of AI Engineering
March 2025 - Present

Krutrim

Naperville, IL

  • โ€ขArchitected India's first Agentic AI platform (Kruti.ai) with 200+ engineers
  • โ€ขDelivered 50% latency reduction and 40% cost savings through multi-model LLM orchestration
  • โ€ขBuilt RAG pipelines, vector search, and real-time personalization capabilities
  • โ€ขLaunched domain-specific AI agents for cab booking, food ordering, bill payments, image generation
  • โ€ขBuilt enterprise-grade 24/7 PaaS capabilities with SDK/API integration
  • โ€ขLeading enterprise adoption of agentic AI across engineering and product workflows
Agentic AILLM OrchestrationRAGVector SearchPaaS
Senior Director of Engineering
September 2023 - February 2025

Ola

Naperville, IL

  • โ€ขLaunched Ola Maps B2B platform acquiring 13,000+ enterprise customers
  • โ€ขReduced infrastructure costs by 70% via cloud-native roadmap
  • โ€ขScaled to millions of daily API calls
  • โ€ขIntroduced AI-powered real-time route optimization boosting ETA accuracy
  • โ€ขLed 150+ engineers across US and India
  • โ€ขNegotiated strategic vendor partnerships for electric mobility adoption
B2B PlatformCloud-NativeAI Route OptimizationFleet Management
Head of Infrastructure and Services
May 2023 - September 2023

HERE Technologies

Chicago, IL

  • โ€ขLed large-scale engineering programs in safety-critical regulated environment
  • โ€ขDirected global engineering for AI/ML infrastructure enabling autonomous driving
  • โ€ขLed global team building core infrastructure for ML/AI products
InfrastructureAI/MLAutonomous Driving
Director of Engineering - Highly Automated Driving
July 2021 - June 2023

HERE Technologies

Chicago, IL

  • โ€ขDelivered AI-enhanced HD mapping and lane-level automation systems
  • โ€ขManaged global team of 85+ engineers across North America, Europe, APAC
  • โ€ขChampioned AI/ML advancements in automated driving improving map precision
  • โ€ขSupported major OEM autonomous driving platforms
Autonomous DrivingHD MappingGlobal TeamsOEM

Let's Connect

Interested in exploring AI strategies, collaborating on engineering challenges, or discussing how agentic AI can transform your business?

Portfolio

https://prasadkavuri.com