Voice AI Framework
Twilio
Deepgram
vLLM
AWS
Built a high-concurrency telephony agent achieving sub-500ms latency. Integrates interruptibility, tool-use for scheduling, and custom fine-tuned Qwen-3 models for domain accuracy.
Transforming Ideas into Reality: One Line of Code at a Time. Let's Collaborate, Innovate, and Shape the Future of AI Together.
Years of Experience
LeetCode Problem
YouTube Videos
Hi, I'm Mohammed Warish — an AI Engineer specialized in building Sub-500ms Voice Agents and Multi-Agent Systems. Currently at Quanteon, I focus on the hard engineering problems of Generative AI: reducing latency, ensuring context retention, and deploying scalable Dockerized inference pipelines.
Name
Warish
Age
23
Experience
1 + years
phone
9991463786
State
Haryana
Designed and developed a Voice AI Agent Framework, deploying specialized agents including Appointment Schedulers, Prior Auth Agents, and Interviewer Agents to automate healthcare and recruitment workflows.
Deployed on AWS EC2 with Docker, achieving Sub-500ms overall latency for vLLM-based inferencing and enabling auto-scaling worker pools.
Engineered a HIPAA-compliant STT–LLM–TTS pipeline utilizing Deepgram, Azure OpenAI, and ElevenLabs for natural conversational experiences.
Fine-tuned Qwen 3 on 1.2K call transcripts for domain adaptation, significantly improving tool calling and information extraction.
Engineered advanced time series forecasting and anomaly detection models to identify and diagnose system irregularities, and provided the root cause analysis for better operational transparency and proactive issue resolution.
Developed multi-agent AI systems leveraging LangGraph, DeepSeek, and RAG to deliver real-time summarization and analytics from anomaly data, transforming LLMs into powerful analytical copilots for business intelligence.
Developed a video-based quality assurance system for chip manufacturing process using YOLO, integrated with a case management system to auto-raise alerts on startkit/material delays. Event summaries were stored and processed to enable text-to-query analysis on video incidents using OpenAI.
Implemented "known-unknown" activity detection systems for surveillance video using unsupervised techniques, enabling real-time alerts on anomalous behaviors in dynamic environments.
Developed an AI Therapist Agent using LLaMA2 and OpenAI models to assist individuals experiencing trauma, offering contextual emotional support and shifting perspectives from negative to positive through guided conversations.
Developed a comprehensive customer and lab partner application using React, Zustand, Cloudinary, Firebase, Node.js, TypeScript, MongoDB, and REST, enhancing user experience by 10%.
Integrated Firebase reCAPTCHA and Google login, improving security and reducing unauthorized access.
Reduced server load by 20% using Cloudinary for efficient file uploads. Developed over 40 REST APIs for functionalities including Order Booking, Wallet Management, Patient Report Card, and Reviews.
Implemented Google Maps APIs and a secure Razorpay payment gateway, streamlining transactions.
Designing and deploying end-to-end AI/ML systems for real-world problems, including anomaly detection, predictive analytics, and intelligent automation using state-of-the-art models and frameworks.
Building advanced multi agentic applications using Large Language Models (LLMs) like GPT, LLaMA, and DeepSeek, including multi-agent chatbots, RAG pipelines, and text-to-query analytics for knowledge extraction and automation.
Crafting robust computer vision pipelines for video anomaly detection, action recognition, quality assurance, object tracking, and visual prompting using YOLO, 3D CNNs, ResNet, VGG16, Triplet network and OpenCV.
Twilio
Deepgram
vLLM
AWS
Built a high-concurrency telephony agent achieving sub-500ms latency. Integrates interruptibility, tool-use for scheduling, and custom fine-tuned Qwen-3 models for domain accuracy.
GROQ
RAG
FAISS
Langchain
AI-powered project management system that automates task generation, prioritization,context-aware task suggestions, optimal team configurations, and dynamic report generation with actionable insights.
Python
NumPy
Pandas
Scikit
Flask
PowerBi
Developed a machine learning model to analyze and predict churn in the Telecom Industry, aiming to improve customer retention and boost revenue. Implemented using Python, with a Power BI dashboard for visualization and deployment via Flask.
React
Redux
Mongodb
GraphQl
Swagger
Local Job Employment Portal provides daily wages to local people. Anyone can resgister as an electrician plumber and labourer and anyone in that area can hire them to get thier work done.
React
Node
Razorapy
Charts
Redux
LearnWiz is a subscription based Elearning platform which offers innovative and expert led courses to enhance the learning speed of students by delivering the best premium courses at affordable prices.
JS
Lodash
HTML
CSS
Weather App designed to provide real-time weather information for a specified location.The app offers the flexibility to switch between temperature units.Also, it provides an optional geolocation feature.
For any kind of query you can always contact with me. I will try to contact you back as soon as possible.
9991463786
khanwarish483@gmail.com
Nuh , Haryana 122104