Live Demo · Medical Clinic AI Receptionist

Talk to the
AI Phone Agent

A live call to a production AI voice agent — custom Java SIP server, VoIP media engine, Aliyun ASR/TTS, and a local RAG pipeline. Ask it anything a clinic patient would ask.

Try asking (QUERY):

Hours
"What are your clinic hours?"
Walk-in
"Do you accept walk-ins?"
Prescription
"Can I get a refill over the phone?"
Sick Note
"I need a doctor's note for work."
Late
"I'm 15 minutes late for my appointment."
Referral
"Checking on my specialist referral."
Urgent
"My chest hurts and I can't breathe."
New Patient
"Can I register as a new patient?"
Initializing...
Clinic AI Line
sip:20008001@192.168.240.1

Requires microphone · Use localhost or HTTPS


More intents to try:

Greeting
"Hello, good morning!"
Command
"Transfer me to a human agent."
Command
"Can you repeat that?"
Feedback
"That's very helpful, thank you."
Inform
"My name is John, I'm a registered patient."
Chitchat
"How are you today?"

About This System

Intelligent Voice Agent for Medical Clinics

A production-grade AI phone assistant built from the ground up — custom Java SIP signalling server, VoIP media engine, Aliyun ASR/TTS, local RAG pipeline with BGE embeddings and pgvector, real-time 7-class intent classification, and sub-2s end-to-end latency.

Python / FastAPI pgvector RAG BGE Embeddings Qwen LLM Intent Classifier Rerank Pipeline Java SIP Server VoIP Java Media Aliyun ASR / TTS Janus WebRTC
System Architecture
Phone / WebRTC
Aliyun ASR/TTS
speech ↔ text
SIP Server
Java · custom
VoIP Java
RTP media
FastAPI Agent
Python
Intent Classifier
qwen-turbo
RAG Pipeline
BGE + pgvector
LLM Answer
qwen-plus

Key Technical Highlights

What Makes It Work

🧠
Intent Classification
7-class classifier (QUERY, COMMAND, INFORM, FEEDBACK, ACK, GREETING, CHITCHAT) with entity normalization and category routing.
🔍
Two-Stage RAG
BGE-large embedding → pgvector ANN search → BGE-reranker cross-encoder. Fast-track for high-confidence hits. Sub-400ms retrieval.
Dynamic LLM Routing
Simple queries auto-downgrade to qwen-turbo. Complex answers use qwen-plus. Saves 500–800ms per turn.
📞
Custom SIP Stack
Self-developed Java SIP signalling server + VoIP media engine. WebRTC bridged via Janus gateway.
🚨
Urgent Fast-Track
Emergency keywords bypass RAG entirely — 911 directive delivered in under 1 second.
🌐
Multilingual Ready
Same pipeline runs Chinese education (bge-zh) and English clinic (bge-en). Config-driven, no code changes.

Copyright & Attribution

Intellectual Property Notice

RAG Pipeline · AI Agent · Intent Classifier
Designed and developed under Harvey Liu's technical lead. Originally built in Java, refactored in Python (2026). Released as open-source — source code available at github.com/lhw300.
SIP Server (Java)
Designed and developed under Harvey Liu's technical lead at LCall Technology. The compiled software is freely available for download and use. Source code is proprietary.
VoIP / PBX Media Engine (Java)
Designed and developed under Harvey Liu's technical lead at LCall Technology. Copyright © LCall Technology. Used with permission for personal portfolio demonstration, job interviews, and technical discussion purposes only. Not for commercial use.