top of page
Create Your First Project
Start adding your projects to your portfolio. Click on "Manage Projects" to get started
Semantic MCQ Search Engine from a 16,000-Page PDF - RAG / ChromaDB
Project type
RAG, ChromaDB, FAISS, Semantic Search
The client needed a desktop application to quickly retrieve MCQs from large medical textbooks spanning ~16K PDF pages. Keyword search was impractical due to content size and complexity. I designed and built a Python-based semantic search system using a vector database. MCQs were accurately extracted from PDFs using rule-based parsing to handle multiple formats, then converted into embeddings for semantic retrieval. The final app enables instant, highly relevant MCQ search, significantly reducing manual effort and improving study efficiency.


bottom of page