Traditional keyword-based search often fails to capture the true intent behind a user's query — especially when working with complex, unstructured text like books, documents, or user notes. LibroVault-Python addresses this limitation by implementing a modular, high-performance semantic search API built with FastAPI and Sentence-Transformers.
Designed for seamless integration as a microservice, it exposes two core endpoints:
POST /api/embedding/make — generates dense vector embeddings from raw textPOST /api/embedding/compare — compares a query embedding against stored vectors to return the most semantically similar itemsThe system easily supports changing between models, switching all-MiniLM-L6-v2 for English and paraphrase-multilingual-MiniLM-L12-v2 for other languages. Internally, the architecture is split into three layers — configuration/model core, service logic, and API orchestration — enabling clean separation of concerns and easy extensibility.
The project achieves low-latency performance (~50ms per embedding, ~10ms for vector search across thousands of items), with normalization and vector operations powered by NumPy and Hugging Face’s semantic_search. Embeddings are loaded and compared dynamically from JSON files, but the design allows for future integration with vector databases (e.g., FAISS, Qdrant) for scalable deployment.
LibroVault-Python is ideal for any application requiring context-aware search — from virtual libraries and academic archives to document assistants and recommendation engines.
The architecture of LibroVault-Python is designed around clarity, modularity, and extensibility. It follows a three-layered structure:
Core (Configuration & Model Initialization)
config.py loads environment variables and defines normalization behavior for each supported model.model.py instantiates a single SentenceTransformer object based on MODEL_NAME, prints model load info, and configures optional L2 normalization.Embedding Generation
generate_embedding(text: str) in embedding_service.py encodes input text into a dense vector using the preloaded model./api/embedding/make endpoint receives raw text, delegates processing to the service layer, and returns a vector of floats.Semantic Comparison
/api/embedding/compare endpoint accepts a query text and paths to embedding files.search_service.py uses Hugging Face’s semantic_search function on NumPy arrays to return the top-K most similar vectors.Best Practices
.env, enabling runtime changes with no code edits.try/except blocks to gracefully handle missing or malformed data.This methodology ensures that LibroVault-Python remains easy to test, adapt, and scale as project requirements evolve.
LibroVault-Python delivers low-latency, high-accuracy semantic search suitable for real-world NLP applications. Performance testing has shown:
all-MiniLM-L6-v2Key advantages demonstrated:
.env, enabling experimentation and multilingual support without changing codeExample response from /api/embedding/compare includes book_id, page_number, file path, and cosine similarity scores — allowing developers to rank and retrieve documents based on conceptual relevance, not just keyword overlap.
If you're interested in exploring the project further, you're welcome to contribute or review the source code on GitHub:
🔗 https://github.com/Hugobsan/LibroVault-Python
You can also see the system in action through the user-facing interface available at:
🌐 https://librovault.hugobsan.tech
Obs: Only in Portuguese 🇧🇷
To request demo access, feel free to send an email to hbsantos36@gmail.com, and I’ll provide a login and password — no cost involved.