This paper presents an educational Retrieval-Augmented Generation (RAG) system designed to provide factual information about abortion awareness, reproductive health, and related women's health topics. The system leverages OpenAI's embedding and language models to retrieve relevant information from curated educational documents and generate accurate, educational responses. This approach addresses the need for accessible, factual information on sensitive health topics while maintaining educational integrity and medical accuracy.
Access to accurate information about reproductive health is crucial for personal decision-making and education. However, finding reliable, factual information on sensitive topics like abortion can be challenging due to misinformation, politicization, and varied cultural perspectives. This project aims to address this challenge by developing a specialized educational resource that provides factual information on abortion awareness and related topics.
The system is built as a Retrieval-Augmented Generation (RAG) application, which combines the strengths of information retrieval systems with the natural language generation capabilities of large language models. This approach grounds the responses in verified educational content, reducing hallucinations and ensuring factual accuracy.
The system architecture consists of several key components:
Document Processing Pipeline: Processes educational documents on abortion and reproductive health, breaking them into manageable chunks for embedding.
Vector Database: Stores document chunks as vector embeddings using OpenAI's text-embedding-3-small model, enabling semantic search capabilities.
Retrieval Component: Identifies the most relevant document chunks based on user queries using semantic similarity.
Generation Component: Utilizes OpenAI's gpt-4o-mini model to generate responses based on retrieved context, user query, and carefully designed prompt templates.
Web Interface: Provides a user-friendly interface for interacting with the system, including options to select different teaching approaches and communication styles.
graph TD A[Start: User Adds Documents] --> B[Document Ingestion ingest.py] B --> C[Chunk & Embed Documents text-embedding-3-small] C --> D[Store Embeddings in Vector DB] subgraph QueryFlow ["Query Flow"] E[User Query via Web or API] --> F[Retrieve Relevant Chunks] F --> G[Generate Response gpt-4o-mini] G --> H[Return Answer to User] end subgraph InterfaceOptions ["Interface Options"] I[Web Interface] J[REST API Endpoints] end D --> F I --> E J --> E
The system uses carefully selected educational documents focused on abortion awareness, reproductive health, and women's health topics. These documents are curated to ensure medical accuracy, educational value, and factual content free from political bias.
A significant contribution of this work is the development of specialized prompt templates for abortion education. The system implements different reasoning strategies, including:
To maintain focus on the specialized domain, the system implements topic filtering mechanisms that politely decline to answer questions unrelated to abortion and reproductive health.
The system successfully provides factual, educational information about abortion and reproductive health topics. The RAG approach ensures responses are grounded in verified educational content rather than generated solely from the language model's parameters.
The configurable prompt templates and reasoning strategies allow adaptation to different educational needs, from basic factual information to more detailed explanations with reasoning steps. Source attribution provides transparency about the origins of information.
This educational RAG system represents a specialized approach to providing factual information on abortion awareness and reproductive health. By combining document retrieval with contextual generation, the system ensures educational integrity while maintaining conversational interaction. The project demonstrates how AI can be responsibly applied to sensitive educational domains when properly designed with factual grounding and ethical considerations.
Retrieval-Augmented Generation, Abortion Education, Reproductive Health, Educational AI, Medical Information Systems