Smart Search for Free Courses on Analytics Vidhya
Project Overview
Smart Search is an innovative project designed to implement an advanced search feature for free courses available on the Analytics Vidhya platform. The aim is to empower users to find the most relevant courses based on natural language queries, leveraging state-of-the-art machine learning techniques. This project not only enhances user experience but also demonstrates the potential of modern search algorithms in educational platforms.
Objective
The primary objective of this project is to develop a Smart Search feature that enables users to efficiently find the most relevant courses on the Analytics Vidhya platform based on their natural language queries. By utilizing LangChain, FAISS, and Sentence Transformers, we aim to deliver an efficient, scalable, and accurate search experience that significantly improves user satisfaction.
Key Features
- Advanced Search Algorithms: Implementing cutting-edge search algorithms to provide accurate and relevant results.
- User-Friendly Interface: Designing an intuitive and easy-to-use interface for seamless user interaction.
- Real-Time Search Results: Delivering instantaneous search results to enhance user experience.
- Customizable Search Parameters: Allowing users to tailor search parameters to better match their needs.
Technologies Used
- LangChain: Facilitating the integration of various language models for natural language processing.
- FAISS: Ensuring efficient similarity search and clustering of dense vectors.
- Sentence Transformers (all-MiniLM-L6-v2): Generating high-quality embeddings for textual data.
- Streamlit: Providing an interactive and visually appealing web application framework.
Architecture
- Data Collection & Preprocessing: Scraping and preprocessing course data from Analytics Vidhya to ensure clean and structured information.
- Embedding Generation: Converting textual course data into numerical vectors using Sentence Transformers, enabling efficient similarity computations.
- FAISS Indexing: Building an efficient and scalable FAISS index to facilitate rapid similarity searches.
- Integration with Streamlit: Developing a user-friendly search interface using Streamlit for real-time interaction.
- Ranking and Result Presentation: Utilizing cosine similarity to rank courses and present the most relevant results to users.
Directory Structure
AnalyticsVidhya-SmartSearch/
│
├── course_details.json # Scraped course data
├── scrape_courses.py # Data scraping script
├── index_courses.py # Embedding and indexing script
├── app.py # Streamlit application for smart search
├── course_faiss.index # FAISS index for efficient search
├── requirements.txt # Dependencies for the project
├── .gitignore # Git ignore file to exclude unnecessary files
└── README.md # Project documentation and instructions
Implementation Details
- Data Collection: Scraping comprehensive course details including titles, descriptions, curriculum, and additional information to create a rich dataset.
- Embedding Generation: Using Sentence Transformers to generate meaningful embeddings that capture the semantic essence of the course data.
- FAISS Indexing: Creating a highly efficient FAISS index to perform rapid and scalable similarity searches.
- Streamlit Integration: Implementing a dynamic search interface with Streamlit to allow users to interact with the search engine in real-time.
- Ranking Mechanism: Utilizing cosine similarity to rank the search results based on their relevance to the user's query, ensuring the most pertinent courses are displayed first.
Prerequisites
- Python 3.7 or higher
- pip (Python package installer)
Getting Started
- Clone the repository:
git clone https://github.com/Satyamkumarnavneet/Analyticsvidhya-SmartSearch.git
- Navigate to the project directory:
cd AnalyticsVidhya-SmartSearch
- Install dependencies:
pip install -r requirements.txt
- Run the application:
streamlit run app.py
- Fast Search Results: FAISS-based indexing ensures rapid search results, even with large datasets, providing a smooth user experience.
- Scalability: The system is designed to handle a large number of courses efficiently, making it suitable for growing educational platforms.
Usage
- Open your web browser and navigate to
http://localhost:8501
.
- Enter your search query in the search bar.
- Customize search parameters if needed to refine your search.
- View the search results in real-time, with courses ranked by relevance.
Benefits
- Enhanced User Experience: By providing accurate and relevant search results quickly, users can find the courses they need without hassle.
- Educational Impact: This project showcases the application of advanced machine learning techniques in the educational domain, potentially inspiring further innovations.
Contributing
We welcome contributions to enhance the Smart Search project! To contribute:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
)
- Commit your changes (
git commit -m 'Add some AmazingFeature'
)
- Push to the branch (
git push origin feature/AmazingFeature
)
- Open a Pull Request
Support
For support, please email navneetsatyamkumar@gmail.com
. We are here to help!
Screenshots
Home Page
