Natural Language to SQL using LLM

A simple web application that lets you ask natural-language questions about your PostgreSQL and MySQL database and have them converted into SQL queries by an LLM. It includes schema browsing, query confirmation for destructive statements, and result display with pagination and hover-tooltips.

Github: https://github.com/ah-naf/nlsql/tree/origin
Live Link: https://nl-sql-gme3hme5fpfdg2ez.canadacentral-01.azurewebsites.net/

Features

Natural Language → SQL: Describe what you want in plain English, and the app generates a SQL statement.
Schema Browser: View tables, columns, data types, primary/foreign key badges, and search/filter tables.
Confirmation Flow: Destructive operations (INSERT, UPDATE, DELETE, etc.) require confirmation.
Result Rendering: Paginated, responsive table with hover popovers for long content.
Database Connection: Connect to or create databases directly from the UI.

Backend Workflow

🔧 Environment Variables

Place a .env file at the project root with the following:

# For Together AI (default)
LLM_API_KEY=your-together-api-key
LLM_API_URL=https://api.together.xyz/v1/chat/completions
LLM_API_MODEL_NAME=meta-llama/Llama-3-70B-Instruct

# OR for Gemini (optional alternative)
GEMINI_API_URL=https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=your-gemini-api-key

🚀 Local Development

1. Clone the repository

git clone https://github.com/ah-naf/nlsql.git
cd nlsql

2. Backend setup

cd backend
go mod tidy

3. Frontend setup

cd ../frontend
npm install
npm run build

4. Run the application

cd ../backend
go run cmd/main.go

Open http://localhost:8080 in your browser.

🐳 Run with Docker

This project supports multi-stage Docker builds (frontend + backend). Here's how to build and run:

1. Build the image

docker build -t nlsql-app .

2. Run the container

docker run -p 8080:8080 nlsql-app

Make sure your PostgreSQL instance is accessible from the container.

3. Open the app

Visit: http://localhost

💡 Example Prompts

Show all orders placed in the last 24 hours
Add a new user named Alice with email alice@example.com
Delete all rows from temp_sessions table

Problems Solved

No SQL Expertise Required: Users can obtain data insights without writing or understanding SQL syntax.
Faster Querying: Business analysts or non-technical stakeholders can run ad-hoc queries by typing plain English, speeding up report generation.
Reduced Human Error: Converting natural language directly into SQL minimizes typos or syntax mistakes that often occur when writing SQL manually.
Interactive Schema Exploration: The schema browser allows users to discover table and column structures without switching to a separate database client.
Safe Destructive Operations: Built-in confirmation flow prevents accidental data loss from INSERT, UPDATE, or DELETE statements.
Transaction Support: Destructive queries can be wrapped in transactions; users receive confirmation prompts before committing changes.
Uniform Interface for Multiple Databases: Supports both PostgreSQL and MySQL backends, making it easy to switch or scale across different environments.

Future Work

Support Additional SQL/NoSQL Engines:
- Add connectors for other relational databases (e.g., SQL Server, SQLite) and popular NoSQL stores (e.g., MongoDB, DynamoDB) to broaden applicability.
Enhanced Prompt History:
- Maintain a conversation context so that follow-up questions (e.g., “Show only those hired after 2023”) are interpreted correctly relative to previous queries.
User Authentication & Roles:
- Introduce role-based access control (RBAC) so that certain users can only view data while others can perform mutations.
Natural Language Refinement:
- Implement pre-processing techniques (spell correction, stop-word removal) to handle more ambiguous or complex user inputs.
Visual Query Builder Integration:
- Combine the LLM-powered approach with a drag-and-drop query builder for users who want both natural language and GUI-based query construction.
Performance Monitoring & Caching:
- Track query execution times and cache frequently requested results to improve responsiveness.
Multi-Language Support:
- Extend natural language parsing to support languages beyond English (e.g., Spanish, Chinese) by leveraging multilingual LLMs.

Limitations

LLM Context Window Constraints:
- The free or low-cost LLM endpoints have a limited context window (e.g., 4K–8K tokens). As schemas grow (many tables/columns), prompts may exceed the allowable token limit, leading to truncated or degraded SQL generation.
Stateless Query Processing:
- Without explicit conversational context, each query is evaluated independently. Follow-up questions that reference previous results may produce incorrect SQL unless the context is manually provided in the prompt.
Model Cost & Rate Limits:
- Using free LLM tiers means stricter rate limits and potential throttling during peak usage. Upgrading to a paid plan may be required for production-level performance.
Synthetic Schema Exposure:
- Embedding an entire database schema in the prompt could leak sensitive structure information if the LLM provider’s security controls are not robust.
Edge-Case SQL Parsing:
- For very complex queries (window functions, subqueries in nested transactions), the LLM may generate syntactically valid SQL that does not align with user intent or may not optimize performance for large datasets.
Lack of Offline Mode:
- Since the architecture depends on external LLM APIs, the application cannot generate queries when offline or if the LLM service is unavailable.

Acknowledgements

LLM Service Providers – Together AI, Gemini API, and meta-llama’s open models for powering SQL generation.
PostgreSQL & MySQL – Foundations for reliable, ACID-compliant relational storage.
Go (Golang) & React – For a performant backend and a responsive frontend experience.
Docker & Docker Compose – Streamlining deployment and environment consistency.