Resume Parser with OpenAI
A production-ready resume parser built with Go and OpenAI that extracts structured data from PDF, DOCX, and TXT files. Features a modern web interface with drag-and-drop upload and real-time parsing results.
Features
- Multiple File Formats: Supports PDF, DOCX, and TXT files
- Advanced PDF Processing: Uses MuPDF (via go-fitz) for superior text extraction
- OpenAI Integration: Leverages GPT-3.5-turbo for intelligent parsing
- Multiple Prompt Types: Basic, detailed, and skills-focused parsing with UI selection
- Interactive Frontend: Modern Alpine.js UI with drag-and-drop upload and real-time results
- Comprehensive Logging: Detailed logging with structured output
- Production Ready: Docker containerization with Go backend
- Security Features: Input validation, secure file handling, environment variable configuration
- Responsive Design: Mobile-friendly interface with modern styling and improved touch targets
Prerequisites
- Go 1.21 or higher
- OpenAI API key
- Docker (optional, for containerized deployment)
Local Development Setup
1. Clone the Repository
git clone <repository-url>
cd resume-parser
2. Backend Setup
cd backend
go mod tidy
3. Environment Configuration
Create a .env
file in the backend
directory:
OPENAI_API_KEY=your_openai_api_key_here
GIN_MODE=debug
4. Run the Backend
cd backend
go run main.go
The server will start on http://localhost:8080
5. Test the API
# Health check
curl http://localhost:8080/health
# Parse a resume
curl -X POST -F "file=@resume.pdf" "http://localhost:8080/api/parse-resume?promptType=detailed"
6. Frontend Development
For local development, you can serve the frontend with:
cd frontend
python -m http.server 3000
Then visit http://localhost:3000
and update the API calls to point to http://localhost:8080
Docker Deployment (Recommended)
Quick Start with Docker Compose
- Set your OpenAI API key:
export OPENAI_API_KEY=your_openai_api_key_here
- Run the application:
docker-compose up --build
- Access the application:
Using Docker Directly
docker build -t resume-parser .
docker run -p 8080:8080 -e OPENAI_API_KEY=your_key resume-parser
API Endpoints
Health Check
GET /health
Parse Resume
POST /api/parse-resume?promptType={type}
Parameters:
file
: Resume file (PDF, DOCX, or TXT)
promptType
: basic
, detailed
, or skills-focused
Response:
{
"filename": "resume.pdf",
"filetype": ".pdf",
"text": "extracted text...",
"parsed": {
"name": "John Doe",
"email": "john.doe@example.com",
"phone": "(555) 123-4567",
"skills": ["Python", "JavaScript", "React"],
"experience": [...],
"education": [...],
"certifications": [...]
}
}
Prompt Types
The frontend provides an interactive prompt type selector with the following options:
Basic
Essential information extraction
- Name, email, phone
- Skills list
- Education history
- Work experience
Detailed
Comprehensive extraction with additional fields
- All basic fields
- Detailed job descriptions
- Education details (school, degree, year)
- Certifications
- Languages
- Extended formatting
Skills-Focused
Specialized skill categorization
- Technical skills (programming, tools, frameworks)
- Soft skills (leadership, communication)
- Categorized skill sets
- Skills-optimized parsing
File Processing
- PDF: Processed using MuPDF (go-fitz) for accurate text extraction
- DOCX: Standard library ZIP-based extraction with XML parsing
- TXT: Direct text processing
File Size Limits
- Maximum file size: 50MB
- Supported formats: PDF, DOCX, TXT
Logging
The application provides comprehensive logging:
- Startup: Configuration and server status
- Request Processing: File details and processing steps
- PDF Extraction: Page-by-page processing with MuPDF
- OpenAI Integration: API calls, timing, and usage statistics
- Error Handling: Detailed error context and debugging info
Security Features
- Input validation and sanitization
- Secure file handling with temporary files
- Environment variable configuration
- Health checks and monitoring
- File type validation
- Efficient PDF Processing: MuPDF for fast, accurate text extraction
- Streaming File Handling: Memory-efficient file processing
- Connection Pooling: Optimized HTTP client for OpenAI API
- Concurrent Processing: Designed for high throughput
Monitoring
- Health check endpoint at
/health
- Structured logging with logrus
- Request timing and metrics
- Error tracking and reporting
Error Handling
- Graceful degradation for unsupported files
- Detailed error messages for debugging
- Fallback mechanisms for PDF processing
- Timeout handling for OpenAI API calls
- Safe null handling in frontend expressions
Frontend Features
- Modern UI: Clean, professional interface with improved touch targets
- Drag-and-Drop Upload: Intuitive file upload with visual feedback
- Real-time Processing: Live status updates during parsing
- Responsive Design: Mobile-optimized with larger touch targets
- Error Handling: Comprehensive error display and recovery
- Results Display: Structured presentation of parsed data
- Raw Text View: Collapsible section showing extracted text
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Troubleshooting
Common Issues
- PDF Processing Errors: Ensure MuPDF libraries are installed
- OpenAI API Errors: Check API key and rate limits
- File Upload Errors: Verify file format and size limits
- Docker Build Issues: Ensure all dependencies are available
- Frontend Errors: Check browser console for Alpine.js expression errors
Environment Setup
- Ensure OpenAI API key is properly set in
.env
file
- Verify Go version is 1.21 or higher
- Check Docker installation for containerized deployment
- Ensure proper file permissions for uploads