AAIDC-M3: Production-Ready ML System Documentation
This document describes the design, implementation, and deployment strategy for the Text Analysis and Triage System.
1.1 Problem Solved
This system addresses the challenge of manually reviewing large volumes of unstructured text data, such as customer feedback or support logs. The core function is to provide real-time classification and priority triage, leading to automated routing and faster resolution of critical issues.
1.2 System Architecture and Data Flow
The architecture is a three-tier model designed for resilience and low latency. The process begins at the UI Layer (Streamlit), which gathers user input (text and priority level). This input is sent to the API Layer (FastAPI), which serves as the ingress point, managing request validation and routing. The API then communicates with the ML Engine Layer (src/model.py), which loads the pre-trained model and executes the prediction logic. The resulting prediction and confidence score are then passed back through the layers to be displayed to the user.
2.1 Deployment Choices (Containerization)
The entire application is built for containerization using Docker. This approach ensures a consistent runtime environment across development and production, effectively isolating dependencies and eliminating common environment-specific errors. The containerized design simplifies deployment to cloud platforms like Kubernetes or AWS ECS, demonstrating production readiness.
2.2 Testing Suite and Safety Features
The system employs a comprehensive testing strategy using the pytest framework to ensure reliability and security.
• Unit Tests: Located in tests/test_unit.py, these tests verify the functionality of low-level code components, such as the input cleaning functions within src/preprocess.py and utility functions in the ML engine.
• Integration Tests: Found in tests/test_integration.py, these tests confirm that the FastAPI endpoints (/predict and /health) communicate correctly with the ML engine, verifying the integrity of the full data pipeline under various conditions.
• Security Guardrails (Input Validation): Implemented using Pydantic schemas within main.py, the system strictly validates all incoming data (e.g., ensuring text meets minimum length requirements and priority is within the expected numeric range). This prevents unexpected crashes and ensures data quality.
• Security Guardrails (Input Sanitization): The src/preprocess.py module includes logic to strip potentially malicious or extraneous characters, normalizing the input before it is passed to the ML model, thereby reducing security risks.
2.3 Failure Handling and Monitoring
Robust failure handling ensures the system remains operational even when errors occur.
• API Resilience: Critical operations, most importantly the model loading process, are wrapped in try...except blocks. If the model file is missing or corrupt, the API enters a fallback "dummy mode" (returning "Unknown" and 0.00%) rather than crashing, maintaining service availability.
• Error Reporting: When invalid user input is detected, the API returns standardized HTTP error codes (e.g., HTTP 422 Unprocessable Entity), which the Streamlit frontend is programmed to catch and display gracefully to the user, enhancing the user experience.
• Monitoring Endpoint: A dedicated /health endpoint is available in main.py specifically for automated Liveness and Readiness checks by production monitoring systems like Kubernetes or Prometheus, ensuring operational visibility.
3.2 Running the Application
The system requires two components to be running simultaneously in separate terminals.
python -m uvicorn main