
ā¬ ļø Previous - How to Succeed in This Program
š First lesson - What Powers ChatGPT and Modern AI
The Module 2 Capstone Project is your final major milestone toward certification.
By completing it, you'll demonstrate that you understand how to take an LLM from selection to production ā planning a deployment strategy, selecting appropriate infrastructure, and designing a monitoring approach.
You'll submit a comprehensive deployment plan that covers your use case, model selection, deployment strategy, cost analysis, and observability design.
In this lesson, you'll review the objectives, deliverables, and evaluation criteria for your final project ā the last step before earning your LLM Engineering & Deployment Certification.
In this project, you'll create a detailed deployment plan for an LLM of your choice. This written publication demonstrates your understanding of production LLM systems without requiring actual cloud infrastructure costs.
Your model can be:
Your goal is to demonstrate that you can:
Your project should be a comprehensive written publication covering the following sections:
Describe the problem you're solving:
Document your model's technical details:
| Aspect | Your Choice |
|---|---|
| Model | (e.g., Llama 3.2 1B, Mistral 7B, GPT-4o-mini) |
| Model Source | (e.g., Hugging Face, OpenAI API, custom fine-tuned) |
| Parameter Count | ā |
| Quantization | (e.g., None, INT8, INT4) |
| Context Length | ā |
| Max Output Tokens | ā |
Explain your choices:
Choose a deployment approach and justify your decision:
Select a deployment platform and justify your choice. Here are some common options:
| Platform | Best For |
|---|---|
| Hugging Face Inference API | Quick setup, managed infrastructure |
| Modal | Serverless, pay-per-use, easy scaling |
| AWS Bedrock | Enterprise integration, imported models |
| AWS SageMaker | Full control, custom containers |
| vLLM on Cloud VM | High throughput, advanced batching |
| Ollama on EC2 | Simple self-hosted deployment |
You're not limited to these ā choose any platform that fits your use case.
Document your planned configuration:
Explain your reasoning:
Provide a detailed cost breakdown for your deployment:
| Cost Component | Monthly Estimate |
|---|---|
| Compute (GPU/CPU hours) | $ |
| Storage (model weights, logs) | $ |
| Network (data transfer) | $ |
| Monitoring tools | $ |
| Total Estimated | $ |
Describe at least two strategies you would implement:
Calculate the estimated cost per 1,000 requests based on your configuration.
Design a monitoring strategy for your deployment:
| Metric | Why It Matters | Alert Threshold |
|---|---|---|
| Latency (p50, p99) | User experience | > X ms |
| Error Rate | Reliability | > X% |
| Throughput (RPS) | Capacity | < X RPS |
| Token Usage | Cost control | > X tokens/request |
| GPU Utilization | Resource efficiency | > 95% or < 20% |
Choose your monitoring stack and explain why:
| Tool | Purpose |
|---|---|
| (e.g., LangFuse, LangSmith) | LLM tracing and observability |
| (e.g., CloudWatch, Datadog) | Infrastructure monitoring |
| (e.g., LiteLLM) | Cost tracking and budgets |
Address security for your deployment:
Module 2 projects follow the same monthly review schedule as Module 1.
To be included in a given month's review cycle, make sure to submit your project by one of the following dates:
š“ January 05, 2026, 11:59 PM UTC
š“ February 02, 2026, 11:59 PM UTC
š“ March 02, 2026, 11:59 PM UTC
š“ April 06, 2026, 11:59 PM UTC
š“ May 04, 2026, 11:59 PM UTC
If you miss a listed date, your project will simply roll over to the next month's review.
Reviews typically take about two weeks, during which you'll receive feedback and, if needed, an opportunity to make improvements before final evaluation.
Plan ahead so you can complete your submission comfortably within your preferred review window.
Create a publication on Ready Tensor that:
š Publication Evaluation Rubric
Submit a repository that demonstrates your deployment knowledge with working code. Your repo should include:
Deployment Scripts (at least one):
Client Code:
Deployment Documentation:
Example repository structure:
my-llm-deployment/
āāā README.md # Deployment steps and usage
āāā requirements.txt # Dependencies
āāā deploy/
ā āāā modal_deploy.py # Modal deployment script
ā āāā docker-compose.yaml # Or alternative deployment config
āāā client/
ā āāā client.py # Request client
ā āāā test_requests.py # Example requests
āāā config/
āāā model_config.yaml # Model configuration
Your repository should meet 70% of the "Essential" level in the repository evaluation rubric.
š Repository Evaluation Rubric
You may optionally include:
These are not required but can strengthen your submission.
Your publication will be evaluated on:
| Criteria | What We Look For |
|---|---|
| Use Case Clarity | Clear problem definition, realistic requirements |
| Technical Accuracy | Correct understanding of architectures, platforms, trade-offs |
| Deployment Strategy | Justified platform choice, appropriate infrastructure |
| Cost Analysis | Realistic estimates, optimization strategies |
| Monitoring Plan | Relevant metrics, appropriate tooling |
| Communication | Clear writing, logical organization, professional presentation |
Successfully completing this project earns you the LLM Deployment Engineer credential ā recognizing your understanding of deploying, monitoring, and managing large language models in production environments.
If you've also earned the LLM Fine-Tuning Specialist credential from Module 1, you'll be awarded the LLM Engineering & Deployment Certification, representing full completion of the program.
Once you've submitted your project, it will be reviewed by the evaluation team.
If it meets the certification standards, you'll receive your credential ā and if you've completed both modules, your full program certificate as well.
This marks your official recognition as a certified LLM Engineer, capable of planning and designing production-ready LLM systems.