
📚 First lesson - What Powers ChatGPT and Modern AI
⬅️ Previous - How to Succeed in This Program
The Module 1 Capstone Project is your first major milestone toward certification.
By completing it, you’ll demonstrate that you can fine-tune an open-weights model end-to-end — from dataset preprocessing to model optimization — using frameworks like Hugging Face Transformers, PEFT, and Accelerate.
You’ll also practice evaluation, quantization, and reproducibility, essential skills for real-world model development.
In this lesson, you will review the objectives, deliverables, and evaluation criteria for your first project.
This lesson outlines what you'll need to complete for your Module 1 Capstone Project— the main deliverable for this part of the certification.
As you go through these requirements, don't worry if some parts feel unfamiliar. You're not expected to know how to do these steps yet.
The lessons in Weeks 1 through 4 of this module will teach you the knowledge and skills needed to complete this project successfully.
Use this page as a preview of what you'll be ready to build by the end of Module 1.
In this project, you’ll perform parameter-efficient fine-tuning (PEFT) on a small or medium-sized open-weights LLM such as Mistral 7B, Phi-3 Mini, or Qwen-1.5B, adapting it for a specific task of your choice.
You’ll apply your knowledge of data preparation, model configuration, and optimization techniques to produce:
You’ll build and document a fine-tuned model that adapts an open-weights base LLM for a specific task or dataset.
Your project should include:
Dataset Selection and Preparation:
Fine-Tuning Implementation:
Evaluation and Optimization:
lm-evaluation-harness or another benchmark tool.bitsandbytes or GGUF) for efficiency.Documentation:
Not sure what to fine-tune your model for? Here are a few directions to help you choose a dataset and define your project goal.
All of these can be done using existing public datasets available on Hugging Face Datasets.
You don’t need to collect or clean your own data — simply pick a relevant dataset and focus on fine-tuning, evaluation, and optimization.
Fine-tune your model to handle domain-specific queries using a dataset specialized in a particular field such as finance, healthcare, or legal text.
Use a dataset of technical commands or documentation (e.g., Docker, SQL, Git, or Linux) to train your model to respond to natural-language queries.
Work with benchmark datasets such as GSM8K (grade-school math problems) or ARC (AI2 Reasoning Challenge) to improve reasoning or step-by-step problem-solving ability.
You’re not limited to these examples — explore any dataset or task that interests you.
Just ensure the dataset is publicly available, appropriately licensed, and small enough to fine-tune efficiently.
If you’re unsure where to start, browse Hugging Face Datasets for inspiration — many datasets include sample scripts and documentation that make setup easy.
Module 1 projects are evaluated in the review cycle for the month in which they are submitted.
To be included in that month’s review, send in your project no later than one of these dates:
If you don’t meet a listed deadline, you can still submit before the next month’s date to be considered in that cycle.
The review process usually takes up to 2 weeks after the deadline, which includes receiving reviewer feedback and making any necessary updates.
To complete this project, submit the following deliverables:
Create a short publication on Ready Tensor that:
📄 Publication Evaluation Rubric
Submit a repo that:
📄 Repository Evaluation Rubric
Successfully completing this project earns you the LLM Fine-Tuning Specialist credential — recognizing your ability to fine-tune and optimize large language models for production use.
Move on to the upcoming lessons in Module 1.
They’ll give you the knowledge and code examples you need to complete this project successfully and earn your first certification credential.