In this publication, I share the development and deployment of a custom fine-tuned large language model—Llama-3.2-1B-ComputerEngineeringLLM—based on Meta's LLaMA 3.2 1B. I have specifically tailored this model for computer engineering applications, leveraging state-of-the-art techniques such as LoRA (Low-Rank Adaptation) and advanced quantization methods to deliver efficient and practical solutions in my study domain.
Llama-3.2-1B-ComputerEngineeringLLM is a derivative work that adapts the powerful capabilities of Meta’s LLaMA 3.2 1B to meet the unique demands of computer engineering. For fine-tuning, I used datasets such as Wikitext-2-raw-v1 along with a specialized computer engineering and computer science dataset. As a result, the model is well-equipped to handle technical documentation, code generation, everyday inquiries, and more.
Base Model: Meta’s LLaMA 3.2 1B
Architecture: LlamaForCausalLM
Follow these steps to set up and run the model on your local machine. The instructions assume you have Git LFS installed, as well as Python and pip.
First, install the required dependencies:
pip install -U bitsandbytes pip install transformers torch accelerate
Clone the repository and pull the large files:
git clone https://github.com/IrfanUruchi/Llama-3.2-1B-ComputerEngineeringLLM.git cd Llama-3.2-1B-ComputerEngineeringLLM git lfs pull
from transformers import AutoModelForCausalLM, AutoTokenizer local_path = "./Llama-3.2-1B-ComputerEngineeringLLM" # Load the model and tokenizer model = AutoModelForCausalLM.from_pretrained(local_path, device_map="auto", local_files_only=True) tokenizer = AutoTokenizer.from_pretrained(local_path, use_fast=False, local_files_only=True) # Example prompt to test the model prompt = "Explain how computers process data." inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.8, top_k=50, top_p=0.92) print(tokenizer.decode(outputs[0], skip_special_tokens=True))
After loading the model, you can customize your prompts and experiment with different generation parameters such as max_new_tokens, temperature, top_k, and top_p to suit your needs. This makes the model a versatile tool for technical documentation, code generation, and answering everyday inquiries.
This model is a derivative work based on Meta’s LLaMA 3.2 1B and is distributed under the LLaMA 3.2 Community License. For full details, please see the LICENSE file in the repository.
Attribution:
“Llama 3.2 is licensed under the Llama 3.2 Community License, Copyright © Meta Platforms, Inc. All Rights Reserved. Built with Llama.”
For more information on the base model, visit Meta’s LLaMA repository (https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)
While this model is fine-tuned for computer engineering applications, it may not perform optimally on subjects outside of this domain. Users might need to apply further prompt engineering for specialized tasks, and some outputs could exhibit minor artifacts due to fine-tuning constraints.
Working on the LLaMA 3.2-1B-ComputerEngineeringLLM project has been an incredibly rewarding journey. This effort represents my passion for blending advanced language model techniques with the practical challenges of computer engineering. I invite you to dive into the repository, try out the model for your own projects, and share your thoughts and feedback—each comment helps me refine and improve this work. If you have any questions, ideas, or suggestions, please don't hesitate to reach out through the GitHub issues page. Let's build the future of engineering together!
There are no datasets linked
There are no datasets linked