All Posts
Tutorials

Fine-Tuning Llama 4 on Wollnut Labs: A Step-by-Step Guide

Wollnut Labs TeamMarch 20, 20257 min

Prerequisites

  • A Wollnut Labs account with at least $10 in credits
  • Your training dataset in JSONL format
  • Basic familiarity with Python and Hugging Face
  • Step 1: Launch a GPU Instance

  • Go to Dashboard → New Instance
  • Select the **PyTorch 2.2 + CUDA 12.4** template
  • Choose **H100 1x** for 7B models or **H100 2x** for larger variants
  • Add your SSH key and click Deploy
  • Your instance will be ready in about 60 seconds.

    Step 2: Connect via SSH

    ssh -i ~/.ssh/your_key root@YOUR_INSTANCE_IP

    Step 3: Install Fine-Tuning Dependencies

    pip install trl peft datasets bitsandbytes accelerate

    Step 4: Upload Your Dataset

    scp -i ~/.ssh/your_key dataset.jsonl root@YOUR_IP:/workspace/

    Step 5: Run Fine-Tuning

    Create a training script:

    from trl import SFTTrainer
    from transformers import AutoModelForCausalLM, AutoTokenizer
    from peft import LoraConfig
    from datasets import load_dataset
    
    model_name = "meta-llama/Llama-4-Scout-17B"
    dataset = load_dataset("json", data_files="dataset.jsonl")
    
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype="auto",
        device_map="auto",
    )
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
    lora_config = LoraConfig(
        r=16,
        lora_alpha=32,
        target_modules=["q_proj", "v_proj"],
        lora_dropout=0.05,
    )
    
    trainer = SFTTrainer(
        model=model,
        train_dataset=dataset["train"],
        peft_config=lora_config,
        max_seq_length=2048,
        tokenizer=tokenizer,
    )
    
    trainer.train()
    trainer.save_model("./llama-finetuned")

    Step 6: Export Your Model

    scp -r -i ~/.ssh/your_key root@YOUR_IP:/workspace/llama-finetuned ./

    Step 7: Stop Your Instance

    Don't forget to stop your instance when training is complete! Billing stops immediately.

    Cost Estimate

    Fine-tuning Llama 4 Scout 17B on a typical dataset (10k examples, 3 epochs) takes about 2-4 hours on an H100. At $2.25/hr, that's roughly $4.50–$9.00 total.