• Gemma 3 4B

    Gemma vision

    Google's small instruct model with vision input.

    Params
    4B
    Quant
    Q4_K_M
    Size
    2.5 GB
    Install in Llama
  • Gemma 3 1B

    Gemma

    Tiny Gemma variant — runs anywhere.

    Params
    1B
    Quant
    Q4_K_M
    Size
    806 MB
    Install in Llama
  • Llama 3.2 3B Instruct

    Llama

    Meta's compact instruct model.

    Params
    3B
    Quant
    Q4_K_M
    Size
    2.0 GB
    Install in Llama
  • Qwen 2.5 7B Instruct

    Qwen

    Strong general-purpose mid-size model.

    Params
    7B
    Quant
    Q4_K_M
    Size
    4.7 GB
    Install in Llama
  • Qwen 2.5 Coder 7B

    Qwen

    Code-tuned Qwen — good for editor integrations.

    Params
    7B
    Quant
    Q4_K_M
    Size
    4.7 GB
    Install in Llama
  • Qwen 3 30B A3B

    Qwen

    Mixture-of-experts — only 3B params active per token.

    Params
    30B (MoE)
    Quant
    Q4_K_M
    Size
    18 GB
    Install in Llama
  • Mistral Nemo Instruct

    Mistral

    Mistral + NVIDIA 12B with a 128k context.

    Params
    12B
    Quant
    Q4_K_M
    Size
    7.5 GB
    Install in Llama
  • SmolLM3 3B

    SmolLM

    HuggingFace's small, fast, multilingual model.

    Params
    3B
    Quant
    Q4_K_M
    Size
    1.9 GB
    Install in Llama