Qwodel

Command-line tools provided by qwodel.


Installation

Install with the backend extras you need:

pip install qwodel[all]     #Everything

pip install qwodel[awq]     # AWQ (GPU)
pip install qwodel[gguf]    # GGUF (CPU)
pip install qwodel[coreml]  # CoreML (Apple)

qwodel — Main CLI

qwodel [COMMAND] [OPTIONS]

quantize

Quantize a model from the command line.

qwodel quantize MODEL_PATH [OPTIONS]

Arguments

ArgumentDescriptionRequired
MODEL_PATHPath to the input model (local directory, .gguf file, or HuggingFace model ID).

Options

OptionShortTypeDefaultDescription
--backend-bstrRequiredBackend to use: awq, gguf, coreml.
--format-fstrRequiredQuantization format (e.g., int4, Q4_K_M, float16).
--output-dir-ostr./quantized_modelsDirectory to save the output.
--verbose-vflagFalseEnable verbose/debug logging.

Examples

# AWQ INT4 quantization
qwodel quantize ./gemma -b awq -f int4 -o ./output

# GGUF Q4_K_M quantization
qwodel quantize ./llama-3 -b gguf -f Q4_K_M -o ./output

# CoreML INT8 quantization
qwodel quantize ./my-model -b coreml -f int8_linear -o ./output

list-formats

List all available quantization formats for a backend.

qwodel list-formats [BACKEND]
ArgumentDescriptionRequired
BACKENDBackend name: awq, gguf, coreml. Omit to list all.

Examples

# List GGUF formats
qwodel list-formats gguf

# List all formats across all backends
qwodel list-formats

check

Verify your installation and check which backends and dependencies are available.

qwodel check

No arguments or options required.


Format Quick Reference

AWQ Formats

FormatDescription
int44-bit weight quantization (W4A16). GPU inference.

GGUF Formats

FormatDescription
Q4_K_MBest balance of speed and quality. Recommended.
Q8_0Near-lossless quality.
Q2_KMaximum compression.
Q3_K_M3-bit medium quality.
Q4_0Compact 4-bit.
Q4_K_SSmall 4-bit K-quant.
Q5_K_MBetter quality than Q4_K_M.
Q5_K_SSmall 5-bit K-quant.
Q6_KHigh quality.
IQ4_NL4.5 bpw importance-based.
IQ3_M3.66 bpw compact.

CoreML Formats

FormatCompressionNotes
float16~2xHalf-precision. Universal.
int8_linear~4x8-bit linear.
int8_symmetric~4x8-bit symmetric.
int4~8xiOS 18+ only.
int6~5xBalance between int4 and int8.

On this page