Qwodel
Backends

Quantizes models for Apple devices (iOS, macOS, iPadOS) using coremltools.

Install: pip install qwodel[coreml]

Requires macOS. Xcode Command Line Tools must be installed.


Supported Formats

FormatCompressionNotes
float16~2xHalf-precision. Minimal accuracy loss. Recommended start.
int8_linear~4x8-bit linear quantization. Good accuracy.
int8_symmetric~4x8-bit symmetric quantization. Faster ANE ops.
int6~5x6-bit palettization. Balance between int4 and int8.
int4~8x4-bit palettization. Maximum compression. iOS 18+ only.

Parameters

Quantizer(...) — Initialization

ParameterTypeDefaultDescription
input_shapetuple(1, 512)(batch_size, seq_length) for model tracing.
compute_unitsstr"ALL"CoreML compute units: "ALL", "CPU_ONLY", "CPU_AND_GPU".
seq_lengthint512Maximum sequence length for dynamic shape range.

quantize(format) — Runtime

ParameterTypeRequiredDescription
formatstrYesOne of the formats listed above.

Example

from qwodel import Quantizer

quantizer = Quantizer(
    backend="coreml",
    model_path="./my-model",
    output_dir="./output",
    compute_units="ALL",
    seq_length=512
)
output = quantizer.quantize(format="float16")
print(f"Output: {output}")

CLI:

qwodel quantize ./my-model --backend coreml --format float16 --output ./output

After Quantization

Your output is a .mlpackage directory. Load it into an iOS or macOS app: iOS App Integration →