Important
The main branch is a new major version number: 3.x. In order
to decouple from tokenizer and downloader packages some breaking
changes were introduced. See Breaking Changes for more information.
MLX Swift LM is a Swift package to build tools and applications with large language models (LLMs) and vision language models (VLMs) in MLX Swift.
Some key features include:
- Model loading with integrations for a variety of tokenizer and model downloading packages.
- Low-rank (LoRA) and full model fine-tuning with support for quantized models.
- Many model architectures for both LLMs and VLMs.
For some example applications and tools that use MLX Swift LM, check out MLX Swift Examples.
Add the core package to your Package.swift:
.package(url: "https://github.com/ml-explore/mlx-swift-lm", branch: "main"),Then add your preferred tokenizer and downloader integrations:
.package(url: "https://github.com/DePasqualeOrg/swift-tokenizers-mlx", from: "0.1.0"),
.package(url: "https://github.com/DePasqualeOrg/swift-hf-api-mlx", from: "0.1.0"),And add the libraries to your target:
.target(
name: "YourTargetName",
dependencies: [
.product(name: "MLXLLM", package: "mlx-swift-lm"),
.product(name: "MLXLMTokenizers", package: "swift-tokenizers-mlx"),
.product(name: "MLXLMHuggingFace", package: "swift-hf-api-mlx"),
]),MLX Swift LM focuses on model implementations. Tokenization and model downloading are handled by separate packages. Adapters make it easy to use your preferred downloader and tokenizer packages.
| Downloader package | Adapter |
|---|---|
| huggingface/swift-huggingface | DePasqualeOrg/swift-huggingface-mlx |
| DePasqualeOrg/swift-hf-api | DePasqualeOrg/swift-hf-api-mlx |
| Tokenizer package | Adapter |
|---|---|
| DePasqualeOrg/swift-tokenizers | DePasqualeOrg/swift-tokenizers-mlx |
| huggingface/swift-transformers | DePasqualeOrg/swift-transformers-mlx |
Note: The adapters are offered for convenience and are not required. You can also use tokenizer and downloader packages directly by setting up the required protocol conformance for MLX Swift LM. See the integration packages for examples of how to do this.
You can get started with a wide variety of open-weights LLMs and VLMs using this simplified API (for more details, see MLXLMCommon):
import MLXLLM
import MLXLMHuggingFace
import MLXLMTokenizers
let model = try await loadModel(
from: HubClient.default,
using: TokenizersLoader(),
id: "mlx-community/Qwen3-4B-4bit"
)
let session = ChatSession(model)
print(try await session.respond(to: "What are two things to see in San Francisco?"))
print(try await session.respond(to: "How about a great place to eat?"))Loading from a local directory:
import MLXLLM
import MLXLMTokenizers
let modelDirectory = URL(filePath: "/path/to/model")
let container = try await loadModelContainer(
from: modelDirectory,
using: TokenizersLoader()
)Use a custom Hugging Face client:
import MLXLLM
import MLXLMHuggingFace
import MLXLMTokenizers
let hub = HubClient(token: "hf_...")
let container = try await loadModelContainer(
from: hub,
using: TokenizersLoader(),
id: "mlx-community/Qwen3-4B-4bit"
)Use a custom downloader:
import MLXLLM
import MLXLMCommon
import MLXLMTokenizers
struct S3Downloader: Downloader {
func download(
id: String,
revision: String?,
matching patterns: [String],
useLatest: Bool,
progressHandler: @Sendable @escaping (Progress) -> Void
) async throws -> URL {
// Download files and return a local directory URL.
return URL(filePath: "/tmp/model")
}
}
let container = try await loadModelContainer(
from: S3Downloader(),
using: TokenizersLoader(),
id: "my-bucket/my-model"
)Or use the underlying API to control every aspect of the evaluation.
Version 3 of MLX Swift LM decouples the tokenizer and downloader implementations. See the [integrations](#Tokenizer and Downloader Integrations) section for details.
Add your preferred tokenizer and downloader adapters:
// Before (2.x) – single dependency
.package(url: "https://github.com/ml-explore/mlx-swift-lm/", from: "2.30.0"),
// After (3.x) – core + adapters
.package(url: "https://github.com/ml-explore/mlx-swift-lm/", from: "3.0.0"),
.package(url: "https://github.com/DePasqualeOrg/swift-tokenizers-mlx/", from: "0.1.0"),
.package(url: "https://github.com/DePasqualeOrg/swift-hf-api-mlx/", from: "0.1.0"),And add their products to your target:
.product(name: "MLXLMTokenizers", package: "swift-tokenizers-mlx"),
.product(name: "MLXLMHFAPI", package: "swift-hf-api-mlx"),
// If you use MLXEmbedders:
.product(name: "MLXEmbeddersTokenizers", package: "swift-tokenizers-mlx"),
.product(name: "MLXEmbeddersHFAPI", package: "swift-hf-api-mlx"),// Before (2.x)
import MLXLLM
// After (3.x)
import MLXLLM
import MLXLMHFAPI // Downloader adapter
import MLXLMTokenizers // Tokenizer adapterIf you use MLXEmbedders:
import MLXEmbedders
import MLXEmbeddersHFAPI // Downloader adapter
import MLXEmbeddersTokenizers // Tokenizer adapterThe core APIs now include a from: parameter of type URL or any Downloader as well as a using: parameter for the tokenizer loader. Tokenizer integration packages may supply convenience methods with a default tokenizer loader, allowing you to omit the using: parameter.
The most visible call-site changes are:
hub:→from:: Models are now loaded from a directoryURLorDownloader.HubApi→HubClient: A new implementation of the Hugging Face Hub client is used.
Example when downloading from Hugging Face:
// Before (2.x) – hub defaulted to HubApi()
let container = try await loadModelContainer(
id: "mlx-community/Qwen3-4B-4bit"
)
// After (3.x) – Using Swift Hugging Face + Swift Tokenizers
let container = try await loadModelContainer(
from: HubClient.default,
id: "mlx-community/Qwen3-4B-4bit"
)At the lower-level core API, you can still pass any Downloader and any TokenizerLoader explicitly.
Loading from a local directory:
// Before (2.x)
let container = try await loadModelContainer(directory: modelDirectory)
// After (3.x)
let container = try await loadModelContainer(from: modelDirectory)Loading with a model factory:
let container = try await LLMModelFactory.shared.loadContainer(
from: HubClient.default,
configuration: modelConfiguration
)Loading an embedder:
import MLXEmbedders
import MLXEmbeddersHFAPI
import MLXEmbeddersTokenizers
let container = try await loadModelContainer(
from: HubClient.default,
configuration: .configuration(id: "sentence-transformers/all-MiniLM-L6-v2")
)decode(tokens:) is renamed to decode(tokenIds:) to align with the transformers library in Python:
// Before (2.x)
let text = tokenizer.decode(tokens: ids)
// After (3.0)
let text = tokenizer.decode(tokenIds: ids)Developers can use these examples in their own programs -- just import the swift package!
- Porting and implementing models
- MLXLLMCommon: Common API for LLM and VLM
- MLXLLM: Large language model example implementations
- MLXVLM: Vision language model example implementations
- MLXEmbedders: Popular encoders and embedding models example implementations
The hub parameter (previously HubApi) has been replaced with from (any Downloader or URL for a local directory). Functions that previously defaulted to defaultHubApi no longer have a default – callers must either pass a Downloader explicitly or use the convenience methods in MLXLMHuggingFace / MLXEmbeddersHuggingFace, which default to HubClient.default.
For most users who were using the default Hub client, adding import MLXLMHuggingFace or import MLXEmbeddersHuggingFace and using the convenience overloads is sufficient.
Users who were passing a custom HubApi instance should create a HubClient instead and pass it as the from parameter. HubClient conforms to Downloader via MLXLMHuggingFace.
tokenizerIdandoverrideTokenizerhave been replaced bytokenizerSource: TokenizerSource?, which supports.id(String)for remote sources and.directory(URL)for local paths.preparePrompthas been removed. This shouldn't be used anyway, since support for chat templates is available.modelDirectory(hub:)has been removed. For local directories, pass theURLdirectly to the loading functions. For remote models, theDownloaderprotocol handles resolution.
loadTokenizer(configuration:hub:) has been removed. Tokenizer loading now uses AutoTokenizer.from(directory:) from Swift Tokenizers directly.
replacementTokenizers (the TokenizerReplacementRegistry) has been removed. Use AutoTokenizer.register(_:for:) from Swift Tokenizers instead.
The defaultHubApi global has been removed. Hugging Face Hub access is now provided by HubClient.default from the HuggingFace module.
downloadModel(hub:configuration:progressHandler:)→Downloader.download(id:revision:matching:useLatest:progressHandler:)loadTokenizerConfig(configuration:hub:)→AutoTokenizer.from(directory:)ModelFactory._load(hub:configuration:progressHandler:)→_load(configuration: ResolvedModelConfiguration)ModelFactory._loadContainer: removed (baseloadContainernow builds the container from_load)