Skip to content

Latest commit

 

History

History
10 lines (7 loc) · 742 Bytes

faq.rst

File metadata and controls

10 lines (7 loc) · 742 Bytes

Frequently Asked Questions (FAQ)

This is a list of Frequently Asked Questions (FAQ) about the MLC-LLM. Feel free to suggest new entries!

... How can I customize the temperature, repetition penalty of models?
There is a mlc-chat-config.json file under your model directory, where you can modify the parameters.
... What's the quantization algorithm MLC-LLM using?
MLC-LLM does not impose any restrictions on the choice of the quantization algorithm, and the design should be compatible with all quantization algorithms. By default, we utilize the grouping quantization method discussed in the paper The case for 4-bit precision: k-bit Inference Scaling Laws.