-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate Hugging Face model card #58
Conversation
Hi, not sure if this is already implemented but FYI Hugging Face confirmed that models with the “merge” tag will be marked as merges. |
@davanstrien Happy new year! Whenever you get back into things, I have a first pass at card generation about ready. I'd appreciate your thoughts on it before I merge it in. Here's what gets generated for ---
base_model:
- TheBloke/Llama-2-13B-fp16
- garage-bAInd/Platypus2-13B
- psmathur/orca_mini_v3_13b
- WizardLM/WizardMath-13B-V1.0
tags:
- mergekit
- merge
---
# ties-example
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [TheBloke/Llama-2-13B-fp16](https://huggingface.co/TheBloke/Llama-2-13B-fp16) as a base.
### Models Merged
The following models were included in the merge:
* [garage-bAInd/Platypus2-13B](https://huggingface.co/garage-bAInd/Platypus2-13B)
* [psmathur/orca_mini_v3_13b](https://huggingface.co/psmathur/orca_mini_v3_13b)
* [WizardLM/WizardMath-13B-V1.0](https://huggingface.co/WizardLM/WizardMath-13B-V1.0)
### Configuration
The following YAML configuration was used to produce this model:
```yaml
base_model: TheBloke/Llama-2-13B-fp16
dtype: float16
merge_method: ties
models:
- model: TheBloke/Llama-2-13B-fp16
- model: psmathur/orca_mini_v3_13b
parameters:
density: [1.0, 0.7, 0.1]
weight: 1.0
- model: garage-bAInd/Platypus2-13B
parameters:
density: 0.5
weight: [0.0, 0.3, 0.7, 1.0]
- model: WizardLM/WizardMath-13B-V1.0
parameters:
density: 0.33
weight:
- filter: mlp
value: 0.5
- value: 0.0
parameters:
int8_mask: 1.0
normalize: 1.0
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking super nice; thanks for working on this! I think most of the obvious metadata fields are included in the template already. It might be possible to also infer some additional fields from the base models being merged, but I think it's probably better to keep it a bit simpler to start, especially as some merges may include many models.
Does it also make sense to add a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super cool @cg123! Generated model card looks quite nice already.
I've used this branch to push my first merge: https://huggingface.co/julien-c/Mistral-7B-Neural-Story-mix 🔥
Note that the Hub model page UI element is quite ugly for now (will improve next week) but we already link to the merged models in the UI:

A few questions/suggestions (with a wider scope than this PR):
- could you also copy the input yml file to the output folder, using a conventional filename (maybe something lile
mergekit_config.yml
?). That way people would consistently upload it and the Hub could parse them down the road and provide some cool features based on this metadata (stats, UI, etc) - (nit) if you wanted to let users programmatically upload their models like @davanstrien was suggesting you might want to use some model card helpers from
huggingface_hub
, for instance to "merge" with the remote model card rather than overwrite it (in my model i had set a license at repo creation and it was overwritten). It's just a detail at this point though. - we should do a nice icon for mergekit on the Hub so the tag stands out more, do you already have a logo or icon in mind? otherwise we'll come up with something

more generally any feature we could build that'd be useful to you, just let us know!
Thanks for the comments @davanstrien @julien-c! I went ahead and added a copy of the original config YAML to the output directory as well. I think this is good to go for the first pass and will merge it shortly. Next step will be a separate PR for As for an icon, I've been half-thinking about Again thanks for the input! |
cool idea, we'll try something with @gary149 and if you like it you can use it! |
WDYT @cg123? we're thinking of using that icon for mergekit on HF (in model filters, etc): |
@gary149 Sorry for not seeing this sooner! Thanks for making this icon - I think it looks great. I'd be happy to have it used for mergekit on HF. |
That's great to hear! here is a svg of it:
|
FYI @gary149 the icon doesn't render correctly on dark mode: |
Thanks, we are going to fix it. |
WIP implementation for #41.