Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: Daniel #404

Draft
wants to merge 92 commits into
base: main
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
92 commits
Select commit Hold shift + click to select a range
5a750e9
added build folder
Oct 16, 2024
8d3d151
script for building mac app
Oct 16, 2024
dcd3264
added ci/cd
Oct 16, 2024
eda1ffd
added nuitka to deps
Oct 16, 2024
96b8d85
fixes to PR
Oct 16, 2024
a54d178
win icon
Oct 16, 2024
5ad45f4
fixes to static linking
Oct 16, 2024
17b85bd
fixed mlx error
Oct 17, 2024
39c260f
mlx library fix
Oct 17, 2024
885c663
remove .DS_Store
AlexCheema Oct 17, 2024
358c58d
changes to replace error
Oct 17, 2024
768413f
use resolve_tokenizer consistently
AlexCheema Oct 17, 2024
c333b2c
os.path.dirname of sys.executable
AlexCheema Oct 17, 2024
8f6cb76
changes to win jobs
Oct 17, 2024
d2805dc
added linux icon
Oct 17, 2024
b4ae3c2
linux build fix
Oct 17, 2024
3b64471
linux message
Oct 17, 2024
d767713
linux build
Oct 17, 2024
edd9483
renamed folder
Oct 17, 2024
3a40d21
linux packaging code
Oct 18, 2024
4bb9a1b
fixes to linux dist
Oct 18, 2024
98442e4
changes to linux nuitka
Oct 19, 2024
29fe3cf
added linux icon
Oct 19, 2024
3b9dbc8
changed to linux ico
Oct 19, 2024
b598947
changes to linux build ci/cd
Oct 19, 2024
bd414b5
macOS build
Oct 20, 2024
bcd0a35
Merge branch 'exo-explore:main' into package-exo-as-installable
Oct 20, 2024
4823cc5
testing wfs
Oct 20, 2024
23b1ddc
fix to config
Oct 20, 2024
21bafa7
fix config issue
Oct 20, 2024
c40b90a
circle ci issue fix
Oct 20, 2024
18d0726
changes to config
Oct 20, 2024
68db82f
config.yml
Oct 20, 2024
cbe4dec
fix to config.yml
Oct 20, 2024
6432c2b
changes to resource class
Oct 20, 2024
a9bde46
change to image
Oct 20, 2024
4645292
build fix error
Oct 20, 2024
0281f4f
fixes to build errors
Oct 20, 2024
ada61f6
fix to build error
Oct 20, 2024
4583c28
fix to circle ci/cd
Oct 20, 2024
bc6bb44
build fixes
Oct 20, 2024
1a2a576
circle ci/cd fixes error
Oct 20, 2024
4da2529
circle ci/cd errors fix
Oct 20, 2024
7b69a89
ci/cd error fix
Oct 20, 2024
8dd099b
error fix
Oct 20, 2024
d9b780d
ci/cd error fix
Oct 20, 2024
4908a2e
circle ci/cd errors
Oct 20, 2024
ecdf113
indent fix
Oct 20, 2024
5d68850
error fix
Oct 20, 2024
ae06a95
fix to conda
Oct 20, 2024
34938c5
conda error fix
Oct 20, 2024
e24dc63
circle ci error
Oct 20, 2024
b2a77a1
circle ci error
Oct 20, 2024
ef35cfd
circle ci fix
Oct 20, 2024
4b6a6bc
cci fix
Oct 20, 2024
f8599a4
circle ci/cd fix
Oct 20, 2024
b5e6c18
circle ci/cd fix
Oct 20, 2024
aa11281
circle ci/cd erro fix
Oct 20, 2024
b7c17b6
circle ci/cd error
Oct 20, 2024
67612f5
circle ci/cd error
Oct 20, 2024
399fe36
circle error
Oct 20, 2024
3d13c49
ci/cd fix
Oct 20, 2024
dcaf20e
ci/cd fix
Oct 20, 2024
48dac95
build error fix
Oct 20, 2024
a384c8c
build error fix
Oct 20, 2024
8512263
improv to build ci/cd
Oct 21, 2024
2859a62
added jobs
Oct 21, 2024
55150b4
fix to linux job
Oct 21, 2024
48eaf85
fix to clang
Oct 24, 2024
054e444
float fix
Oct 24, 2024
eadfdb3
float fix
Oct 24, 2024
993217c
float fix
Oct 24, 2024
08d8b75
change to model
Oct 24, 2024
fd3436d
smaller model
Oct 24, 2024
761800b
fix to clang error
Oct 24, 2024
e1887ea
model change
Oct 24, 2024
f708c4b
fixing layer missing
Oct 24, 2024
fd45f3b
seg fault error
Oct 24, 2024
e0a2e9a
llvir
Oct 24, 2024
fb0ef4e
changed to fl32
Oct 24, 2024
96e3ed1
added 3.2-3b
Oct 24, 2024
67b12ab
float 16
Oct 25, 2024
9c4bf27
fix to float16
Oct 25, 2024
65d747e
fix to float16
Oct 25, 2024
7b2b7ec
fix to float16
Oct 25, 2024
5b7ee2e
fix to float16
Oct 25, 2024
682e7e4
using torch:
Oct 26, 2024
5fc85d3
fix conflicts
Oct 31, 2024
91395d9
daniel
Nov 2, 2024
24ca9bd
a few updates on things pulled from previous PR but not tested
dtnewman Nov 3, 2024
c913aa6
add disable browser launch option
dtnewman Nov 3, 2024
314c786
make the tinychat lines in chatgpt_api.py conditional on disable brow…
dtnewman Nov 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
float 16
  • Loading branch information
josh authored and josh committed Oct 25, 2024
commit 67b12ab31a0abb2a9c71d47be5aabeef6adcccd3
13 changes: 11 additions & 2 deletions exo/inference/tinygrad/models/llama.py
Original file line number Diff line number Diff line change
@@ -252,8 +252,6 @@ def permute(v: Tensor, n_heads: int):


def fix_bf16(weights: Dict[Any, Tensor]):
for k, v in weights.items():
print(f"Key: {k}, Device: {v.device}, Dtype: {v.dtype}")

if Device.DEFAULT == "CLANG":
return {
@@ -263,3 +261,14 @@ def fix_bf16(weights: Dict[Any, Tensor]):
k: v.to(dtypes.float16).to(v.device) if v.dtype == dtypes.bfloat16 else v for k, v in weights.items()
}

def fix_bf16(weights):
converted = {}
for k, v in weights.items():
if v.dtype == dtypes.bfloat16:
curr_device = v.device
converted_tensor = v.to(dtypes.float16)
device_str = getattr(curr_device, 'name', str(curr_device))
converted[k] = converted_tensor.to(device_str)
else:
converted[k] = v
return converted
3 changes: 1 addition & 2 deletions exo/models.py
Original file line number Diff line number Diff line change
@@ -6,8 +6,7 @@
"MLXDynamicShardInferenceEngine": Shard(model_id="mlx-community/Llama-3.2-1B-Instruct-4bit", start_layer=0, end_layer=0, n_layers=16),
},
"llama-3.2-3b": {
"MLXDynamicShardInferenceEngine": Shard(model_id="mlx-community/Llama-3.2-3B-Instruct-4bit", start_layer=0, end_layer=0, n_layers=28),
"TinygradDynamicShardInferenceEngine":Shard(model_id="unsloth/Llama-3.2-3B-Instruct", start_layer=0, end_layer=0, n_layers=28)
"MLXDynamicShardInferenceEngine": Shard(model_id="mlx-community/Llama-3.2-3B-Instruct-4bit", start_layer=0, end_layer=0, n_layers=28)
},
"llama-3.1-8b": {
"MLXDynamicShardInferenceEngine": Shard(model_id="mlx-community/Meta-Llama-3.1-8B-Instruct-4bit", start_layer=0, end_layer=0, n_layers=32),