-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Description
When I add following code in script train.py (function init_model) to count LLM parameters
params_llm = 0
for k,v in llm_model.named_parameters():
print(k, v.shape, v.numel())
params_llm += v.numel()
print('llm total params: {:.2f}M'.format( params_llm / 1024 / 1024))
I always get
transformer.wte.weight torch.Size([0]) 0
transformer.h.0.ln_1.weight torch.Size([0]) 0
transformer.h.0.attn.c_attn.weight torch.Size([0]) 0
transformer.h.0.attn.c_attn.bias torch.Size([0]) 0
transformer.h.0.attn.c_proj.weight torch.Size([0]) 0
transformer.h.0.ln_2.weight torch.Size([0]) 0
transformer.h.0.mlp.w1.weight torch.Size([0]) 0
transformer.h.0.mlp.w2.weight torch.Size([0]) 0
transformer.h.0.mlp.c_proj.weight torch.Size([0]) 0
transformer.h.1.ln_1.weight torch.Size([0]) 0
transformer.h.1.attn.c_attn.weight torch.Size([0]) 0
transformer.h.1.attn.c_attn.bias torch.Size([0]) 0
transformer.h.1.attn.c_proj.weight torch.Size([0]) 0
... ...
Is it be normal? or will that affect model training?
Metadata
Metadata
Assignees
Labels
No labels