Skip to content

Commit fc5bf85

Browse files
blueswhenniushengxiao
andauthored
fix: fix a bug in flashinfer_struct (#966)
Co-authored-by: niushengxiao <[email protected]>
1 parent e7f41ca commit fc5bf85

File tree

2 files changed

+2
-2
lines changed

2 files changed

+2
-2
lines changed

lightllm/models/llama/flashinfer_struct.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ def init_some_extra_state(self, model, input_ids: torch.Tensor):
8181
self.req_manager.req_to_token_indexs,
8282
self.b_req_idx,
8383
self.b_seq_len,
84-
self.b_start_loc,
84+
kv_starts,
8585
self.max_len_in_batch,
8686
kv_indices,
8787
)

lightllm/models/llama/model.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ def __init__(self, model):
3030
self.tp_kv_head_num = max(model.config["num_key_value_heads"] // tp_world_size, 1)
3131
head_dim = model.config["hidden_size"] // model.config["num_attention_heads"]
3232
self.head_dim = model.config.get("head_dim", head_dim)
33-
self.workspace_buffer = torch.empty(256 * 1024 * 1024, dtype=torch.int8, device=get_current_device_id())
33+
self.workspace_buffer = torch.empty(512 * 1024 * 1024, dtype=torch.int8, device=get_current_device_id())
3434
self.max_seq_length = model.max_seq_length
3535
self.kv_indices_buffer = [
3636
torch.empty(

0 commit comments

Comments
 (0)