Skip to content

[问题/Issue] 章节2.1:简短问题描述 / Chapter2.1: attention函数没有问题。但是2.1.4小节的那一行代码中attention函数的输入参数感觉有问题 #120

@anliu2465-png

Description

@anliu2465-png

1. 遇到问题的章节 / Affected Chapter

Chapter2.1

2. 具体问题描述 / Problem Description

在2.1.4 自注意力小节,代码部分为什么是attention(x,x,x)呢?我的理解是:这里QKV矩阵的输入X尽管相同,但是权重矩阵不同W_Q、W_K、W_V。而且attention函数的输入参数不是QKV矩阵吗,怎么到这里成了输入是x了?

3. 问题重现材料 / Reproduction Materials

# attention 为上文定义的注意力计算函数
attention(x, x, x)
attention(XW_Q,XW_K,XW_V)
···

### 确认事项 / Verification

- [x] 此问题未在过往Issue中被报告过 / This issue hasn't been reported before

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions