Skip to content

Llama-vision Attentions. Find top attended patch ids #1

Description

@claudiopisa

I have been analysing the Llama-3.2InferanceAnalysis.ipynb code you posted. At some point in the code the top attented patch are plotted over the image:

#Assume 14x14 patch size → 16×16 grid → 256 total patches
patch_size = 14 
top_attended_patch_ids = [85,86,69,70,53,54,101,102,117,118]  # Replace with your actual patch indices

#Draw rectangles around top patches#
patch_id = 0
for row in range(0, 224, patch_size):
    for col in range(0, 224, patch_size):
        if patch_id in top_attended_patch_ids:
            draw.rectangle([col, row, col + patch_size, row + patch_size], outline="red", width=2)
            draw.text((col + 2, row + 2), str(patch_id), fill="red")
        patch_id += 1

Could you please help me understand how did you find the values of the top attended patch ids ? Because the heatmaps plotted have on the x-axis the embeddings values of the patches, so i don't understand how to retrieve the original patch id values

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions