Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This configuration file for bfloat16 is identical to its float16 counterpart. Based on the tuning script test/kernel/llama_gqa_diverse_decode_stage1_tuning.py, it appears tuning was only performed for torch.half (float16), and the results were then copied for both float16 and bfloat16. This means the bfloat16 configurations are likely not optimal and could lead to performance degradation. It is recommended to either run the tuning specifically for bfloat16 and update these files, or remove the bfloat16 configuration files for now to avoid confusion and potential performance issues. This applies to all bfloat16 config files in this PR.

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This JSON configuration file is not formatted, which makes it difficult to read and review. For better maintainability, please format the JSON content with indentation and add a newline at the end of the file. This should be applied to all new JSON files in this pull request.

{
  "4096": {
    "8": {
      "BLOCK_N": 16,
      "num_warps": 4,
      "num_stages": 1
    },
    "32": {
      "BLOCK_N": 16,
      "num_warps": 4,
      "num_stages": 1
    },
    "128": {
      "BLOCK_N": 16,
      "num_warps": 2,
      "num_stages": 1
    },
    "256": {
      "BLOCK_N": 16,
      "num_warps": 2,
      "num_stages": 1
    }
  },
  "8192": {
    "8": {
      "BLOCK_N": 16,
      "num_warps": 2,
      "num_stages": 1
    },
    "32": {
      "BLOCK_N": 16,
      "num_warps": 2,
      "num_stages": 1
    },
    "128": {
      "BLOCK_N": 16,
      "num_warps": 2,
      "num_stages": 1
    },
    "256": {
      "BLOCK_N": 16,
      "num_warps": 2,
      "num_stages": 1
    }
  }
}

Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 8, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 8, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 10}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 7}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 10}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 7}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 11}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 9}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 11}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 32, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 32, "num_warps": 4, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 11}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 2}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 11}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 3}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 2}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 9}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 9}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 4}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 8, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 3}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 5}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 64, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"4096": {"8": {"BLOCK_N": 16, "num_warps": 4, "num_stages": 4}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}, "8192": {"8": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "32": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "128": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}, "256": {"BLOCK_N": 16, "num_warps": 2, "num_stages": 1}}}
Loading