Skip to content

Commit a56c623

Browse files
committed
change memory model from gpu to npu
1 parent ff5270e commit a56c623

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

50 files changed

+225
-291
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Descend is a safe systems programming language that adapts and extends Rust's ty
77
- **Extended Borrow Checking**: Prevents data races by tracking unique (`uniq`) and shared (`shrd`) references across thousands of parallel threads
88
- **Memory Views**: Safe parallel access patterns that replace raw pointer indexing, statically verified to be race-free
99
- **Execution Resource Tracking**: Types enforce that memory is only accessed in correct execution contexts (`cpu.thread`, `gpu.grid`, `gpu.block`, `gpu.thread`)
10-
- **Explicit Memory Spaces**: References track physical memory locations (`cpu.mem`, `gpu.global`, `gpu.shared`) preventing invalid cross-device accesses
10+
- **Explicit Memory Spaces**: References track physical memory locations (`cpu.mem`, `npu.global`, `gpu.shared`) preventing invalid cross-device accesses
1111
- **Safe Synchronization**: The type system enforces correct placement and usage of synchronization primitives
1212

1313
**Design Philosophy:**
@@ -59,7 +59,7 @@ AscendNPU-IR defines custom MLIR dialects tailored to Ascend NPU capabilities:
5959

6060
**Descend + AscendNPU-IR Integration:**
6161

62-
The MLIR backend maps Descend's execution contexts (`gpu.grid`/`gpu.block`/`gpu.thread`) and memory hierarchies (`gpu.global`/`gpu.local`) to corresponding Ascend NPU constructs through AscendNPU-IR's HIVM dialect. This integration:
62+
The MLIR backend maps Descend's execution contexts (`gpu.grid`/`gpu.block`/`gpu.thread`) and memory hierarchies (`npu.global`/`gpu.local`) to corresponding Ascend NPU constructs through AscendNPU-IR's HIVM dialect. This integration:
6363

6464
- Preserves Descend's compile-time safety guarantees (race freedom, memory safety, synchronization correctness)
6565
- Generates efficient code optimized for Ascend NPU hardware
@@ -97,9 +97,9 @@ Descend:
9797

9898
```rust
9999
fn add<n: nat, r: prv>(
100-
a: &r shrd gpu.global [i16; 16],
101-
b: &r shrd gpu.global [i16; 16],
102-
c: &r uniq gpu.global [i16; 16]
100+
a: &r shrd npu.global [i16; 16],
101+
b: &r shrd npu.global [i16; 16],
102+
c: &r uniq npu.global [i16; 16]
103103
) -[grid: gpu.grid<X<1>, X<16>>]-> () {
104104
// Vector addition with GPU memory spaces
105105
()
@@ -274,7 +274,7 @@ The CUDA backend generates C++ code for NVIDIA GPUs:
274274
###### ✅ Phase 2: Ascend-Specific Lowering (Completed)
275275

276276
- [x] Map execution contexts (`gpu.grid`/`gpu.block`/`gpu.thread`) to HIVM parallel constructs
277-
- [x] Map memory hierarchies (`gpu.global` → HIVM global, `gpu.local` → HIVM shared)
277+
- [x] Map memory hierarchies (`npu.global` → HIVM global, `gpu.local` → HIVM shared)
278278
- [x] HIVM dialect integration with proper address spaces
279279
- [x] HACC entry point and device function generation
280280
- [x] Comprehensive test suite (14 passing tests)

examples/core/assign.desc.off

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
fn assign<n: nat, r: prv>(
2-
a: &r shrd gpu.global [i16; 16],
3-
b: &r uniq gpu.global [i16; 16]
2+
a: &r shrd npu.global [i16; 16],
3+
b: &r uniq npu.global [i16; 16]
44
) -[grid: gpu.grid<X<1>, X<16>>]-> () {
55
b = a;
66
()

examples/core/func_params.desc

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,5 +5,3 @@ fn add(a: i32, b: i32) -[t: cpu.thread]-> i32 {
55
fn main() -[t: cpu.thread]-> i32 {
66
add(10, 32)
77
}
8-
9-

examples/core/gpu_mem.desc.off

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
fn add<n: nat, r: prv>(
2-
a: &r shrd gpu.global [i16; 16],
3-
b: &r shrd gpu.global [i16; 16],
4-
c: &r uniq gpu.global [i16; 16]
2+
a: &r shrd npu.global [i16; 16],
3+
b: &r shrd npu.global [i16; 16],
4+
c: &r uniq npu.global [i16; 16]
55
) -[grid: gpu.grid<X<1>, X<16>>]-> () {
66
// a = b + c;
77
()
Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
fn memory_load<n: nat, r: prv>(
2-
a: &r shrd gpu.global [i16; 16],
3-
b: &r uniq gpu.global [i16; 16]
2+
a: &r shrd npu.global [i16; 16],
3+
b: &r uniq npu.global [i16; 16]
44
) -[grid: gpu.grid<X<1>, X<16>>]-> () {
55
()
66
}

examples/core/load_ub.desc.off

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
fn memory_load<n: nat, r: prv>(
2-
a: &r shrd gpu.global [i16; 16]
2+
a: &r shrd npu.global [i16; 16]
33
) -[grid: gpu.grid<X<1>, X<16>>]-> () {
44
a;
55
()

examples/core/load_ub_twice.desc.off

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
fn memory_load<n: nat, r: prv>(
2-
a: &r shrd gpu.global [i16; 16],
3-
b: &r uniq gpu.global [i16; 16]
2+
a: &r shrd npu.global [i16; 16],
3+
b: &r uniq npu.global [i16; 16]
44
) -[grid: gpu.grid<X<1>, X<16>>]-> () {
55
a;
66
b;

examples/core/vadd.desc.off

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
fn add<n: nat, r: prv>(
2-
a: &r shrd gpu.global [i16; 16],
3-
b: &r shrd gpu.global [i16; 16]
2+
a: &r shrd npu.global [i16; 16],
3+
b: &r shrd npu.global [i16; 16]
44
) -[grid: gpu.grid<X<1>, X<16>>]-> () {
55
a + b;
66
()

examples/core/vdiv.desc.off

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
fn div<n: nat, r: prv>(
2-
a: &r shrd gpu.global [i16; 16],
3-
b: &r shrd gpu.global [i16; 16]
2+
a: &r shrd npu.global [i16; 16],
3+
b: &r shrd npu.global [i16; 16]
44
) -[grid: gpu.grid<X<1>, X<16>>]-> () {
55
a / b;
66
()

examples/core/vec_add.desc.off

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,19 +6,19 @@
66
// - r: prv - Provenance parameter tracking memory region/lifetime for all references
77
fn add<n: nat, r: prv>(
88
// Shared reference to first input vector - multiple threads can read simultaneously
9-
// Memory space: gpu.global (GPU global memory)
9+
// Memory space: npu.global (GPU global memory)
1010
// Ownership: shrd (shared) - prevents write-after-read data races
1111
// Type: 16-element array of 16-bit signed integers
12-
a: &r shrd gpu.global [i16; 16],
12+
a: &r shrd npu.global [i16; 16],
1313

1414
// Shared reference to second input vector - multiple threads can read simultaneously
1515
// Same memory space and ownership constraints as 'a'
16-
b: &r shrd gpu.global [i16; 16],
16+
b: &r shrd npu.global [i16; 16],
1717

1818
// Unique reference to output vector - only one thread can write at a time
1919
// Ownership: uniq (unique) - prevents write-after-write data races
2020
// The compiler statically ensures no conflicting borrows exist
21-
c: &r uniq gpu.global [i16; 16]
21+
c: &r uniq npu.global [i16; 16]
2222

2323
// Execution context specification - defines how this function runs on GPU hardware
2424
// - grid: gpu.grid<X<1>, X<16>> - GPU execution grid with 1 block containing 16 threads

0 commit comments

Comments
 (0)