Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Address-Space ID Table (AS0–AS225/501/502)

Every ID, MemorySpace number, and pool name on this page was decoded byte-exactly from the SparseCore LLVM-lowering functions in libtpu.so from the libtpu-0.0.40-cp314 wheel (BuildID md5 89edbbe81c5b328a958fe628a9f2207d). Other versions differ.

Abstract

A SparseCore pointer in the LLVM backend carries a numeric address-space ID — the integer N in !llvm.ptr<N>. Unlike a typical GPU backend that uses a compact AS0..AS5 numbering, the SparseCore LLVM dialect uses a sparse, banded ID space: 0 for the inherited scalar memory, 201..225 (base 0xc9) for the SparseCore-specific pools and their alias groups, and 501/502 (0x1f5/0x1f6) for the two circular-buffer windows. Each ID maps 1:1 onto a 1-based mlir::sparse_core::MemorySpace enum value, which in turn names a physical (or virtual/alias) memory pool — smem, tile_spmem, spmem, hbm, sflag, vmem, dreg, timem, simem, iova, mar, the per-tile/per-SCS variants, and the may-alias *_any supersets.

The ID is how the LowerToSparseCoreLlvm pass routes a ScDialect memref operand to a leaf tpu_* intrinsic: the DMA and stream lowerings dispatch on the (srcMemSpaceID, dstMemSpaceID) pair, and the addrspacecast lowering elides or emits an llvm.addrspacecast by comparing the LLVM pointer types the two IDs convert to. The table is recovered three independent ways that agree to the ID — the forward AddressSpaceDescription(ID)→string switch, the AddressSpaceToMemorySpace(ID)→MemorySpace jump table, and the inverse MemorySpaceToAddressSpace(MemorySpace)→ID reverse table.

The contract for a reimplementer is: tag every SparseCore pointer with the ID for its pool; classify on-tile vs off-tile with the one-line IsOffTileMemory mask; canonicalise to the *Any superset for alias analysis when the exact tile/core is statically unknown.

ID bands0 (base Smem); 201..225 = 0xc9..0xe1 (SC pools + alias groups); 501/502 = 0x1f5/0x1f6 (CB windows)
Named IDs22 (19 in the 201-band + ID 0 + 501 + 502); 6 reserved/gap slots (206, 207, 209, 210, 221, 222)
MemorySpace enum22 enum slots (1-based, max 0x16); 21 valid, value 8 is an unused gap (empty name, no AS)
Forward (ID→name)mlir::sparse_core::LlvmTpuDialect::AddressSpaceDescription(int) @ 0x135462c0
Forward (ID→MS)mlir::sparse_core::AddressSpaceToMemorySpace(uint) @ 0x14b78800
Inverse (MS→ID)mlir::sparse_core::MemorySpaceToAddressSpace(MemorySpace) @ 0x14b78780
Pool name (MS→str)mlir::sparse_core::stringifyMemorySpace(MemorySpace) @ 0x14b78240
On-/off-tileIsOffTileMemory(MemorySpace) @ 0x13d7ac00 = (ms & ~0x10) != 2
Alias canonicaliseGetAnyTypeFromAddressSpace(int) @ 0x1357b400
ConfidenceCONFIRMED unless a cell is annotated otherwise

The Master AS-ID Table

AS# is the LLVM address-space integer (the N in !llvm.ptr<N>). Region/pool is stringifyMemorySpace(MS#). MS# is the MemorySpace enum value (1-based; 0 = no canonical pool). Width is the addressing scale the pool covers — KB per-tile, MB chip-shared, GB global. tile? is IsOffTileMemory==false, true only for MS 2 and MS 18. A ✓ in notes means the ID↔MS mapping is confirmed by the inverse MemorySpaceToAddressSpace reverse table.

AS#hexRegion / poolMS#Widthtile?Meaning · Confidence
00x0smem1KBoffinherited base TPU scalar memory ✓
2010xc9tile_spmem2KBONper-tile SparseCore SRAM ✓
2020xcaspmem3MBoffchip-shared SparseCore SRAM ✓
2030xcbhbm4GBoffglobal HBM (embedding tables) ✓
2040xccsflag5offsync-flag memory ✓ (MS 22 sflag_tc also maps here)
2050xcdvmem6MBoffTensorCore vector memory (TC↔SC handoff) ✓
2060xce0reserved / gap
2070xcf0reserved / gap
2080xd0dreg7offdata-register window ✓
2090xd10reserved / gap
2100xd20reserved / gap
2110xd3— (alias)0offSflagAny may-alias superset (no pool)
2120xd4smem_any9offSmemAny may-alias superset ✓
2130xd5hbm_any10offHBMAny may-alias superset ✓
2140xd6timem11offper-tile instruction memory ✓
2150xd7simem12offSC instruction memory ✓ †
2160xd8iova13GBoffI/O virtual address ✓
2170xd9sflag_tile14offper-tile sflag bank ✓
2180xdaspmem_any15offSpmemAny may-alias superset ✓
2190xdbsmem_tile16KBoffper-tile SMEM (TileSmem) ✓
2200xdcmar17offmemory-access-region ✓ †
2210xdd0reserved / gap
2220xde0reserved / gap
2230xdfsflag_scs20offper-SCS sflag bank (SflagScs) ✓
2240xe0smem_scs21KBoffper-SCS SMEM (SmemScs) ✓
2250xe1— (alias)0offSflagAnySynctile (no pool)
5010x1f5tile_spmem_cb18KBONCBREG-windowed TILE_SPMEM
5020x1f6smem_cb19KBoffCBREG-windowed SMEM

A on a row marks an ID whose pool name comes only from stringifyMemorySpace, not from AddressSpaceDescription (see the GOTCHA below).

The desc (AddressSpaceDescription) strings for the named IDs are, in case order: TileSpmem, Spmem, HBM, Sflag, Vmem, Dreg, SflagAny, SmemAny, HBMAny, Timem, IOVA, SflagTile, SpmemAny, TileSmem, SflagScs, SmemScs, SflagAnySynctile; ID 0 returns Smem; 501/502 return "TileSpmem Circular Buffer" / "Smem Circular Buffer"; everything else returns "Unknown".

GOTCHA — IDs 215 (simem) and 220 (mar) carry a real MemorySpace (12 and 17) but AddressSpaceDescription returns the empty default for them — they fall into the same case 206/207/209/210/215/220/221/222: return result arm as the true reserved gaps. The pool names simem/mar come from stringifyMemorySpace, not from the description switch. A reader that derives names only from AddressSpaceDescription will wrongly treat 215/220 as reserved.


How the Backend Tags Pointers

The LlvmTpuDialect declares no separate "pointer type" per pool. Instead the address-space integer above is the N in the LLVM pointer type !llvm.ptr<N>, and a ScDialect memref carries its MemorySpace as a memref attribute. The lowering converts that to the LLVM AS number and uses it as the dispatch key:

ScDialect op (memref with MemorySpace attr)
  → AddressSpaceToMemorySpace / MemorySpaceToAddressSpace  (ID ↔ MS, 1:1)
  → getStridedElementPtr → !llvm.ptr<AS#>                  (raw element pointer)
  → DMA/stream lowering dispatch on (srcAS, dstAS)         (selects tpu_* intrinsic)

AddressSpaceToMemorySpace(uint) is a jump table over IDs 201..224 plus explicit 501→18 / 502→19 arms; the low 32 bits of its 0x1_0000000N return value are the MemorySpace enum. MemorySpaceToAddressSpace(MemorySpace) is the exact inverse: it indexes dword_AF36CE8[ms-1], gated by the range check ms-1 > 0x15 (rejects ms > 22) and the validity mask 0x3fff7f tested as (0x3fff7f >> (ms-1)) & 10x3fff7f has 21 bits set (the 21 valid MemorySpace values; bit 7 is clear, i.e. value 8 is rejected). stringifyMemorySpace and TpuVersionToString are both pointer-table lookups (off_219AF590[ms-1] and off_22011BF0[ver], both 1-based) whose string pointers live in .data.rel.ro and are filled by R_X86_64_RELATIVE relocations at load — they read as zero in the on-disk image (confirmed: every slot of off_219AF590 is 0x0 on disk; resolving the RELATIVE relocs yields smem, tile_spmem, … sflag_tc).


On-Tile vs Off-Tile (the access-semantics gate)

IsOffTileMemory(MemorySpace) is a single masked compare:

bool IsOffTileMemory(int ms) { return (ms & 0xFFFFFFEF) != 2; }   // (ms & ~0x10) != 2

Clearing bit 4 (0x10) folds MS 2 (tile_spmem) and MS 18 = 0x12 (tile_spmem_cb) together, so only those two are on-tile. Every other pool — hbm, spmem, smem, sflag, vmem, dreg, timem, simem, iova, mar, all the *_tile/*_scs/*_any variants — is off-tile and requires a DMA, stream, or sync to reach. This is the predicate the DMA and stream lowerings consult before selecting a data-movement intrinsic.


The *Any May-Alias Canonicalisation

When a pointer's exact tile or core is statically unknown, the SparseCore LLVM backend widens it to a wildcard *Any space for alias analysis. GetAnyTypeFromAddressSpace(int) is the canonicaliser:

concrete ID (name)→ canonical ID (name)
201 TileSpmem, 202 Spmem218 SpmemAny
203 HBM213 HBMAny
204 Sflag211 SflagAny
205 Vmem205 Vmem (self — no separate wildcard)
219 TileSmem, 0 Smem212 SmemAny

The *Any IDs (211/212/213/218) carry a description but no MemorySpace pool — they are alias-analysis groupings, not physical pools. Calling GetAnyTypeFromAddressSpace on an already-wildcard or leaf space (Dreg, Timem, IOVA, SflagTile, the *Any IDs themselves) hits the LogFatal("Unsupported address space: ") arm (llvm_tpu_dialect_only.h:100), so the canonicaliser is total only over the concrete spaces above.

NOTE — the *Any widening is the SparseCore answer to the fat-pointer problem: a pointer into HBM/SPMEM whose owning tile is a runtime value cannot be proven disjoint from another such pointer, so the backend assigns both the HBMAny/SpmemAny superset and lets alias analysis treat them as may-alias. The concrete-vs-Any distinction is what keeps statically-resolved tile-local accesses from being pessimised.


Cross-Validation and CheckAddressSpaces

The four accessors form a closed, self-checking system:

AddressSpaceDescription(ID)   : ID → human string      @0x135462c0
AddressSpaceToMemorySpace(ID) : ID → MemorySpace        @0x14b78800
MemorySpaceToAddressSpace(MS) : MemorySpace → ID (inv)  @0x14b78780
stringifyMemorySpace(MS)      : MemorySpace → pool name  @0x14b78240

The forward and inverse ID↔MS maps are exact inverses for all 21 named IDs (verified arm-by-arm against the decompiled switches). CheckAddressSpaces(SparseCoreTarget&, Operation*, int, int) @ 0x135b8e00 is the verifier the lowering calls to validate a (src, dst) ID pair against the target before emitting a data-movement intrinsic; its full legality matrix (which pairs are valid per primitive) is not enumerated here — only its existence and signature are confirmed.


NameRelationship
LowerToSparseCoreLlvm passreads these IDs to route memref operands to tpu_* intrinsics
MemorySpaceCastOpLowering @ 0x135a5c20elides/emits llvm.addrspacecast by comparing converted pointer types
DmaSimpleStartOpLowering / LinearStreamStartOpLoweringdispatch on (srcAS, dstAS) / (dtype, off-tile MS, verb)
getStridedElementPtrturns a memref+index into a raw !llvm.ptr<AS#>

Cross-References