Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

TypeID Sentinel Address Table

An MLIR TypeID is the runtime identity tag MLIR uses for fast structural RTTI: every concrete C++ class the framework compares at runtime (a Dialect, an Op, a Type, an Attribute, an Interface, even a Trait) has exactly one mlir::TypeID value associated with it through mlir::TypeID::get<T>(). The implementation produces that value by reading the address of a per-class static storage object — the address itself is the identity. A typical compiled MLIR binary therefore contains hundreds of one-byte (or eight-byte Meyers-cached) anonymous globals in .bss / .rodata whose sole role is to be compared against each other by pointer equality. In tileiras (88 MB Blackwell-era CUDA 13.1 MLIR-based optimizing assembler) those sentinels cluster densely in the 0x5B37B90 .. 0x5BE6138 band of the static-data segment, and one sentinel address suffices to back-trace from a stripped function to the exact Op/Type/Attr class it dispatches on. This page is the canonical reverse-direction lookup table: address in, dialect-and-class out.

The binary uses two sentinel idioms in parallel. First, static pointer-identity sentinels: one-byte .bss slots whose address is the TypeID. No code ever writes the byte; the pointer is the value. These dominate the cute / cute_nvgpu / nv_tileas / NVVM op slabs. Second, Meyers-cached sentinels: an {8-bit guard, 64-bit qword} pair where the qword fills in on first use by interning a C++-mangled mlir::TypeID::getFullName() string through a process-wide pool (sub_44A6CA0 in this binary; upstream MLIR ships the same RTTI-string-to-pointer interner under llvm::ManagedStatic). After init, the qword holds the TypeID. These dominate the cute interface anchors and a few standalone singletons. Both forms reach the per-op dispatcher exactly the same way: through a load of *(qword*)(op + 48) + 16 (the OperationName::TypeID slot) and a pointer-equality test against a sentinel address baked into the dispatcher arm.

A third special case is the "shared no-properties guard" &unk_5BE6138 — the global OperationName::TypeID reserved for the sentinel class mlir::detail::UnregisteredOpProperties. Every NVVM-to-LLVM and TileAS layout-classifier dispatcher tests against it first to short-circuit the no-properties path or detect an op being mid-rewritten. Every arm references it, making it the single most-cited sentinel in the binary.

How sentinels are consumed at runtime

Pointer-identity and Meyers-cached sentinels reach the dispatcher through the same OperationName::TypeID slot; only the lazy-init step differs. The minimum-cost lookup that a reimplementation must reproduce is:

/* Pointer-identity sentinel — the address is the TypeID. */
const void *type_id_pointer_identity(const void *sentinel_byte_slot) {
    return sentinel_byte_slot;            /* no load; pointer is the value */
}

/* Meyers-cached sentinel — first call interns the C++ mangled
 * mlir::TypeID::getFullName() string through the process-wide pool
 * (sub_44A6CA0 in this binary), races resolved by the Itanium ABI
 * guard byte. After init, the qword holds the TypeID. */
const void *type_id_meyers_cached(uint8_t *guard, const void **qword,
                                  const char *type_full_name) {
    if (__atomic_load_n(guard, __ATOMIC_ACQUIRE) == 0) {
        if (__cxa_guard_acquire(guard)) {
            *qword = intern_typeid_string(type_full_name);
            __cxa_guard_release(guard);
        }
    }
    return *qword;
}

/* Dispatch is pointer-equality on the resolved TypeID, applied against
 * the OperationName::TypeID slot reached through Operation+0x30 ->
 * OperationName::Impl+0x10. */
static inline bool op_is_sentinel(const void *op, const void *sentinel) {
    const void *opname_impl = *(const void *const *)((const uint8_t *)op + 0x30);
    const void *type_id     = *(const void *const *)((const uint8_t *)opname_impl + 0x10);
    return type_id == sentinel;
}

Allocating a fresh TypeID storage per call instead of through one static slot will produce one new identity per call site, which makes pointer-equality dispatch impossible. The address-band discipline below — every sentinel of a kind lives in one contiguous slab emitted by one translation unit — is what guarantees one address per kind.

Address-band index

The table partitions the sentinel space into the contiguous bands the linker emitted for each dialect / category. Numbers under "Count" are the distinct sentinels inside that band referenced elsewhere in the binary; the rest is padding.

BandCountOwnerForm
0x5B37B90 .. 0x5B37C285Upstream MLIR Op/DialectInterface anchorsMeyers (8-byte qword)
0x5B37BE8 .. 0x5B37BF02Dialect one-shot init guardsGuard byte
0x5B37F20 .. 0x5B381704cuda_tile AbstractOperation singletons (.data.rel.ro)Pointer-identity
0x5B38080, 0x5B381A82cuda_tile misc AttributeConcept / OperationStatePointer-identity
0x5B38BB0 .. 0x5B38BC84cuda_tile dialect Type TypeIDsPointer-identity
0x5B38C40 .. 0x5B38C682nv_tile_ir::as Op-interface anchorsMeyers
0x5B38F801TmaDescriptorTypeInterface anchorMeyers
0x5B445F8 .. 0x5B448903cutlass_ir::cute Layout / View / CopyAtom interfacesMeyers
0x5B44EB8 .. 0x5B44FD821nv_tileas op-info kindPtr singletonsPointer-identity
0x5B44F081nv_tileas op-ctor descriptor block tagPointer-identity
0x5B452B0 .. 0x5B459706nv_tileas per-op attribute-vector sentinelsPointer-identity
0x5B453701nv_tileas pragma ocg* attr-vectorPointer-identity
0x5B46980 .. 0x5B469A02nv_tileaa NamedAttr-vector slotsPointer-identity
0x5B46D28 .. 0x5B46F6833nv_tileaa per-op FoldRecord descriptorsPointer-identity
0x5B46E08, 0x5B46E80, 0x5B46E88, 0x5B46F30, 0x5B46FA0, 0x5B46FA86nv_tileaa producer-side / element-type sentinelsPointer-identity
0x5B46FF0 .. 0x5B470D08cutlass_ir::cute core type-interface anchorsMeyers
0x5B47490 .. 0x5B476A0~20cutlass dialect per-op OpInfoBlockPointer-identity
0x5B47FF8 .. 0x5B481A849cute_nvgpu Op TypeIDs (slab)Pointer-identity
0x5B482C81cute_nvgpu dialect TypeIDPointer-identity
0x5B48580 .. 0x5B48B2012cute_nvgpu per-op attribute-table sentinelsPointer-identity
0x5B48D88 .. 0x5B48E5827cute_nvgpu concrete Type TypeIDsPointer-identity
0x5B496B81cute dialect TypeIDPointer-identity
0x5B49A98 .. 0x5B49B1817cute dialect concrete Type TypeIDsPointer-identity
0x5B8D610 .. 0x5B8DCB8213 (197 referenced)NVVM Op TypeID slabPointer-identity (8-byte slot stride)
0x5BAADB81IntegerType variant (i32 / blocked layout id 1)Pointer-identity
0x5BA8F601LLVM dialect TypeIDPointer-identity
0x5BE3FF81scf.if AbstractOperation kindPtrPointer-identity
0x5BE40081nv_tileas.convert_layout AbstractOperation kindPtrPointer-identity
0x5BE58581arith.constant AbstractOperation kindPtrPointer-identity
0x5BE59081arith dialect TypeIDPointer-identity
0x5BE5C401nv_tileas.async.pipeline.consume_one (paired form)Pointer-identity
0x5BE5FC0 .. 0x5BE6138~10MLIR builtin FloatType / FloatVariant tablePointer-identity
0x5BE61381Shared no-properties / null-OperationName guardPointer-identity

The runtime invariant this layout captures: a sentinel address in 0x5B44E* / 0x5B44F* is an OperationName::opInfo slot (the descriptor passed at registration time), whereas one in 0x5BE3F* / 0x5BE4* / 0x5BE5* is the paired kindPtr slot (AbstractOperation::TypeID) that ends up in op->getName().getTypeID() after uniquing. The two ranges contain duplicates of each op identity at two different indirection levels; resolvers and rewriters generally compare against the kindPtr form, op-builders and registrars against the opInfo form.

Master sentinel table

Sorted by sentinel address, ascending. For each row: dialect, the C++ class or op/type/attr name, byte length of the sentinel's storage (1 for pointer-identity, 8 for the qword half of a Meyers pair, 9 for the guard+qword combined), and the wiki page that documents the matching op / type / interface in detail.

SentinelDialectClass / op / attr nameBytesFirst-cited page
0x5B37B90upstream MLIRRegionBranchTerminatorOpInterface (guard)1dialects/cute/interfaces.md
0x5B37B98upstream MLIRRegionBranchTerminatorOpInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B37BE8upstream MLIRRegionBranchOpInterface (cache slot)8dialects/cute/interfaces.md
0x5B37BF0nv_tileaadialect one-shot init guard1dialects/nv_tileaa/index.md
0x5B37C20upstream MLIROpAsmDialectInterface (guard)1dialects/index.md
0x5B37C28upstream MLIROpAsmDialectInterface (TypeID dword)8dialects/index.md
0x5B37F20cuda_tilecuda_tile.return AbstractOperation (primary)1dialects/cuda_tile/return.md
0x5B37FA8cuda_tilecuda_tile.return AbstractOperation (secondary interface)1dialects/cuda_tile/return.md
0x5B38080cuda_tileArrayAttr element AttributeConcept1dialects/cuda_tile/attrs.md
0x5B380C0cuda_tilecuda_tile.if AbstractOperation1dialects/cuda_tile/if.md
0x5B38170cuda_tilecuda_tile.continue AbstractOperation1dialects/cuda_tile/continue.md
0x5B381A8cuda_tileOperationState concept (sub_669F80)1dialects/cuda_tile/index.md
0x5B38BB0cuda_tilecuda_tile.partition_view (TypeID)1dialects/cuda_tile/types.md
0x5B38BB8cuda_tilecuda_tile.tensor_view (TypeID)1dialects/cuda_tile/types.md
0x5B38BC0cuda_tilecuda_tile.tile (TileType TypeID)1dialects/cuda_tile/types.md
0x5B38BC8cuda_tilecuda_tile.ptr (PointerType TypeID)1dialects/cuda_tile/types.md
0x5B38C40nv_tile_ir::asProducerOpInterface (guard)1dialects/nv_tileas/interfaces.md
0x5B38C48nv_tile_ir::asProducerOpInterface (TypeID qword)8dialects/nv_tileas/interfaces.md
0x5B38C60nv_tile_ir::asAgentLikeOpInterface (guard)1dialects/nv_tileas/interfaces.md
0x5B38C68nv_tile_ir::asAgentLikeOpInterface (TypeID qword)8dialects/nv_tileas/interfaces.md
0x5B38F80cutlass_ir::cuteTmaDescriptorTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B445F8cutlass_ir::cuteLayoutTypeInterface (guard)1dialects/cute/interfaces.md
0x5B44600cutlass_ir::cuteLayoutTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B44610cutlass_ir::cuteViewTypeInterface (guard)1dialects/cute/interfaces.md
0x5B44618cutlass_ir::cuteViewTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B44888cutlass_ir::cuteCopyAtomTypeInterface (guard)1dialects/cute/interfaces.md
0x5B44890cutlass_ir::cuteCopyAtomTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B44EB8nv_tileasnv_tileas.view (opInfo)1dialects/nv_tileas/view.md
0x5B44EC8nv_tileasnv_tileas.tiled_store (opInfo)1dialects/nv_tileas/tiled-store.md
0x5B44ED0nv_tileasnv_tileas.tiled_load (opInfo)1dialects/nv_tileas/tiled-load.md
0x5B44ED8nv_tileasnv_tileas.tiled_atomic_rmw (opInfo)1dialects/nv_tileas/tiled-atomic-rmw.md
0x5B44EE0nv_tileasnv_tileas.store (opInfo)1dialects/nv_tileas/store.md
0x5B44EF0nv_tileasnv_tileas.scatter_store (opInfo)1dialects/nv_tileas/scatter-store.md
0x5B44EF8nv_tileasnv_tileas.async.pipeline.consumer_release (opInfo)1dialects/nv_tileas/async-pipeline.md
0x5B44F08nv_tileasop-ctor descriptor block tag1dialects/nv_tileas/index.md
0x5B44F10nv_tileasnv_tileas.pragma (paired opInfo)1dialects/nv_tileas/pragma.md
0x5B44F18nv_tileasnv_tileas.async.pipeline.consumer_yield1dialects/nv_tileas/async-pipeline.md
0x5B44F20nv_tileasnv_tileas.producer_write1dialects/nv_tileas/producer-write.md
0x5B44F38nv_tileasnv_tileas.async.pipeline.produce_one1dialects/nv_tileas/async-pipeline.md
0x5B44F58nv_tileasnv_tileas.produce_one_async1dialects/nv_tileas/async-pipeline.md
0x5B44F68nv_tileasnv_tileas.consumer_read1dialects/nv_tileas/consumer-read.md
0x5B44F70nv_tileasnv_tileas.async.pipeline.consume_one1dialects/nv_tileas/async-pipeline.md
0x5B44F78nv_tileasnv_tileas.consume_one_async1dialects/nv_tileas/async-pipeline.md
0x5B44F90nv_tileasnv_tileas.load (opInfo)1dialects/nv_tileas/load.md
0x5B44FA8nv_tileasnv_tileas.gather_load (opInfo)1dialects/nv_tileas/gather-load.md
0x5B44FB8nv_tileasnv_tileas.async.pipeline.consumer_release-family (paired)1dialects/nv_tileas/async-pipeline.md
0x5B44FD8nv_tileasnv_tileas.convert_layout (opInfo)1dialects/nv_tileas/convert-layout.md
0x5B44FF0nv_tileasnv_tileas.async.pipeline.acquire (positional)1dialects/nv_tileas/async-pipeline.md
0x5B45070nv_tileasnv_tileas.alloc_tensor1dialects/nv_tileas/alloc-tensor.md
0x5B452B0nv_tileasnv_tileas.scatter_store attr-vec ("atom")1dialects/nv_tileas/scatter-store.md
0x5B45370nv_tileasnv_tileas.pragma attr-vec (ocgEnter/LeaveDirectives)1dialects/nv_tileas/pragma.md
0x5B453E0nv_tileasnv_tileas.async.pipeline.consumer_wait attr-vec1dialects/nv_tileas/async-pipeline.md
0x5B45600nv_tileasnv_tileas.gather_load attr-vec1dialects/nv_tileas/gather-load.md
0x5B458C0nv_tileasnv_tileas.async.pipeline.create_iterator attr-vec1dialects/nv_tileas/async-pipeline.md
0x5B45970nv_tileasnv_tileas.async.gather_tma_load attr-vec1dialects/nv_tileas/async-pipeline.md
0x5B46980nv_tileaaNamedAttr-vector slot (2-slot pattern)8dialects/nv_tileaa/index.md
0x5B469A0nv_tileaaNamedAttr-vector slot (head)8dialects/nv_tileaa/index.md
0x5B46D28nv_tileaanv_tileaa.yield FoldRecord1dialects/nv_tileaa/yield.md
0x5B46D30nv_tileaanv_tileaa.view FoldRecord1dialects/nv_tileaa/view.md
0x5B46D68nv_tileaanv_tileaa.splat FoldRecord1dialects/nv_tileaa/splat.md
0x5B46D70nv_tileaanv_tileaa.scatter FoldRecord1dialects/nv_tileaa/scatter.md
0x5B46D88nv_tileaanv_tileaa.return FoldRecord1dialects/nv_tileaa/return.md
0x5B46D98nv_tileaanv_tileaa.queue.yield FoldRecord1dialects/nv_tileaa/queue.md
0x5B46DA0nv_tileaanv_tileaa.queue.put FoldRecord1dialects/nv_tileaa/queue.md
0x5B46DA8nv_tileaanv_tileaa.queue.get FoldRecord1dialects/nv_tileaa/queue.md
0x5B46DB0nv_tileaanv_tileaa.ptr_to_int FoldRecord1dialects/nv_tileaa/ptr-to-int.md
0x5B46DC0nv_tileaanv_tileaa.pragma FoldRecord1dialects/nv_tileaa/pragma.md
0x5B46DD8nv_tileaanv_tileaa.opt_barrier FoldRecord1dialects/nv_tileaa/opt-barrier.md
0x5B46DE0nv_tileaanv_tileaa.mulhiui FoldRecord1dialects/nv_tileaa/mulhiui.md
0x5B46DF0nv_tileaanv_tileaa.message FoldRecord1dialects/nv_tileaa/message.md
0x5B46DF8nv_tileaanv_tileaa.mark_for_reuse FoldRecord1dialects/nv_tileaa/mark-for-reuse.md
0x5B46E08nv_tileaanv_tileaa.make_memref (opInfo)1dialects/nv_tileaa/make-memref.md
0x5B46E18nv_tileaanv_tileaa.launch_func FoldRecord1dialects/nv_tileaa/launch-func.md
0x5B46E20nv_tileaanv_tileaa.join_mem_token FoldRecord1dialects/nv_tileaa/queue.md
0x5B46E28nv_tileaanv_tileaa.is_valid_program_id FoldRecord1dialects/nv_tileaa/program-id.md
0x5B46E30nv_tileaanv_tileaa.int_to_ptr FoldRecord1dialects/nv_tileaa/ptr-to-int.md
0x5B46E38nv_tileaanv_tileaa.inject_ir FoldRecord1dialects/nv_tileaa/inject-ir.md
0x5B46E40nv_tileaanv_tileaa.histogram FoldRecord1dialects/nv_tileaa/histogram.md
0x5B46E70nv_tileaanv_tileaa.generate FoldRecord1dialects/nv_tileaa/generate.md
0x5B46E78nv_tileaanv_tileaa.gather_load FoldRecord1dialects/nv_tileaa/gather-load.md
0x5B46E80nv_tileaanv_tileaa.func (opInfo)1dialects/nv_tileaa/func.md
0x5B46E88nv_tileaanv_tileaa.fp_to_fp (opInfo)1dialects/nv_tileaa/fp-to-fp.md
0x5B46E98nv_tileaanv_tileaa.extract_slice FoldRecord1dialects/nv_tileaa/extract-slice.md
0x5B46EA8nv_tileaanv_tileaa.extern_ew FoldRecord1dialects/nv_tileaa/extern-ew.md
0x5B46EC8nv_tileaanv_tileaa.ew_inline_asm FoldRecord1dialects/nv_tileaa/ew-inline-asm.md
0x5B46EE0nv_tileaanv_tileaa.create_queue FoldRecord1dialects/nv_tileaa/queue.md
0x5B46EE8nv_tileaanv_tileaa.create_mem_token FoldRecord1dialects/nv_tileaa/queue.md
0x5B46F10nv_tileaanv_tileaa.cancel_next_program_id FoldRecord1dialects/nv_tileaa/program-id.md
0x5B46F28nv_tileaanv_tileaa.broadcast FoldRecord1dialects/nv_tileaa/broadcast.md
0x5B46F30nv_tileaanv_tileaa.block_tile (opInfo)1dialects/nv_tileaa/block-tile.md
0x5B46F38nv_tileaanv_tileaa.bitcast FoldRecord1dialects/nv_tileaa/bitcast.md
0x5B46F58nv_tileaanv_tileaa.assert FoldRecord1dialects/nv_tileaa/assert.md
0x5B46F60nv_tileaanv_tileaa.addptr FoldRecord1dialects/nv_tileaa/addptr.md
0x5B46F68nv_tileaanv_tileaa.addf FoldRecord1dialects/nv_tileaa/addf.md
0x5B46FA0upstream MLIRIntegerType variant (dot-operand layout id 2)1dialects/index.md
0x5B46FA8upstream MLIRIntegerType TypeID model (i1 / shared variant)1dialects/index.md
0x5B46FF0cutlass_ir::cuteMmaAtomTypeInterface (guard)1dialects/cute/interfaces.md
0x5B46FF8cutlass_ir::cuteMmaAtomTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B47000cutlass_ir::cutePrefetchAtomTypeInterface (guard)1dialects/cute/interfaces.md
0x5B47008cutlass_ir::cutePrefetchAtomTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B47020cutlass_ir::cutePrintableTypeInterface (guard)1dialects/cute/interfaces.md
0x5B47028cutlass_ir::cutePrintableTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B47030cutlass_ir::cuteIteratorTypeInterface (guard)1dialects/cute/interfaces.md
0x5B47038cutlass_ir::cuteIteratorTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B47058cutlass_ir::cutePointerTypeInterface (guard)1dialects/cute/interfaces.md
0x5B47060cutlass_ir::cutePointerTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B47068cutlass_ir::cuteAtomTypeInterface (guard)1dialects/cute/interfaces.md
0x5B47070cutlass_ir::cuteAtomTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B47080cutlass_ir::cuteDescriptorIteratorTypeInterface (guard)1dialects/cute/interfaces.md
0x5B47088cutlass_ir::cuteDescriptorIteratorTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B470C8cutlass_ir::cuteMaybeStaticTypeInterface (guard)1dialects/cute/interfaces.md
0x5B470D0cutlass_ir::cuteMaybeStaticTypeInterface (TypeID qword)8dialects/cute/interfaces.md
0x5B47490 .. 0x5B476A0cutlassPer-op OpInfoBlock band (~20 slots)variesdialects/cutlass/index.md
0x5B47FF8 .. 0x5B481A8cute_nvgpuOp TypeID slab (49 slots, 8-byte stride)1 eachdialects/cute_nvgpu/index.md
0x5B482C8cute_nvgpudialect TypeID1dialects/cute_nvgpu/index.md
0x5B48580cute_nvgpurelinquish_tmem_alloc_permit attr-table8dialects/cute_nvgpu/relinquish-tmem-alloc-permit.md
0x5B485A0cute_nvgpuarch.sm100.dealloc_tmem attr-table8dialects/cute_nvgpu/dealloc-tmem.md
0x5B485C0cute_nvgpuarch.sm100.alloc_tmem attr-table8dialects/cute_nvgpu/alloc-tmem.md
0x5B486A0cute_nvgpusm89.mma attr-table8dialects/cute_nvgpu/sm89-mma.md
0x5B48700cute_nvgpusm90.mma attr-table8dialects/cute_nvgpu/sm90-mma.md
0x5B48780cute_nvgpusm100.mma attr-table8dialects/cute_nvgpu/sm100-mma.md
0x5B48800cute_nvgpuSM120.block_scaled attr-table (17 entries)8dialects/cute_nvgpu/sm120-block-scaled.md
0x5B488E0cute_nvgpusm100.umma attr-table8dialects/cute_nvgpu/sm100-umma.md
0x5B489E0cute_nvgpustsm attr-table8dialects/cute_nvgpu/stsm.md
0x5B48A20cute_nvgpusm80.cp_async attr-table8dialects/cute_nvgpu/sm80-cp-async.md
0x5B48AF0cute_nvgpuSM100.tma_store attr-table8dialects/cute_nvgpu/tma-store.md
0x5B48B20cute_nvgpuSM100.tma_reduce attr-table8dialects/cute_nvgpu/tma-reduce.md
0x5B48D88cute_nvgpuatom.non_exec_tiled_tma_reduce / SmemDescType1dialects/cute_nvgpu/types.md
0x5B48D90cute_nvgpuatom.non_exec_tiled_tma_store / TmaDescriptorTiledType1dialects/cute_nvgpu/types.md
0x5B48D98cute_nvgpuatom.non_exec_tiled_tma_load / TmaDescriptorIm2colType1dialects/cute_nvgpu/types.md
0x5B48DA0cute_nvgpuatom.stsm1dialects/cute_nvgpu/types.md
0x5B48DA8cute_nvgpuatom.ldsm1dialects/cute_nvgpu/types.md
0x5B48DB0cute_nvgpuatom.simt_async_copy1dialects/cute_nvgpu/types.md
0x5B48DB8cute_nvgpuatom.universal_copy1dialects/cute_nvgpu/types.md
0x5B48DC0cute_nvgpuatom.tma_reduce1dialects/cute_nvgpu/types.md
0x5B48DC8cute_nvgpuatom.tma_store1dialects/cute_nvgpu/types.md
0x5B48DD0cute_nvgpuatom.tma_load1dialects/cute_nvgpu/types.md
0x5B48DD8cute_nvgputma_descriptor_im2col1dialects/cute_nvgpu/types.md
0x5B48DE0cute_nvgputma_descriptor_tiled1dialects/cute_nvgpu/types.md
0x5B48DE8cute_nvgpuatom.s2t_copy1dialects/cute_nvgpu/types.md
0x5B48DF0cute_nvgpuatom.tmem_store1dialects/cute_nvgpu/types.md
0x5B48DF8cute_nvgpuatom.tmem_load1dialects/cute_nvgpu/types.md
0x5B48E00cute_nvgpuSM120.mma_bs (block-scaled)1dialects/cute_nvgpu/sm120-block-scaled.md
0x5B48E08cute_nvgpusm100.mma_bs_sp1dialects/cute_nvgpu/sm100-mma.md
0x5B48E10cute_nvgpusm100.mma_bs1dialects/cute_nvgpu/sm100-mma.md
0x5B48E18cute_nvgpusm100.mma_sp1dialects/cute_nvgpu/sm100-mma.md
0x5B48E20cute_nvgpusm100.mma1dialects/cute_nvgpu/sm100-mma.md
0x5B48E28cute_nvgpusm90.mma (WGMMA)1dialects/cute_nvgpu/sm90-mma.md
0x5B48E30cute_nvgpusmem_desc_view1dialects/cute_nvgpu/types.md
0x5B48E38cute_nvgpusmem_desc1dialects/cute_nvgpu/types.md
0x5B48E40cute_nvgpusm89.mma (FP8 e4m3/e5m2)1dialects/cute_nvgpu/sm89-mma.md
0x5B48E48cute_nvgpusm80.sparse_mma1dialects/cute_nvgpu/sm80-mma.md
0x5B48E50cute_nvgpusm80.mma1dialects/cute_nvgpu/sm80-mma.md
0x5B48E58cute_nvgpuatom.universal_fma (SM70 path)1dialects/cute_nvgpu/types.md
0x5B496B8cutedialect TypeID1dialects/cute/index.md
0x5B49A98cutecute.tuple1dialects/cute/types.md
0x5B49AA0cutecute.fast_divmod_divisor1dialects/cute/types.md
0x5B49AA8cutecute.tiled_mma1dialects/cute/types.md
0x5B49AB0cutecute.tiled_copy1dialects/cute/types.md
0x5B49AB8cutecute.coord_tensor1dialects/cute/types.md
0x5B49AC0cutecute.memref (CuteMemRefType)1dialects/cute/types.md
0x5B49AC8cutecute.ptr (CutePtrType)1dialects/cute/types.md
0x5B49AD0cutecute.sparse_elem1dialects/cute/types.md
0x5B49AD8cutecute.composed_layout (ComposedLayoutType)1dialects/cute/types.md
0x5B49AE0cutecute.layout (LayoutType)1dialects/cute/types.md
0x5B49AE8cutecute.swizzle (SwizzleType)1dialects/cute/types.md
0x5B49AF0cutecute.tile (CuteTileType)1dialects/cute/types.md
0x5B49AF8cutecute.shape (CuteShapeType)1dialects/cute/types.md
0x5B49B00cutecute.stride1dialects/cute/types.md
0x5B49B08cutecute.coord (CuteCoordType)1dialects/cute/types.md
0x5B49B10cutecute.int_tuple (IntTupleType)1dialects/cute/types.md
0x5B49B18cute / cute_nvgpuConstrainedInt + AtomIType (shared)1dialects/cute/types.md
0x5B8D610 .. 0x5B8DCB8NVVMOp TypeID slab — 213 slots, 197 referenced (see slab close-up)8 eachdialects/nvvm/index.md
0x5BA8F60LLVMdialect TypeID1dialects/index.md
0x5BAADB8upstream MLIRIntegerType variant (i32 / blocked layout id 1)1dialects/index.md
0x5BE3FF8scfscf.if AbstractOperation kindPtr1dialects/index.md
0x5BE4008nv_tileasnv_tileas.convert_layout AbstractOperation kindPtr1dialects/nv_tileas/convert-layout.md
0x5BE5858aritharith.constant AbstractOperation kindPtr1dialects/index.md
0x5BE5908arithdialect TypeID1dialects/index.md
0x5BE5C40nv_tileasnv_tileas.async.pipeline.consume_one (paired)1dialects/nv_tileas/async-pipeline.md
0x5BE5FC0upstream MLIRFloatType singleton (F16 entry, MED)1dialects/index.md
0x5BE5FE0upstream MLIRMemRefType TypeID model1dialects/index.md
0x5BE6000upstream MLIRFloatType singleton (F32 entry, MED)1dialects/index.md
0x5BE6028upstream MLIRFloatType singleton (F64 entry, MED)1dialects/index.md
0x5BE6030upstream MLIRFloatType singleton (slot between F64 and TF32, MED)1dialects/index.md
0x5BE6038nv_tile_irtf32 (nv_tf32) storage sentinel1dialects/index.md
0x5BE6040upstream MLIRFloatType singleton (MED)1dialects/index.md
0x5BE6048upstream MLIRbf16 storage sentinel1dialects/index.md
0x5BE6090upstream MLIRf8E5M2 storage sentinel1dialects/index.md
0x5BE60A0upstream MLIRf8E4M3FN storage sentinel1dialects/index.md
0x5BE6138MLIR detailUnregisteredOpProperties / no-properties guard (shared)1dialects/index.md

NVVM op TypeID slab close-up: 0x5B8D610 .. 0x5B8DCB8

The largest sentinel cluster in the binary is the contiguous NVVM-op slab at 0x5B8D610 .. 0x5B8DCB8. It is 1704 bytes long (0x6A8), holds 213 8-byte slots at uniform 8-byte stride, and the NVVMToLLVM lowering dispatcher (sub_2D67A80, 92 KB) tests 197 of those slots as per-op TypeID sentinels in a folded dyn_cast cascade walking the slab from top-of-range (0x5B8DCB8) down. The remaining 16 slots correspond to NVVM op classes handled exclusively by the SelectionDAG MatcherTable path (sub_1A833C0) and never appear as explicit dispatcher arms.

Why it is contiguous: the linker emits one mlir::TypeID::Storage-array initialization per dialect, where every op-class registered through the TableGen-generated registerNVVMDialect() entry point produces one 8-byte slot containing the address of the class's static thread_local TypeID::UniqueIdHolder. All 213 slots come from one translation unit's static data, so they land in a single .rodata section with no padding between slots — exactly the pattern observed.

How to read offset → op name: index i = (slab_address - 0x5B8D610) / 8. The dispatcher walks arms in slab-descending order, so the first arm reached at line ~2067 of sub_2D67A80 matches 0x5B8DCB8 (NVVM::CpAsyncCommitGroupOp). Each subsequent arm decrements the slot by 8. Slot 0x5B8D610 + 8*i for i ∈ [0, 212] therefore corresponds to the (212 - i)-th arm in walk order.

Selected anchor sentinels from inside the slab, with their op classes:

SentinelNVVM Op classIntrinsic-ID family
0x5B8DCB8NVVM::CpAsyncCommitGroupOp(top of dispatcher)
0x5B8DCA8NVVM::CpAsyncWaitGroupOp8397
0x5B8DC90NVVM::Tcgen05DeallocOp8381, 0x20CD
0x5B8DB58NVVM::AtomicRMWOp(variant via sub_4261FA)
0x5B8DB50NVVM::ReduceOp (variant 1)(via sub_2E657E0)
0x5B8DB48NVVM::ReduceOp (variant 2)(via sub_2E657C0)
0x5B8DB40NVVM::ReduceOp (variant 3, vec)(via sub_2E65720)
0x5B8DB38NVVM::AtomicCAS / nvvm.red.b128(via sub_2E65750)
0x5B8DAF8NVVM::CpAsyncBulkTensorReduceOp8974-9011
0x5B8DAF0NVVM::CpAsyncBulkTensorPrefetchOp9150
0x5B8DAE8NVVM::CpAsyncBulkTensorSharedCTAToGlobalOp8956
0x5B8DAE0NVVM::CpAsyncBulkTensorSharedCTAToGlobalExtOp8956
0x5B8DAD8NVVM::CpAsyncBulkTensorSharedClusterToGlobalOp8951
0x5B8DAB8NVVM::Tcgen05FenceOp (fence pair v0)8609
0x5B8DAB0NVVM::Tcgen05FenceOp (fence pair v1)8610
0x5B8DAA8NVVM::CvtPackfloatF32Op0x21B3 = 8627
0x5B8DAA0NVVM::ElectSyncOp0x21A5 = 8613
0x5B8DA98NVVM::PrefetchOp0x21F7 = 8695
0x5B8DA90NVVM::CpAsyncShared.*.GlobalOp0x210F
0x5B8D928NVVM::CvtFloatToFp8 / CvtPackedOp8305-8308
0x5B8D920NVVM::WgmmaCommitGroupSyncAlignedOp0x226A = 8810
0x5B8D918NVVM::WgmmaCommitGroup / WaitGroup8797-8799
0x5B8D910NVVM::WgmmaMmaAsync (block-variant 0x245C)0x245C = 9308
0x5B8D8F8NVVM::MmaBlockScaleOp9398 = 0x24B6
0x5B8D8F0NVVM::MmaSync sibling9035
0x5B8D8E8NVVM::MmaSync sibling9036
0x5B8D8D8NVVM::WgmmaMmaAsyncOp (full)0x226A = 8810
0x5B8D8D0NVVM::WgmmaMmaAsync sibling (operand-walked)--
0x5B8D898NVVM::LdmatrixOp9153-9170
0x5B8D7E0NVVM::CpAsyncBulkTensorBaseOp8919-8966
0x5B8D7F8NVVM::CpAsyncShared.*.GlobalOp variant9259 / 9263
0x5B8D7F0NVVM::CpAsyncBulkSharedClusterToSharedCTAOp9217
0x5B8D7E8NVVM::CpAsyncCommitGroupOp / CpAsyncShared9220 / 9222
0x5B8D7D0NVVM::MmaOp (mma.sync)(MatcherTable)
0x5B8D7C8NVVM::WmmaOp (load/store/mma)(MatcherTable)
0x5B8D768NVVM::StmatrixOp9858-9866
0x5B8D700NVVM::Tcgen05MMAOp (full)10521-10525
0x5B8D6F8NVVM::Tcgen05MMABlockScaleOp10524-30
0x5B8D6F0NVVM::Tcgen05MMASparseOp10522-23
0x5B8D6E8NVVM::Tcgen05MMAWsOp10522-23
0x5B8D6E0NVVM::Tcgen05MMAWsSpOp10534 (gated)
0x5B8D6D8NVVM::Tcgen05MMASpBlockScaleOp10522-30
0x5B8D6D0NVVM::Tcgen05ShiftOp10540
0x5B8D6C8NVVM::Tcgen05CommitOp9669-70, 10447
0x5B8D6C0NVVM::Tcgen05CommitArriveOp9671 = 0x25C7
0x5B8D6B8NVVM::Tcgen05CpOp9136
0x5B8D6B0NVVM::Tcgen05AllocOp8376, 0x20B7
0x5B8D6A8NVVM::Tcgen05DeallocOp8381, 0x20CD
0x5B8D6A0NVVM::Tcgen05RelinquishAllocPermitOp8390-91
0x5B8D698NVVM::Tcgen05WaitOp9399
0x5B8D690NVVM::Tcgen05FenceOp8609 sibling
0x5B8D688NVVM::Tcgen05LdmatrixOp9674-83
0x5B8D680NVVM::Tcgen05StmatrixOp9684-89
0x5B8D610 .. 0x5B8D670NVVM::Mbar / barrier / cluster / setmaxnreg / fence band (~25)varies

Block-anchor band assignments (within the slab, from the dispatcher walk order):

Slab bandOp-class family
0x5B8DCB8 .. 0x5B8DC90cp.async commit/wait + tensormap descriptor builder + Tcgen05Dealloc
0x5B8DC88 .. 0x5B8DC2816 cp.async.bulk commit/wait fence-band siblings
0x5B8DC20 .. 0x5B8DC003 cp.async.bulk commit/wait variants
0x5B8DBF8 .. 0x5B8DB7017 cp.async.bulk.tensor TMA store/load fan-out (1D-5D × im2col × multicast × L2hint)
0x5B8DB68 .. 0x5B8DB583 atomic / red sibs
0x5B8DB50 .. 0x5B8DB383 nvvm.red ops (variants by red_op × scope × type)
0x5B8DB28 .. 0x5B8DB004 cp.async.commit / wait band
0x5B8DAF8 .. 0x5B8DAE03 cp.async.bulk.tensor.reduce variants (S2G / G2S / prefetch)
0x5B8DAD8 .. 0x5B8DAC03 ldmatrix-cluster siblings
0x5B8DAB8 .. 0x5B8DAB02 nvvm.tcgen05.fence variants
0x5B8DAA8 .. 0x5B8DA90cvt.packfloat / elect.sync / prefetch / cp.async.shared.global
0x5B8DA88 .. 0x5B8DA783 cp.async-cluster-bulk siblings
0x5B8DA70 .. 0x5B8DA186 mbarrier-init/inval/arrive variants
0x5B8D9C0 .. 0x5B8D9D09 fence.{proxy,sc,acq_rel} cluster fan-out (0x2200 family)
0x5B8D9B8 .. 0x5B8D9789 mbarrier.test_wait/parity/timelimit fan-out
0x5B8D928 .. 0x5B8D8F8cvt.float.to.fp8 / wgmma fence/commit/wait / mma.block_scale
0x5B8D8F0 .. 0x5B8D8E82 mma.sync siblings (9035, 9036)
0x5B8D8D8 .. 0x5B8D8D0wgmma.mma_async (full + sibling)
0x5B8D8C8 .. 0x5B8D898ldmatrix-shape fan-out (m8n8 / m8n16 / m16n16)
0x5B8D8A8 .. 0x5B8D8983 stmatrix × num × trans variants (9637-38, 9858+)
0x5B8D880 .. 0x5B8D7F84 cp.async.bulk.tensor.shared::cluster.global variants
0x5B8D7F0 .. 0x5B8D7E82 nvvm.cp.async.shared (8463 / 9220)
0x5B8D7E0nvvm.cp.async.bulk.tensor rank fan-out (8919-8966)
0x5B8D7D8 .. 0x5B8D7C84 mma.sync / wmma siblings (9434-9505 dword table)
0x5B8D7C0 .. 0x5B8D6F816 tcgen05.mma {full, sp, ws, ws.sp, block_scale, ...}
0x5B8D6F0 .. 0x5B8D68016 tcgen05 misc (ld/st/cp/commit/alloc/dealloc/wait)
0x5B8D670 .. 0x5B8D610~25 generic ops / cluster / setmaxnreg / lazy-tail siblings

Slot stride and storage rationale: each slot is exactly 8 bytes because the slab stores raw void* pointers, and on x86-64 the AT&T psABI guarantees _Alignof(void*) == sizeof(void*) == 8. The address of slot i is 0x5B8D610 + 8*i, no per-slot padding. The dispatcher reads each sentinel address as an immediate operand baked into the per-arm cmp instruction, so any reimplementation must keep the slab contiguous and 8-byte aligned for the fold-up cascade to remain a single cmp/je chain.

The shared &unk_5BE6138 no-properties guard sits ~0x59 KB later than the slab, in a different translation unit. Upstream MLIR intends this: UnregisteredOpProperties::TypeID lives in mlir/IR/OperationSupport.cpp, separate from the dialect's generated registerNVVMDialect() translation unit. Placing the no-properties sentinel outside the slab guards against a pointer-equality false-positive when an arm tests op.getName().getTypeID() == &slab[i] against an op whose properties record was never built.

Cross-references

The companion table Op Mnemonic Master Table indexes the same sentinels by op-name rather than by address, with verbatim mnemonics, length bytes, and one-clause semantics for every registered op.

The Cross-references column in the master table points to the canonical wiki page for each sentinel's op or type. Conventions:

  • dialects/<dialect>/<op-mnemonic>.md for op-info / op-class sentinels
  • dialects/<dialect>/types.md for concrete Type TypeIDs
  • dialects/<dialect>/interfaces.md for type-interface anchors (Meyers pairs)
  • dialects/<dialect>/index.md for dialect-level TypeIDs and ranges whose per-op decomposition is documented separately
  • dialects/index.md for upstream MLIR / cross-dialect anchors

Two cross-dialect sharing patterns are worth highlighting:

  1. 0x5B49B18 is reused by both cute.ConstrainedInt and cute_nvgpu.AtomIType. The two share pointer identity because the inline printer emits the same i<N>(<divby M>)? surface syntax for both, and the underlying AbstractType class is parameterised on the same set of attributes — TableGen emits a single TypeID.
  2. The PrintableTypeInterface qword 0x5B47028 is attached to every cute and almost every cute_nvgpu concrete type (27+ installs). When you trace a sentinel comparison against 0x5B47028, you are inside the PrintableTypeInterface dispatch, not a per-type check.

Pairing convention: nv_tileas.convert_layout exemplifies the two-form encoding. Its OperationName opInfo slot (the descriptor passed to sub_4461CA0 at op registration) is 0x5B44FD8, while its AbstractOperation::TypeID slot (the kindPtr reachable via *(qword*)(op+48)+16 after uniquing) is 0x5BE4008. Resolvers compare against the kindPtr; op-builders against the opInfo. Treat them as the same op identity at two different indirection levels.