Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

cl::opt Full Catalog

Every command-line option the tileiras binary registers — through LLVM cl::opt / cl::list / cl::alias, the NVIDIA-private dialect-bag, and MLIR PassOption registrars that share its textual surface. An option counts if a static-storage symbol calls llvm::cl::Option::setArgStr(name, len) (sub_4534CC0, 1174 B) at static-construction time, if a tileiras dialect-bag helper (sub_5FE350 / sub_5FE910 / sub_5FED40) runs from a per-invocation builder, or if mlir::detail::PassOptions::Option<T> is constructed from sub_6D3460. The binary contains 689 distinct caller addresses for sub_4534CC0; this catalog covers the 77 user-visible options across registrars 1–7. The 478-row PassBuilder name registry is summarized in PassBuilder Mega-Registry.

Reading guide

Option families surface to the user through different CLI prefixes. The first stop depends on which prefix appears on the command line:

If you see...Find it in...Examples
bare --opt-level, -O, -gLayer 1 (Driver Globals)--opt-level, --gpu-name, --sanitize
bare --compute-capability, --debuginfo-levelLayer 2 (Dialect Options Bag)targets pre-driver loop
--pass-pipeline="tileir{key=value ...}"Layer 3 (TileIR PassOptions)compute-capability=sm_90, num-warps=8, opt-level=2
-mllvm -nvptx-...Layer 4 (NVPTX Backend)upstream LLVM NVPTX target options
bare -Om, -Osize, -w, -WerrorLayer 4b (NVPTX-CL Options)host-driver-mode compatibility flags
-mllvm -nvvm-reflect-*Layer 5 (NVVM Reflect)nvvm-reflect-add KEY=VAL, -R KEY=VAL
--mlir-...Layer 6 (MLIR Framework)--mlir-print-ir-after-all, --mlir-timing
--passes="..." pass-name stringsPassBuilder mega-registrynot user-visible cl::opts; see PassBuilder Mega-Registry

When the same name appears in two layers (the documented --compute-capability collision between Layer 2 and Layer 3 is the canonical example), the driver propagates the Layer-2 value down into Layer 3; the Layer-3 default fires only when the MLIR pass library is loaded outside the tileiras driver.

Registrar Tiers

The static-init-time CLI surface partitions into five disjoint registrars plus two LLVM-inherited bulk registrars. Each tier owns its builder body, its storage scheme, and its help-text rodata cluster.

TierCountRegistrar functionStorage schemeEnd-user prefix
Driver-tier (Layer 1)5sub_579270 (3834 B), called by thunk sub_57A170 from mainHeap-allocated 864-byte TileirasDriverCLOpts aggregatebare (--opt-level, -O, --lineinfo, --device-debug, -g)
Dialect-bag (Layer 2)3sub_602440 (1599 B), called from sub_602A80 per invocationHeap-allocated ~1488-byte dialect options bagbare (parsed alongside Layer 1)
TileIR PassOptions (Layer 3)20sub_6D3460 (13726 B); helpers sub_6D3140 (int), sub_6D2E20 (uint/size_t), sub_5FED40 (bool), sub_4534CC0 (string/enum), sub_44E10F0 (string default setter)Caller-owned TileIRPipelineOptions struct, ≥5616 B--pass-pipeline="tileir{...}"
NVPTX backend (Layer 4)26LLVM static ctors, per TU within NVPTXTargetMachinePer-TU BSS globals, 160–200 B per option-mllvm -nvptx-*
NVVM reflect (Layer 5)4ctor_238 (0x463A70); cl::list backing at 0x5B4F380Static cl::opt<bool> at 0x5B4F400, cl::list<string> at 0x5B4F300-mllvm -nvvm-reflect-*
MLIR framework (Layer 6)18MLIR support-library static ctors (AsmPrinter / Diagnostics / MLIRContext / PassTiming)Per-TU BSS globals--mlir-*
NVPTX-CL options (Layer 4b)13sub_45BA4C0 (8524 B), ManagedStatic-guarded; __cxa_atexit per optStatic globals, 13 __cxa_atexit registrationsbare (no -mllvm prefix)
Misc LLVM (Layer 7)1LLVM static ctor in ValueTracking.cppPer-TU BSS global-mllvm
PassBuilder mega-registry478sub_1CCB7D0 (35948 B)StringMap at *(this+8)name-registry only; not a CLI option
Driver-tier total (1+2)8
Pass-options surface (3)47(counts include the helper-distinguished int/uint/bool/string/enum sub-types)
Total user-visible77
Total setArgStr xrefs689(LLVM/MLIR/Target full link graph)

Static-init order follows ELF .init_array; each ctor invokes setArgStr(opt, name, len) then done() (sub_4534420, 199 B), inserting into cl::GlobalParser (sub_4530050). Categories attach via sub_452D690 (OptionCategory::addOption, 455 B). The atomic counter NumOccurrences (sub_452D580, 88 B) uses InterlockedExchangeAdd64. Standard cl::alias diagnostics live at 0x45E7218/0x45E7258/0x45E7288/0x45E72C0 (verbatim upstream) and route through sub_452D9F0sub_459CCA0; the duplicate-registration error " registered more than once!" is emitted from setArgStr itself.

Master Table

Columns: option | type | default | help text (verbatim) | storage / agg offset | defining pass / TU | wiki page. Sorted alphabetically within each registrar. Where the per-option BSS storage was not individually extracted, the rodata address of the name string is given as name@<addr>.

Layer 1 — Tileiras Driver Globals (registrar sub_579270)

Aggregate: heap-allocated 864 B TileirasDriverCLOpts, owner pointer returned to main via sub_57A170. cl::opt slots are 192 B; the two cl::alias slots are 136 B. Applicator trampolines are sub_578C40 (int) and sub_578C50 (bool), paired with nullsub_10 / nullsub_11.

optiontypedefaulthelp text (verbatim)storagedefining passwiki
--device-debugcl::opt<bool>falseGenerate debug information (if present in the input bytecode)name@(len 12), agg+0x218..0x2D7sub_579270driver/cli-options
-gcl::aliasdevice-debugAlias for --device-debugname@0x45E74F9 (len 1), agg+0x2D8..0x35Fsub_579270driver/cli-options
--lineinfocl::opt<bool>falseGenerate line-number information (if present in the input bytecode)name@0x45E74F0 (len 8), agg+0x158..0x217sub_579270driver/cli-options
-Ocl::aliasopt-levelAlias for --opt-levelname (len 1), agg+0x0D0..0x157sub_579270driver/cli-options
--opt-levelcl::opt<int>3Specify optimization level. Default Value: 3.name@0x45E74xx (len 9), agg+0x000..0x0C7sub_579270driver/cli-options

--opt-level ValueStr metavar = "N" (renders as --opt-level=<N>). Aggregate heap-allocated by sub_44A8C20(0x360).

Layer 1 — cl::ValuesClass int32 enum options

Four additional Layer-1 options are wired by byte-equivalent template instantiations of cl::opt<cl::ValuesClass>::opt, each differing only in its string-pair table, parser vtable, and the int32 target slot it writes into the aggregate. The parsed result is always a single int32; downstream code consults the integer, never the string.

optionbuilderparser vtabledefaultint32 codes
--gpu-namesub_577620 (5-pair table)&unk_59A7378100"sm_100"=100, "sm_103"=103, "sm_110"=110, "sm_120"=120, "sm_121"=121
--host-archsub_577950 (3-pair table)&unk_59A74680"x86_64"=0, "aarch64"=1, "arm64ec"=2
--host-ossub_577C80 (2-pair table)&unk_59A75580"linux"=0, "windows"=1
--sanitizesub_577FB0 (1-pair table)&unk_59A76480(unset)=0, "memcheck"=1

GPU-name codes correspond to compute-capability families: 100 is Datacenter Blackwell (default), 103 the Blackwell variant, 110 Jetson Thor, 120 Consumer RTX 50** / Pro, 121 DGX Spark. --sanitize is the toggle that activates the -sanitize=memcheck -g-tmem-access-check nvdisasm tail.

Host-triple resolution reads these ints downstream. sub_40FD330 keys off the host-arch int with stride 39 for x86_64 (code 0), stride 36 for aarch64 (code 1), and stride 36 for arm64ec (code 2 — which uses a sub-entry of the aarch64 record); this is the only place arm64ec diverges from aarch64. sub_40FD7E0 keys off the host-os int with OS-index 7 for linux (code 0) and OS-index 15 for windows (code 1).

The four parser vtables &unk_59A7378 / &unk_59A7468 / &unk_59A7558 / &unk_59A7648 share an 8-slot layout: vtable+0 typeinfo helper, +8 destructor, +16 parse (string → int32 map probe), +24 print (int32 → string lookup), +32 valuesDefault (initialise from a cl::values(...) builder), +40 reserved, +48 reserved, +56 reserved. The parse slot is the only operation invoked at command-line-parse time; the print slot fires only when --help is requested.

Layer 2 — Tileiras Dialect Options Bag (registrar sub_602440)

Aggregate: ~1488 B heap-allocated dialect bag attached to a per-invocation mlir::DialectRegistry. Not visible through the global LLVM parser; consumed by tileiras's pre-driver loop and forwarded to Layer 3. Helpers: sub_5FE350 (enum), sub_5FE910 (string), sub_5FED40 (bool).

optiontypedefaulthelp text (verbatim)storagedefining passwiki
--compute-capabilitycl::opt<string>sm_100(metavar compute capability, no description body)name@0x4Exxxxx (len 18), bag+0x4D8..0x5C7; default literal "sm_100"@0x45E7185sub_602440driver/cli-options
--debuginfo-levelcl::opt<enum> (4-value)none (=0)The level of debug info to emit.name@0x45EF0xx (len 15), bag+0x070..0x1EFsub_602440lowering/target-and-debuginfo
--is-optimizedcl::opt<bool>falseEncode in the debug info whether the program is optimized or not.name@0x45EF0xx (len 12), bag+0x400..0x4D7sub_602440lowering/target-and-debuginfo

Enum table for --debuginfo-level (constructed inline by sub_602440):

valuestringhelp (verbatim)
0noneNone.
1fullFull.
2line-tablesLine Tables Only.
3debug-directivesDebug Directives Only.

Note: --compute-capability collides with the Layer-3 PassOption of the same name (Layer 2 defaults to sm_100, Layer 3 to sm_80). The driver propagates Layer 2 into Layer 3; the Layer-3 default fires only when the MLIR passes load without the tileiras driver.

Layer 3 — TileIR MLIR PassOptions (registrar sub_6D3460)

20 mlir::Pass::Option<T> registrations on a single TileIRPipelineOptions (≥5616 B). Not in the global LLVM cl::opt registry; parsed from --pass-pipeline="tileir{key=value ...}". reg+N is the registration slot (208 B stride); val+N is the resolved-value offset in the a2 value-struct read by sub_6D6A00. For per-option consumer mapping see Options Mapping.

optiontypedefaulthelp text (verbatim)storage / agg+slotconsuming pass(es)wiki
approxbool0Approximate calculation.reg+2104, val~+?NVVMReflect path (sub_14FE980)libdevice/nvvm-reflect-mechanism
compute-capabilitystringsm_80compute capabilityreg+600, val+728/736sub_738810 (Frontend→TileAA), sub_6D0E90driver/cli-options
dump-hoststring""Print the generated host code to the provided path.reg+4912, val+5040/5048, presence+5072sub_879B50 (EmitHostWrapper)driver/tileir-callbacks-abi
dynamic-persistentbool0Enable dynamic persistent transformationreg+2936, val+3064TileASDynamicPersistent (driver gate)passes/tileas/cta-cluster-family
emit-line-infoenum (5-value)noneEmit debug line info from existing or snapshot IR (snapshot saved to ./snapshot.mlir).reg+3408, val+3536driver (snapshot insertion); SynthesizeDebugInfoScopeslowering/target-and-debuginfo
enable-debug-loggingbool0Enable debug logging in TileIR host callbacks.reg+4024, val+4152sub_879B50 EmitHostWrapperdriver/tileir-callbacks-abi
enable-random-delaybool0enable random delayreg+2520TileAS scheduler family (LOW conf)scheduler/overview
ftzbool0Flush denormal to zero.reg+2312, val+2232NVVMReflect path (nvvm-reflect-ftz ModuleFlag)libdevice/nvvm-reflect-mechanism
host-triplestringnativeSpecify the target triple for TileIR host callbacks.reg+4232, val+4360sub_879B50 EmitHostWrapperdriver/tileir-callbacks-abi
index-bitwidthint32Bitwidth of the index type, 0 to use size of machine word.reg+1688ConvertTileASToLLVM, TileAS→NVGPU, ConvertToLLVM, ConvertMemRefToLLVM, ConvertControlFlowToLLVM, UnspecializedPipelinelowering/tileas-to-llvm
max-constraint-iterationsuint10Maximum number of iterations for resource constraint generation. Higher values allow more optimization attempts but increase compilation time. Lower values may result in fallback to serial execution when resource constraints are tight.reg+4704, val+4832sub_8A25E0 TileASPrepareForSchedulingpasses/tileas/cta-cluster-family
num-ctasint1number of ctas in a cgareg+392, val+520sub_738810 (Frontend→TileAA)lowering/cuda-tile-to-tileaa
num-warpsint4number of warpsreg+184, val+312sub_738810 (Frontend→TileAA)lowering/cuda-tile-to-tileaa
opt-levelint2Optimization level for NVVM compilation. Please notice that the default value is 2 and can be set from 0 to 3.reg+864, val+992driver sub_6D6A00 shape switch; ConvertTargetToNVVM (sub_14FE980)pipeline/driver-and-opt-levels
pipeline-strategyenum (3-value)noneSelect the strategy of pipelining optimization.reg+1072, val+1200driver sub_6D6A00 / sub_6D0E90 / sub_6D18D0 (warp-specialize selector); SpecializeAgentspasses/tileas/async-pipeline-family
rrt-size-thresholduint4096RRT size threshold for quantization (in time slots). Applies quantization when RRT exceeds this size to reduce compilation time at the cost of schedule accuracy. Smaller thresholds enable more compression for faster compilation but with reduced scheduling precision. If threshold is 0, then no quantization will be applied.reg+4496, val+4624sub_8A25E0 TileASPrepareForSchedulingpasses/tileas/cta-cluster-family
schedule-trace-filestring""Generate a chrome timeline trace if not empty for the visualizationof the scheduling result for TileASv2reg+3144, val+3272sub_825050 TileASScheduleRewriteEnablescheduler/overview
unspecialized-pipeline-num-stagesint4numStages for unspecialized pipeline pass.reg+1896, val+1816UnspecializedPipeline (sub_1A24770), ConvertTileASToLLVM, TileAS→NVGPUpasses/tileas/async-pipeline-family
use-nvgpucomp-libnvvmbool0Use NVGpuComp to compile NVVM IR. If false, use default libnvvm path.reg+5488, val+5616ConvertTargetToNVVM (sub_14FE980)lowering/nvgpu-and-gpu-to-nvvm
v2-opt-levelint0Optimization level for tile_ir V2 pass pipeline.reg+2728, val+2856driver sub_6D6A00 second-axis shape gatepipeline/driver-and-opt-levels

Enum tables for Layer-3 enums:

optionvaluestringhelp (verbatim)
pipeline-strategy0noneno pipelining optimization
pipeline-strategy1unspecializedo pipelining for unspecialized flow
pipeline-strategy2warp-specializedo pipelining for warp specialized flow
emit-line-info0noneDo not emit line info.
emit-line-info1inputIREmit line info from the existing input IR.
emit-line-info2tileaaEmit line info from TileAA IR snapshot before lowering to TileAS.
emit-line-info3tileasEmit line info from TileAS IR snapshot before lowering to LLVM.
emit-line-info4post-tileasEmit line info from Cute and LLVM IR snapshot.

Layer 4 — NVPTX Backend cl::opt (LLVM static ctors)

Per-TU global ctors invoking sub_4534CC0(&optObj, name, len) at static-init. Eight rows below are INITIALIZE_PASS markers (pass-name registrations, not cl::opt). The set is verbatim upstream LLVM NVPTXTargetMachine; default for nvptx-force-min-byval-param-align is patched to false (upstream = true).

optiontypedefaulthelp text (verbatim)storage (string addr)defining passwiki
alloca-hoistingINITIALIZE_PASSNVPTX specific alloca hoistingname@0x4D12164NVPTXAllocaHoistingnvptx-passes/overview
disable-nvptx-load-store-vectorizercl::opt<bool>falseDisable load/store vectorizername@0x4D0EF60LSV gatenvptx-passes/overview
disable-nvptx-require-structured-cfgcl::opt<bool>falseTransitional flag to turn off NVPTX's requirement on preserving structured CFG. The requirement should be disabled only when unexpected regressions happen.name@0x4D0EF88NVPTXTargetMachinecodegen/nvptx-bring-up-and-target-init
nvptx-aa-wrapperINITIALIZE_PASSNVPTX Address space based Alias Analysis Wrappername@0x4D11FB1NVPTXAliasAnalysisWrappernvptx-passes/memory-space-opt-and-process-restrict
nvptx-approx-log2f32cl::opt<bool>falseNVPTX Specific: whether to use lg2.approx for log2name@0x4D0DA2DNVPTXISelLoweringlibdevice/math-pass-pipeline-and-crosswalk
nvptx-asm-printerINITIALIZE_PASSNVPTX Assembly Printername@0x4D07A97NVPTXAsmPrintercodegen/asm-printer-monster-and-windows
nvptx-assign-valid-global-namesINITIALIZE_PASSAssign valid PTX names to globalsname@0x4D121A0NVPTXAssignValidGlobalNamesnvptx-passes/overview
nvptx-atomic-lowerINITIALIZE_PASSNVPTX lower atomics of local memoryname@0x4D1221CNVPTXAtomicLowercodegen/atomic-warp-sreg-fence
nvptx-early-byval-copycl::opt<bool>falseCreate a copy of byval function arguments early.name@0x4D0F141NVPTXLowerArgsnvptx-passes/lower-args-and-aggr-and-struct
nvptx-emit-init-fini-kernelcl::opt<bool>falseEmit kernels to call ctor/dtor globals.name@0x4D1262CNVPTXCtorDtorLoweringnvptx-passes/kernel-cdp-inline-pretreat
nvptx-exit-on-unreachablecl::opt<bool>falseLower 'unreachable' as 'exit' instruction.name@0x4D0F127NVPTXISelLoweringcodegen/nvptx-target-lowering-call-and-args
nvptx-fma-levelcl::opt<uint>(default per LLVM)NVPTX Specific: FMA contraction (0: don't do it 1: do it 2: do it aggressivelyname@0x4D0D9DCNVPTXTargetMachinelibdevice/math-pass-pipeline-and-crosswalk
nvptx-force-min-byval-param-aligncl::opt<bool>false (NVIDIA-patched; upstream default = true)NVPTX Specific: force 4-byte minimal alignment for byval params of device functions.name@0x4D0DBF0NVPTXLowerArgsnvptx-passes/lower-args-and-aggr-and-struct
nvptx-forward-paramsINITIALIZE_PASSNVPTX Forward Paramsname@0x4D12715NVPTXForwardParamsnvptx-passes/overview
nvptx-iselINITIALIZE_PASSNVPTX DAG->DAG Pattern Instruction Selectionname@0x4D1293DNVPTXISelDAGToDAGcodegen/iseldag-and-matchertable
nvptx-libcall-calleecl::opt<bool>(default per LLVM)(controls direct libcall lowering; help colocated near 0x4D08070)name@0x4D0805ANVPTXTargetLoweringcodegen/nvptx-target-lowering-call-and-args
nvptx-lower-global-ctor-dtorcl::opt<bool>falseLower GPU ctor / dtors to globals on the device.name@0x4D083A1NVPTXCtorDtorLoweringnvptx-passes/kernel-cdp-inline-pretreat
nvptx-lower-global-ctor-dtor-idcl::opt<string>""Override unique ID of ctor/dtor globals.name@0x4D12678NVPTXCtorDtorLoweringnvptx-passes/kernel-cdp-inline-pretreat
nvptx-no-f16-mathcl::opt<bool>falseNVPTX Specific: Disable generation of f16 math ops.name@0x4D0E6C4NVPTXISelLoweringlibdevice/math-pass-pipeline-and-crosswalk
nvptx-prec-divf32cl::opt<uint>(default per LLVM)NVPTX Specific: Override the precision of the lowering for f32 fdivname@0x4D0DA08NVPTXISelLowering (also reads __CUDA_PREC_DIV reflect key)libdevice/math-pass-pipeline-and-crosswalk
nvptx-prec-sqrtf32cl::opt<bool>falseNVPTX Specific: 0 use sqrt.approx, 1 use sqrt.rn.name@0x4D0DA1ANVPTXISelLowering (also reads __CUDA_PREC_SQRT reflect key)libdevice/math-pass-pipeline-and-crosswalk
nvptx-rsqrt-approx-optcl::opt<bool>falseEnable reciprocal sqrt optimizationname@0x4D15BF5NVPTXTargetLoweringlibdevice/math-pass-pipeline-and-crosswalk
nvptx-sched4regcl::opt<bool>falseNVPTX Specific: schedule for register pressuename@0x4D0D9CCNVPTXSubtarget scheduler choicecodegen/nvptx-subtarget-and-feature-matrix
nvptx-short-ptrcl::opt<bool>falseUse 32-bit pointers for accessing const/local/shared address spaces.name@0x4D0F117NVPTXTargetMachinecodegen/nvptx-subtarget-and-feature-matrix
nvptx-traverse-address-aliasing-limitcl::opt<uint>(default per LLVM)Depth limit for finding address space through traversalname@0x4D12070NVPTXAAnvptx-passes/memory-space-opt-and-process-restrict
nvptx-use-max-local-array-alignmentcl::opt<bool>falseUse maximum alignment for local memoryname@0x4D11F00NVPTXLowerArgsnvptx-passes/lower-args-and-aggr-and-struct

Layer 4 also carries nvptx-prec-divf32 enum value-strings: Use div.approx, Use div.full, Use IEEE Compliant F32 div.rnd if available (default), Use IEEE Compliant F32 div.rnd if available, no FTZ.

Layer 4b — NVPTX-CL Options Registrar (sub_45BA4C0)

A ManagedStatic-guarded static initializer at sub_45BA4C0 (8524 B) registers exactly 13 llvm::cl::opt instances against the global registry. Each branch ends with __cxa_atexit(dtor, &opt, &__dso_handle). These appear bare on the CLI (no -mllvm prefix).

optiontypedefaulthelp text (verbatim)storage / linedefining passwiki
debug-compileflagfalseCompile for debuggingsub_45BA4C0:708tileiras CLIdriver/cli-options
generate-line-infoflagfalseEmit line info even without -Gsub_45BA4C0:774tileiras CLIdriver/cli-options
ignore-bad-fpflagfalseWorkaround Gdb problem in dumping floating-point constantssub_45BA4C0:390tileiras CLIdriver/cli-options
line-info-inlined-atflagfalseEmit line with inlined-at enhancementsub_45BA4C0:840tileiras CLIdriver/cli-options
maxregint (with cl::value_desc)(none)max regcountsub_45BA4C0:583tileiras CLIdriver/cli-options
nvptx-f32ftzflagfalse(no description)sub_45BA4C0:198tileiras CLIdriver/cli-options
nvptx-nanflagfalse(no description)sub_45BA4C0:134tileiras CLIdriver/cli-options
OmflagfalsePerform maximum optimizationsub_45BA4C0:518tileiras CLIdriver/cli-options
OsizeflagfalseOptimize for code sizesub_45BA4C0:454tileiras CLIdriver/cli-options
register-usage-levelint(none)(no description)sub_45BA4C0:902tileiras CLIdriver/cli-options
value-tracking-max-depthint(none)(no description)sub_45BA4C0:646tileiras CLIdriver/cli-options
wcl::aliasdisable warningssub_45BA4C0:262tileiras CLIdriver/cli-options
WerrorflagfalseTreat all warnings as errorssub_45BA4C0:326tileiras CLIdriver/cli-options

Layer 5 — NVVM Reflect cl::opt (ctor_238 at 0x463A70)

Three registrations bundled into one TU ctor: a cl::opt<bool>, a cl::list<std::string>, and a cl::alias. List backing std::vector<std::string> at 0x5B4F380 (begin/end) + 0x5B4F390 (capacity); 32 B/entry. Atexit dtors: sub_9C31D0 (opt), sub_9C3B10 (list), sub_9C3120 (alias).

optiontypedefaulthelp text (verbatim)storagedefining passwiki
nvvm-reflect-addcl::list<string>(empty)A key=value pair. Replace __nvvm_reflect(name) with value.obj@0x5B4F300, name@0x4D3C77A; metavar name=<int>@0x4D3C78B; list backing@0x5B4F380/0x5B4F390NVVMReflectPass (sub_1BD0910 / sub_1BD0C50 / sub_1BD1280)libdevice/nvvm-reflect-mechanism
nvvm-reflect-enablecl::opt<bool>trueNVVM reflection, enabled by defaultobj@0x5B4F400, name@0x4D3C766, help@0x5B4F428 (len 35)NVVMReflectPasslibdevice/nvvm-reflect-mechanism
Rcl::aliasnvvm-reflect-add(alias)obj@0x5B4F260, name@(len 1); standard 4-check cl::alias validationNVVMReflectPasslibdevice/nvvm-reflect-mechanism

Two pass-name strings co-locate in Layer 5 but register via INITIALIZE_PASS, not cl::opt:

pass-arghelp (verbatim)storage
nvvm-intr-rangeAdd !range metadata to NVVM intrinsics.name@0x4D0ED8E, help@0x4D3C3B0
nvvm-reflectReplace occurrences of __nvvm_reflect() calls with 0/1name@0x4D0ED5D, help@0x4D3C518

The legacy-PM create-fn for nvvm-reflect (sub_1BD0880, 21 B) is a report_fatal_error("target-specific codegen-only pass") stub; the real pass body reaches through the new-PM path in sub_1A92780. Three soft-CLI parser errors live at 0x4D3C6B8 / 0x4D3C6E0 / 0x4D3C710 (Empty name, Missing value, integer value expected in nvvm-reflect-add option ').

The 13 B string nvvm-reflect- at 0x4D3C6A0 has ftz\0 at offset +13 (0x4D3C6AD), so a 16 B StringRef into Module::getModuleFlag materializes the key "nvvm-reflect-ftz" from concatenated rodata. The same ftz substring is shared with the Layer-3 ftz PassOption.

Layer 6 — MLIR Framework cl::opt (AsmPrinter / Diagnostics / PassManager / Timing)

18 static-ctor cl::opts in MLIR support libs (lib/IR/AsmPrinter.cpp, lib/IR/Diagnostics.cpp, lib/IR/MLIRContext.cpp, lib/Pass/PassTiming.cpp). Exposed unmodified through the global LLVM cl::opt parser.

optiontypedefaulthelp text (verbatim)storagedefining passwiki
mlir-disable-threadingcl::opt<bool>falseDisable multi-threading within MLIR, overrides any further call to MLIRContext::enableMultiThreading()name@0x502E5E2, help@0x502E618MLIRContextinfra/threading-and-synchronization
mlir-elide-elementsattrs-if-largercl::opt<uint>(per MLIR)Elide ElementsAttrs with "..." that have more elements than the given upper limitname@0x502CB20AsmPrinterbytecode/asm-printer-status
mlir-elide-resource-strings-if-largercl::opt<uint>(per MLIR)Elide printing value of resources if string is too long in chars.name@0x502CBA0AsmPrinterbytecode/asm-printer-status
mlir-output-formatcl::opt<enum>textDisplay method for timing dataname@0x502F5BB, help@0x502F5D0PassTimingmlir-infra/overview
mlir-pretty-debuginfocl::opt<bool>falsePrints out debug info using the pretty forms ignoring raw loc formsname@0x502CDAEAsmPrinterbytecode/asm-printer-status
mlir-print-assume-verifiedcl::opt<bool>falseSkip op verification when using custom printersname@0x502CDDA, help@0x502CC10AsmPrinterbytecode/asm-printer-status
mlir-print-debuginfocl::opt<bool>falsePrint debug info in pretty formname@0x502CD99AsmPrinterbytecode/asm-printer-status
mlir-print-elementsattrs-with-hex-if-largercl::opt<int>-1 (disabled)Print DenseElementsAttrs with a hex string that have more elements than the given upper limit (use -1 to disable)name@0x502CA78AsmPrinterbytecode/asm-printer-status
mlir-print-local-scopecl::opt<bool>falsePrint with local scope and inline information (eliding aliases for attributes, types, and locations)name@0x502CDF5, help@0x502CC40AsmPrinterbytecode/asm-printer-status
mlir-print-op-genericcl::opt<bool>falsePrint all operations using the generic assembly formname@0x502CDC4AsmPrinterbytecode/asm-printer-status
mlir-print-op-on-diagnosticcl::opt<bool>trueWhen a diagnostic is emitted on an operation, also print the operation as an attached notename@0x502E5F9, help@0x502E680Diagnosticsmlir-infra/diagnostic-abi-and-helpers
mlir-print-skip-regionscl::opt<bool>falseSkip regions when printing ops.name@0x502CE0C, help@0x502CCA8AsmPrinterbytecode/asm-printer-status
mlir-print-stacktrace-on-diagnosticcl::opt<bool>falseWhen a diagnostic is emitted, also print the stack trace as an attached notename@0x502E6E0, help@0x502E708Diagnosticsmlir-infra/diagnostic-abi-and-helpers
mlir-print-unique-ssa-idscl::opt<bool>falsePrint unique SSA ID numbers for values, block arguments and naming conflicts across all regionsname@0x502CE3B, help@0x502CD10AsmPrinterbytecode/asm-printer-status
mlir-print-value-userscl::opt<bool>falsePrint users of operation results and block arguments as a commentname@0x502CE24, help@0x502CCC8AsmPrinterbytecode/asm-printer-status
mlir-timingcl::opt<bool>falseDisplay execution timesname@0x502F565, help@0x502F571PassTimingmlir-infra/overview
mlir-timing-displaycl::opt<enum>listOutput format for timing dataname@0x502F589, help@0x502F59DPassTimingmlir-infra/overview
mlir-use-nameloc-as-prefixcl::opt<bool>falsePrint SSA IDs using NameLocs as prefixesname@0x502CE55, help@0x502CD70AsmPrinterbytecode/asm-printer-status

Enum values — mlir-timing-display: list = display the results in a list sorted by total time (0x502F5F0); tree = display the results ina with a nested tree view (0x502F628, verbatim typo ina preserved from upstream). mlir-output-format: text = display the results in text format (0x502F658); json = display the results in JSON format (0x502F680).

Layer 7 — Misc LLVM cl::opt

optiontypedefaulthelp text (verbatim)storagedefining passwiki
disable-i2p-p2i-optcl::opt<bool>falseDisables inttoptr/ptrtoint roundtrip optimizationname@0x4FF26BA, help@0x4FF2688llvm/Analysis/ValueTracking.cppupstream LLVM ValueTracking

PassBuilder Mega-Registry Note

  • The 478 pretty-name-keyed ("llvm::TPass]") entries + 66 naked-class entries + 7 special-form entries (551 total) inserted into the PassBuilder StringMap<PassInfo> by sub_1CCB7D0 (35948 B) are downstream LLVM. Twenty NVIDIA-private interleavings (check-gep-index, check-kernel-functions, cnp-launch-check, ipmsp, nv-early-inliner, nv-inline-must, nvvm-pretreat, nvvm-verify, printf-lowering, select-kernels, nvvm-aa, kernel-info, nvvm-reflect-pp, nvvm-peephole-optimizer, propagate-alignment, reuse-local-memory, memory-space-opt, lower-aggr-copies, lower-struct-args, process-restrict) are pipeline-text-parser keys for --passes=..., not freestanding cl::opts; factory functors live in parseModulePass / parseCGSCCPass / parseFunctionPass / parseLoopPass / parseMachinePass. Full table: pipeline/passbuilder-mega-registry.