xla.DebugOptions Proto
All symbols, addresses, and field numbers on this page apply to
libtpu.sofrom thelibtpu-0.0.40-cp314wheel (buildlibtpu_lts_20260413_b_RC00, build-id md589edbbe81c5b328a958fe628a9f2207d, 781,691,048 bytes, ELF x86-64 DYN, not stripped). Field names and enum type names are quoted verbatim from the descriptor pool embedded in.rodata. Other versions will differ.
Abstract
xla::DebugOptions is the shared XLA compiler/runtime configuration message — the one proto XLA carries whole across CPU, GPU, and TPU backends. It travels with an HLO module across the JAX/PJRT front-end → backend boundary: it is field 14 of HloModuleConfigProto and field 4 of ExecutionOptions. In a JAX-on-GPU build it is the proto behind the public --xla_* flag surface (--xla_dump_to, --xla_step_marker_location, …). In this libtpu build it is mostly inert: the TPU compiler's real knobs live in the 1121-field TpuCompilationEnvironment, and only two DebugOptions fields are wired to a registered absl::Flag.
This page owns the field-by-field schema: field number → name → wire type → C++/proto type, grouped by area and ordered by field number. It is the authoritative DebugOptions field dictionary that the rest of this wiki cites. The descriptor pool carries 290 live fields (max field number 501, 211 numbering gaps), 17 nested enums, 2 nested map-entry messages, and zero real (user-declared) oneofs. Every scalar is a proto3 optional wrapped in a synthetic single-member oneof, giving explicit has-bit presence — a DebugOptions on the wire records exactly which knobs the front-end touched.
The reference frame for a reimplementer is OpenXLA's own xla.proto. The two TPU-specific facts that bend it: first, the TPU build does not prune the GPU/CPU fields from the proto — 183 xla_gpu_* + 31 xla_cpu_* + 5 xla_llvm_* fields survive in the descriptor so a GPU-built XLA can round-trip them, but no TPU code reads them and no flag sets them. Second, the field numbers are sparse and tombstoned by deletion, not by declared reserved ranges: the descriptor carries no reserved_range / reserved_name, so the 211 gaps are deleted-flag holes guarded only by review discipline.
NOTE — DebugOptions carries 290 live fields, decoded in full from the descriptor at pool index 403 (290 field names cross-matched against the binary). This is the authoritative schema count for the field roster; a partial 111-entry tag sample elsewhere undercounts it.
For reimplementation, the contract is:
- The field schema — 290 field numbers, names, and proto types, ordered by field number, so a reimplementer can rebuild the descriptor and round-trip the wire format.
- The presence model — every scalar is proto3-
optional(synthetic oneof, has-bit presence); 12 fields arerepeated/map. This is what makes "the user set this knob" distinguishable from "left at default." - The TPU-inert split — which fields are flag-wired (2), which are PJRT-
debug_options-only (generic/dump/hlo/llvm), and which are GPU/CPU proto-only carryovers (214) that no TPU code reads.
| Message | xla.DebugOptions (package xla, proto3) |
| Descriptor | FileDescriptorProto index 403 = …/compiler/xla/xla.proto (VA 0xc021470) |
| Live fields | 290 (max field# 501, 211 gaps) |
| All-default baseline | xla::DefaultDebugOptionsIgnoringFlags() @ 0x1e66a860 (mangled _ZN3xla32DefaultDebugOptionsIgnoringFlagsEv) |
| Default-diff helpers | GetNonDefaultDebugOptions @ 0x1c920540 · DumpNonDefaultDebugOptions @ 0x1c920d80 |
| Nested enums | 17 (+ 1 field references external xla.autotuner.Backend) |
| Nested messages | 2 map-entry types (auto-generated by map<K,V>) |
| Real oneofs | 0 (278 synthetic proto3-optional markers) |
| Flag-wired fields | 2 — xla_tpu_detect_nan (135), xla_tpu_detect_inf (136) |
| Confidence | CONFIRMED (field names + enum names + baseline symbol byte-anchored) unless a row says otherwise |
1. Type and Prefix Census
Type distribution
The 290 fields decompose by proto type as follows (re-derivable from the descriptor's per-field type cards):
| Proto type | Count | C++ accessor type |
|---|---|---|
| bool | 185 | bool |
| int32 | 30 | int32_t |
| string | 26 | std::string |
| enum | 23 | nested-enum / external enum |
| int64 | 21 | int64_t |
| message | 3 | 1 sub-message + 2 map-entry |
| float | 2 | float |
QUIRK — DebugOptions has zero
doubleand zero unsigned fields, unlike the TCE (which carries 14 doubles and 15 unsigned). DebugOptions is a flat config proto built almost entirely frombooltoggles (185 of 290, 64%). A reimplementer can assume signed-or-bool for every scalar except the 2floatfields.
Prefix distribution and TPU build status
The single most important structural fact: the prefix family of a field determines whether it does anything in this TPU build.
| Prefix | Count | TPU build status |
|---|---|---|
xla_gpu_* | 183 | proto-only — no flag, no TPU consumer (GPU carryover) |
generic xla_* | 39 | proto-only — settable only via PJRT debug_options |
xla_cpu_* | 31 | proto-only — no flag, no TPU consumer (CPU carryover) |
xla_dump_* | 24 | proto-only — settable only via PJRT debug_options |
xla_hlo_* | 6 | proto-only — settable only via PJRT debug_options |
xla_llvm_* | 5 | proto-only — settable only via PJRT debug_options |
xla_tpu_* | 2 | FLAG-WIRED (135 detect_nan, 136 detect_inf) |
GOTCHA — a reimplementer enumerating DebugOptions fields will see 183
xla_gpu_*fields (e.g.xla_gpu_enable_triton_gemm,xla_gpu_autotune_level) that look like live knobs. They are not — there is noxla_gpu_*flag registered anywhere in this binary (zeroAbslFlagHelpGenForxla_gpu_*symbols, confirmed). The GPU/CPU fields exist in the shared descriptor purely so a GPU-built XLA can serialize/deserialize the same message; on TPU they are dead weight. Do not wire them to anything. (CONFIRMED — symbol scan returned 0.)
2. The Presence Model
Proto3-optional, not real oneofs
The descriptor declares 278 "oneofs", but every one is a synthetic single-member proto3-optional marker (_<fieldname>), one per scalar field. There are zero real multi-member user-declared oneofs. Each scalar therefore carries an explicit has-bit: the front-end can distinguish "set to false" from "left at default false."
field 113 xla_dump_hlo_as_proto : bool
└─ synthetic oneof _xla_dump_hlo_as_proto (1 member, proto3-optional)
→ has_xla_dump_hlo_as_proto() reads the has-bit
→ xla_dump_hlo_as_proto() reads the value (zero if unset)
The 12 non-optional fields are 10 repeated-scalar/enum (4 repeated string, 6 repeated enum) and 2 map fields (§4) — there is no standalone (non-map) repeated message field. Repeated/map fields have list/map presence (size), not a has-bit.
NOTE — this presence model is why
GetNonDefaultDebugOptions@0x1c920540andDumpNonDefaultDebugOptions@0x1c920d80exist: they diff a populated DebugOptions against the all-default baseline (DefaultDebugOptionsIgnoringFlags@0x1e66a860) using the has-bits to emit only the fields the user actually changed. A reimplementer that drops proto3-optional presence loses the "what did the user touch" signal these functions rely on.
Defaults live in .text, not the descriptor
proto3 carries no descriptor-level field defaults — every scalar's wire default is the zero value (false/0/""/enum-0). The effective runtime default is whatever DefaultDebugOptionsIgnoringFlags @ 0x1e66a860 writes before the front-end overrides it. That function is a single large constructor in .text; the recovered per-field default values are owned by default-debugoptions.md. This page owns the schema; that page owns the values.
NOTE — only two DebugOptions fields are flag-wired: a cross-match of all 290 field names against the binary's registered
AbslFlagHelpGenForxla_*symbols finds exactly two intersections,xla_tpu_detect_nan(135) andxla_tpu_detect_inf(136) (bothdetectflag-gen symbols present). The classic dump/HLO knobs (xla_dump_to,xla_hlo_profile, …) are not standalone absl flags in this build — they reach DebugOptions only through the PJRTCompileOptions.debug_optionsproto path. The other 1328 registeredxla_*flags land in the TCE, not here.
3. Field Schema by Area
The complete 290-field schema follows, grouped by functional area and ordered by field number within each group. Column key: # = field number; type carries the proto label (repeated) and the enum/message type name; conf = confidence the name/type is byte-exact (C = confirmed against descriptor strings, H = high). The flag-wiring / dump-trigger note appears in the right column where relevant. Defaults are NOT shown here — see default-debugoptions.md.
3.1 HLO debug / profile (xla_hlo_*) — 6 fields
| # | name | type | conf | note |
|---|---|---|---|---|
| 2 | xla_hlo_graph_addresses | bool | C | message starts at field 2 (field 1 deleted) |
| 9 | xla_hlo_profile | bool | C | |
| 92 | xla_hlo_graph_sharding_color | bool | C | |
| 370 | xla_hlo_pass_fix_detect_cycles | bool | C | |
| 447 | xla_hlo_print_inline_stack_frames | bool | C | |
| 106 | xla_hlo_evaluator_use_fast_path | bool | C |
3.2 HLO pass control (generic xla_*) — selected
| # | name | type | conf | note |
|---|---|---|---|---|
| 30 | xla_disable_hlo_passes | repeated string | C | per-pass disable list |
| 104 | xla_disable_all_hlo_passes | bool | C | |
| 124 | xla_enable_hlo_passes_only | repeated string | C | allow-list |
| 31 | xla_backend_optimization_level | int32 | C | -O level |
| 33 | xla_embed_ir_in_executable | bool | C | |
| 35 | xla_eliminate_hlo_implicit_broadcast | bool | C | |
| 107 | xla_allow_scalar_index_dynamic_ops | bool | C | |
| 122 | xla_allow_excess_precision | bool | C | |
| 142 | xla_multiheap_size_constraint_per_heap | int32 | C | |
| 251 | xla_debug_buffer_assignment_show_max | int64 | C | |
| 293 | xla_reduce_window_rewrite_base_length | int64 | C | |
| 315 | xla_syntax_sugar_async_ops | bool | C | |
| 335 | xla_enable_fast_math | bool | C | |
| 363 | xla_unsupported_crash_on_hlo_pass_fix_max_iterations | bool | C | |
| 379 | xla_unsupported_crash_on_hlo_pass_noop_change | bool | C | |
| 380 | xla_unsupported_crash_on_hlo_pass_silent_hlo_change | bool | C | |
| 419 | xla_keep_shardings_after_spmd | bool | C | |
| 452 | xla_enable_hlo_sharding_v3 | bool | C | |
| 463 | xla_recognize_reduction_optimization_level | int32 | C | |
| 187 | xla_partitioning_algorithm | enum PartitioningAlgorithm | C | SPMD partitioner select |
| 344 | xla_pjrt_allow_auto_layout_in_hlo | bool | C | |
| 397 | xla_early_exit_with_layouts | bool | C |
3.3 Layout / test layout (generic xla_*)
| # | name | type | conf | note |
|---|---|---|---|---|
| 90 | xla_test_all_output_layouts | bool | C | |
| 91 | xla_test_all_input_layouts | bool | C | |
| 373 | xla_test_add_command_buffer_mode | bool | C |
3.4 Numerics / NaN-Inf detection — the TPU-meaningful core
| # | name | type | conf | note |
|---|---|---|---|---|
| 135 | xla_tpu_detect_nan | bool | C | FLAG-WIRED (AbslFlagHelpGenForxla_tpu_detect_nan present) |
| 136 | xla_tpu_detect_inf | bool | C | FLAG-WIRED (AbslFlagHelpGenForxla_tpu_detect_inf present) |
| 403 | xla_detect_unstable_reductions | enum DetectionMode | C | |
| 432 | xla_detect_unstable_reductions_post_optimizations | enum DetectionMode | C | |
| 426 | xla_gpu_detect_nan | enum DetectionMode | C | GPU carryover (note: enum, not bool) |
| 428 | xla_gpu_detect_inf | enum DetectionMode | C | GPU carryover (note: enum, not bool) |
QUIRK —
xla_tpu_detect_nan(135) andxla_tpu_detect_inf(136) arebool, but the later-addedxla_gpu_detect_nan(426) /xla_gpu_detect_inf(428) are tri-stateDetectionModeenums (NONE/WARNING/FAIL). The two TPU bool fields are the only DebugOptions fields with a registeredabsl::Flagin libtpu — they are the live numerics knobs a JAX-on-TPU user can actually flip via the flag surface. A reimplementer must not assume the_detect_nanfamily is type-uniform across backends.
3.5 TPU step marker — the one TPU-meaningful enum field
| # | name | type | conf | note |
|---|---|---|---|---|
| 108 | xla_step_marker_location | enum StepMarkerLocation | C | controls TPU step-marker placement |
StepMarkerLocation is enumerated in §5. This is the single DebugOptions enum field that materially affects TPU codegen (where the profiler step marker is inserted relative to while loops).
3.6 Dump / debug emission (xla_dump_*) — 24 fields
The dump family is proto-only (PJRT debug_options path). Six of these are dump triggers: a bool whose set state gates serialization of a specific HLO-family proto (cross-referenced to the proto each emits).
| # | name | type | conf | note |
|---|---|---|---|---|
| 109 | xla_dump_to | string | C | output directory |
| 110 | xla_dump_hlo_module_re | string | C | module-name regex |
| 111 | xla_dump_hlo_pass_re | string | C | pass-name regex |
| 154 | xla_dump_hlo_pipeline_re | string | C | pipeline-name regex |
| 433 | xla_dump_emitter_re | string | C | emitter regex |
| 112 | xla_dump_hlo_as_text | bool | C | |
| 113 | xla_dump_hlo_as_proto | bool | C | → HloProto |
| 114 | xla_dump_hlo_as_dot | bool | C | |
| 115 | xla_dump_hlo_as_url | bool | C | |
| 116 | xla_dump_hlo_as_html | bool | C | |
| 164 | xla_dump_hlo_as_long_text | bool | C | |
| 118 | xla_dump_hlo_snapshots | bool | C | → HloSnapshot |
| 144 | xla_dump_module_metadata | bool | C | → HloModuleMetadataProto |
| 182 | xla_dump_latency_hiding_schedule | bool | C | → ScheduleProto |
| 381 | xla_dump_full_hlo_config | bool | C | → HloModuleConfigProto |
| 405 | xla_dump_hlo_unoptimized_snapshots | bool | C | → HloUnoptimizedSnapshot |
| 131 | xla_dump_include_timestamp | bool | C | |
| 132 | xla_dump_max_hlo_modules | int32 | C | |
| 149 | xla_dump_fusion_visualization | bool | C | |
| 151 | xla_dump_compress_protos | bool | C | |
| 153 | xla_dump_disable_metadata | bool | C | |
| 185 | xla_dump_enable_mlir_pretty_form | bool | C | |
| 290 | xla_dump_large_constants | bool | C | |
| 466 | xla_dump_buffer_assignment_analysis | bool | C |
Adjacent dump-related generic fields: 252 xla_detailed_logging (bool), 253 xla_enable_dumping (bool), 436 xla_enable_scoped_logging_timers (bool).
3.7 Command buffers / runtime (generic xla_*)
| # | name | type | conf | note |
|---|---|---|---|---|
| 311 | xla_cmd_buffer_trace_cache_size | int64 | C | |
| 317 | xla_enable_command_buffers_during_profiling | bool | C | |
| 364 | xla_flags_reset | bool | C | |
| 408 | xla_disable_automatic_host_compute_offload | bool | C | |
| 429 | xla_enable_enzyme_comms_opt | bool | C | |
| 439 | xla_allow_h2h_copy_when_automatic_host_compute_offload_disabled | bool | C | |
| 102 | xla_force_host_platform_device_count | int32 | C |
3.8 GPU-inherited-but-inert (xla_gpu_*) — 183 fields, proto-only
The full 183 are present in the descriptor but read by no TPU code. They are best described by their axes rather than dumped row-by-row; the representative anchors below confirm the family is intact and span its functional clusters. Every row is gpu* proto-only.
| cluster | representative fields (#) | types |
|---|---|---|
| autotuning | xla_gpu_autotune_level(123), xla_gpu_autotune_max_solutions(288), xla_gpu_experimental_autotune_cache_mode(324, enum AutotuneCacheMode), xla_gpu_experimental_autotune_backends(442, repeated external enum xla.autotuner.Backend), xla_gpu_autotune_gemm_rtol(316, float) | int32/int64/enum/float |
| Triton GEMM | xla_gpu_enable_triton_gemm(188), xla_gpu_triton_gemm_any(190), xla_gpu_unsupported_enable_triton_gemm(322), xla_gpu_experimental_enable_triton_heroless_priority_fusion(340) | bool |
| collectives / NCCL | xla_gpu_all_reduce_combine_threshold_bytes(157, int64), xla_gpu_nccl_termination_timeout_seconds(163, int64), xla_gpu_enable_pipelined_all_reduce(217), xla_gpu_disable_async_collectives(289, repeated enum CollectiveOpType) | int64/bool/enum |
| command buffer | xla_gpu_enable_command_buffer(258, repeated enum CommandBufferCmdType), xla_gpu_command_buffer_scheduling_mode(404, enum CommandBufferSchedulingMode), xla_gpu_command_buffer_update_mode(469, enum CommandBufferUpdateMode), xla_gpu_command_buffer_unroll_loops(411) | enum/bool |
| latency / scheduling | xla_gpu_enable_latency_hiding_scheduler(186), xla_gpu_enable_analytical_latency_estimator(255), xla_gpu_enable_analytical_sol_latency_estimator(356), xla_gpu_analytical_latency_estimator_options(357, map) | bool/map |
| codegen / cuDNN / cuBLAS | xla_gpu_enable_cublaslt(166), xla_gpu_cudnn_gemm_fusion_level(285, int32), xla_gpu_enable_cudnn_layer_norm(262), xla_gpu_libnvjitlink_mode(343, enum LibNvJitLinkMode) | bool/int32/enum |
| while-loop | xla_gpu_enable_while_loop_unrolling(294, enum WhileLoopUnrolling), xla_gpu_enable_while_loop_double_buffering(248) | enum/bool |
| diagnostics | xla_gpu_shape_checks(170, enum ShapeChecks), xla_gpu_pgle_accuracy_checker(341, enum PGLEStrictnessLevel), xla_gpu_detect_nan(426)/detect_inf(428, enum DetectionMode) | enum |
| paths / files | xla_gpu_cuda_data_dir(61), xla_gpu_dump_autotune_results_to(222), xla_gpu_kernel_cache_file(306), xla_gpu_collectives_implementation(468) | string |
NOTE — the 183
xla_gpu_*fields are not individually documented here because every one is inert on TPU — documenting each "meaning" would restate the field name with no reimplementation value (the §1 GOTCHA). The cluster table proves the family is present and intact in the descriptor (so the wire format round-trips), which is all a TPU reimplementer needs. Field 451xla_gpu_execution_terminate_timeoutis astring, not an int — a deliberate descriptor quirk worth noting if you regenerate the proto.
3.9 CPU-inherited-but-inert (xla_cpu_*) — 31 fields, proto-only
Also proto-only. Representative anchors confirming the family:
| # | name | type |
|---|---|---|
| 60 | xla_cpu_multi_thread_eigen | bool |
| 97 | xla_cpu_use_onednn | bool |
| 99 | xla_cpu_enable_fast_math | bool |
| 120 | xla_cpu_fast_math_honor_nans | bool |
| 308 | xla_cpu_prefer_vector_width | int32 |
| 333 | xla_cpu_max_isa | string |
| 365 | xla_cpu_experimental_xnn_graph_fusion_mode | enum XnnGraphFusionMode |
| 399 | xla_cpu_experimental_onednn_fusion_type | repeated enum LibraryFusionType |
| 448 | xla_cpu_scheduler_type | enum CpuSchedulerType |
| 467 | xla_cpu_opt_preset | enum CpuOptPreset |
3.10 LLVM IR metadata (xla_llvm_*) — 5 fields, proto-only
| # | name | type |
|---|---|---|
| 70 | xla_llvm_enable_alias_scope_metadata | bool |
| 71 | xla_llvm_enable_noalias_metadata | bool |
| 72 | xla_llvm_enable_invariant_load_metadata | bool |
| 73 | xla_llvm_disable_expensive_passes | bool |
| 300 | xla_llvm_force_inline_before_split | bool |
3.11 Escape hatch — map<string,string>
| # | name | type | conf | note |
|---|---|---|---|---|
| 500 | xla_backend_extra_options | map<string,string> | C | the canonical typed-flag escape hatch |
| 357 | xla_gpu_analytical_latency_estimator_options | map<string,string> | C | GPU carryover |
xla_backend_extra_options (500) is the only field that survives unknown-key additions: arbitrary string→string options the backend reads without a typed field. It is deliberately parked at field 500 to leave 471–499 free for in-sequence growth (§6).
4. Nested Messages and Maps
The descriptor carries exactly 2 nested messages, both auto-generated map-entry types, plus one external sub-message referenced by field 424.
DebugOptions.XlaBackendExtraOptionsEntry { string key = 1; string value = 2; }
└─ field 500 xla_backend_extra_options : map<string,string>
DebugOptions.XlaGpuAnalyticalLatencyEstimatorOptionsEntry { string key = 1; string value = 2; }
└─ field 357 xla_gpu_analytical_latency_estimator_options : map<string,string> (gpu carryover)
External (defined elsewhere in xla.proto, NOT nested in DebugOptions):
field 424 xla_gpu_experimental_thunk_buffer_debug_filter : message xla.ThunkBufferDebugFilter
xla.ThunkBufferDebugFilter {
repeated xla.IntRangeInclusive thunk_id_ranges = 1;
repeated string profile_annotation_regexes = 2;
}
xla.IntRangeInclusive { int64 first = 1; int64 last = 2; }
QUIRK — field 424 is the only real sub-message field in DebugOptions; the other two "messages" in the type census (§1) are the auto-generated map-entry types. A reimplementer counting "3 message fields" should not expect 3 hand-written sub-messages — there is one (
ThunkBufferDebugFilter, a GPU carryover) plus two map entries. (CONFIRMED —ThunkBufferDebugFilterandXlaBackendExtraOptionsEntrynames byte-anchored in the descriptor.)
5. Nested Enums
All 17 nested enums, value-by-value, from the descriptor. The 23 enum-typed fields use one of these except field 442, which references the external xla.autotuner.Backend (defined in a separate descriptor-pool file, value set not recovered here).
StepMarkerLocation (field 108 — the one TPU-meaningful enum):
STEP_MARK_AT_ENTRY=0, STEP_MARK_AT_TOP_LEVEL_WHILE_LOOP=1,
STEP_MARK_NONE=2, STEP_MARK_AT_SECOND_LEVEL_WHILE_LOOP=3
-- non-sequential: NONE=2 sits BETWEEN the two while-loop values
ShapeChecks (170): IGNORE=0, RUNTIME=1, COMPILE_TIME=2
PartitioningAlgorithm (187): NOOP=0, _EXP0=1, _EXP1=2, _EXP2=3
CommandBufferCmdType (258, repeated):
INVALID=0, FUSION=1, CUBLAS=2, CUDNN=3, COLLECTIVES=4, CONDITIONAL=5,
WHILE=6, CUSTOM_CALL=7, CUBLASLT=8, DYNAMIC_SLICE_FUSION=9,
DYNAMIC_SLICE_COPY_FUSION=10
CollectiveOpType (289, repeated):
NOOP=0, ALLREDUCE=1, ALLGATHER=2, REDUCESCATTER=3, COLLECTIVEBROADCAST=4,
ALLTOALL=5, COLLECTIVEPERMUTE=6, RAGGEDALLTOALL=7, ALLCOLLECTIVES=8
WhileLoopUnrolling (294):
NO_UNROLL=0, DOUBLE_BUFFER=1, FULL_UNROLL=2, AUTO_UNROLL=3
AutotuneCacheMode (324): UNSPECIFIED=0, UPDATE=1, READ=2
PGLEStrictnessLevel (341): OFF=0, WARN=1, ERROR=2
LibNvJitLinkMode (343): AUTO=0, DISABLED=1, ENABLED=2
PipelineParallelismOptLevel (351): DISABLE=0, ENABLE=1
XnnGraphFusionMode (365): DISABLED=0, GREEDY=1, GREEDY_SLINKY=2, BYPASS_COST_MODEL=3
LibraryFusionType (399,400,422, all repeated):
INVALID=0, DOT=1, ELTWISE=2, REDUCE=3, INDIVIDUAL_DOT=4, INDIVIDUAL_CONVOLUTION=5
DetectionMode (403,426,428,432): NONE=0, WARNING=1, FAIL=2
CommandBufferSchedulingMode (404): SERIALIZE=0, CONCURRENT=1, LHS=2, CONCURRENT_REGIONS=3
CpuSchedulerType (448): DEFAULT=0, MEMORY_OPTIMIZED=1, CONCURRENCY_OPTIMIZED=2
CpuOptPreset (467): DEFAULT=0, FAST_RUNTIME=1, FAST_COMPILE=2
CommandBufferUpdateMode (469): ALWAYS_UPDATE=0, NEVER_UPDATE=1, CAPTURE_CMD_NEVER_UPDATE=2
GOTCHA —
StepMarkerLocation(108) is the enum a TPU reimplementer must get right, and its numbering is a trap:STEP_MARK_NONE=2sits between the two while-loop placements (AT_TOP_LEVEL_WHILE_LOOP=1,AT_SECOND_LEVEL_WHILE_LOOP=3). Driving step-marker logic off ordinal order ("higher = deeper") is wrong —NONEis not "deepest." Read the enum value by name. The effective TPU default for this field (often overridden away fromAT_ENTRY=0) is owned bydefault-debugoptions.md.
6. Field-Number Gaps — Deletions, Not Reserved Ranges
The message is sparse: 290 used numbers across 1–501, with 211 gaps. The descriptor carries no reserved_range and no reserved_name — positive evidence that every gap is a deleted field (its line removed from xla.proto), not a declared reserved range. Upstream XLA removes a flag by deleting the field declaration; the number is never re-used (protobuf hygiene), but only review discipline guards it, not the reserved keyword.
Gaps (field numbers absent, 1..501):
1, 3-8, 10-29, 32, 34, 36-59, 63-69, 74-89, 93-96, 98, 117, 119, 130,
133-134, 139, 141, 143, 145, 152, 156, 158, 160-162, 167-169, 171-173,
176-180, 183-184, 191-202, 204, 206-207, 211, 214, 218, 220-221, 226,
229-230, 233-234, 238, 242-243, 249, 263-264, 266, 270-271, 275-276,
278-279, 281-282, 286, 298-299, 302-303, 309, 313-314, 319-320,
325-326, 330, 332, 346, 352, 354-355, 358, 361, 367, 369, 371, 385,
394, 396, 398, 402, 423, 430, 443, 446, 464, 471-499
NOTE — field 1 is deleted (the live message starts at field 2,
xla_hlo_graph_addresses). The dense low-range gaps (3–29, 36–59, 74–89) are early-XLA churn. The 471–499 block (29 numbers) is the open allocation window: fields run sequentially to 470, then jump to the out-of-band 500/501 pair. A reimplementer regenerating this proto for a newer XLA thanlibtpu_lts_20260413_b_RC00should expect 471–499 to start filling. (CONFIRMED — descriptor has zero reserved entries; HIGH that 471–499 is the intentional growth window.)
7. Reimplementation Notes
- Round-trip the whole proto, even the inert fields. To deserialize a DebugOptions handed in via PJRT
CompileOptions.debug_options, a TPU reimplementation must accept all 290 fields (including the 214 GPU/CPU carryovers) or it will reject messages a GPU-built front-end serialized. Read only the ~70 generic/dump/hlo/tpu fields; accept all 290. - Wire only 2 flags. Only
xla_tpu_detect_nan(135) andxla_tpu_detect_inf(136) need anabsl::Flagbinding. Everything else arrives via the proto, never via the flag surface — seexla-flag-atlas.mdfor the flag-name roster. - Preserve proto3-optional presence. The has-bits are what
GetNonDefaultDebugOptions/DumpNonDefaultDebugOptionsdiff against the baseline; dropping presence breaks the "what did the user touch" logging. - Defaults are a separate concern. This page is field# → name → type only. The effective default of each field (from
DefaultDebugOptionsIgnoringFlags@0x1e66a860) is owned bydefault-debugoptions.md. The TCE that wraps DebugOptions for the TPU compile path is owned bytpu-compilation-environment.md.
Related Components
| Component | Relationship |
|---|---|
xla::DefaultDebugOptionsIgnoringFlags() @ 0x1e66a860 | the all-default baseline constructor; owns effective default values |
xla::GetNonDefaultDebugOptions @ 0x1c920540 | diffs a populated message against the baseline via has-bits |
xla::DumpNonDefaultDebugOptions @ 0x1c920d80 | logging path emitting only user-changed fields |
TpuCompilationEnvironment (_table_ @ 0x21cfa9e0) | the TPU-private master config; DebugOptions is the GPU/CPU-shared sibling |
HloModuleConfigProto (field 14) / ExecutionOptions (field 4) | the protos that carry DebugOptions across the PJRT boundary |
Cross-References
- overview.md — the three-layer flag→proto→effective-value pipeline; where DebugOptions sits as Stage 2a
- xla-flag-atlas.md — the ~2107-name flag surface; which flags (the 2) set DebugOptions fields vs the 1328 that land in the TCE
- default-debugoptions.md — the
DefaultDebugOptionsIgnoringFlags @ 0x1e66a860effective-default values (this page owns the schema; that page owns the defaults) - tpu-compilation-environment.md — the 1121-field TCE that wraps DebugOptions for the TPU compile path
- autoproto-autoor-resolution.md — the
AutoOr<T>tri-state resolver used by the TCE (DebugOptions itself has no AutoProto fields — all 278 singular fields are plain proto3-optional)