LLVM/MLIR Manifest
All addresses, offsets, and version strings on this page apply to
libtpu.sofrom thelibtpu-0.0.40-cp314wheel (releaselibtpu_lts_20260413_b_RC00, build-id md589edbbe81c5b328a958fe628a9f2207d, 781,691,048 bytes on disk). Other wheels will differ.
Abstract
libtpu.so is not a thin runtime shim — it is a whole compiler statically linked into one shared object. Roughly a third of its code-and-rodata bytes are upstream LLVM and MLIR, vendored from a single Google-internal LLVM monorepo snapshot and dragged in along with an out-of-tree TPU LLVM backend, the upstream MLIR dialect zoo, and a stack of TPU-specific MLIR dialects (tpu, llo, sparse_core, mosaic_sc, xtile). This page establishes the authoritative what's-compiled-in manifest: the exact LLVM/MLIR version evidence, which LLVM core/CodeGen/MC components are present, which MLIR dialects and infrastructure are linked, and the headline versions of the third-party libraries that ride alongside the toolchain.
The version frame is the central complication. Google's build system rewrites the upstream LLVM_VERSION_MAJOR to a rolling sentinel, so the binary carries no LLVM 23.0.0 banner. What it does carry — pinned to byte offsets in .rodata — are two monorepo commit SHAs (clang and LLVM heads), a g3_____-trunk revision tag, the 9999.0.0 sentinel, and a build epoch. From those the upstream major is bounded to LLVM 23-dev (tip-of-trunk, ~April 2026) by the release-branch calendar, not read directly. Every component claim below is anchored to a defined symbol or a .rodata literal recovered from the binary; the version window is the one inference, and it is marked as such.
NOTE — the binary is un-stripped in this wheel:
.symtabcarries 1,233,710 entries (sh_size 0x1c3cc50 ÷ 24; the1232970readelfprints in the section'sInfcolumn is the first-global index, not the entry count) with full Itanium mangling. FLIRT pattern-matching is therefore moot — every component below is confirmed by an exact demangled symbol or rodata string, not a fuzzy signature. A production strip removes.symtab/.strtabbut leaves all the code bytes identified here.
For reimplementation, the contract is:
- The version pin — which LLVM/MLIR commit the toolchain tracks, and how to bound the upstream major from a sentinel-masked build.
- The LLVM component set — Core IR, both ISel paths, MachineCodeGen, the MC layer, the seven registered target backends, the analysis/transform pass pipeline, and the two AOT MLGO advisor models.
- The MLIR component set — Core IR, the three-tier dialect registry, pass infra, bytecode reader/writer, the conversion framework, and the LLVM-IR translation path (but no ExecutionEngine/JIT).
- The embedded-library versions — the key third-party pins (Abseil, protobuf, Eigen, DNNL, ICU, libc++, …) that the toolchain is compiled against, deferring the exhaustive tree to the Embedded-Library Atlas.
| LLVM/MLIR version | LLVM 23-dev (trunk), monorepo commit 8918319853fbdf9e6f6cb69e96848f913a22bc31 |
| clang version literal | g3 clang version 9999.0.0 (a70419505471bd8240ef3451dcdd541f8676477c) @ 0xaf591d4 |
| LLVM version literal | LLVM version g3_____-trunk 8918319853fbdf… @ 0xb1fa070 |
| llvm-mc literal | llvm-mc (based on LLVM g3_____-trunk 8918319853fbdf…) @ 0xa12d48e |
| Build epoch | Built on Apr 13 2026 14:17:21 (1776115041) @ 0x84a1d90 |
| Code/data model | Large code model (.lrodata 113.3 MB, .lbss present), x86-64, clang/lld |
| MLIR provenance | Same monorepo commit as LLVM — no separate MLIR version |
| Registered target backends | X86, AArch64, ARM, AMDGPU (+R600), PowerPC, NVPTX, TPU (seven, by LLVMInitialize*TargetInfo) |
| LLVM + MLIR footprint | ~84.1 MB LLVM + ~72.2 MB MLIR = ~156 MB of code+rodata symbol bytes (~37% of T+R) |
LLVM/MLIR Version Pin
Purpose
Fix the exact toolchain version so every llvm::/mlir:: address on the forensics pages is unambiguous, and explain why the upstream major cannot be read directly from a string.
Version Evidence
Three version literals sit in .rodata, each recoverable with strings -t x. They are the primary anchors for the whole toolchain:
0xa12d48e llvm-mc (based on LLVM g3_____-trunk 8918319853fbdf9e6f6cb69e96848f913a22bc31)
0xaf591d4 …PIC LevelCode ModelLarge Data Threshold…g3 clang version 9999.0.0 (a70419505471bd8240ef3451dcdd541f8676477c)
0xb1fa070 LLVM version g3_____-trunk 8918319853fbdf9e6f6cb69e96848f913a22bc31
The clang and LLVM SHAs are two heads of the same monorepo (clang's last-touched commit vs LLVM core's last-touched commit at the snapshot point), not two repositories. MLIR ships from that same monorepo at that same commit — there is no independent MLIR version number to find. The clang literal is co-located (same string-pool run) with the Code Model / Large Data Threshold driver flags, which is itself the evidence that this binary was built with the large code model (see the .lrodata/.lbss sections in ELF Anatomy).
| Field | Value | Anchor |
|---|---|---|
| LLVM trunk SHA | 8918319853fbdf9e6f6cb69e96848f913a22bc31 | .rodata 0xb1fa070 |
| clang trunk SHA | a70419505471bd8240ef3451dcdd541f8676477c | .rodata 0xaf591d4 |
| Revision tag | g3_____-trunk (Google rolling-trunk sync) | both literals |
| clang version string | 9999.0.0 (Google sentinel for tip-of-trunk) | clang literal |
| Build toolchain | Bazel, release r4rca-2026.04.04-1 (mainline @894239244) @ 0x84a12a0 | .rodata |
| Build epoch (UTC) | 2026-04-13 21:17:21 (epoch 1776115041) | .rodata 0x84a1d90 |
Bounding the Upstream Major
The 9999.0.0 sentinel and g3_____-trunk tag deliberately mask LLVM_VERSION_MAJOR; no LLVM 23.0.0-style literal survives. The major is pinned from the release-branch calendar, not a string:
Build epoch : 2026-04-13 (RC cut from piper @894239244; Bazel r4rca-2026.04.04-1)
google3 sync lag : days-to-low-weeks behind upstream main
=> upstream main window: ~late-March to mid-April 2026
LLVM release cadence (6-month major; branch ~6-8 wk before .1.0):
21.x branched ~Jul 2025 (main was 21.0.0git in 2025)
22.x branched ~late-Jan/Feb 2026, 22.1.0 ~Mar 2026
23.x branches ~Jul/Aug 2026
Once 22.x branches (Jan/Feb 2026), main's LLVM_VERSION_MAJOR bumps to 23.
By April 2026 upstream main reports 23.0.0git.
=> embedded LLVM = LLVM 23-dev (trunk), post-22.x-branch, pre-23.x-branch.
GOTCHA — do not treat
9999.0.0as a real version and assume an exhaustively-stable release ABI. This is tip-of-trunk: it can contain post-22-branch IR/pass changes that no tagged release ships, and it predates the 23.0.0 release. A reimplementer targeting "LLVM 22" or "LLVM 23" tagged sources will see API drift; the only exact reference is the monorepo commit8918319853fbdf….
NOTE — the major-version window (23-dev) is HIGH confidence — bounded by the build epoch and the deterministic branch/version-bump mechanic. It is the single inferred datum on this page; everything else is a direct symbol/string hit. The remaining gap is the exact upstream commit date for
8918319853fbdf…, which is not embedded (only the SHA), so converting "23-dev window" to "23-dev as of YYYY-MM-DD" requires an external monorepo lookup.
LLVM Core
Purpose
Enumerate the LLVM components statically linked in — this is a full code-generation toolchain, not a stub. The headline is that the entire SelectionDAG + MachineCodeGen + MC stack is present, several upstream target backends are linked alongside the out-of-tree TPU target, and two MLGO advisor models are AOT-baked into rodata.
Component Manifest
Every row is a defined-symbol hit (nm -C libtpu.so | rg …). Confidence is CERTAIN where a concrete class symbol is present.
| LLVM component | Present | Primary evidence (defined symbol) |
|---|---|---|
| Core IR | YES | llvm::Module, llvm::Function, llvm::BasicBlock, llvm::Instruction, llvm::LLVMContext |
| Bitcode reader/writer | YES | llvm::BitcodeReader, llvm::parseBitcodeFile, llvm::WriteBitcodeToFile |
| SelectionDAG ISel | YES | llvm::SelectionDAG, llvm::SelectionDAGISel, llvm::TargetLowering |
| GlobalISel infra | YES (linked) | llvm::InstructionSelect, llvm::LegalizerInfo, llvm::RegisterBankInfo |
| MachineCodeGen | YES | llvm::MachineFunction, llvm::MachineInstr, llvm::LiveIntervals |
| MC layer | YES | llvm::MCStreamer, llvm::MCInst, llvm::MCCodeEmitter (+ TPUMCCodeEmitter) |
| TPU target backend | YES | llvm::TPUTargetMachine + the llvm::TPU* family (below) |
| Analysis passes | YES | llvm::ScalarEvolution, llvm::PassBuilder (NewPM) |
| MLGO advisor models | YES (2) | RegAllocEvictModel, InlinerSizeModel (AOT, see below) |
| Embedded LLVM bitcode | YES | kEigenUnaryLlIr_constant_buffer_contents @ 0xaf58000 |
| MCJIT / ORC | YES | llvm::MCJIT, llvm::orc::* (LLVM ExecutionEngine — note: distinct from mlir::ExecutionEngine, which is absent); XLA CPU backend JITs llvm::Modules |
QUIRK — both ISel infrastructures are linked. GlobalISel (
InstructionSelect/LegalizerInfo/RegisterBankInfo) is present, but the TPU path's MC-emitter (TPUMCCodeEmitter::getBinaryCodeForInstr@0x13c74da0, a 5,667-case switch overInstBits) is downstream ofMachineInstr, consistent with a SelectionDAG-primary backend. Whether the TPU target also has a GlobalISel path for some opcodes is not resolved from the symbol surface alone — it needs a disassembly of the pass-pipeline constructor.
The Linked Target Backends
The inventory of LLVMInitialize*TargetInfo registrations proves this binary registers seven LLVM target backends, not just the custom TPU one:
LLVMInitializeX86TargetInfo ── host backend (the binary runs on x86-64)
LLVMInitializeAArch64TargetInfo
LLVMInitializeARMTargetInfo
LLVMInitializeAMDGPUTargetInfo ── (+ R600MCCodeEmitter, the legacy AMDGPU sub-target)
LLVMInitializePowerPCTargetInfo
LLVMInitializeNVPTXTargetInfo
LLVMInitializeTPUTargetInfo ── the out-of-tree Google backend (this page's headline)
The set of LLVMInitialize*Target (codegen) initializers is the same seven, and each carries an instantiated *TargetMachine class (llvm::X86TargetMachine, llvm::AArch64TargetMachine, llvm::ARMBaseTargetMachine, llvm::AMDGPUTargetMachine/GCNTargetMachine/R600TargetMachine, llvm::PPCTargetMachine, llvm::NVPTXTargetMachine, llvm::TPUTargetMachine). Each in-tree target carries its own TableGen InstBits encoder table — e.g. AMDGPUMCCodeEmitter::…::InstBits @ 0x29d8910, AArch64MCCodeEmitter::…::InstBits @ 0x397e980, PPCMCCodeEmitter::…::InstBits @ 0x3c0d770. They are registerAllTargets() fallout (the same build-system over-linking that drags in the unused MLIR target dialects, below).
QUIRK — Hexagon source TUs are partially linked (≈140
_GLOBAL__sub_I_Hexagon*.cppstatic-init thunks and a handful ofllvm::HexagonSubtarget::*Mutationsymbols survive), but Hexagon is not a usable backend: there is noLLVMInitializeHexagon*initializer, nollvm::HexagonTargetMachinevtable or constructor, and noHexagonMCCodeEmitter. It is dead static-init residue from over-linking, not a registered target — do not count it among the seven.
Note: the TPU target is the only custom backend, but it is one of seven registered backends. X86, AArch64, ARM, AMDGPU+R600, PowerPC, and NVPTX are all statically linked and registered alongside TPU. X86 is the host backend, confirmed by
llvm::X86TargetMachine/llvm::X86Subtarget/llvm::X86InstrInfoandcreateX86MCCodeEmitter. (A separate Hexagon static-init residue is present but is not a registered backend; see the QUIRK above.)
The TPU Target Backend
The out-of-tree TPU target is the single most distinctive LLVM component — a complete llvm::Target named "TPU" that does not exist upstream. Its TableGen tables are located and sized at fixed addresses:
| Table / function | Address | Role |
|---|---|---|
TPUMCCodeEmitter::…::InstBits | 0x3366d90 | Per-opcode instruction encoding bits (TensorCore) |
…::InstBits_BarnaCorePxcHwMode | 0x33931f0 | BarnaCore (Pufferfish PXC) HwMode encoding variant |
llvm::TPUDescs | 0x33bf650 | Per-instr MCInstrDesc table |
llvm::TPUInstrNameData | 0x33f2be0 | Instr mnemonic string pool |
llvm::TPUFeatureKV | 0x21934550 | SubtargetFeatureKV key/value (16 features; 1,152 B / 72 B-stride) |
llvm::TPUSubTypeKV | 0x21934ca0 | Subtype/CPU key/value (9 CPU variants; 1,008 B / 112 B-stride) |
TPUMCCodeEmitter::getBinaryCodeForInstr | 0x13c74da0 | 5,667-case encoder switch (selects InstBits per HwMode) |
Five silicon subtarget classes (plus a generated llvm::TPUGenMCSubtarget) cover the TensorCore generations across the HAL families (the abbreviations map to TPU codenames; the SparseCore sequencer split is documented in the tpu dialect and lowering pages):
llvm::TPUSubtarget ── base (jellyfish/dragonfish/pufferfish TensorCore)
llvm::TPUBcSubtarget ── BarnaCore PXC
llvm::TPUVfcSubtarget ── vfc = viperfish (v5)
llvm::TPUGlcSubtarget ── glc = ghostlite (gen 4; mktg "Trillium")
llvm::TPUGfcSubtarget ── gfc = 6acc60406 / TPU7x (gen 5; mktg "Ironwood")
The per-generation scheduling models are present as nine *SchedModelSchedClasses tables, split by sequencer type (SCS / TAC / TEC) and generation (VF / GL / GF):
BarnaCorePFSchedModelSchedClasses ── Pufferfish BarnaCore
SparseCoreScs{GF,GL,VF}SchedModelSchedClasses ── SCS sequencer (3 gens)
SparseCoreTac{GL,VF}SchedModelSchedClasses ── TAC sequencer (2 gens — NO GF)
SparseCoreTec{GF,GL,VF}SchedModelSchedClasses ── TEC sequencer (3 gens)
QUIRK — there is no
SparseCoreTacGFtable. The GF generation (gfc= 6acc60406 / TPU7x, mktg "Ironwood") ships an SCS and a TEC sequencer but no TAC sequencer, so its TAC scheduling model is absent by design — not missing by extraction error. A reimplementer iterating generations × sequencer-types and assuming a full 3×3 grid will allocate a table the hardware does not have.
MLGO Advisor Models
Two ML-guided-optimization models are AOT-compiled (via tfcompile) directly into native code plus constant-buffer rodata — there is no TF runtime or interpreter for them. They are the LLVM "release-mode" MLGO models baked in through the upstream LLVM_RAEVICT_MODEL_PATH / LLVM_INLINER_MODEL_PATH build mechanism:
| Model | Consumer | Evidence |
|---|---|---|
RegAllocEvictModel | LLVM greedy RA MLRegAllocEvictAdvisor | …RegAllocEvictModel… symbol family |
InlinerSizeModel | LLVM inliner MLInlineAdvisor | _llvm__InlinerSizeModel_*_fusion_ constant-buffer symbols |
These are LLVM-backend MLGO advisors, not an XLA learned cost model. The InlinerSizeModel constant buffers are even named per fusion shape (dot_add_fusion, iota_reduce_fusion, compare_convert_fusion, …), confirming they are the trained inliner-for-size weights.
Embedded LLVM Bitcode
The binary ships at least one precompiled LLVM bitcode module as a rodata blob, which is direct proof the bitcode reader is live:
0x0af58000 kEigenUnaryLlIr_constant_buffer_contents (LLVM bitcode Module; vectorised tanh + FMA/min/max intrinsics)
llvm_ir::kEigenUnaryLlIr (guard/storage symbol @ .bss 0x224ee910)
It is linked into JIT'd CPU llvm::Modules at compile time via llvm::parseBitcodeFile, so Eigen-lowered tanh resolves to a vector loop rather than a libm call. This blob is LLVM IR bitcode (LLVM-23-dev format), distinct from the MLIR bytecode surface (below) — both readers are compiled in.
MLIR Dialects
Purpose
Enumerate the MLIR components. MLIR ships from the same monorepo commit as LLVM core, so it is the LLVM-23-dev in-tree MLIR, layered with TPU-specific out-of-tree dialects authored against that API.
Core IR and Infrastructure
| MLIR component | Present | Primary evidence |
|---|---|---|
| Core IR | YES | mlir::MLIRContext, mlir::OpBuilder, mlir::Operation (+ Block/Region/Value/Type) |
| Dialect registry | YES | 67 concrete *Dialect classes (unique vtable for …Dialect, excl. base mlir::Dialect) |
| Pass infrastructure | YES | mlir::PassManager, mlir::Pass |
| Bytecode reader/writer | YES | mlir::BytecodeReader, mlir::writeBytecodeToFile |
| Conversion framework | YES | mlir::ConversionTarget, mlir::TypeConverter, mlir::RewritePatternSet |
| LLVM-IR translation | YES | mlir::translateModuleToLLVMIR |
| ExecutionEngine / JIT | NO | no mlir::ExecutionEngine symbol |
NOTE — the bytecode reader/writer and the LLVM-IR translation path are both present, but
mlir::ExecutionEngineis absent. MLIR is used purely as an AOT translate path: the SparseCore lowering chain runstranslateModuleToLLVMIRand hands the resultingllvm::Moduleto the LLVM TPU backend (LowerToSparseCoreLlvm). There is no MLIR JIT, which resolves a standing question about whether an MLIR ExecutionEngine is linked: it is not.
NOTE — the directly-verifiable count of instantiated dialect classes is 67 — one
vtable for …Dialectsymbol per concrete dialect, deduped, excluding the basemlir::Dialect(counting the base gives 68). Looser regexes over the raw symbol surface inflate this: matching every bare…Dialecttoken double-counts namespaced vs. unqualified spellings (e.g.xla::xtile::XTileDialectandXTileDialect) and pulls in constructors/typeinfo, yielding figures north of 200. The vtable count is the conservative, one-class-one-hit number. The Embedded-Library Atlas owns the full dialect tree.
The Three-Tier Dialect Set
The dialects fall into three groups, and the distinction matters for a reimplementer: most of Group A'/B is linked but unused by the TPU lowering chain — it is registerAllDialects() over-linking, the MLIR analogue of the seven LLVM target backends.
| Group | Dialects (verified samples) | Status on TPU path |
|---|---|---|
| A — upstream MLIR core | func, arith, scf, cf, vector, memref, tensor, linalg, affine, math, complex, index | Registered; some used in early lowering |
| A' — LLVM/target dialects | llvm, gpu, spirv (and the X86/NVVM/ROCDL/AMDGPU family) | Linked, unused (over-linking fallout) |
| B — HLO / input | stablehlo, chlo, mhlo, vhlo | Input + early-lowering |
| C — TPU-specific (Google) | tpu, llo, sparse_core (ScDialect, LlvmTpuDialect), mosaic_sc, xtile | The TPU compilation chain |
The Group-C dialects are confirmed by their concrete Dialect classes:
mlir::tpu::TPUDialect ── the TPU dialect (tpu/Mosaic ops)
mlir::llo::LLODialect ── low-level ops (LLO)
mlir::sparse_core::ScDialect ── SparseCore ops
mlir::sparse_core::LlvmTpuDialect ── SparseCore -> LLVM-TPU bridge dialect
mlir::mosaic_sc::MosaicSCDialect ── Mosaic-SparseCore
xla::xtile::XTileDialect ── XLA XTile tiling dialect
QUIRK — the presence of
mlir::gpu::GPUDialect,mlir::spirv::SPIRVDialect, and the X86/NVVM target dialects in a TPU plugin is not evidence of a GPU code path. It isregisterAllDialects()pulling the whole upstream dialect set into the static link. The same over-linking explains the AArch64/ARM/AMDGPU/PowerPC LLVM backends. A reimplementer should treat Group A' as dead weight on the TPU path, not as a capability.
Embedded Third-Party Libraries
Purpose
Pin the headline versions of the third-party libraries the toolchain is compiled against. This page gives the LLVM/MLIR-adjacent manifest with the determinable versions; the Embedded-Library Atlas owns the exhaustive ~60-library tree with byte accounting.
Version Manifest
Versions are pinned three ways: a versioned path literal (decisive), a version-tagged namespace (decisive), or a feature-floor (a symbol that first appeared in a known release — pins a floor, not the exact version).
| Library | Version | Version evidence |
|---|---|---|
| LLVM | 23-dev (trunk) | LLVM version g3_____-trunk 8918319853fbdf… @ 0xb1fa070 |
| MLIR | = LLVM (same commit) | same monorepo commit; no separate version |
| libc++ | = LLVM commit | std::__u:: ABI tag (Google libc++ inline-namespace build) |
| Intel oneDNN | v3.3 | path literal third_party/intel_dnnl/v3_3/ |
| ICU | icu_78 | versioned namespace icu_78:: |
| Protobuf | v32+ (EDITION_2024 production) | edition enums EDITION_2023, EDITION_2024, EDITION_2026, EDITION_UNSTABLE (no EDITION_2025) |
| Abseil | >= LTS 20240722 (floor) | absl::AnyInvocable, absl::log_internal::SetTimeZone, absl::CordBuilder, absl::StatusOr |
| Eigen | 3.4.x branch | Eigen::bfloat16 (modern), Eigen::half; no EIGEN_*_VERSION literal survives |
| gRPC | google3-tip (>= 1.66-dev) | chaotic_good transport, filter_fusion (trunk-only) |
| tcmalloc | 2024+ (rseq per-CPU) | google_malloc ELF section, __rseq_cs, rseq cmpxchg family |
NOTE — the path-literal and versioned-namespace pins (oneDNN
v3_3, ICUicu_78) are the onlyCERTAINexact versions; they survive as literal strings the build embeds. Everything else is a floor — the binary exhibits features added in a known release but no*_VERSION_MAJORmacro literal survives, so the exact point release cannot be read from this binary alone. The Atlas page documents the per-library reproduction recipe for tightening each floor.
NOTE — the
std::__u::inline-namespace ABI tag on every libc++ symbol is a Google build customization (the same monorepo as LLVM/MLIR). Because thellvm::/mlir::template instantiations are templated onstd::__u::types throughout, the libc++ build is inseparable from the LLVM/MLIR version pin — they are one toolchain.
Google-Specific Customizations
No behavioural patch to LLVM/MLIR core algorithms is visible in the symbol surface. The Google customizations over upstream are all standard vendoring patterns, not core forks:
- Version sentinel —
9999.0.0/g3_____-trunkare build-identity overrides of the upstream version macros (masksLLVM_VERSION_MAJOR; not a behavioural change). - Out-of-tree TPU target — the entire
llvm::TPU*backend (5 silicon subtargets, 6,166 MC opcodes perInitMCInstrInfo, theInstBitsencoder) is Google-private; upstream has no TPU target. - AOT MLGO models —
RegAllocEvictModel+InlinerSizeModelbaked in through the upstream release-model mechanism (build-time, not a source patch). - libc++
__uABI tag — a Google libc++ build customization. - TPU MLIR dialects —
tpu/llo/sparse_core/mosaic_sc/xtileauthored against the in-tree MLIR API; out-of-tree dialects layered on stock MLIR, not core modifications.
Byte Footprint
LLVM and MLIR together are the largest single category of code in the binary. The figures below are symbol-bucketed (summed nm -S sizes by namespace); the underlying ELF section sizes that bound them are confirmed directly: .text = 0x12bdb484 (314,422,404 B), .lrodata = 0x6c0e7d0 (113,305,552 B), .rodata = 0x39eaf28 (60,731,176 B).
| Bucket | Combined bytes | % of code+rodata symbol bytes |
|---|---|---|
LLVM (llvm:: + TableGen tables) | ~84.1 MB | ~19.8% |
MLIR (mlir:: + dialect templates) | ~72.2 MB | ~17.0% |
| MLGO models (RegAllocEvict + InlinerSize) | ~0.59 MB | ~0.14% |
kEigenUnaryLlIr bitcode blob | 16,384 B | ~0.004% |
| LLVM + MLIR + MLGO + bitcode | ~156.9 MB | ~36.9% |
NOTE — MLIR's
.rodatashare is tiny relative to its.textbecause MLIR is template-heavy code with few large constant tables; LLVM's.rodatais large because the TPU TableGen tables (InstBits181,344 B,TPUDescs,TPUInstrNameData) and the in-tree-targetInstBitstables live in rodata/lrodata. These are symbol-bucketed sizes — they exclude anonymous-namespace TU-local residue, an estimated 70-80% of which belongs to LLVM/MLIR/XLA, so the true footprint is modestly higher than 156.9 MB.
Cross-References
- Embedded-Library Atlas — owns the exhaustive ~60-library static-link tree and per-library byte accounting; this page is the LLVM/MLIR-centric slice.
- Binary Forensics Overview — the top of the forensics tier; build identity and the analysis methodology.
- ELF Anatomy — section sizes, the large-code-model
.lrodata/.lbss, and the build-id note backing every offset here. - Custom Sections —
google_malloc,protodesc_cold,__rseq_csand the other compiler-emitted sections that bound the library buckets. - The libtpu.so / sdk.so Two-Binary Split — which of the two shipped objects carries the LLVM/MLIR toolchain.
- The TPU Compiler — the compilation pipeline that drives the LLVM/MLIR components catalogued here.
- LlvmTpu Intrinsic Catalog — the intrinsics the LLVM TPU backend lowers.
- The tpu MLIR Dialect: Ops and the Op-Model Contract — the Group-C
tpudialect in depth. - LowerToSparseCoreLlvm — the
translateModuleToLLVMIRAOT path that hands MLIR to the LLVM TPU backend.