Binary Perimeter
The stripped Zoo Tycoon 2 binary's function and global/static inventory, with per-function name, status, library provenance, vtable membership, and inferred domain. Source: zt-function-perimeter.tsv, zt-global-perimeter.tsv.
The denominator — outer code & global perimeter
This page measures the original game's outer code and state perimeter from the stripped executable (build/unpacker/zt-stripped.exe) using radare2 prelude+call analysis. It is the red-team denominator against which the reconstruction is measured. The Type Registry is the semantic view of the register_type slice of this perimeter; the tables below carry the full function and global/static enumeration.
Functions — 19,145 entries
radare2's raw afl count (~18.5k) is real function entries, not vtable pollution (vtables live in .rdata); but it is dominated by boilerplate — 341 thunks, ~6,257 getters/leaf accessors (17–64 B), ~11,289 medium methods (mostly STL/template and component read/write_xml_node), and only 1,258 large (>512 B) substantial-logic functions. Identification status (function granularity, not “falls in a region”):
| Status | Count | Meaning |
|---|---|---|
| named | 627 (3%) | a byte-map row (or function-names.tsv) starts exactly at this function's entry and identifies it |
| type-getter | 707 (4%) | contains a register_type (0x005b46fc) call — a registered-type accessor |
| unidentified | 17,811 (93%) | distinct function, not individually identified; most carry structural characterization through provenance, vtable membership, or inferred domain |
Naming is by exact entry anchor only — a byte-map row expanded from one symbol anchor over a multi-KB span (e.g. get_type_ZTBiomeMode covering 6 KB / ~13 functions) names only the one function at its anchor; the others in that span are distinct and unidentified, carrying the nearest descriptive region band as a locality hint, not that symbol's name. So function-granular identification is ~7.0% (1,334 of 19,145). The identification campaign (tools/decompile-batch.py → function-names.tsv) has swept the string-bearing substantial-logic surface to full-binary exhaustion: a corrected-display pass over every 0x40000-byte span of .text returns zero unidentified functions carrying ≥5 real string literals. Every UI panel, XML/config reader, dialog handler, shader binder, attribute/token registrar, state read/write pair, and the telemetry/IME/Lua plumbing is named. What remains is string-poor engine math (NetImmerse/Gamebryo geometry & collision, D3D, AI scoring, mesh processing, STL boilerplate) — region-tagged, not exact-named — plus three jump-table dispatchers whose apparent “strings” are misread code bytes (0x0044df15, 0x006c1ffe, 0x006d5355).
Beyond exact names: structural characterization — 98.5%
Recognition coverage
Implementation burden proxy
The perimeter is mostly characterized, but exact behavioral naming is selective. Reconstruction tracks original-game contracts and source compatibility, not raw function count; proven middleware/runtime spans are replacement surfaces unless their asset formats, scripting APIs, or observable behavior define a game contract.
Exact behavioural names only cover the string-bearing surface (~7%). To understand the rest without strings, three companion analyses attach orthogonal, address-keyed facts to every function entry (the provenance, vtable, domain columns below):
- Library provenance (
tools/fingerprint-libs.py) — third-party middleware bounded by identity strings, imports, magic constants, and surviving RTTI type-descriptor strings. Gamebryo/NetImmerse is one contiguous block0x58dadb–0x7ca2fd(the D3D9 renderer), ~9,160 functions, with zlib 1.1.4 nested inside it (0x67bda5–0x67ffb2) and Lua 5.0.2 at0x873–0x893. CRT/STL are dynamically linked (MSVCR80, a/MDbuild), so there is no static CRT span. Everything outside these spans is ZT/BlueFang code; the boundary is corroborated by a one-way call gradient — 5,788 outside functions call inward to 1,838 engine targets. - Vtable membership (
tools/build-rtti-vtables.py) — MSVC RTTI's COL/CHD/BaseClassArray were stripped from this image (only 99 TypeDescriptors survive), so class names and the inheritance graph are not RTTI-recoverable. But 861 real vtables were enumerated as.rdatapointer runs → 47,166 method slots → 14,375 previously-unidentified functions marked as polymorphic methods (slot 0 = destructor). - Inferred domain (
tools/propagate-domains.pyover a 74,608-edge call graph + 557 import edges) — each unidentified function inherits a functional domain from its named neighbours and dominant API imports; 17,270 of 17,809 received one (clusters: ui 8.5k, renderer 3.3k, game-entities 1.9k, economy 1.9k).
Net: 18,853 / 19,145 (98.5%) of function entries now carry a name, type-getter role, library provenance, vtable-method membership, or an inferred domain. Only 292 are fully opaque — 253 of them <128 B leaf thunks with no edges/strings/vtable, ~20 larger string-poor tank/map geometry helpers, and the 3 dispatchers. A further 8,156 are ZT/BlueFang code with an inferred functional area but no exact behavioural name yet (3,536 high-confidence) — the remaining frontier, now scoped by domain rather than opaque.
Globals / statics — 5,236 cells
Distinct global addresses referenced from .text: 2,068 type-cache cells (the registered-type singletons), 179 otherwise byte-map-identified, and 2,989 unidentified (~57%). Init-guard cells (cache+4) and the broader .bss are partial.
Per-class method perimeter
The property perimeter per class is the reconstructed field set; the method perimeter is the class's vtable slots. All 861 vtables are now enumerated slot-by-slot (reconstruction/vtable-methods-full.tsv, 47,166 slots → 14,465 distinct method functions), but only 5 are class-named — because MSVC RTTI was stripped, the vtable→class-name link is gone. So the per-class method shape is fully recovered; the labels (which vtable is which class) await type-getter/constructor topology analysis.
Recovered Scope
- Established: complete type/class ontology (1,967/1,967), their fields, relationships, the message vocabulary, the function/global address inventory, and region-level coverage of nearly all code.
- Open: function-granular identification beyond ~7%, the per-class vtable→class labelling, the behaviour/algorithms of most functions, and ~57% of globals' meaning.
Function perimeter
| address | size | name | region | status | provenance | vtable slot | inferred domain |
|---|
Global / static perimeter
| address | section | name | status |
|---|