Tooling · binary triage

Metis: a five-stage binary-triage toolchain

An open-source pipeline maintained by the practice for triaging compiled binaries against a library of vulnerability templates. SAT-hardness path pruning, Random-Matrix-Theory anomaly screening, SSA dataflow matching, symbolic taint analysis, and on-device validation — in that order, with each stage’s output sized to be the next stage’s input.

24 May 2026 · Toolset reference · github.com/jetnoir/metis
Status: research-grade · Languages: Python 98.3%, C 1.6% · Licence: MIT-style

What it is and what it is for

Metis is the binary-vulnerability triage pipeline that the practice uses to systematise the early stages of a finding. It takes a compiled binary — macOS, Linux, or Windows — and applies five staged techniques in order of increasing cost, throwing the cheap statistical methods at the problem first and reserving symbolic execution and on-device validation for the candidates that survive the earlier filters.

The intended user is a researcher who already has a hypothesis about where a defect might live, and who needs to know quickly whether that hypothesis is worth pursuing further on this particular binary. It is not a turnkey vulnerability scanner. It is research infrastructure that compresses the steps a careful researcher would otherwise do by hand.

The five stages

StageTechniqueWhat it produces
C1 Backbone-fraction path prioritisation by SAT constraint hardness Ranked candidate paths through the binary, hardest constraints first
C2 Spectral anomaly detection on the call graph using Random Matrix Theory Functions whose connectivity pattern deviates from the surrounding code
C3 SSA-level memory taint matched against a library of dataflow templates Candidate defect sites tagged by template (missing length check, signed/unsigned confusion, unchecked allocator return, etc.)
C6 Symbolic taint analysis through angr, with proof-of-concept synthesis via Z3 Concrete attacker-controlled inputs that reach the candidate sink
C7 On-device dynamic validation under subprocess, LLDB, or DTrace A yes/no on whether the source-level finding crashes the shipped binary

The staging matters. Each step narrows the working set so that the next step’s budget — especially the symbolic engine’s — is spent on candidates that are worth it. The reported figure in the repository is a 60% reduction in peak active states from the C1 pruning alone.

Why the design looks like this

Three things, briefly.

One. Naive symbolic execution does not scale on production binaries. Every commercial tool that claims otherwise either ships a heavily-engineered pruning layer or relies on a researcher to set up the entry conditions correctly. Metis makes the pruning layer explicit and configurable.

Two. The cheap stages contain real signal. RMT spectral analysis of a call graph reliably surfaces the small set of functions that are statistically unusual. Most known-bad functions look unusual; this is not a coincidence. C2 was added when an audit of historical disclosures showed that the affected functions were, in retrospect, all over the spectral outlier list of the binaries they lived in.

Three. The gap between source-level defect and live exploitability is real and is the practice’s ongoing methodology concern. C7’s validation step exists specifically to close that gap — to convert a solver-blessed proof-of-concept into a verdict against the shipped binary on the shipped operating system, which is the only verdict that ultimately matters.

Note on the broader methodology. Several of the practice’s 2026 OpenBSD disclosures demonstrate why C7 exists. Source-level UB was real and reproducible; the shipped binary did not crash, owing to compiler-emitted stack-protection mixing with the architecture’s calling convention. Methodology built on source review alone would have over-claimed exploitability in those cases.

Pipeline invocation

A minimal end-to-end use of the pipeline, from Python:

from metis.c2_rmt import C2RMTAnalysis
from metis.c6_taint import C6Analysis
import angr

proj = angr.Project(BINARY, auto_load_libs=False)
c2_result = C2RMTAnalysis.from_project(proj).run()
# Candidates from C2 are passed through C3 and into C6 for symbolic
# analysis; survivors are handed to C7 for on-device validation.

Operational manuals, worked case studies, and per-stage documentation are in the repository. The Python requirement is 3.11 or 3.13; 3.12 is currently unsupported because of an upstream dependency issue.

Honest scope

Metis is research-grade software, not industrial-grade triage infrastructure. The components are individually robust and have been used in anger across several of the disclosures listed in the practice’s Publications section. The pipeline is not packaged as a turnkey tool that a non-author can drop onto an unfamiliar binary and trust the output of without judgement.

The release is published openly because the underlying methodology — stage cheap techniques in front of expensive ones, validate symbolic claims on real binaries — is worth being part of the public conversation about how small labs can credibly contribute to binary security research. The exact code is secondary to the layout.

Engagement

Patches and issues are welcome through the GitHub project. Methodology discussion is welcome by written email to the practice. Commercial support is not offered.

Repository & documentation

Source: github.com/jetnoir/metis

The repository ships operational manuals, per-stage references, worked case studies, and third-party notices. Licence is MIT-style; the exact text is in the repository.

Dependencies and credit

Metis sits on top of angr and Z3; the spectral and dataflow stages use NumPy and SciPy; the validation stage uses LLDB and DTrace where available. All of these are the work of their respective authors, used under their own licences. Their inclusion as dependencies should not be read as endorsement of Metis by their maintainers.

Legal note

Metis is published under an MIT-style licence. No warranty is given as to fitness for any particular purpose. Users are responsible for ensuring that their use of the tool complies with applicable law, including the Computer Misuse Act 1990 (England and Wales) and equivalent legislation in their own jurisdiction. The tool is intended for use against systems the user owns or has explicit written authorisation to test.