stax
A live profiler for macOS and Linux — CPU stacks, off-CPU waits, cooperating target spans, and annotated disassembly while your program runs.
# record a program (or attach to a running one with --pid)
stax record -- ./target/release/mybench
# from another shell — or to an AI agent — query the live run
stax wait --for-samples 10000 # block until data lands
stax top -n 10 --sort self # hottest leaf functions or target spans
stax flame -d 6 # active flamegraph, as a tree
stax annotate 'mycrate::hot_fn' # per-instruction sample counts stax records on-CPU stacks, off-CPU waits, and cooperating target spans, then turns them into flamegraphs, top-N functions/spans, per-thread and per-lane breakdowns, and annotated disassembly — all queryable while the recording is still running.
Every view is a plain CLI subcommand: text output, meaningful exit codes, no GUI required. That puts stax exactly where a graphical profiler can't go — over an SSH session to a remote machine, inside a CI job, or driven end-to-end by an AI agent. There is a browser UI when you want one, but nothing depends on it.
Choose your path
Guide
Learn stax step by step
Install the daemons, record your first run, read flamegraphs, profile JIT'd code, and troubleshoot when something goes wrong.
Concepts
Understand how it works
The three-process architecture, what each platform can capture, how stacks get unwound, and what sampling actually measures.
Reference
Look it up fast
Every subcommand and flag, the RPC services for programmatic clients, environment variables, and exit codes.
Why stax
- Live first, saveable when needed. The aggregator updates continuously;
stax top,stax flame, and the web UI all read the current state of a run that is still going. Usestax select-runto restore stopped in-memory history, per-command--runto query it without changing state, andstax save,stax open,stax compare, andstax compare --jsonwhen you need durable artifacts, before/after notes, or CI-readable deltas. Saved archives can be a directory or one.staxpackage; v2 archives replay typed event records and rehydrate code-byte blobs when present.comparethreshold flags such as--fail-target-delta-msand--fail-unlinked-origins-deltaturn saved runs into direct regression gates. - Built for agents as much as humans. Every query is a subcommand with
plain-text output and meaningful exit codes.
stax wait --for-samples Nlets a script block until there's enough data to look at. - On-CPU and off-CPU. stax doesn't just show where the CPU time goes — it correlates scheduler events to show why a thread was blocked: lock, sleep, I/O, IPC.
- Cooperating target spans. GPU, accelerator, and executor work reported
through
stax-targetlands on the same timeline as synthetic lanes, with explicit target time/span counts inthreads,top,flame, and the web UI. - Down to the instruction.
stax annotatedisassembles a hot function and attributes samples to individual instructions, interleaved with source. - Symbolicates stripped binaries. On Linux, stax pulls symbols from local debug packages and debuginfod; on macOS, from the dyld shared cache — so system-library frames get real names.
- JIT-aware. A JIT that emits a perf jitdump file gets its compiled functions symbolicated and disassembled like any other code.
Quick links
- Getting Started — install the daemons and verify
- Recording a Run — launch a target or attach to a PID
- Architecture —
stax,stax-server,staxd - Stack Unwinding — frame pointers, DWARF, and what your build needs
- CLI Reference — every subcommand and flag
- GitHub — source and issues