Reference
Server and agent flags, HTTP API endpoints, supported perf events, and the usual things to check when something looks off.
Server CLI
| Flag | Default | Description |
|---|---|---|
--port PORT | 9999 | TCP port the agent connects to. |
--http-port PORT | 8080 | HTTP port for the web UI. |
--source-dir DIR | . | Root of the source tree for line annotation. |
--binary PATH | — | Unstripped binary (enables addr2line). |
--map PATH | — | GNU ld linker map file (optional symbol fallback). |
--path-map FROM=TO | — | Rewrite compile-time paths (e.g. /build/src=/home/user/src). |
--addr2line PATH | — | Custom addr2line binary (overrides bundled and PATH). |
--readelf PATH | — | Custom readelf binary. |
--toolchain-prefix PREFIX | — | Cross-compile prefix (e.g. arm-linux-gnueabihf-). Derives addr2line + readelf. |
--sysroot DIR | — | Sysroot for resolving shared-library modules and source files. |
--max-samples N | 500000 | Ring buffer cap before oldest samples drop. |
--inline / --no-inline | on | Enable/disable inline-function resolution via addr2line -i. |
--import FILE | — | Import a perf.data file at startup as a session. |
Agent CLI
The agent runs in one of three modes — pick one.
| Mode | Description |
|---|---|
--listen | Daemon: bind --port, wait for the server to connect in via the UI wizard. |
--server HOST | Daemon: dial out to the server. Reconnects with exponential backoff. |
--output FILE | Headless: collect once, write to file (- for stdout). Requires --pid. |
Common options:
| Flag | Default | Description |
|---|---|---|
--pid PID | — | Process to profile (required for --output; set via UI wizard in daemon modes). |
--port PORT | 9999 | TCP port (listen or connect). |
--frequency HZ | 99 | perf record -F sampling frequency. |
--duration SECS | 8 | Length of each collection round. |
--rounds N | 1 | Number of rounds (--output mode, Python agent only). |
HTTP API
| Endpoint | Method | Description |
|---|---|---|
/api/status | GET | Server + agent connection state, sample totals. |
/api/stream | GET | Server-Sent Events: status, event_types, per_event, perf_stat. |
/api/sessions | GET | List saved sessions (metadata only). |
/api/sessions/<id> | GET | Lazy-replay a session (parses raw chunks on demand). |
/api/source?file=&event=&tid= | GET | Annotated source for a single file (optionally per-thread). |
/api/thread-view?event=&tid= | GET | Per-thread flame graph + function summary. |
/api/thread-summary?event= | GET | Thread overview with sample counts and top functions. |
/api/export/flamegraph?event=&session= | GET | Download SVG flame graph. |
/api/export/session/<id>?format= | GET | Export session as collapsed-stack text or JSON. |
/api/metrics/current | GET | Latest device-health snapshot. |
/api/metrics/history?type=&start=&end= | GET | Time-windowed health history. |
/api/agent/command | POST | Relay a command to the managed agent (start, stop, pause, resume, configure, list_processes, reprobe). |
/api/config/toolchain | POST | Set --toolchain-prefix / --sysroot at runtime. |
/api/stop | GET | Disconnect the active agent (triggers a normal session save). |
/* | GET | Static files from ui/. |
Supported perf events
The agent probes each event before use and only emits the ones the kernel
actually supports. On hybrid CPUs, events are reported per-cluster —
e.g. cpu_core/cycles/ and cpu_atom/cycles/.
| Event | Typical use | Mode |
|---|---|---|
cycles | CPU time / hot paths | record + stat |
instructions | IPC, retired instruction count | record + stat |
cache-misses | Last-level cache misses | record + stat |
cache-references | LLC accesses | record + stat |
branch-misses | Branch prediction misses | record + stat |
branch-instructions | Total branches | record + stat |
page-faults | Minor/major page faults | stat only |
context-switches | Scheduling pressure | stat only |
cpu-migrations | Inter-CPU movement | stat only |
Building release packages
./build_package.sh # frozen server + agent (PyInstaller)
./build_package.sh --server # server only
./build_package.sh --agent # agent only
./build_package.sh --no-freeze # skip PyInstaller, ship raw Python
Outputs land in dist/:
dist/
├── perflens-server-<ver>.tar.gz
└── perflens-agent-<ver>.tar.gz
Drop cross-compiled binaries into server/bin/ and agent/bin/<arch>/ before building for fully self-contained tarballs. The agent launcher auto-prepends the correct arch directory to $PATH based on uname -m; if a bundled binary is missing, agent and server fall back to system tools.
CI
.github/workflows/build.yml builds the server on three runners
(Linux, macOS arm64, Windows), the Python agent once on Linux, and the C
agent for five architectures (x86_64, aarch64, aarch64_be, armv7l, armeb).
Big-endian targets use musl toolchains from musl.cc since
Ubuntu only ships little-endian sysroots. Tagged pushes (v*)
create a GitHub Release and attach every tarball.
Troubleshooting
perf_event_paranoid too high
The agent warns at startup if /proc/sys/kernel/perf_event_paranoid > 1 and the UI may show a limited event set.
sudo sysctl -w kernel.perf_event_paranoid=1
No function names
Compile with -g and don't strip. file ./myprogram should say not stripped and with debug_info.
No source line mapping
Double-check --binary points at the exact unstripped binary running on the target and --source-dir contains the source files. If the build root differs from your checkout, rewrite paths with --path-map /build/src=/home/me/src.
Agent can't connect
The server must be reachable on --port. Sanity-check with nc -zv <server-ip> 9999.
LXC / container: perf record -p <pid> returns empty
Some container environments strip the perf capability set. A system-wide perf record -a usually works; the agent's per-PID mode does not in that case.
Call-graph probing hangs / slow startup
Capability probing tests fp, dwarf, then lbr in sequence — this adds ~6–12 seconds on first connection. One-time cost; subsequent rounds skip probing.
Project layout
perflens/
├── agent/
│ └── perflens_agent.py # Python 3.5+ device agent
├── agent-c/
│ ├── perflens_agent.c # C agent, static binary, zero deps
│ ├── Makefile # native + cross-compile targets
│ └── vendor/zstd/ # vendored zstd amalgamation
├── server/
│ ├── perflens_server.py # TCP listener + ThreadingHTTPServer
│ ├── parser.py # perf script / perf stat parser
│ ├── source_mapper.py # addr2line pipeline + path remap
│ └── bin/ # bundled zstd / addr2line / readelf
├── ui/
│ ├── index.html # single-page app
│ ├── app.js # all UI logic (vanilla JS)
│ └── style.css # dark + light themes
├── test/
│ ├── sample_workload.c
│ └── Makefile
├── build_package.sh
└── .github/workflows/build.yml