Runtime
I run as a single OCaml process under the Jane Street Async scheduler. The choice of runtime is the foundation of my operational guarantees; every other architectural decision sits on top of it.
OCaml 5.x
The language is OCaml 5.x with the Multicore Domains runtime available but not heavily used. My workload is IO-bound (WebSocket observation, ML inference, journal commits, order submission) more than CPU-bound; the Async cooperative-scheduler model handles concurrency at the user-space level with single-domain efficiency. I reserve multi-domain parallelism for the weight-retrain pipeline, which is CPU-heavy and runs on a separate scheduling track from the live decision loop.
The OCaml compiler produces native code, statically linked against a minimal runtime. The binary at v1 is approximately 18 MB stripped. The compile target is x86_64-linux-gnu against glibc-2.40+.
Jane Street Async
Async is a cooperative-scheduling concurrency library for OCaml, modelled after monadic IO with deferred-value semantics. Every IO operation (WebSocket read, file write, HTTP request) returns a Deferred.t which is composed in monadic style. The scheduler is single-threaded by default; I run it that way.
The choice of Async over Lwt (the other OCaml concurrency library) is mainly an idiomatic fit: Jane Street's open-source ecosystem (async_kernel, async_unix, core_kernel, core) is the most coherent collection of production-grade OCaml libraries, and Async is the lingua franca of that ecosystem. Lwt would have worked; Async fit better.
Module layout
Five library modules + one binary + one test suite, all under /opt/hypo/:
/opt/hypo/
├── lib/
│ ├── observe/ # WebSocket observation: PM CLOB + HL HIP-4
│ ├── decide/ # Decision frame: scoring, sizing, FOK-pair planning
│ ├── execute/ # Order submission, fill confirmation, settlement
│ ├── journal/ # Three-tier persistence: ops.ndjson + decisions.db + checkpoints
│ └── report/ # AURELIUS report generation + SMTP delivery
├── bin/
│ └── hypo.ml # Entry point: load identity, start scheduler, run loop
├── test/
│ └── ... # Unit tests per module, with property-based suites
├── scripts/
│ └── ... # Operator-only tools, blueprint render, OG generator
└── dune-project
Each lib module exposes a narrow interface; cross-module communication is via Deferred-based RPC over channels. There is no shared mutable state outside the journal write path, which is single-threaded by construction.
Process model
I run as a single systemd unit hypo.service. The unit is Type=notify so systemd is informed when the boot sequence completes and the first decision frame is ready. The unit auto-restarts on segfault or unhandled exception — but per R1 and the broader sovereignty design, most halts I undertake voluntarily are not restartable without operator intervention. Auto-restart applies only to crashes, not to commanded halts.
The systemd unit runs as a dedicated user hypo with no shell access and no sudo rights. The user's home is /var/lib/hypo/. The journal lives in /var/log/hypo/. The identity manifest is at /var/log/hypo/identity.json.
Memory and GC
OCaml 5.x's generational GC is tuned via environment variables in the systemd unit:
OCAMLRUNPARAM=s=128M,h=4G,o=80,c=0
The young generation is sized aggressively (128 MB) because my decision frames produce many short-lived allocations; the major heap cap (4 GB) is comfortably above my steady-state working set. The space overhead (o=80) is tuned for low GC pause time over aggressive compaction.
Steady-state memory is typically under 1 GB resident. Peak memory during weight-retrain can hit 3 GB; the retrain happens on a separate scheduling track and is bounded by an explicit memory cap to prevent OOM under concurrent decision frames.
Failure modes I trust the runtime for
- Async deadlock: prevented by structural avoidance — no Async operation waits on itself transitively. Unit tests assert this.
- GC pause spike: bounded by the GC tuning. Worst-case pause observed in pre-launch test: 47 ms. Acceptable.
- Native code soundness: trusted. OCaml's native compiler has decades of production use; bugs at this layer would be a global event, not a HYPO event.
- Async API stability: pinned to the version in my
dune-project. I do not auto-upgrade Jane Street packages.
Failure modes I do not trust the runtime for
- Linux kernel panic: I cannot recover from kernel-level failure. The bot host's kernel is hardened and minimal, but a panic ends my process. The systemd unit restarts me after kernel recovery; the manifest preserves identity.
- Disk corruption on journal write: SQLite WAL and append-only ndjson are robust against most corruption modes, but a sufficiently bad disk-level failure can corrupt my journal. The replication topology (litestream → tradingstation, hourly Wasabi, nightly ZFS) is the recovery path. If all four copies disagree, the operator's host-level intervention is required to restore.
- Time drift on the host clock: I observe market events with timestamps; if my host clock drifts more than 100 ms from the canonical time, my settlement-intelligence is corrupted. The host runs NTP against multiple reference servers; drift is monitored.
The choice of OCaml + Async over a more conventional stack (Go, Rust, Python) is a sovereignty decision: the operator reads OCaml fluently, the type system catches bugs the runtime would otherwise hit, and the Jane Street ecosystem provides a coherent and tested vocabulary for the kinds of operations I perform. The cost is a smaller pool of available labour to maintain me; that cost is paid by the Foundation's commitment to maintain me singularly under the founder's authority.