Building a Personal Intelligence Hub on a $600 Mac Mini
I have a Mac Mini M4 sitting in a closet in Panama City. It runs 9 services on different ports, handles voice transcription, semantic search, link intelligence, and a full operations agent — all locally. The total hardware cost was $599.
This is not a homelab flex. It's a practical answer to a question I kept running into: where does my data live, and who can search it?

The Problem
I record voice notes throughout the day. Some are task reminders, some are ideas, some are client context I need to capture fast. For months, these sat as audio files on a server — transcribed, but unsearchable. If I needed to find "what did I say about the invoice structure last Tuesday," I was grepping through text files.
My saved links — articles, repos, references — lived in a JSON store. Also unsearchable beyond exact text match.
My operations agent (an LLM that manages my morning briefings, system health, and message routing) had no way to query any of this. It could check if services were running, but couldn't answer "what have I been working on this week?"
Three data sources. Zero connections between them. Everything isolated.

The Pipeline
Here's what happens now when I record a voice note from my phone:
1. Capture — A PWA on my phone hits the recorder service. The audio file lands on the Mac Mini.
2. Transcription — A local speech-to-text model runs on the M4's neural engine. No cloud API, no network call. The audio never leaves the machine. A 2-minute recording transcribes in about 4 seconds.
3. Cleanup — The raw transcript goes to a small, fast AI model for cleanup. It fixes filler words, punctuation, and structure. But I added something else to this step: the cleanup prompt also extracts commitments.
After the cleaned transcript, output COMMITMENTS_JSON: followed by
a JSON array of action items. Format: [{"text": "what", "context": "why"}]
The model does both jobs in a single pass. Cost per transcript: fraction of a cent.
4. Commitment Extraction — The response comes back split on COMMITMENTS_JSON:. The clean transcript gets saved. The commitments get appended to a structured file that my operations agent reads every morning. Stuff I said I'd do actually surfaces in my daily briefing now.
5. Semantic Indexing — The transcript gets POST'd to an embeddings service running on the same machine. It uses a small transformer model (~480MB in memory) to convert text into vectors. Now that voice note is findable by meaning, not just keywords.
6. Relay — The transcript and original audio get sent to a WhatsApp group via an operations gateway. I see them on my phone within seconds of recording.
End to end: record → transcribe → clean → extract commitments → index → relay. All local except the cleanup call and the WhatsApp send. Total latency under 10 seconds for a typical recording.

The Same Pattern, Three Times
Once I had the embeddings service running, I wired it into everything:
Voice recordings → auto-indexed after transcription. 12 recordings on day one, growing daily.
Saved links → when I share a URL from my phone, a link intelligence service fetches the page, sends it through an AI model for summary and tagging, then indexes the result. Now I can search my saved links by concept: "that article about database migrations" finds it even if the title said nothing about databases.
Operations logs → my AI agent generates session logs in JSONL format. 186 sessions of context — decisions made, questions answered, system changes logged. I wrote a parser that extracts the first 10 substantive messages from each session and indexes them. Now I can semantically search my own operational history.
Same endpoint, same embedding model, three completely different data sources. The search API is four lines:
curl -s -X POST http://127.0.0.1:8092/search \
-H 'Content-Type: application/json' \
-d '{"query": "invoice structure for client", "top_k": 5}'
Returns results ranked by meaning, across recordings, links, and agent sessions. On a $600 machine.

Right-Sizing
The hardest lesson wasn't building services. It was learning when to kill them.
The Mac Mini has 16GB of RAM. After the OS, background processes, and 9 services, RAM is the bottleneck — not CPU, not disk, not network. Every megabyte matters.
I had a local language model running — a 3-billion parameter model taking 2.2GB of RAM. It sat there for weeks. Nothing called it. My operations agent uses a cloud model. My transcript cleanup uses a different model. The local one was speculative — "maybe I'll need it someday."
I killed it. RAM went from 117MB free to 2.3GB free. Every other service got faster. The model files stay cached on disk; I can restart it in 30 seconds if a real use case appears.
The right-sizing rule: if nothing calls a service, it's waste. Don't run things speculatively on a constrained box. You can always bring them back. You can't get RAM back from a process that's doing nothing.
I also found two orphaned tail -f processes from weeks earlier, following a log file that no longer existed. A photos analysis daemon using 2.8GB for a machine that has no display. An entire creative suite's background services loaded at boot for a headless server. All killed. All waste.

The Multi-Machine Angle
The Mac Mini isn't alone. It's one node in a three-machine setup connected over a mesh VPN:
- Venus (the Mac Mini) — intelligence hub, services, local AI
- Moon (a Linux laptop) — portable development, on-the-go sessions
- Mercury (a Linux desktop) — heavy compute, large builds
Config files sync between machines automatically. I can SSH from Moon into Venus and run any service command. The tweet queue I'm building lives on Moon but posts through a script that calls the X API directly. The voice recordings live on Venus but the transcripts sync everywhere.
The key insight: no single machine does everything. Each one does what it's good at. The Mac Mini's neural engine makes it the best transcription box. The Linux desktop has more RAM for big builds. The laptop goes where I go. The mesh VPN makes them feel like one system.

What It Actually Cost
| Component | Cost |
|---|---|
| Mac Mini M4 (16GB) | $599 |
| Cloudflare tunnel + access | Free tier |
| Mesh VPN | Free tier |
| Domain | ~$10/year |
| AI cleanup calls | ~$2/month |
| Electricity | ~$5/month |
No cloud VMs. No managed databases. No $50/month subscriptions. The speech-to-text model runs locally. The embeddings model runs locally. The operations agent uses a cloud model, but that's a choice — the local option exists if the cost math changes.
Total monthly cost to run a personal intelligence hub: under $10.

What I'd Do Differently
Start with the embeddings service. I built it last but it should have been first. Once you have semantic search, every other service becomes more valuable because its output becomes findable.
Add health endpoints from day one. I spent an afternoon adding /health routes to 9 services after the fact. If I'd done it when building each service, it would have been 3 lines per service and zero retrofit time.
Don't install services you don't have a caller for. The local language model taught me this. Cool technology sitting idle is just warm RAM.

The Repos
Everything described here is open source:
- ormus-recorder — Voice recording PWA + local transcription pipeline
- ormus-links — Link intelligence with AI extraction
- hermetic-claude — The skill/command system that ties it together
The code is real, running, and actively maintained. Not a demo. Not a proof of concept. The system I actually use every day to run my work.

The best infrastructure is the kind you forget exists until you need it. Then it's already there, already indexed, already searchable. That's the goal: a system that catches everything and surfaces what matters, without asking you to change how you work.


