The architecture behind Paagos — Alysstair Morales

I’ve been building Paagos for a few weeks now, and the architecture has stabilized enough to write about. This post is a technical deep dive into the decisions I’ve made and why.

The constraints

Before picking any technology, I defined the constraints:

Cost-efficient at low scale. This is a product for Philippine SMEs. Margins are thin. The infrastructure can’t cost more than the revenue it supports.
Fast everywhere in the Philippines. Not just Manila. Davao, Cebu, Iloilo. Edge compute matters here.
AI-native, not AI-bolted. The AI isn’t a feature — it’s part of the infrastructure.
One engineer. I’m building this alone. The stack needs to be simple enough for one person to operate.

The stack

Compute: Hono on Cloudflare Workers

The entire backend runs on Cloudflare Workers with Hono as the framework. No containers. No VMs. No cold starts.

Why? Because Workers are cheap at low scale (the free tier is generous), fast everywhere (edge by default), and simple to deploy. Hono gives me a familiar Express-like API without the Node.js baggage.

The tradeoff: Workers have execution limits. CPU time is capped. You can’t run long computations. For heavier tasks, I lean on Cloudflare’s Workflows and Queues to break work into smaller steps.

Data: PlanetScale Postgres

PlanetScale gives me a managed Postgres database with branching — I can test schema changes against production-like data without risk. The connection pooling works well with Workers’ connection model.

Frontend: TanStack Start on Cloudflare Pages

TanStack Start gives me type-safe, server-rendered pages with full-stack type inference. Running on Cloudflare Pages means the frontend is also at the edge.

AI: Two-tier architecture

This is the part I’m most opinionated about.

Tier 1: Invisible AI runs on Workers AI (open-source models). It handles transaction categorization, receipt OCR, anomaly detection — things the user never sees directly. It’s fast, cheap, and runs at the edge.

Tier 2: Active AI uses frontier models through Mastra. This is the conversational layer — natural language queries, financial insights, guided workflows. It’s more expensive per call, so it only runs when the user explicitly invokes it.

The separation matters because it lets me optimize cost and latency independently for each tier.

What I’d change

It’s early. The architecture will evolve. But so far, the edge-first approach is paying off in both cost and performance. The main risk is complexity — Cloudflare’s ecosystem is powerful but has sharp edges (pun intended) that I’m still learning to navigate.

More updates as the build continues.