Case Study — AutoPulse: Automation Monitoring Built in a Weekend

The Problem Nobody Talks About

You've invested in automation. Your CRM syncs with your invoicing tool. Your field service platform talks to your database. Client onboarding runs on autopilot. Everything works — until it doesn't.

To be fair, most automation platforms do notify you when something fails. Make.com emails you when a scenario errors out. Airtable flags a broken automation. That part works — when the platform itself is working.

The watchman problem: the platform can't alert you about its own outage. If Airtable goes down, it can't send you a notification saying it went down. If Zapier silently stops executing zaps — and this happens — there's no zap that fires to tell you it stopped running. Who watches the watchman?

And then there's the failure mode that generates no alert at all: gradual degradation. A platform doesn't crash — it just starts responding slower. 200ms becomes 800ms. No error. No notification. But downstream, automations start timing out, syncs fall behind, and small delays compound into missed invoices, late contractor notifications, or compliance checks that never complete.

You find out days later when a client calls — or worse, when a regulator asks questions.

The Situation

I manage automation systems across multiple clients — each running a stack of interconnected tools: Airtable, Make.com, Zoho FSM, QuickBooks, HubSpot, and custom API integrations. These aren't simple one-step workflows. They're multi-system orchestrations where a single failure in one connection can cascade downstream.

The monitoring options available were not great:

Option A

Platform-native logs

Scattered across five different dashboards, each with its own format. Checking them daily was a full-time job nobody was doing.

Option B

Enterprise tools

Datadog, PagerDuty, New Relic. Designed for engineering teams with six-figure budgets. Overkill for operational automation.

Option C

Nothing

Which is what most automation teams actually use. Wait for something to break, then react.

None of these fit. I needed something purpose-built: lightweight, centralized, and designed specifically for the kind of automation infrastructure I manage.

The Decision: Build It

Instead of stitching together another set of tools, I designed and built AutoPulse — a centralized automation health monitoring system.

Timeline

One weekend

From architecture to production.

Monthly cost

Runs entirely within Cloudflare's free tier.

Status

Production

Monitoring live client systems.

How It Works

AutoPulse has three layers, each doing one job:

The Watchers. Automated scripts run every 5 minutes, checking whether each connected platform is responding and how fast. Think of it as a heartbeat monitor — if a platform stops responding or slows down, AutoPulse knows immediately.

The Memory. Every health check result gets stored with a timestamp. This creates a historical record: not just "is it working now?" but "how has it been performing over the past 24 hours, week, or month?" Patterns become visible — a platform that gets slow every Tuesday at 3pm tells you something.

The Dashboard. A single screen showing the real-time status of every monitored system. Green means healthy. Yellow means slow. Red means down. Click into any system and you see the performance trend over time — latency graphs that tell the story at a glance.

When something goes red, an alert fires to Slack. No more checking five different dashboards. No more finding out from a client.

The Architecture

Skip this section if you don't care about the how — the results section is next.

AutoPulse runs on Cloudflare Workers — serverless functions that execute at the edge, close to the services they're monitoring. The database is Cloudflare D1 (SQLite at the edge). The two workers communicate via Service Bindings — a direct internal connection that bypasses the public internet.

The architecture is intentionally modular:

→

The monitoring worker and the dashboard worker are separate deployments. If the monitor breaks, the dashboard still shows historical data. If the dashboard breaks, monitoring and alerting continue.

→

Adding a new system to monitor means adding an entry to a configuration list — not building new infrastructure.

→

Each monitored system reports as an independent feed, so one platform's issues don't affect visibility into others.

The health check classifies response times into tiers — Excellent (<200ms), Good (<500ms), Warning (<1500ms), Critical (>1500ms) — and stores both the raw latency and the classification. The dashboard renders latency trends using Chart.js, color-coded by performance tier.

Total infrastructure cost: $0. Cloudflare's free tier allows 100,000 Worker invocations per day. AutoPulse uses roughly 300.

The Results

Before AutoPulse

Platform issues discovered hours or days after they started

No centralized view of automation health across clients

Reactive troubleshooting — always fighting fires after the fact

Zero historical data on platform performance patterns

After AutoPulse

Real-time visibility into every connected platform, from one screen

Slack alerts within 5 minutes of any platform degradation

Historical latency data revealing performance patterns

Proactive capacity to address issues before client impact

What This Demonstrates

AutoPulse wasn't a client project. Nobody asked me to build it. I built it because the problem existed and no available tool solved it the right way for my context.

This is what I mean by Builder-Architect: I saw a gap in how automation operations were being managed, designed a system to fill it, and built it end-to-end — from database schema to API design to production dashboard — in a weekend.

The same thinking applies to every system I build for clients: understand the real operational problem, design the architecture, build it, and make sure it actually works in production. Not just the happy path — the failure modes, the edge cases, the "what happens at 2am when nobody's watching" scenarios.

If your business runs on automation, someone should be watching whether it's actually running.