Maybe Don't DDoS Gmail
TL;DR: 1.4M Gmail API calls/day for a single user from over-engineering 'reliability.' Throttled my entire Google account. Circuit breakers fought my fixes. Turns out webhooks work if you trust them.
I don't trust things to work.
It's an occupational hazard. Over a decade running operations at Amazon, StockX, and Stripe will do that—you've seen too many "guaranteed" systems silently fail at 3 AM.
So when I built my Gmail integration, I did what any paranoid operator would do: I added safety nets. Then I added safety nets for the safety nets.
Gmail sends webhooks when emails arrive? Great. But what if they don't arrive? Better poll every minute just in case.
What if emails get stuck? Poll every 5 minutes to check.
What if the webhooks fail silently? Poll every 15 minutes to reconcile.
Belt and suspenders, right?
Turns out I built a straitjacket.
The Problem: 1.4 Million API Calls Per Day
My account got throttled. Not just the app—my entire Google account.
Other Google integrations stopped working. Gmail clients started showing rate limit warnings. The app fell behind on sync.
I checked the logs: 1.4 million Gmail API calls per day.
For context, Gmail's free tier allows 1 billion queries per day across all users. I was burning through 0.14% of that quota for my own development account.
That's not a safety net. That's a DDoS attack.
And it got worse: I had circuit breakers in my code that kept firing "self-healing" recovery processes. I'd turn off email processing, and the circuit breakers would detect the "failure" and restart everything.
My own guardrails were fighting me. (More on that disaster in a future post.)
The Root Cause: Polling Madness
I had 6 different schedulers running continuously, all "just in case":
1. Batch Scheduler — Ran every minute (1,440 times/day):
{ cron: '* * * * *' } // Every. Single. Minute.2. Five Monitoring Jobs — Running every 5-30 minutes:
- Webhook heartbeat monitor (every 5 min)
- Stuck email monitor (every 5 min)
- Stuck jobs recovery (every 5 min)
- Failed email recovery (every 15 min)
- Inbox reconciliation (every 15 min)
Combined: ~4,800 polling executions per day.
Each execution made multiple Gmail API calls to check for changes that the webhooks already told me about.
I wasn't building reliability. I was building redundancy theater.
The Fix: Trust Your Webhooks
The solution was brutal simplicity—delete the safety nets:
1. Removed redundant schedulers — Deleted batchScheduler, optimizedScheduler, and threadMonitor entirely. If the webhooks work, you don't need them.
2. Expanded webhook coverage — Gmail Watch now tracks INBOX, SENT, and TRASH labels. Webhooks tell me everything I need to know.
3. Used History API properly — Query messageAdded, labelAdded, labelRemoved events reactively (only when webhooks fire), not proactively every minute.
4. Drastically reduced "safety nets":
- Monitoring jobs: 5 min → 30 min intervals
- Inbox reconciliation: Every 15 min → once daily at 4 AM
The Results
- From ~4,800 to ~100 polling executions per day (97% reduction)
- From 1.4M+ to ~40K Gmail API calls per day
- No more rate limiting — My account works again, integrations recovered
- Real-time processing — Webhooks are actually faster than polling
The thing I was most afraid of (webhooks failing silently) never happened. The thing I caused (global account throttling and self-inflicted recovery loops) was entirely my own doing.
The Lesson: Paranoia Has a Cost
I'm still a closed-loop process person. I still want monitoring, alerting, and self-healing.
But redundancy isn't reliability if it creates the failure mode you're trying to prevent.
Polling "just in case" made the system:
- More fragile (rate limiting broke everything, including unrelated services)
- Slower (polling is inherently delayed)
- More expensive (API quota burns fast)
- Harder to debug (which scheduler caused this error?)
Now I use polling sparingly—one daily reconciliation job at 4 AM to catch anything truly weird. The rest is event-driven.
Webhooks work. If you don't trust them, fix your webhook infrastructure. Don't build a polling fortress around it.
What I'm Watching For
The daily reconciliation job at 4 AM is my canary. If it consistently finds emails the webhooks missed, I'll know something's wrong with the event-driven architecture.
So far? It finds nothing. The webhooks work.
Turns out the paranoia was the problem.
Related: Vibe-Coded Software · I Deleted My Production Database at 2 AM