A Dead Man’s Switch for n8n, Without Building One
Most alerts tell you something happened. A dead man’s switch tells you something didn’t.
That is the whole idea. You set up a signal that has to keep arriving on a schedule, and the moment it stops, you get told. It is the only kind of check that fires on silence instead of on noise, which is exactly the failure n8n is worst at showing you: the run that simply never came.
The catch is that the DIY version is more fragile than the problem it solves. So most people either skip it or build one that quietly rots.
Why n8n needs a switch that fires on silence
n8n is good at telling you a run failed. It is bad at telling you a run never started.
A failed execution leaves a record: a red node, an error, something to click. A missed run leaves nothing. The workflow still reads Active, the executions list looks calm, and the report that should have gone out at 09:00 just isn’t there. There is no failed execution because there was no execution at all.
Every normal alert is built around an event that happened. None of them fire on the absence of an event. That is the gap a dead man’s switch is meant to fill, and why it matters more for scheduled work than almost anything else. We went deeper on the missed run itself in the scheduled run that never fired; the switch is the other half of that story.
The DIY heartbeat, and why it rots
The do-it-yourself version is well known. You make your workflow send a ping on every run, to a small service or a cron-checker, and you configure that service to alert you if the ping does not arrive in time. When the workflow stops, the ping stops, the alert fires. A heartbeat.
It works, right up until it doesn’t, and it usually fails in quiet ways.
The first problem is that the heartbeat rides inside the same workflow it is supposed to watch. If the workflow never starts, the ping never sends, which is the point, but it also means the heartbeat shares every weakness of the thing it watches. If the instance is down, the heartbeat is down too.
The second problem is the interval. You have to tell the checker how long is too long, and a real workflow’s timing drifts. So you either set the window tight and get false alarms on every slightly-late run, or you set it loose and the alert arrives hours after it mattered.
The third problem is the one that actually gets you: it depends on you remembering to add it, correctly, to every workflow that matters, and to maintain it as schedules change. Across one workflow that is fine. Across a few dozen client workflows, “remember to wire and tune a heartbeat on each one” is not a plan. It is a wish.
So the DIY switch is real engineering applied to a problem that keeps shifting under it. For an agency, that is rarely the right place to spend the hours.
What a switch actually has to get right
Strip it down and a dead man’s switch for n8n has to do three things well.
It has to know what “on time” means for each workflow, not as a single global timeout but per workflow, because a once-a-week sync and a near-realtime feed are not late at the same point. It has to tolerate normal jitter, so a run that lands a little behind does not page you, while a run that genuinely vanished does. And it has to watch from outside the workflow, so the thing doing the watching does not die with the thing it watches.
Those three together are the difference between an alert you trust and an alert you learn to ignore.
Watching for the missing run without building it
This is the part NoCrash (n8n reliability) is built to do, and it is worth being specific about how, because “it watches your workflows” is not an answer.
NoCrash connects to n8n through the API and watches from outside. For scheduled workflows it learns each one’s normal cadence from its own history rather than asking you to declare a timeout, and it only flags a missed run when the gap exceeds that learned interval by a real margin, so ordinary lateness is absorbed instead of paged. Event-driven and webhook flows are judged on their outcome instead, not on a schedule, so they do not throw false missed-run alarms.
The result is the switch without the wiring. You do not add a ping node to every workflow, you do not tune a window per job, and the watch does not go dark when your instance does. When an expected run does not appear, that becomes a plain-language event instead of a silence nobody noticed.
That same outside-the-run logic is what catches the other quiet failures too: a green run that did nothing, Continue On Fail hiding an error, and an AI agent answering green over a failed tool. A missed run is just the version where there is no run at all.
Why this matters more for agencies
For your own automations, a missing run is an annoyance you will probably catch eventually. For a client’s, it is a trust problem you find out about late.
The client does not know the difference between a failed run and a missed one. They know the leads stopped arriving, the report did not land, the sync went stale. And because nothing turned red, nobody looked, sometimes for days. A switch that fires on silence is what turns “the client told us” into “we told the client,” and for an agency that one reversal is most of the value.
Check your workflow before the next quiet failure
A green execution is not always proof that the workflow did the job.
Run your exported n8n workflow through the free NoCrash Workflow Grader and get a quick read on the spots worth watching first. No access needed. No signup needed for the first look.
Run the free n8n Workflow Grader
If you want ongoing coverage after that, start free and watch up to 3 things continuously.