Published: 2026-01-22 , Last updated: 2026-01-22
A single physical button that controls Spotify playback sounds trivial at first.
In reality, it becomes a compact case study of how hardware, firmware, network reliability, and third-party APIs intersect—and how fragility naturally emerges when a system crosses those boundaries.
This article documents the design of a button-driven Spotify control system and explains why even “simple” real-world systems demand careful thinking.
JUMP TO:
The original goal was intentionally minimal:
Detect when Wi-Fi connectivity is actually broken (not just slow)
Attempt a single automated recovery action
Leave a traceable signal when recovery fails
Constraints (by design):
No UI, dashboard, or companion app
Only one action mapped to one outcome (recover or record failure)
No persistent cloud state
Why it matters:
Most “self-healing” systems fail quietly because they confuse symptoms with causes. This project forces the system to admit uncertainty instead of masking it.
📋 Key takeaway: If a system can’t explain why it acted, it shouldn’t act automatically.
At a conceptual level, the system looks simple:
Trigger: Wi-Fi health sensor detects loss of connectivity
Local compute: microcontroller / edge script evaluates state
Network action: controlled reset or reconnection attempt
External effect: Wi-Fi restored—or failure logged
However, each step hides assumptions that stay invisible until something fails.
This is where real-world systems diverge from diagrams.
Prototype snapshot: ESP32-based Wi-Fi monitor used to detect connectivity loss and trigger automated network recovery.
System setup: ESP32 Wi-Fi monitor paired with a smart plug used to reset network equipment when connectivity fails.
(Hardware Layer: why a signal is never “just a signal”)
What we assume:
Packet loss means Wi-Fi is down
Signal strength reflects connectivity quality
How it fails in reality:
DNS failures look like total disconnection
Captive portals and router hiccups create false negatives
What we do about it:
Combine multiple signals (ping + DNS + timeouts)
Require failure persistence over time, not instant triggers
📋 Key takeaway: Connectivity is a state, not a boolean.
(Architecture: where the boundary actually is)
What we assume:
If Wi-Fi is down, recovery is always safe
Rebooting or resetting is harmless
How it fails in reality:
Recovery actions can interrupt valid in-progress traffic
Acting too often creates oscillation loops
What we do about it:
Cool-down timers between actions
Explicit “no-action” states when uncertainty is high
📋 Key takeaway: Not acting is often the most reliable decision.
(Symptom vs outcome: what users expect vs what happens)
What we assume:
A reset restores normal operation
Success is immediate and visible
How it fails in reality:
Recovery succeeds but connectivity remains partial
No feedback makes failure indistinguishable from success
What we do about it:
Post-action verification checks
Explicit failure markers when recovery doesn’t work
📋 Key takeaway: A recovery without verification is just another guess.
Prototype device: ESP32 Wi-Fi monitor used to detect connectivity loss in the network.
This system spans multiple domains:
Physical environment (interference, power)
Firmware logic and timing
Network infrastructure and ISP state
Router firmware behavior
Each layer is reasonable on its own. Together, they multiply uncertainty.
This fragility isn’t a mistake — it’s a property of cross-layer systems.
📋 Key takeaway: Assume the layer you can’t see is the one that fails.
If this were production, improvements would include:
Feedback mechanisms (LEDs, logs, user-visible signals)
Explicit retry and timeout strategy
Clear offline and degraded-mode handling
State validation before issuing recovery commands
Cross-layer observability (logs, metrics, timestamps)
⚡Most importantly⚡
"A simple interface does not mean a simple system."
➡️ FIELD NOTE|Automating Router Alerts Inside a Network Boundary
➡️ BUILD LOG|Vibe Coding with AI: Modifying Embedded Media Players
➡️ SYSTEM REVIEW|From SMS Trigger to Email Notification Systems
➡️ FIELD NOTE|Designing Failure-Aware IoT Prototypes
Because many failures aren’t router failures. Acting too early hides root causes and creates unnecessary disruption.
Because the system detected ambiguous state. Silent failure is safer than confident wrong action.
The assumptions between sensing and decision logic.
Add user feedback, persistent state logging, and remote observability across all actions.
Yes. Any system crossing physical and logical boundaries behaves this way.
Shipping real-world systems means designing for failure, not assuming stability.
If you want a second set of eyes on architecture, reliability, or “demo → production” risks, book a session.