Screen health monitoring ensures all displays are online, showing correct content, and operating within normal parameters. Without monitoring, a broken screen in a far corridor runs dark for weeks until someone walks past and notices. With monitoring, you detect offline players in minutes, identify overheating hardware before it fails, and track fleet-wide uptime as a KPI. SpinetiX provides built-in monitoring through Arya and player APIs.
Monitoring Architecture
Arya Dashboard Monitoring
Arya's fleet dashboard shows all players with real-time status: online (green), offline (red), warning (yellow). Click any player for details: firmware version, content version, last check-in, storage usage, and network info. Email notifications alert when players go offline.
REST API Polling
For custom monitoring, poll each player's REST API every 2–5 minutes. The status endpoint returns JSON with all health metrics. Parse responses into your monitoring platform (Grafana, Nagios, Zabbix, PRTG) for dashboards and alerting.
Screenshot Verification
Automated screenshot capture provides visual proof that content is rendering correctly. Compare screenshots against expected output to detect rendering issues, data feed failures, or unexpected content changes.
Health Metrics
| Metric | Source | Warning Threshold | Critical Threshold |
|---|---|---|---|
| Online status | API poll / Arya | Offline > 5 min | Offline > 15 min |
| CPU temperature | API status | > 65°C | > 75°C |
| Storage usage | API status | > 80% full | > 95% full |
| Firmware version | API status | 1 version behind | 2+ versions behind |
| Content version | File hash comparison | Different from expected | No content loaded |
| Uptime | API status / calculated | < 99.5% | < 99.0% |
Key Parameters
| Parameter | Value | Why It Matters |
|---|---|---|
| Poll interval | 2–5 minutes | Balance detection speed with load |
| Alert channels | Email, Slack, PagerDuty | Reach the right person fast |
| Uptime target | 99.5% standard | Measurable SLA for stakeholders |
| Data retention | 90 days minimum | Trend analysis, SLA reporting |
| Screenshot frequency | Hourly / on-change | Visual verification without overload |
Common Mistakes
- No monitoring at all. Learning about offline screens from customer complaints is unacceptable. Implement automated monitoring from day one — even a simple script that pings player IPs every 5 minutes is better than nothing.
- Alert fatigue. Sending email alerts for every 30-second network blip floods inboxes and desensitizes operators. Set appropriate thresholds: warn after 5 minutes, critical after 15 minutes. Transient blips resolve themselves.
- Status-only monitoring. Knowing a player is "online" doesn't confirm content is correct. Add screenshot verification and content version checks to ensure screens show what they should.
- No historical data. Without trend data, you can't calculate uptime SLAs, identify recurring issues, or predict hardware failures. Store monitoring data for at least 90 days for meaningful analysis.