"Let's move our SCADA to the cloud" is a common executive statement. It's a sound goal — but a pure cloud approach clashes with field reality: does the plant stop when the internet drops? Can an AHU be controlled with 200 ms latency? Will a valve-open command round-trip from a Frankfurt server?
This post explains the correct cloud SCADA design pattern — edge + cloud hybrid — based on field experience.
Why pure cloud fails
- Latency: PLC loops 10-100 ms. Cloud round-trips 100-500 ms. Control logic can't tolerate this.
- Internet dependency: ADSL drops, fiber cuts, ISP outages — the plant must keep running.
- Data volume: 5,000 tags × 1 s × 30 days = 13 billion samples. Sending all of it to cloud explodes bandwidth and storage cost.
- OT security: Exposing field devices directly to the internet is a top attack vector.
The three layers of hybrid architecture
Layer 1: Edge (on-site)
- PLC + I/O
- HMI panels
- Local SCADA / edge runtime — critical control + alarms
- Local time-series buffer (10-30 days)
- Local MQTT broker
This layer survives internet outages. All "life-critical" decisions are made here.
Layer 2: Bridge (edge gateway)
- Reads from local MQTT
- Store & forward during outages
- TLS 1.3 upstream MQTT or HTTPS
- Filtering: downsample 1 s → 1 min, drop non-critical tags
- Edge → cloud one-way only (cloud never writes directly to PLC)
Layer 3: Cloud
- Long-term history (Timestream, InfluxDB Cloud, Azure Time-Series)
- Cross-site aggregation
- Analytics (anomaly detection, predictive maintenance)
- Dashboards (Grafana, Power BI)
- Mobile access, multi-user
- ERP / accounting integration
Which data goes where?
| Data type | Edge | Cloud |
|---|---|---|
| Real-time control decisions | ✓ | — |
| Instant alarms | ✓ | copy |
| Second-level trend (30 days) | ✓ | — |
| Minute aggregates (years) | — | ✓ |
| Cross-site comparison | — | ✓ |
| ML training | — | ✓ |
| Management reports | — | ✓ |
Migration roadmap
- Harden the edge. Make sure existing SCADA runs stably. Add Linux runtime + redundancy if needed.
- Install MQTT broker on-site. All telemetry flows through it.
- Define edge gateway with store & forward (Telegraf, Node-RED, or custom).
- Stand up cloud side — time-series DB + dashboard. Read-only initially.
- Run a 2-3 month pilot. Add mobile, reporting, alarm notifications.
- Roll out to other sites. Same gateway template, central config management.
Controlling cloud cost
- Outbound traffic: typically free; site → cloud is what's billed.
- API call count: per-POST pricing. 5,000 tags every second = 13 B/month.
- Storage: raw 1-second data balloons fast. Aggregation + compression are mandatory.
If you're considering cloud migration, share your tag count + site count + retention needs — we can produce a cloud cost projection and hybrid architecture.
Türkçe
English