Meta description: Practical guide to AVEVA CONNECT for PI teams: CONNECT Data Services vs PI Data Archive, hybrid options, security, costs, latency, ownership and migration.

Understanding AVEVA CONNECT for PI professionals

Why this matters Cloud projects are often presented as mandates rather than technical decisions. PI administrators and OT/IT architects need to know how to use AVEVA CONNECT without disrupting reliable data collection, predictable performance, controlled change and clear accountability.

This guide is practitioner‑focused: it explains AVEVA CONNECT concepts relevant to the PI System and outlines realistic hybrid strategies, highlighting trade-offs in latency, cost, security and data ownership. Links point to deeper PIAdmin.com resources.

What AVEVA CONNECT is (and isn’t) AVEVA CONNECT is AVEVA’s cloud platform for data access, sharing and application integration. For PI practitioners, the relevant service is CONNECT Data Services (CDS), a cloud‑native service for time‑series and contextualised data.

Keep in mind:

AVEVA CONNECT is a platform; CDS is one data service you may integrate with.
The PI Data Archive, AF, local interfaces and operational processes often remain the operational system of record in a hybrid model.
Cloud adoption changes the operating model: identity, access control, network patterns, monitoring and cost governance matter as much as data modelling.

Related reading: security should be a first‑class design constraint. See Securing the AVEVA PI System in Modern Enterprise Environments.

CDS versus PI Data Archive — a practical comparison Treat CDS and the PI Data Archive as different systems optimised for different roles.

PI Data Archive (on‑prem) — summary A proven, high‑throughput historian designed for deterministic performance on plant networks and close integration with PI interfaces and AF.

CONNECT Data Services (cloud) — summary A cloud‑native service intended for scalable access, sharing and application integration across sites and organisations, typically fed from on‑prem or edge sources.

Practical differences

Ingestion

PI Data Archive: typically fed directly by local interfaces and buffers near the source.
CDS: typically populated by publishing/replication from on‑prem/edge through controlled egress.

Query behaviour

PI Data Archive: optimised for operational use (shift reports, dashboards) over local networks.
CDS: supports enterprise and analytics access patterns but must be designed for Internet/cloud latency and tenancy controls.

Administration and troubleshooting

On‑prem: focus on interfaces, buffers, Windows services, logs and AD.
Cloud: focus on identity providers, token‑based access, outbound connectivity, service health and cost monitoring.

Data model alignment

If AF is your operational context model, decide how that context is represented and governed in the cloud.
Agree naming, units and asset identity early — semantics cause most cloud problems.

Hybrid PI architectures — common patterns Most estates adopt a hybrid approach: local systems remain the operational record while the cloud supports cross‑site consumption and analytics.

Pattern A — On‑prem historian, cloud consumption

Archive remains primary; a subset of tags/assets is replicated to CDS for enterprise reporting. Why it works: minimal OT disruption.
Watch‑out: governance for published scope.

Pattern B — Site historians + central cloud layer

Each site keeps local PI; CDS provides consolidated views for corporate teams. Why it works: respects site autonomy and latency.
Watch‑out: asset identity and time alignment across sites.

Pattern C — Edge collection + cloud‑first for new projects

New projects publish to CDS (directly or via edge); legacy stays on‑prem. Why it works: incremental adoption.
Watch‑out: two operating models to support.

Pattern D — Cloud analytics, on‑prem operations

Cloud holds derived/curated datasets (KPIs, aggregates); on‑prem retains full fidelity. Why it works: cost control and clear purpose.
Watch‑out: maintain traceability from KPIs back to raw signals.

Latency, cost and data ownership — what to decide early Address these three early and document the decisions.

Latency Consider:

Acquisition latency: how quickly values appear after they occur.
Query latency: how quickly users/apps retrieve data.
Decision latency: operational impact (alarms, quality release).

Rules:

Keep operator‑critical use cases local or off the public Internet.
Use cloud for use cases tolerant of seconds/minutes delay (fleet analytics, weekly reports).
Classify use cases as real‑time operational, near‑real‑time, or batch and map them to appropriate paths.

Cost Major cost drivers:

Data movement (egress/ingress)
Storage retention (raw vs aggregated)
Query volume (who reads what, how often)
Shadow usage (untracked dashboarding)

Control techniques:

Publish only justified data.
Prefer aggregates where raw fidelity isn’t needed.
Apply retention policies aligned to business value.
Implement monitoring and chargeback/showback.

Data ownership Define roles for:

Legal ownership (contracts)
Operational stewardship (signal meaning and quality)
Access rights (who can read/export/share)
Accountability (on‑call responsibilities)

Typical pragmatic model:

Site OT owns signal meaning and quality.
Central IT/cloud team owns identity, access frameworks and monitoring.
A named data product owner manages enterprise datasets.

Document which datasets are authoritative, who approves schema/context changes, and how sensor changes are handled.

Security implications — identity, network and blast radius Cloud expands the threat surface; treat security as architecture.

Identity and access control Expect:

A shift from local AD group patterns to identity providers and token‑based access.
Stronger lifecycle controls for joiners/movers/leavers.

Establish:

Role‑based access (operational vs analytics vs vendor)
Least privilege with time‑bound elevation
Audit trails for access

Start with Securing the AVEVA PI System in Modern Enterprise Environments.

Network and segmentation Avoid:

Overly broad outbound rules
Bypassing DMZ patterns
Mixing OT/IT trust zones without clear routing and inspection

Good practice:

Minimise OT exposure; prefer controlled outbound connectivity
Use layered zones (OT → DMZ/edge → cloud)
Log and monitor outbound flows

Blast radius and resilience Ask:

What fails if cloud connectivity drops?
What can an attacker access if credentials are compromised?
How quickly can you revoke access and rotate secrets?

Design for graceful degradation, clear incident runbooks and separation of duties between OT, IT and vendors.

When cloud makes sense — succinct guidance Use cloud when the use case is explicit.

Strong reasons to use AVEVA CONNECT

Enterprise consumption of curated datasets across sites.
Controlled sharing with partners/vendors without opening plant networks.
Elastic analytics for heavy compute peaks.
Standardisation of access for internal application teams.

Weak reasons (warning signs)

“Modernise” with no defined consumers or value stream.
“Replace PI” because it’s old, ignoring PI’s operational guarantees.
Using cloud to avoid data modelling — cloud increases reliance on shared meaning.

Use‑case triage checklist Confirm:

Consumers and their roles
Required freshness (seconds/minutes/hours)
Required fidelity (raw vs aggregates)
Failure mode (what happens if the link is down)
End‑to‑end data quality ownership

Migration considerations — separate workstreams Treat consumption, data, operations and governance migrations as separate streams.

Start with publishing scope, not platform scope Begin with a controlled thin slice:

One site, one process area, a small set of well‑understood tags.
Publish a dataset with clear consumers and acceptance criteria.
Validate latency, cost and security in real conditions.

Backfill and history Backfill is often unnecessary and costly. Options:

No backfill (forward‑looking only)
Limited backfill (e.g. last 13 months)
Curated backfill (selected assets/KPIs)
Full backfill (rare; requires clear business/regulatory drivers)

Operational readiness Define runbooks for monitoring, alerting, recovery and change control:

What signals indicate publishing delays or failures?
Who is paged and under what conditions?
How is data completeness validated?
How are schema/context changes deployed?

Preparing your PI estate — practical prework Actions you can take now to reduce future disruption.

Normalise AF

Standardise naming, units and attribute conventions.
Reduce duplicate equipment representations and use templates.
Apply a data‑contract mindset: signal meaning, unit, range, steward.

Rationalise tags and improve signal quality

Identify noisy or unused tags.
Classify tags into tiers: critical, important, long‑tail.
Fix anti‑patterns (free‑text units, inconsistent descriptors).

Map network zones and trust boundaries

Understand firewall ownership and change processes.
Define an edge/DMZ strategy consistent with your security posture.

See: Securing the AVEVA PI System in Modern Enterprise Environments

Modernise access patterns gradually

Keep proven operational clients stable.
Introduce API‑based access for new consumers.
Avoid uncontrolled direct access to the Data Archive from corporate networks.

Establish a cloud operating model Define who administers identity and roles, approves published datasets, pays for consumption and supports incidents. Document and socialise it early.

A concise CONNECT‑aligned strategy statement Use this five‑point statement with stakeholders:

Keep OT‑critical acquisition and operations local (PI Data Archive remains the operational historian).
Publish curated datasets to CONNECT Data Services for enterprise consumption.
Design for failure so cloud outages do not stop the plant.
Govern meaning and access (AF consistency, identity and data stewardship).
Control cost through selective publication, retention rules and aggregates.

Where to go next on PIAdmin.com

Securing the AVEVA PI System in Modern Enterprise Environments

Understanding AVEVA CONNECT for PI professionals

Understanding AVEVA CONNECT for PI professionals

Tags