Keeping PI fast, stable and predictable at scale
“PI performance” is rarely a single problem with a single fix. It emerges from many design choices across ingestion, archives, query patterns, AF modelling, visualisation, virtualisation, storage and operational hygiene.
Keeping PI fast, stable and predictable at scale
Meta description: Practical guidance to improve PI performance at scale: archives, AF, Event Frames, PI Vision load, ingestion, monitoring, tuning and future-proof design.
Purpose
“PI performance” is rarely a single problem with a single fix. It emerges from many design choices across ingestion, archives, query patterns, AF modelling, visualisation, virtualisation, storage and operational hygiene.
This guide is an evergreen field guide for PI admins, engineers and OT/IT architects who need PI to remain fast, stable and predictable as data volumes, users and integrations grow. It focuses on pragmatic trade-offs and operational realities.
Scope note: PI deployments vary. Use this as a framework: measure, identify the constraint, then change one thing at a time.
What “performance” means
Performance is meeting user and system expectations under real workloads, consistently.
Key outcomes:
- Low latency — new values, calculations and rollups appear within predictable windows.
- High throughput — ingest and queries meet demand during peaks.
- Reliability — no “brown-outs”; predictable behaviour during failover, patching and maintenance.
- Operational predictability — evidence-based capacity planning; costs scale in a controlled way.
Performance is end-to-end: interface → PI Data Archive → AF Server → PI Vision/custom app, plus authentication, DNS, load balancers and storage. A bottleneck anywhere makes “PI slow”.
If you are shaping the landscape, start with Designing a Scalable and Resilient PI System Architecture.
Common bottlenecks and symptoms
- Disk and storage latency (a frequent silent limiter)
- Symptoms: archive reads slow under load, UI trend pauses, query time rises without CPU increase.
- Causes: IO-heavy archives, virtual storage contention, antivirus/backup scans.
- First checks: measure host and guest storage latency; exclude archive directories and PI binaries from aggressive scans where policy allows.
Don’t guess: IO waits can mimic CPU problems.
- CPU saturation (Archive, AF, PI Vision or client tier)
- Symptoms: queued Archive/AF requests, timed‑out UI requests, single-component CPU spikes.
- Causes: wide query fan‑out, analysis schedule spikes, high PI Vision concurrency with heavy symbols.
- Network and name-resolution issues
- Symptoms: intermittent slowness by client location, slow first queries (DNS/Kerberos), cross‑domain auth delays.
- Causes: DNS misconfiguration/timeouts, MTU/packet loss, undersized proxies/load balancers.
- Chatty/query-pattern workloads
- Symptoms: many small fast calls resulting in high total load.
- Examples: dashboards issuing independent series calls instead of consolidated queries; apps polling too frequently.
- Misaligned scaling strategy
- Symptoms: adding CPU/RAM has little effect or temporary benefit.
- Reason: another bottleneck (storage, single‑threaded work, locking or design) limits scaling.
For ingestion design trade-offs, see How Data Gets Into the PI System: Interfaces, Adapters, and MQTT.
Archive sizing and retention
Archive sizing affects search performance, backup/restore time and operability.
What matters:
- Event density (values/day across points).
- Compression and exception settings.
- Access patterns (typical query windows).
- Retention obligations (regulatory, operational, analytics).
Practical retention strategy:
- Hot window — days to weeks for interactive queries; optimise IO and backup windows.
- Warm window — months for ad‑hoc analysis; accept slightly slower performance.
- Cold strategy — long-term retention via online archives, external export or aggregated summaries.
Principles:
- Choose archive file sizes that match backup/restore capabilities.
- Avoid very small files that increase churn and very large files that complicate operations.
- Put archives on predictable, low‑latency disks — latency matters more than raw capacity.
Compression/exception tuning reduces stored events and downstream load but may affect data fidelity. Treat these settings as engineering decisions per measurement type.
AF and Event Frame performance
AF performance depends on database design, attribute references, analysis schedules and search behaviour.
AF pain points and mitigations:
- Complex attribute references: slow loads and high AF CPU. Keep reference logic simple, standardise patterns and minimise expensive lookups.
- Inefficient hierarchy/searches: long searches and AF spikes. Design naming/hierarchy for common searches, use categories/templates, limit wildcards.
- Analysis scheduling spikes: CPU spikes at schedule boundaries. Stagger schedules, prefer event‑triggered analyses where sensible, and retire unused analyses.
Event Frame pitfalls:
- High EF generation rate (many short events), heavy templates (many captured attributes), broad frequent searches, and indefinite retention cause performance and operability issues.
Guidance: be deliberate on EF retention; minimise captured attributes; summarise older EF data when detailed frames are rarely queried.
PI Vision load and concurrency
PI Vision “displays” are workloads: number of symbols × refresh rate × time ranges × concurrent viewers. A few heavy displays can dominate system load.
Common traps and mitigations:
- Too many time series per display: split overview and deep‑dive displays, shorten default ranges, use summary statistics with drill‑down.
- High refresh rates everywhere: reserve fast refresh for true operational needs; use moderate refresh elsewhere. Wallboards should use fewer elements and calls.
- Overuse of dynamic AF content: standardise templates, reduce one‑off patterns, and load‑test representative displays.
High‑volume ingestion impacts
High ingestion narrows the margin for error and affects archives, snapshots, queries, calculations, maintenance and recovery.
Operational changes:
- Backlog risk: sustained ingest above capacity builds queues, causing missing recent data and delayed analytics.
- More points increase search and attribute resolution overhead, EF/analysis potential, and security object management.
- Mistaken exception/compression settings scale badly when replicated across many points.
Protections:
- Enforce standard tag configurations per measurement class.
- Separate engineering‑truth streams from high‑frequency “nice‑to‑have” streams.
- Validate time synchronisation and scan class settings to avoid bursty loads.
Monitoring and tuning practices
Treat PI as a production service: measure → change → validate → document.
Start by defining SLO‑style targets (for example, PI Vision load times under typical concurrency, 95th percentile trend response, ingestion latency, analysis completion windows). Without targets, “slow” is subjective.
Baseline at minimum:
- CPU, memory, disk latency/IOPS/throughput on key servers.
- Network latency between tiers.
- Ingestion rates and peak patterns.
- Request concurrency on the visualisation tier.
- Growth rates (archives, EF counts, AF objects).
Tuning guidance:
- If Archive reads are slow: check storage latency first, reduce wide queries, validate archive organisation.
- If AF is slow: identify attribute resolution, search or scheduling issues; reduce expensive references and wildcard searches.
- If PI Vision is slow: determine whether app server CPU, network or back‑end query time is the limiter; optimise heavy displays before scaling hardware.
Security can cause performance issues (authentication challenges, group resolution). Align hardening and authentication with performance goals: Securing the AVEVA PI System in Modern Enterprise Environments.
Designing for future scale
Plan early without overbuilding.
Capacity planning: express growth as workloads — tags/attributes, events/sec (average and peak), concurrent PI Vision users and worst‑case displays, analysis counts and EF rates — then map to storage IO, compute per tier, network and operational tasks.
Design patterns that scale:
- Separate concerns by tier: ingestion, storage, AF and visualisation. See Designing a Scalable and Resilient PI System Architecture
- Standardise and govern: naming, templates, display standards and onboarding review for high‑volume sources.
- Engineer for failure: design failover, test recovery times for largest archives and busiest tiers, and avoid single points of operational fragility.
Getting help
Short, targeted specialist engagement can accelerate diagnosis, workload modelling or architecture review. For roles and skill sets, see Careers in the AVEVA PI System World
