Date:
29 April 2026
Incident Window:
10:20 - 15:35 AEST
Systems Affected:
Database cluster, downstream application services
Summary
At approximately 10:20 AEST, the system experienced a rapid spike in database connections, leading to resource exhaustion and degraded application performance. A failover at 10:40 temporarily alleviated the issue.
A second, more severe spike occurred at 14:00, again resulting in database exhaustion. Investigation identified a recently released query related to Commonwealth Unspent Funds as the root cause. The query was executing at high volume due to backlog processing and contained inefficient subqueries.
A hotfix was deployed at 15:35, resolving the issue.
Spikes
Impact
  • Intermittent application degradation and timeouts
  • Elevated database connection usage leading to exhaustion
  • Reduced system responsiveness during spike windows
  • Potential delays in provider statement generation
Timeline
  • 10:20: Initial spike in database connections observed
  • 10:40: Database failover performed; connection levels stabilised
  • 10:40–14:00: Investigation into root cause underway
  • 14:00: Second spike; database connections exhausted again
  • ~14:10: Problem query identified
  • ~14:30–15:30: Query analysis and remediation work
  • 15:35: Hotfix deployed; connection usage returns to normal
Root Cause
A query introduced last week for Support At Home contained two unbounded subqueries.This resulted in:
  • High memory and CPU overhead per execution
  • Limited execution plan selection under load
  • Amplified cost when executed concurrently
The issue was triggered by a backlog of statements queries which resulted in an increase in database connections and subsequent exhaustion.
Resolution
  • Identified and analysed the problematic query
  • Implemented a hotfix to optimise/remove unbounded subqueries
  • Reduced per-query load and execution cost
  • Deployment at 15:35 resolved connection exhaustion
Blameless Root Cause Statement
The incident was caused by a query design that resulted in large backlog processing. When triggered at scale, it resulted in database load and connection exhaustion. Improvements are required in query design standards, load validation, and bulk processing controls.
What we are doing about it
Last year Visualcare kicked off an infrastructure upgrade. Key updates include containerisation and database sharding. In recent months we have been piloting this with a key partner to assess performance and reliability. This pilot has been successful and is being pushed live in early May. Once complete we will be rolling out these improvements to support improved reliability and performance.
The duration of incidents within the last 30 days has resulted in an uptime of 99.2%.