Background
PR #4202 fixed #3976 (AppSync subscriptions disconnecting after 2-3 minutes / zombie subscriptions after iOS recycles the TCP route on the same network). However, #4202 was found to regress #4220 — GraphQL subscriptions get stuck in connecting after repeated iOS background/foreground transitions.
@akiramur confirmed v2.56.0 (pre-#4202) does not reproduce #4220. The regression was traced directly to the WebSocket recycle path introduced in #4202.
PR #4228 reverts #4202 to unblock the #4220 regression. That revert also removes the #3976 fix, so #3976 will reappear and must be re-addressed here with a non-regressing approach.
What #4202 did (and why it regressed)
Two layered changes:
1. WebSocketClient.onNetworkStateChange — new (.online, .online) recycle case
case (.online, .online):
// A second .satisfied emission while already online ⇒ underlying path
// was swapped; existing URLSessionWebSocketTask is bound to a stale route.
guard connection?.state == .running else { break }
connection?.cancel(with: .invalid, reason: nil)
subject.send(.disconnected(.invalid, nil))
await createConnectionAndRead()
2. AppSyncRealTimeSubscription.prepareForResubscribe() + call in resumeExistingSubscriptions()
func prepareForResubscribe() {
state.send(.none) // reset local state so subscribe() re-sends .start after reconnect
}
Suspected regression mechanism: On repeated background/foreground transitions, NWPathMonitor emits additional .satisfied events that get mapped to (.online, .online). Each one cancels and re-establishes the connection. Under rapid lifecycle churn these recycles appear to interleave/stack, leaving the client stuck in connecting (never completing the reconnect handshake). See #4220 for the repro repo (feature/multi-subscription-repro branch).
Requirements for the re-fix
Ideas to explore
- Debounce / coalesce
(.online, .online) events so lifecycle churn doesn't trigger repeated recycles.
- Liveness probe instead of blind recycle — only recycle when the existing task is actually dead (e.g. ping/pong timeout or detected read/write failure) rather than on every duplicate
.satisfied.
- Guard against overlapping recycles — ensure a recycle in flight can't be re-triggered, and that the reconnect completes (or is cancelled cleanly) before another starts.
- Reconcile with app-lifecycle handling so scenePhase transitions don't double-count as path changes.
References
Testing
Validate against the #4220 repro repo (multi-subscription background/foreground churn) and the #3976 scenario (idle subscription surviving a same-network route swap past the ~2-3 min mark) before merging.
Background
PR #4202 fixed #3976 (AppSync subscriptions disconnecting after 2-3 minutes / zombie subscriptions after iOS recycles the TCP route on the same network). However, #4202 was found to regress #4220 — GraphQL subscriptions get stuck in
connectingafter repeated iOS background/foreground transitions.@akiramur confirmed v2.56.0 (pre-#4202) does not reproduce #4220. The regression was traced directly to the WebSocket recycle path introduced in #4202.
PR #4228 reverts #4202 to unblock the #4220 regression. That revert also removes the #3976 fix, so #3976 will reappear and must be re-addressed here with a non-regressing approach.
What #4202 did (and why it regressed)
Two layered changes:
1.
WebSocketClient.onNetworkStateChange— new(.online, .online)recycle case2.
AppSyncRealTimeSubscription.prepareForResubscribe()+ call inresumeExistingSubscriptions()Suspected regression mechanism: On repeated background/foreground transitions,
NWPathMonitoremits additional.satisfiedevents that get mapped to(.online, .online). Each one cancels and re-establishes the connection. Under rapid lifecycle churn these recycles appear to interleave/stack, leaving the client stuck inconnecting(never completing the reconnect handshake). See #4220 for the repro repo (feature/multi-subscription-reprobranch).Requirements for the re-fix
connecting..start(theprepareForResubscribeconcern) without firing on spurious path events.Ideas to explore
(.online, .online)events so lifecycle churn doesn't trigger repeated recycles..satisfied.References
be0a21462)Testing
Validate against the #4220 repro repo (multi-subscription background/foreground churn) and the #3976 scenario (idle subscription surviving a same-network route swap past the ~2-3 min mark) before merging.