Page MenuHomePhabricator

Consider using upgradeMode=savepoint for the cirrus-streaming-updater
Closed, ResolvedPublic1 Estimated Story Points

Description

While deploying a change for T375821 I encounter a weird situation where the pipeline failed several times trying to overwrite an existing checkpoints.

It is believed that for some reasons flink decided to resume operations from an old checkpoint and while attempting to write subsequent ones they failed because they already existed.
According to the documentation upgradeMode=savepoint and I don't see a good reason to not use it.

AC:

  • agree to use upgradeMode=savepoint or document the reason why we prefer last-state
  • update the helm values to start using it

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript
Gehel set the point value for this task to 1.Apr 7 2025, 3:37 PM

Change #1136716 had a related patch set uploaded (by DCausse; author: DCausse):

[operations/deployment-charts@master] cirrus-streaming-updater: set upgradeMode to savepoint

https://gerrit.wikimedia.org/r/1136716

Change #1136716 merged by jenkins-bot:

[operations/deployment-charts@master] cirrus-streaming-updater: set upgradeMode to savepoint

https://gerrit.wikimedia.org/r/1136716