Skip to content

Improve ACLK sync shutdown process #19966

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Mar 26, 2025
Merged

Conversation

stelfrag
Copy link
Collaborator

@stelfrag stelfrag commented Mar 26, 2025

Summary

The ACLK synchronization thread will now accept a shutdown opcode instead of monitoring if the ACLK service should be running.

  • Request shutdown and block (wait for completion)
  • Mark all running queries as cancelled (Wait until all running requests / queries complete or get cancelled)
  • Force shutdown in case of long running / stuck queries after 5 seconds
  • Make sure the underlying MQTT connection terminates after the ACLK sync thread

@stelfrag stelfrag marked this pull request as ready for review March 26, 2025 14:38
Copy link
Contributor

@thiagoftsm thiagoftsm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues were observed during runtime. The shutdown process also worked as expected. All tests were performed within GDB to identify potential issues. LGTM!

@stelfrag stelfrag merged commit ca6e7cd into netdata:master Mar 26, 2025
101 of 103 checks passed
@stelfrag stelfrag deleted the aclk_sync_shutdown branch March 26, 2025 22:58
stelfrag added a commit to stelfrag/netdata that referenced this pull request Mar 31, 2025
* Refactor ACLK sync shutdown process

* Mark all pending queries as cancelled
Wait at most 5 seconds for queries to timeout before force stopping the ACLK sync thread

* Refactor logging in ACLK synchronization process

* Improve logging message for ACLK request snapshot creation

(cherry picked from commit ca6e7cd)
@stelfrag stelfrag mentioned this pull request Mar 31, 2025
Ferroin pushed a commit that referenced this pull request Apr 2, 2025
* Refactor ACLK sync shutdown process

* Mark all pending queries as cancelled
Wait at most 5 seconds for queries to timeout before force stopping the ACLK sync thread

* Refactor logging in ACLK synchronization process

* Improve logging message for ACLK request snapshot creation

(cherry picked from commit ca6e7cd)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants