Skip to content

Add parse alerting for rules files before loading WAL #16601

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 26, 2025

Conversation

subhramit
Copy link
Contributor

@subhramit subhramit commented May 15, 2025

Fixes #11486

Add functionality (and corresponding test) to parse rules files before loading the WAL.

  • Builds on top of the previous PR to fix issues and add an invalid rules file in rules/fixtures to test the error case of the parse function.

This eliminates infinite systemd crash-restart loops due to invalid rules files that caused disk exhaustion and multi-day outages as mentioned in the issue.

Refs. #7322, #7399

@subhramit subhramit force-pushed the parse-alerting-wal branch 7 times, most recently from 4ef8f8a to 30f913e Compare May 15, 2025 23:51
Builds over prometheus#16462
Addresses comments, adds invalid rules file

Co-authored-by: marcodebba <marcodebonis74@gmail.com>
Co-authored-by: subhramit <subhramit.bb@live.in>
Signed-off-by: subhramit <subhramit.bb@live.in>
@subhramit subhramit force-pushed the parse-alerting-wal branch from 30f913e to fbf01ef Compare May 15, 2025 23:55
@subhramit
Copy link
Contributor Author

Could you please take a look @tjhop @aknuds1 @beorn7

@beorn7 beorn7 requested review from beorn7 and aknuds1 May 16, 2025 20:07
@subhramit subhramit changed the title Parse alerting for rules files before loading WAL Add parse alerting for rules files before loading WAL May 17, 2025
Copy link
Contributor

@tjhop tjhop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for carrying this forward, looks good to me 👍

@subhramit
Copy link
Contributor Author

Hi team, would anything else be needed here? Or could this be merged?

@subhramit
Copy link
Contributor Author

Hi team, would anything else be needed here? Or could this be merged?

@beorn7 @aknuds1

@aknuds1 aknuds1 requested a review from Copilot May 26, 2025 16:21
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds functionality to pre-parse rules files before the WAL is loaded, ensuring that invalid rule files are caught early. Key changes include:

  • Adding a new function (ParseFiles) to parse rule files from glob patterns.
  • Introducing tests in manager_test.go to validate both successful and failing parsing scenarios.
  • Updating the main application logic (in cmd/prometheus/main.go) to invoke the new parsing step.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
rules/manager_test.go Added tests for validating both valid and invalid rules files.
rules/manager.go Introduced the ParseFiles function to pre-parse rule files.
rules/fixtures/invalid_rules.yaml Added a fixture for an invalid rule file to test error handling.
cmd/prometheus/main.go Updated main to call ParseFiles to verify rule files before loading.

Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@aknuds1 aknuds1 merged commit 44e27a8 into prometheus:main May 26, 2025
27 checks passed
@subhramit subhramit deleted the parse-alerting-wal branch May 26, 2025 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

parse alerting before loading WAL
3 participants