Page MenuHomePhabricator

Provide AbuseFilter condition for revertrisk threshold
Open, In Progress, Needs TriagePublic

Description

Context

We'd like to be able to invoke AbuseFilter actions if an edit doesn't pass the "likely to be reverted" revertrisk threshold.

Proposal

Introduce an AbuseFilter condition that utilizes the pre-save revertrisk API (T356102: Allow calling revertrisk language agnostic and revert risk multilingual APIs in a pre-save context) for the language agnostic model.

We'd have to exclude page creation scenarios, because the revert risk model doesn't handle those.

Consequences

  • Abuse mitigation tooling can invoke actions before an edit is saved based on a revert risk score

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
kostajh changed the task status from Stalled to Open.Jul 2 2024, 12:30 PM

Per https://phabricator.wikimedia.org/T356102#9935357, the feature is now usable on ml-staging.

Change #1051837 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[mediawiki/extensions/ORES@master] [WIP] Add AbuseFilter variable for revertrisk score

https://gerrit.wikimedia.org/r/1051837

Change #1051838 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[integration/config@master] zuul: Add AbuseFilter as phan & test dependency for ORES

https://gerrit.wikimedia.org/r/1051838

Change #1051838 merged by jenkins-bot:

[integration/config@master] zuul: Add AbuseFilter as phan & test dependency for ORES

https://gerrit.wikimedia.org/r/1051838

Change #1152267 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[mediawiki/extensions/ORES@master] LiftWingService: Add tests

https://gerrit.wikimedia.org/r/1152267

Change #1152268 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[mediawiki/extensions/ORES@master] LiftWingService: Unify request creation

https://gerrit.wikimedia.org/r/1152268

Change #1152269 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[mediawiki/extensions/ORES@master] LiftWingService: Add method to evaluate pre-save revert risk

https://gerrit.wikimedia.org/r/1152269

Change #1152270 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[mediawiki/extensions/ORES@master] Add revertrisk_score AbuseFilter variable

https://gerrit.wikimedia.org/r/1152270

Change #1152770 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[operations/mediawiki-config@master] ORES: Allow using RRML for pre-save revert risk detection

https://gerrit.wikimedia.org/r/1152770

Change #1152267 merged by jenkins-bot:

[mediawiki/extensions/ORES@master] LiftWingService: Add tests

https://gerrit.wikimedia.org/r/1152267

Change #1152268 merged by jenkins-bot:

[mediawiki/extensions/ORES@master] LiftWingService: Unify request creation

https://gerrit.wikimedia.org/r/1152268

Change #1152269 merged by jenkins-bot:

[mediawiki/extensions/ORES@master] LiftWingService: Add method to evaluate pre-save revert risk

https://gerrit.wikimedia.org/r/1152269

Change #1152270 merged by jenkins-bot:

[mediawiki/extensions/ORES@master] Add revertrisk_score AbuseFilter variable

https://gerrit.wikimedia.org/r/1152270

Change #1155235 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[operations/mediawiki-config@master] ores: Disable AbuseFilter integration by default

https://gerrit.wikimedia.org/r/1155235

Change #1155247 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[mediawiki/extensions/ORES@master] Set ORESDeveloperSetup to false by default

https://gerrit.wikimedia.org/r/1155247

Change #1155247 merged by jenkins-bot:

[mediawiki/extensions/ORES@master] Set ORESDeveloperSetup to false by default

https://gerrit.wikimedia.org/r/1155247

Change #1155276 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[mediawiki/extensions/ORES@wmf/1.45.0-wmf.5] Set ORESDeveloperSetup to false by default

https://gerrit.wikimedia.org/r/1155276

Change #1155235 merged by jenkins-bot:

[operations/mediawiki-config@master] ores: Disable AbuseFilter integration by default

https://gerrit.wikimedia.org/r/1155235

Change #1155276 merged by jenkins-bot:

[mediawiki/extensions/ORES@wmf/1.45.0-wmf.5] Set ORESDeveloperSetup to false by default

https://gerrit.wikimedia.org/r/1155276

Mentioned in SAL (#wikimedia-operations) [2025-06-10T16:50:59Z] <mszabo@deploy1003> Started scap sync-world: Backport for [[gerrit:1155276|Set ORESDeveloperSetup to false by default (T364705)]], [[gerrit:1155235|ores: Disable AbuseFilter integration by default (T364705)]], [[gerrit:1155280|tests: Run only defered updates on LinkRecommendationUpdaterTest]]

Mentioned in SAL (#wikimedia-operations) [2025-06-10T16:55:05Z] <mszabo@deploy1003> Started scap sync-world: Backport for [[gerrit:1155276|Set ORESDeveloperSetup to false by default (T364705)]], [[gerrit:1155235|ores: Disable AbuseFilter integration by default (T364705)]], [[gerrit:1155280|tests: Run only defered updates on LinkRecommendationUpdaterTest]]

Mentioned in SAL (#wikimedia-operations) [2025-06-10T16:59:17Z] <mszabo@deploy1003> mszabo: Backport for [[gerrit:1155276|Set ORESDeveloperSetup to false by default (T364705)]], [[gerrit:1155235|ores: Disable AbuseFilter integration by default (T364705)]], [[gerrit:1155280|tests: Run only defered updates on LinkRecommendationUpdaterTest]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-06-10T17:10:11Z] <mszabo@deploy1003> Finished scap sync-world: Backport for [[gerrit:1155276|Set ORESDeveloperSetup to false by default (T364705)]], [[gerrit:1155235|ores: Disable AbuseFilter integration by default (T364705)]], [[gerrit:1155280|tests: Run only defered updates on LinkRecommendationUpdaterTest]] (duration: 15m 06s)

Change #1051837 abandoned by Kosta Harlan:

[mediawiki/extensions/ORES@master] [WIP] Add AbuseFilter variable for revertrisk score

Reason:

See I0ccf97880001c3d0c81c612bb98f1da5ab9bb452

https://gerrit.wikimedia.org/r/1051837

@mszabo and I discussed making the following changes:

  • rename the variable to revertrisk_level
  • the valid values for the variable are high or null
  • the variable will be high if the revertrisk score is above the threshold defined in wgOresFiltersThresholds['revertrisklanguageagnostic']['min']
  • the variable is only available if wgOresFiltersThresholds['revertrisklanguageagnostic']['min'] is defined (currently, this is the case for ~19 wikis)
  • we will *not* use RRML endpoint as we don't have thresholds defined for RRML
  • we will *not* return low for the revertrisk_level because we don't have the thresholds defined

Change #1160196 had a related patch set uploaded (by Máté Szabó; author: Máté Szabó):

[mediawiki/extensions/ORES@master] Map pre-save RR scores to predefined values

https://gerrit.wikimedia.org/r/1160196

Change #1160196 merged by jenkins-bot:

[mediawiki/extensions/ORES@master] Map pre-save RR scores to predefined values

https://gerrit.wikimedia.org/r/1160196

Change #1162998 had a related patch set uploaded (by Kosta Harlan; author: Máté Szabó):

[mediawiki/extensions/ORES@wmf/1.45.0-wmf.6] Map pre-save RR scores to predefined values

https://gerrit.wikimedia.org/r/1162998

Change #1163004 had a related patch set uploaded (by Kosta Harlan; author: Kosta Harlan):

[operations/mediawiki-config@master] Revert "ores: Disable AbuseFilter integration by default"

https://gerrit.wikimedia.org/r/1163004

Change #1162998 merged by jenkins-bot:

[mediawiki/extensions/ORES@wmf/1.45.0-wmf.6] Map pre-save RR scores to predefined values

https://gerrit.wikimedia.org/r/1162998

Mentioned in SAL (#wikimedia-operations) [2025-06-23T20:14:27Z] <kharlan@deploy1003> Started scap sync-world: Backport for [[gerrit:1162998|Map pre-save RR scores to predefined values (T364705)]], [[gerrit:1161950|Fix password handling for non-existent users (T395372 T397262)]]

Mentioned in SAL (#wikimedia-operations) [2025-06-23T20:38:46Z] <kharlan@deploy1003> kharlan, tgr: Backport for [[gerrit:1162998|Map pre-save RR scores to predefined values (T364705)]], [[gerrit:1161950|Fix password handling for non-existent users (T395372 T397262)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.

Mentioned in SAL (#wikimedia-operations) [2025-06-23T20:58:57Z] <kharlan@deploy1003> Finished scap sync-world: Backport for [[gerrit:1162998|Map pre-save RR scores to predefined values (T364705)]], [[gerrit:1161950|Fix password handling for non-existent users (T395372 T397262)]] (duration: 44m 29s)

Change #1163004 merged by jenkins-bot:

[operations/mediawiki-config@master] Reapply "ores: Disable AbuseFilter integration by default"

https://gerrit.wikimedia.org/r/1163004

Mentioned in SAL (#wikimedia-operations) [2025-06-23T21:01:54Z] <kharlan@deploy1003> Started scap sync-world: Backport for [[gerrit:1163004|Reapply "ores: Disable AbuseFilter integration by default" (T364705)]], [[gerrit:1155725|Configure event stream for IP auto-reveal instrument (T387600)]], [[gerrit:1160157|Reapply "Use GetSecurityLogContext hook for goodpass/badpass logging" (T395204)]]

Mentioned in SAL (#wikimedia-operations) [2025-06-23T21:04:28Z] <kharlan@deploy1003> kharlan, tgr, tchanders: Backport for [[gerrit:1163004|Reapply "ores: Disable AbuseFilter integration by default" (T364705)]], [[gerrit:1155725|Configure event stream for IP auto-reveal instrument (T387600)]], [[gerrit:1160157|Reapply "Use GetSecurityLogContext hook for goodpass/badpass logging" (T395204)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now

Mentioned in SAL (#wikimedia-operations) [2025-06-23T21:16:46Z] <kharlan@deploy1003> Finished scap sync-world: Backport for [[gerrit:1163004|Reapply "ores: Disable AbuseFilter integration by default" (T364705)]], [[gerrit:1155725|Configure event stream for IP auto-reveal instrument (T387600)]], [[gerrit:1160157|Reapply "Use GetSecurityLogContext hook for goodpass/badpass logging" (T395204)]] (duration: 14m 51s)

Re: Tech News/User-notice - What wording would you suggest for the entry, and When should it be included (I assume this next edition)? Thanks!

Re: Tech News/User-notice - What wording would you suggest for the entry, and When should it be included (I assume this next edition)? Thanks!

Sorry for the delay. Suggested text (cc @mszabo)

The ORES extension adds an AbuseFilter variable if that extension is installed. This allows AbuseFilters to filter edits based on the RevertRisk score of the edit being attempted. The variable is only available on wikis where the RevertRisk LanguageAgnostic model is configured—see T392144 for a full list. It is only populated if the action being evaluated is an edit. For more information, please see https://www.mediawiki.org/wiki/Extension:ORES/AbuseFilter_variables#What_variables_are_available_for_use