Maniphest T352092

Instrument the multi-check experience
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	ppelberg
	Nov 27 2023, 8:27 PM

Description

In T324735, we implemented the initial instrumentation needed to evaluate the impact of the first Reference Check in the analysis T342930 describes.

This task involves the work of building upon this initial "body" of instrumentation to enable us to evaluate the impact of the multi-check experience T347530 will introduce.

Requirements

Borrowed from what @MNeisler documented in T352092#10545377.

feature	action
editCheck-[checkname]	action-accept
editCheck-[checkname]	action-reject
editCheck-[checkname]	action-[any other custom choice we later implement]
editCheck-[checkname]	check-shown-[moment] [1]
editCheckDialog	window-open-from-check-[moment] [1]
[`fixed`/`sidebar`/`gutterSidebar`]EditCheckDialog	window-open-from-check [3]

See full instrumentation Spec.

Done

@DLynch implements new instrumentation
Editing QA verifies new events are being emitted by clients on test2wiki
@MNeisler verifies new events are being logged in DB

[1] I'm guessing on how these two new events (a pre-save show event and sidebar event to distinguish listeners) would be implemented but not tied to proposed names. Feel free to adjust as needed. The main requirements are that I can distinguish these events by edit check type and an event is sent per-check if possible. My logic behind doubling up on the events is that it'll probably be much easier to query for totals + breakdowns that way.

[2] Thinking this through some more, It would not be too difficult to query for edit check totals using something like event.feature LIKE 'editCheck% if we want to remove that redundancy in events.

[3] We have multiple types of sidebar window now, and this sort of correlates to the "moments" but also doesn't entirely.

Details

	Subject	Repo	Branch	Lines +/-
	Edit check: add instrumentation around the check lifecycle	mediawiki/extensions/VisualEditor	master	+23 -2

Customize query in gerrit

Related Objects
Search...

Status	Assigned	Task
Open	None	T265163 Create a system to encode best practices into editing experiences
Open	None	T345472 Offer Edit Check(s) within new article creation
Open	None	T366743 [Epic] Enable multiple Edit Checks to be presented within a single edit
Resolved	MNeisler	T379131 [A/B Test] Run an A/B test to evaluate impact of showing multiple Checks within a single edit
Resolved	ppelberg	T352092 Instrument the multi-check experience
Duplicate	None	T328594 Implementing logging to track how many checks people are being presented with on a per-edit-basis
Open	ppelberg	T352095 Publish multi-check measurement plan

Event Timeline

ppelberg created this task.Nov 27 2023, 8:27 PM

VPuffetMichel moved this task from To Triage to Triaged on the VisualEditor board.Nov 30 2023, 4:46 PM

MNeisler subscribed.Dec 1 2023, 6:28 PM

MNeisler claimed this task.Jan 3 2024, 7:41 PM

MNeisler triaged this task as Medium priority.

MNeisler added a project: Product-Analytics.

MNeisler moved this task from Triage to Upcoming Quarter on the Product-Analytics board.

MNeisler moved this task from Upcoming Quarter to Current Quarter on the Product-Analytics board.Jan 3 2024, 7:43 PM

Per what the Editing Team discussed offline during today's planning meeting, at a minimum, we'll need instrumentation to help us answer questions like:
1. How many checks were presented/activated during any given edit session?
2. Of this checks that were activated within a given edit session, to what extent – if any – did the person engage with each check?

ppelberg added a subtask: T328594: Implementing logging to track how many checks people are being presented with on a per-edit-basis.Mar 5 2024, 12:53 AM

ppelberg added a subtask: T352095: Publish multi-check measurement plan.May 29 2024, 8:19 PM

As part of the measurement plan for assessing the impact of introducing multiple checks, we identified metrics based on being able to calculate the number of checks presented within a single editing session such as edit completion rate broken down by the number of checks shown. The average number of edit checks shown per session is also currently identified as one of the guardrail metrics to monitor impacts from changes to edit check.

Based on discussions with @DLynch, this is not currently instrumented. VisualEditorFeatureUse will currently only log an edit check event once in an editing session rather than firing three times if there are three reference checks.

Adding instrumentation that logs an event for all checks presented to the user requires further discussion and investigation. Current multi-check design includes a lot of regeneration-and-redisplay of checks happening during the mid-edit stage. Pre-save checks are simpler.

Note: This instrumentation would be needed in addition to the approach for revision tags determined in T352120 to track edit checks shown within an edit attempt vs a published edit.

cc @ppelberg

MNeisler edited projects, added Product-Analytics (Kanban); removed Product-Analytics.Jan 23 2025, 5:06 PM

MNeisler moved this task from Next 2 weeks to Doing on the Product-Analytics (Kanban) board.Jan 28 2025, 3:38 PM

Suggestion is to add semi-generic instrumentation to add to VisualEditorFeatureUse around users acting on a check. (As opposed to being shown a check, which thus avoids questions around how to handle the in-flux nature of the mid-edit checks.)

feature	action
editCheck	action-accept
editCheck	action-reject
editCheck	action-[any other custom choice we later implement]
editCheck-[checkname]	action-accept
editCheck-[checkname]	action-reject
editCheck-[checkname]	action-[any other custom choice we later implement]

Some of this already exists specific to the AddReference check -- there's editCheckReferences; edit-check-[confirm / reject] events. This would mostly be getting rid of the need for custom setup of these.
My logic behind doubling up on the events is that it'll probably be much easier to query for totals + breakdowns that way.
Distinguishing between mid-edit and pre-save checks can be done because pre-save checks are between an EditAttemptStep saveIntent and a editCheckDialog; dialog-abort. (I might also want to look into making that close event more specific to whether you proceeded or backed out...)

What this won't get us is a count of how many checks were shown to someone. We discussed in the meeting that getting a meaningful count for mid-edit checks is going to be complicated, but for pre-save we could log e.g. a VEFU editCheck; pre-save-shown-17 sort of event, though that'd be a pain to built reports around. (But we could bucket it -- 1, 2-5, 6-10, 11+, etc.)

Next steps

Per today's offline discussion:

1. David: articulate rough proposal for generic logging of Check actions
- Done in T352092#10510483
2. Megan to become clear about ideal instrumentation
3. Megan evaluate viability of approach David described in T352092#10510483
4. David, Megan, Peter: meet to discuss evaluation ("3.")

ppelberg edited projects, added Editing-team (Tracking), Goal; removed Editing-team.Jan 31 2025, 6:30 AM

ppelberg moved this task from Backlog to Analytics on the Editing-team (Tracking) board.

We decided in a meeting that the generic suggestion is mostly okay. However, we'll also need:

A pre-save show event per-check
Something added to the sidebar show event to distinguish the listeners, so we can tell the difference between mid-edit and pre-save checks

Confirming that the instrumentation proposed in T352092#10510483 in addition to the new pre-save show and edit check moment events, will meet requirements to measure the impact of Reference Check (to be done in T379131) .

We'll need to revisit instrumentation requirements for additional check types (i.e paste and peacock checks) once those workflows have been finalized.

New instrumentation requirements are summarized below for reference:

feature	action
editCheck-[checkname]	action-accept
editCheck-[checkname]	action-reject
editCheck-[checkname]	action-[any other custom choice we later implement]
editCheck-[checkname]	check-shown-presave [1]
editCheck-[checkname]	window-open-from-[moment]check] [1]

A couple notes/questions:
[1] I'm guessing on how these two new events (a pre-save show event and sidebar event to distinguish listeners) would be implemented but not tied to proposed names. Feel free to adjust as needed. The main requirements are that I can distinguish these events by edit check type and an event is sent per-check if possible.

My logic behind doubling up on the events is that it'll probably be much easier to query for totals + breakdowns that way.

[2] Thinking this through some more, It would not be too difficult to query for edit check totals using something like event.feature LIKE 'editCheck% if we want to remove that redundancy in events.

@DLynch - Assigning this task to you for final review and implementation but let me know if you have any questions or need any additional info.

Instrumentation Spec

MNeisler reassigned this task from MNeisler to DLynch.Feb 12 2025, 4:38 PM

MNeisler edited projects, added Product-Analytics; removed Product-Analytics (Kanban).

MNeisler moved this task from Current Quarter to Tracking on the Product-Analytics board.

ppelberg edited parent tasks, added: T379131: [A/B Test] Run an A/B test to evaluate impact of showing multiple Checks within a single edit; removed: T351777: [MILESTONE] Offer Multi-Check (References) at partner wikis.Feb 24 2025, 7:32 PM

ppelberg edited projects, added Editing-team (Kanban Board); removed Editing-team (Tracking).Feb 24 2025, 7:35 PM

ppelberg moved this task from Inbox to Ready to Be Worked On on the Editing-team (Kanban Board) board.

ppelberg updated the task description. (Show Details)Feb 24 2025, 7:39 PM

ppelberg mentioned this in T373949: Clarify the meaning of the editcheck-references-activated tag.Feb 28 2025, 12:15 AM

ppelberg moved this task from Ready to Be Worked On to Doing on the Editing-team (Kanban Board) board.Mar 4 2025, 6:15 PM

Change #1124850 had a related patch set uploaded (by DLynch; author: DLynch):

[mediawiki/extensions/VisualEditor@master] Edit check: add instrumentation around the check lifecycle

https://gerrit.wikimedia.org/r/1124850

gerritbot added a project: Patch-For-Review.Mar 5 2025, 5:30 PM

DLynch moved this task from Doing to Code Review on the Editing-team (Kanban Board) board.Mar 5 2025, 9:01 PM

ppelberg merged a task: T328594: Implementing logging to track how many checks people are being presented with on a per-edit-basis.Mar 5 2025, 9:43 PM

ppelberg mentioned this in T328594: Implementing logging to track how many checks people are being presented with on a per-edit-basis.

ppelberg updated the task description. (Show Details)

ppelberg added subscribers: • Geugeor-WMF, Trizek-WMF.

Change #1124850 merged by jenkins-bot:

[mediawiki/extensions/VisualEditor@master] Edit check: add instrumentation around the check lifecycle

https://gerrit.wikimedia.org/r/1124850

Maintenance_bot removed a project: Patch-For-Review.Mar 10 2025, 11:32 AM

ReleaseTaggerBot added a project: MW-1.44-notes (1.44.0-wmf.20; 2025-03-11).Mar 10 2025, 12:00 PM

ppelberg removed DLynch as the assignee of this task.Mar 10 2025, 5:11 PM

ppelberg added a project: Editing QA.

ppelberg updated the task description. (Show Details)

ppelberg moved this task from Inbox to High Priority on the Editing QA board.

ppelberg moved this task from Code Review to QA on the Editing-team (Kanban Board) board.

Ryasmeen claimed this task.Mar 18 2025, 8:05 PM

ppelberg updated the task description. (Show Details)Mar 18 2025, 9:16 PM

Megan and I had talked off-ticket about some changes to the exact events, and I forgot to update the ticket with them despite having implemented them as-discussed.

DLynch updated the task description. (Show Details)Mar 18 2025, 11:27 PM

Completed testing all the events. Everything looks good on my end, the only issue found was the action: window-open-from-[moment]check seems to be associated with feature:editCheck as opposed to feature:editCheck-[checkname] as specified on the task which turned out to be expected and now updated on the table in the task description.

Screenshot 2025-03-18 at 4.53.05 PM.png (1×3 px, 574 KB)

Screenshot 2025-03-18 at 4.42.57 PM.png (968×3 px, 392 KB)

{F58864157}

Ryasmeen moved this task from QA to Ready for Sign Off on the Editing-team (Kanban Board) board.Mar 19 2025, 3:22 AM

Ryasmeen edited projects, added Verified; removed Editing QA.

Assigning to myself to verify that the new events are logged in VisualEditorFeature as expected.

I've verified that the new events added to track multi-check engagement are being stored as expected in VisualEditorFeatureUse. This based on a review of events generated on test2wiki between 18 March and 20 March.

See summary of the checks completed and findings below:

There were 6 total editing sessions where multi-check events were logged. 4 control group sessions and 2 test group sessions. Note: Once this is deployed to partner wikis and we have a larger sample of events, I'll QA the bucketing to confirm that we are seeing close to a 50/50 split across the test groups.
Confirmed that the bucket field is populated as expected with either 2025-03-editcheck-multicheck-reference-control or 2025-03-editcheck-multicheck-reference-test depending on user assignment.
Events are sent for both logged-in and logged-out users.
For logged-out users, we are populating the anonymous_user_token field as expected.
Confirmed that all new feature and action values implemented for multi-check (as documented in the task description) are stored correctly.
- editCheck-addReference is the only specific check type feature being logged at the moment as expected.
- Only pre-save moment actions (check-shown-presave, window-open-from-check-presave) are stored (also expected)
Events stored for both platform = desktop and platform = phone.
Events only occur where editor_interface = visualeditor
Confirmed based on the number of check-shown-presave events that the test group receives multiple checks during the pre-save moment while the control only receives one.
Confirmed data joins correctly with EditAttemptStep to be able to effectively calculate edit completion rate and other metrics.

Note: I've updated the VEFU data dictionary so these new multi-check events are now reflected there.

MNeisler reassigned this task from MNeisler to ppelberg.Mar 21 2025, 4:00 PM

@Ryasmeen + @MNeisler + @DLynch: this all looks wonderful. Nicely done 👏🏼

ppelberg mentioned this in T352120: Implement methodology for identifying the Edit Check(s) shown within an edit and action(s) people take in response.Mar 25 2025, 6:15 PM

ppelberg mentioned this in T393818: Instrument the Tone Check UX.Wed, Jun 25, 8:20 PM

	F58864124: Screenshot 2025-03-18 at 4.42.57 PM.png
	Mar 18 2025, 11:57 PM

	F58864122: Screenshot 2025-03-18 at 4.53.05 PM.png
	Mar 18 2025, 11:57 PM

	Restricted File
	Mar 19 2025, 12:03 AM

Instrument the multi-check experienceClosed, ResolvedPublicActions