[AMDGPU] Add scheduling stage to rewrite MFMA from VGPR to AGPR #149367

jrbyrnes · 2025-07-17T17:50:44Z

After #145025 we will always produce the VGPR MFMA form. While this is beneficial for some cases, there are still cases where using the AGPR form is preferred. Specifically, in cases where we have high per-iteration RP coming from MFMAs and no in-loop VGPR users of MFMAs. In such cases, selecting the VGPR form may cause an explosion in VGPR pressure, which degrades the quality of scheduling. The PostRA MFMA rewriter can help improve RA for some of these cases, but it will not help the scheduler.

This PR does rewriting during scheduling as a separate scheduling stage. It will only try to go from VGPR -> AGPR form if we have ArchVGPR pressure over the addressable limit, and if we find that we will not need to issue any cross RC copies in loop. We can also implement AGPR form -> VGPR, but the assumption is that we will always produce VGPR form.

A WIP:
Needs more testing
Still a bit undecided about the heuristic
Considering making the implemenation more generalized for other types of rewriting / transformations, though this may be left as a TODO

Putting up draft for any feedback.

Change-Id: I47b2a4274a35f3cf0a6d064674d1d29526e4dfd2

lucas-rami · 2025-07-18T13:44:14Z

About the heuristic, instead of relying on cycle depth, how about using block frequencies and latency estimates of a cross-class copy vs a spill save/restore to determine how much copying we can afford without increasing latency? This is what I am doing to estimate rematerialization benefit in my upcoming scoring system for remat candidates (branch), so I think the cost of deriving block frequencies could even be factored in among the scheduler's stages.

[AMDGPU] Add scheduling stage to rewrite MFMA from VGPR to AGPR

2e15bfc

Change-Id: I47b2a4274a35f3cf0a6d064674d1d29526e4dfd2

jrbyrnes requested review from arsenm, kerbowa, rampitec, lucas-rami and srpande July 17, 2025 17:50

jrbyrnes mentioned this pull request Jul 18, 2025

[AMDGPU] Add option to preinflate to AVGPR #147413

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU] Add scheduling stage to rewrite MFMA from VGPR to AGPR #149367

[AMDGPU] Add scheduling stage to rewrite MFMA from VGPR to AGPR #149367

Uh oh!

jrbyrnes commented Jul 17, 2025 •

edited

Loading

Uh oh!

lucas-rami commented Jul 18, 2025

Uh oh!

Uh oh!

[AMDGPU] Add scheduling stage to rewrite MFMA from VGPR to AGPR #149367

Are you sure you want to change the base?

[AMDGPU] Add scheduling stage to rewrite MFMA from VGPR to AGPR #149367

Uh oh!

Conversation

jrbyrnes commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucas-rami commented Jul 18, 2025

Uh oh!

Uh oh!

jrbyrnes commented Jul 17, 2025 •

edited

Loading