-
Notifications
You must be signed in to change notification settings - Fork 14.5k
[RISCV][VLOPT] Add support for vrgather #148249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
@llvm/pr-subscribers-backend-risc-v Author: Mikhail R. Gadelha (mikhailramalho) ChangesThis PR adds support for the vrgather.vi, vrgather.vx, vrgather.vv, vrgatherei16.vv instructions in the RISC-V VLOptimizer. To support vrgatherei16.vv I also needed to add support for it in getOperandLog2EEW. This is the third PR addressing the list at #147647. Full diff: https://github.com/llvm/llvm-project/pull/148249.diff 2 Files Affected:
diff --git a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
index 2d9f38221d424..e940ab6c62637 100644
--- a/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
+++ b/llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
@@ -747,6 +747,14 @@ getOperandLog2EEW(const MachineOperand &MO, const MachineRegisterInfo *MRI) {
return TwoTimes ? MILog2SEW + 1 : MILog2SEW;
}
+ // Vector Register Gather with 16-bit Index Elements Instruction
+ // Dest and source data EEW=SEW. Index vector EEW=16.
+ case RISCV::VRGATHEREI16_VV: {
+ if (MO.getOperandNo() == 2)
+ return 4;
+ return MILog2SEW;
+ }
+
default:
return std::nullopt;
}
@@ -1051,6 +1059,11 @@ static bool isSupportedInstr(const MachineInstr &MI) {
case RISCV::VSLIDEDOWN_VI:
case RISCV::VSLIDE1UP_VX:
case RISCV::VFSLIDE1UP_VF:
+ // Vector Register Gather Instructions
+ case RISCV::VRGATHER_VI:
+ case RISCV::VRGATHER_VV:
+ case RISCV::VRGATHER_VX:
+ case RISCV::VRGATHEREI16_VV:
// Vector Single-Width Floating-Point Add/Subtract Instructions
case RISCV::VFADD_VF:
case RISCV::VFADD_VV:
diff --git a/llvm/test/CodeGen/RISCV/rvv/vl-opt-instrs.ll b/llvm/test/CodeGen/RISCV/rvv/vl-opt-instrs.ll
index 317ad0c124e73..a8eba6d3db256 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vl-opt-instrs.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vl-opt-instrs.ll
@@ -5476,9 +5476,8 @@ define <vscale x 4 x i32> @vrgather_vi(<vscale x 4 x i32> %a, <vscale x 4 x i32>
;
; VLOPT-LABEL: vrgather_vi:
; VLOPT: # %bb.0:
-; VLOPT-NEXT: vsetvli a1, zero, e32, m2, ta, ma
-; VLOPT-NEXT: vrgather.vi v12, v8, 5
; VLOPT-NEXT: vsetvli zero, a0, e32, m2, ta, ma
+; VLOPT-NEXT: vrgather.vi v12, v8, 5
; VLOPT-NEXT: vadd.vv v8, v12, v10
; VLOPT-NEXT: ret
%1 = call <vscale x 4 x i32> @llvm.riscv.vrgather.vx.nxv4i32.iXLen(<vscale x 4 x i32> poison, <vscale x 4 x i32> %a, iXLen 5, iXLen -1)
@@ -5497,9 +5496,8 @@ define <vscale x 4 x i32> @vrgather_vv(<vscale x 4 x i32> %a, <vscale x 4 x i32>
;
; VLOPT-LABEL: vrgather_vv:
; VLOPT: # %bb.0:
-; VLOPT-NEXT: vsetvli a1, zero, e32, m2, ta, ma
-; VLOPT-NEXT: vrgather.vv v12, v8, v10
; VLOPT-NEXT: vsetvli zero, a0, e32, m2, ta, ma
+; VLOPT-NEXT: vrgather.vv v12, v8, v10
; VLOPT-NEXT: vadd.vv v8, v12, v8
; VLOPT-NEXT: ret
%1 = call <vscale x 4 x i32> @llvm.riscv.vrgather.vv.nxv4i32(<vscale x 4 x i32> poison, <vscale x 4 x i32> %a, <vscale x 4 x i32> %idx, iXLen -1)
@@ -5518,9 +5516,8 @@ define <vscale x 4 x i32> @vrgather_vx(<vscale x 4 x i32> %a, iXLen %idx, <vscal
;
; VLOPT-LABEL: vrgather_vx:
; VLOPT: # %bb.0:
-; VLOPT-NEXT: vsetvli a2, zero, e32, m2, ta, ma
-; VLOPT-NEXT: vrgather.vx v12, v8, a0
; VLOPT-NEXT: vsetvli zero, a1, e32, m2, ta, ma
+; VLOPT-NEXT: vrgather.vx v12, v8, a0
; VLOPT-NEXT: vadd.vv v8, v12, v10
; VLOPT-NEXT: ret
%1 = call <vscale x 4 x i32> @llvm.riscv.vrgather.vx.nxv4i32.iXLen(<vscale x 4 x i32> poison, <vscale x 4 x i32> %a, iXLen %idx, iXLen -1)
@@ -5539,9 +5536,8 @@ define <vscale x 4 x i32> @vrgatherei16_vv(<vscale x 4 x i32> %a, <vscale x 4 x
;
; VLOPT-LABEL: vrgatherei16_vv:
; VLOPT: # %bb.0:
-; VLOPT-NEXT: vsetvli a1, zero, e32, m2, ta, ma
-; VLOPT-NEXT: vrgatherei16.vv v12, v8, v10
; VLOPT-NEXT: vsetvli zero, a0, e32, m2, ta, ma
+; VLOPT-NEXT: vrgatherei16.vv v12, v8, v10
; VLOPT-NEXT: vadd.vv v8, v12, v8
; VLOPT-NEXT: ret
%1 = call <vscale x 4 x i32> @llvm.riscv.vrgatherei16.vv.nxv4i32(<vscale x 4 x i32> poison, <vscale x 4 x i32> %a, <vscale x 4 x i16> %idx, iXLen -1)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vrgather has a nasty semantic cornercase where we can't reduce the VL of the data operand. We can reduce the VL of the vrgather itself, and of the index operand, but not the data operand because vrgather can read past VL. Do you have a test for that case? I didn't spot it immediately, but might have missed it.
Edit: It looks like the existing code handles this in an overly conservative way in canReadPastVL, so this is probably just locating an existing test.
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Luke Lau <luke_lau@icloud.com>
Signed-off-by: Mikhail R. Gadelha <mikhail@igalia.com>
This PR adds support for the vrgather.vi, vrgather.vx, vrgather.vv, vrgatherei16.vv instructions in the RISC-V VLOptimizer.
To support vrgatherei16.vv I also needed to add support for it in getOperandLog2EEW.
This is the third PR addressing the list at #147647.