Skip to content

[InstCombine] Canoncalize complex boolean expressions into ~((y | z) ^ x) #149530

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

yafet-a
Copy link
Contributor

@yafet-a yafet-a commented Jul 18, 2025

Fixes #97044

Optimizations added:
((x&y&~z) | (x&~y&z) | (~x&~y&~z) | (x&y&z))~((y | z) ^ x)
((~z) & ((x&y) | (~x&~y))) ^ (z & ((x&y) | (x&~y)))~((y | z) ^ x)
((x&y) | (~x&~y)) ^ (z & (((x&y) | (~x&~y)) ^ ((x&y) | (x&~y))))~((y | z) ^ x)

Results :

  • All 4 test expressions now produce identical optimal assembly (2 instructions - orr, eon)
  • Significant improvement in certain cases from 10+ instructions to just the two for complex cases such as test0 (mentioned in the original issue and included in the new test added).
  • Matches/exceeds GCC's optimisation quality on these patterns where it was unable to before - https://godbolt.org/z/bKejnjo7o

@yafet-a yafet-a requested a review from nikic as a code owner July 18, 2025 15:13
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Jul 18, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 18, 2025

@llvm/pr-subscribers-llvm-transforms

Author: (yafet-a)

Changes

Fixes #97044

Optimizations added:
((x&y&~z) | (x&~y&z) | (~x&~y&~z) | (x&y&z)) → ~((y | z) ^ x)
((~z) & ((x&y) | (~x&~y))) ^ (z & ((x&y) | (x&~y)))~((y | z) ^ x)
((x&y) | (~x&~y)) ^ (z & (((x&y) | (~x&~y)) ^ ((x&y) | (x&~y)))) → ~((y | z) ^ x)

Results :

  • All 4 test expressions now produce identical optimal assembly (3 instructions)
  • Significant improvement from 10+ instructions to just 2 computational instructions (orr, eon) for complex cases such as test0 (mentioned in the original issue and included in the new test added).
  • Matches/exceeds GCC's optimisation quality on these patterns where it was unable to before - https://godbolt.org/z/bKejnjo7o

Full diff: https://github.com/llvm/llvm-project/pull/149530.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp (+50)
  • (added) llvm/test/Transforms/InstCombine/pr97044.ll (+94)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index 3beda6bc5ba38..5d9bee7b30fad 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -2139,6 +2139,44 @@ static Instruction *foldComplexAndOrPatterns(BinaryOperator &I,
           X);
   }
 
+  // ((~Z) & ((X & Y) | (~X & ~Y))) ^ (Z & ((X & Y) | (X & ~Y))) -> ~((Y | Z) ^ X)
+  {
+    {
+      Value *X, *Y, *Z;
+      Value *SomethingOrZ, *ZAndX;
+      
+      if (match(&I, m_c_Xor(m_Value(SomethingOrZ), m_Value(ZAndX))) &&
+          match(ZAndX, m_And(m_Value(Z), m_Value(X))) &&
+          match(SomethingOrZ, m_Or(m_Value(), m_Specific(Z)))) {        
+        Value *Something;
+        if (match(SomethingOrZ, m_Or(m_Value(Something), m_Specific(Z))) &&
+            match(Something, m_Xor(m_Specific(X), m_Value(Y)))) {
+          Value *YOrZ = Builder.CreateOr(Y, Z);
+          Value *YOrZXorX = Builder.CreateXor(YOrZ, X);
+          return BinaryOperator::CreateNot(YOrZXorX);
+        }
+      }
+    }
+    
+    // ((X & Y) | (~X & ~Y)) ^ (Z & (((X & Y) | (~X & ~Y)) ^ ((X & Y) | (X & ~Y)))) -> ~((Y | Z) ^ X)
+    if (match(Op1, m_AllOnes())) {      
+      Value *X, *Y, *Z;
+      Value *XorWithY;
+      if (match(Op0, m_Xor(m_Value(XorWithY), m_Value(Y)))) {        
+        Value *ZAndNotY;
+        if (match(XorWithY, m_Xor(m_Value(X), m_Value(ZAndNotY)))) {          
+          Value *NotY;
+          if (match(ZAndNotY, m_And(m_Value(Z), m_Value(NotY))) &&
+              match(NotY, m_Not(m_Specific(Y)))) {            
+            Value *YOrZ = Builder.CreateOr(Y, Z);
+            Value *YOrZXorX = Builder.CreateXor(YOrZ, X);
+            return BinaryOperator::CreateNot(YOrZXorX);
+          }
+        }
+      }
+    }
+  }
+
   return nullptr;
 }
 
@@ -3780,6 +3818,18 @@ Instruction *InstCombinerImpl::visitOr(BinaryOperator &I) {
     return replaceInstUsesWith(I, V);
 
   Value *Op0 = I.getOperand(0), *Op1 = I.getOperand(1);
+  
+  // ((X & Y & ~Z) | (X & ~Y & Z) | (~X & ~Y &~Z) | (X & Y &Z)) -> ~((Y | Z) ^ X)
+  {
+    Value *X, *Y, *Z;
+    Value *Term1, *Term2, *XAndYAndZ;
+    if (match(&I, m_Or(m_Or(m_Value(Term1), m_Value(Term2)), m_Value(XAndYAndZ))) &&
+        match(XAndYAndZ, m_And(m_And(m_Value(X), m_Value(Y)), m_Value(Z)))) {
+      Value *YOrZ = Builder.CreateOr(Y, Z);
+      Value *YOrZXorX = Builder.CreateXor(YOrZ, X);
+      return BinaryOperator::CreateNot(YOrZXorX);
+    }
+  }
   Type *Ty = I.getType();
   if (Ty->isIntOrIntVectorTy(1)) {
     if (auto *SI0 = dyn_cast<SelectInst>(Op0)) {
diff --git a/llvm/test/Transforms/InstCombine/pr97044.ll b/llvm/test/Transforms/InstCombine/pr97044.ll
new file mode 100644
index 0000000000000..4e45f88956d89
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/pr97044.ll
@@ -0,0 +1,94 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+; Tests for GitHub issue #97044 - Boolean instruction canonicalization
+; All expressions should optimise to the same canonical form: ~((y | z) ^ x)
+
+define i32 @test0_4way_or(i32 %x, i32 %y, i32 %z) {
+; CHECK-LABEL: @test0_4way_or(
+; CHECK-NEXT:    [[TMP1:%.*]] = or i32 [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[TMP2:%.*]] = xor i32 [[TMP1]], [[X:%.*]]
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP2]], -1
+; CHECK-NEXT:    ret i32 [[TMP3]]
+;
+  %not = xor i32 %z, -1
+  %and = and i32 %y, %not
+  %and1 = and i32 %and, %x
+  %not2 = xor i32 %y, -1
+  %and3 = and i32 %x, %not2
+  %and4 = and i32 %and3, %z
+  %or = or i32 %and1, %and4
+  %not5 = xor i32 %x, -1
+  %not6 = xor i32 %y, -1
+  %and7 = and i32 %not5, %not6
+  %not8 = xor i32 %z, -1
+  %and9 = and i32 %and7, %not8
+  %or10 = or i32 %or, %and9
+  %and11 = and i32 %x, %y
+  %and12 = and i32 %and11, %z
+  %or13 = or i32 %or10, %and12
+  ret i32 %or13
+}
+
+define i32 @test1_xor_pattern(i32 %x, i32 %y, i32 %z) {
+; CHECK-LABEL: @test1_xor_pattern(
+; CHECK-NEXT:    [[TMP2:%.*]] = xor i32 [[TMP1:%.*]], [[X:%.*]]
+; CHECK-NEXT:    [[AND4_DEMORGAN:%.*]] = or i32 [[TMP2]], [[Z:%.*]]
+; CHECK-NEXT:    [[AND8:%.*]] = and i32 [[Z]], [[TMP1]]
+; CHECK-NEXT:    [[TMP4:%.*]] = xor i32 [[AND4_DEMORGAN]], -1
+; CHECK-NEXT:    [[TMP3:%.*]] = or i32 [[AND8]], [[TMP4]]
+; CHECK-NEXT:    ret i32 [[TMP3]]
+;
+  %not = xor i32 %z, -1
+  %and = and i32 %x, %y
+  %not1 = xor i32 %x, -1
+  %not2 = xor i32 %y, -1
+  %and3 = and i32 %not1, %not2
+  %or = or i32 %and, %and3
+  %and4 = and i32 %not, %or
+  %and5 = and i32 %x, %y
+  %and6 = and i32 %x, %not2
+  %or7 = or i32 %and5, %and6
+  %and8 = and i32 %z, %or7
+  %xor = xor i32 %and4, %and8
+  ret i32 %xor
+}
+
+define i32 @test2_nested_xor(i32 %x, i32 %y, i32 %z) {
+; CHECK-LABEL: @test2_nested_xor(
+; CHECK-NEXT:    [[TMP3:%.*]] = xor i32 [[TMP2:%.*]], -1
+; CHECK-NEXT:    [[AND8:%.*]] = and i32 [[Z:%.*]], [[TMP3]]
+; CHECK-NEXT:    [[TMP1:%.*]] = xor i32 [[X:%.*]], [[AND8]]
+; CHECK-NEXT:    ret i32 [[TMP1]]
+;
+  %and = and i32 %x, %y
+  %not = xor i32 %x, -1
+  %not1 = xor i32 %y, -1
+  %and2 = and i32 %not, %not1
+  %or = or i32 %and, %and2
+  %and3 = and i32 %x, %y
+  %not4 = xor i32 %y, -1
+  %and5 = and i32 %x, %not4
+  %or6 = or i32 %and3, %and5
+  %xor = xor i32 %or, %or6
+  %not7 = xor i32 %y, -1
+  %and8 = and i32 %z, %not7
+  %and9 = and i32 %xor, %and8
+  %xor10 = xor i32 %or, %and9
+  %xor11 = xor i32 %xor10, %y
+  %xor12 = xor i32 %xor11, -1
+  ret i32 %xor12
+}
+
+define i32 @test3_already_optimal(i32 %x, i32 %y, i32 %z) {
+; CHECK-LABEL: @test3_already_optimal(
+; CHECK-NEXT:    [[OR:%.*]] = or i32 [[Y:%.*]], [[Z:%.*]]
+; CHECK-NEXT:    [[XOR:%.*]] = xor i32 [[OR]], [[X:%.*]]
+; CHECK-NEXT:    [[NOT:%.*]] = xor i32 [[XOR]], -1
+; CHECK-NEXT:    ret i32 [[NOT]]
+;
+  %or = or i32 %y, %z
+  %xor = xor i32 %or, %x
+  %not = xor i32 %xor, -1
+  ret i32 %not
+}

Copy link

github-actions bot commented Jul 18, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@yafet-a yafet-a force-pushed the users/yafet-a/boolean-optimisation branch from 7c960cb to eb53480 Compare July 18, 2025 16:05
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't accept complex patterns without real-world usefulness: https://llvm.org/docs/InstCombineContributorGuide.html#real-world-usefulness

Currently, the original case can be simplified into a smaller expression: https://godbolt.org/z/En3za8Gjz. Please check if we can fold it into the target pattern through a simpler transformation (2-3 instructions). Then you need to demonstrate its real-world usefulness.

Reducing the size of an expression is not the business of optimizing compilers. Instead, it is widely used by decompilers. See also https://docs.hex-rays.com/user-guide/decompiler/goomba.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[LLVM] Missed optimization on boolean instructions.
3 participants