Skip to content

[instcombine, sha] Add Instruction Fold for Masked Merge #7145

Open
@llvmbot

Description

@llvmbot
Bugzilla Link 6773
Version trunk
OS Linux
Blocks llvm/llvm-bugzilla-archive#49930
Reporter LLVM Bugzilla Contributor
CC @asl,@d0k,@lattner,@topperc,@efriedma-quic,@LebedevRI,@RKSimon,@sunfishcode,@nlewycky,@rotateright
Fixed by commit(s) 333106

Extended Description

When experimenting with SHA, I was pleased to notice that LLVM will fold the "Maj" function (x & y) ^ (x & z) ^ (y & z) down to ((z ^ y) & x) ^ (y & z).

The "Ch" function, though, doesn't fold. (x & y) | (~x & z) should become ((y ^ z) & x) ^ z, as mentioned on http://graphics.stanford.edu/~seander/bithacks.html#MaskedMerge (as should (x & y) ^ (~x & z), the version used in the SHA standard).

(If you're wondering at the similarity, Maj(x,y,z) is equivalent to Ch(x, y|z, y&z). Using that implementation with the optimized Ch again gives optimal code from LLVM as it knows to fold (y|z)^(y&z) down to y^z.)

LLVM IR from Bitter Melon:

define i32 @Ch(i32 %x, i32 %y, i32 %z) nounwind readnone {
entry:
  %0 = and i32 %y, %x                             ; <i32> [#uses=1]
  %not = xor i32 %x, -1                           ; <i32> [#uses=1]
  %1 = and i32 %z, %not                           ; <i32> [#uses=1]
  %2 = xor i32 %1, %0                             ; <i32> [#uses=1]
  ret i32 %2
}

define i32 @Maj(i32 %x, i32 %y, i32 %z) nounwind readnone {
entry:
  %0 = xor i32 %z, %y                             ; <i32> [#uses=1]
  %1 = and i32 %0, %x                             ; <i32> [#uses=1]
  %2 = and i32 %z, %y                             ; <i32> [#uses=1]
  %3 = xor i32 %1, %2                             ; <i32> [#uses=1]
  ret i32 %3
}

define i32 @Ch2(i32 %x, i32 %y, i32 %z) nounwind readnone {
entry:
  %0 = xor i32 %z, %y                             ; <i32> [#uses=1]
  %1 = and i32 %0, %x                             ; <i32> [#uses=1]
  %2 = xor i32 %1, %z                             ; <i32> [#uses=1]
  ret i32 %2
}

define i32 @Maj2(i32 %x, i32 %y, i32 %z) nounwind readnone {
entry:
  %0 = and i32 %z, %y                             ; <i32> [#uses=1]
  %1 = xor i32 %z, %y                             ; <i32> [#uses=1]
  %2 = and i32 %1, %x                             ; <i32> [#uses=1]
  %3 = xor i32 %2, %0                             ; <i32> [#uses=1]
  ret i32 %3
}

From the following C source:

int Ch(int x, int y, int z) {
   return (x&y) ^ (~x&z);
}
int Maj(int x, int y, int z) {
   return (x&y) ^ (x&z) ^ (y&z);
}

int Ch2(int x, int y, int z) {
   return z ^ ((y ^ z) & x);
}
int Maj2(int x, int y, int z) {
   return Ch2(x, y|z, y&z);
}

Metadata

Metadata

Assignees

Labels

bugzillaIssues migrated from bugzillallvm:instcombineCovers the InstCombine, InstSimplify and AggressiveInstCombine passesmissed-optimization

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions