SelectionDAG: Improve expandFMINIMUM_FMAXIMUM
- Use FMAXNUM_IEEE, FMAXNUM, FMAXIMUMNUM if Legal or FMAXNUM_IEEE, FMAXIMUMNUM if Custom. We try the Legal ones first and then Custom ones then.
- For NaNs:
- If Max/Min Opc can be used, we set the result to qNaN if LHS is unordered with RHS.
- If we use FCMP, we move all NaNs in RHS to LHS only, as we perfer LHS if LHS is unordered with RHS.
- For +0 vs -0
- If Max/Min Opc can be used, as all of them process it correctly now, we can just skip it.
- If we use FCMP, we prefer RHS if equal, and then we determine the Sign of LHS when the result equals 0:
- if IsMax and LHS is positive, we switch LHS.
- if IsMax and LHS is negitive, we switch MinMax.
- if not IsMax and LHS is positive, we switch MinMax.
- if not IsMax and LHS is negitive, we switch LHS.
- we introduce a new static function
determineFloatSign, which return the SDValue to determine the Sign of a SDValue.
@llvm/pr-subscribers-backend-mips @llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-backend-nvptx @llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-llvm-selectiondag
Author: YunQiang Su (wzssyqa)
Changes
ISD::FMAXNUM and ISD::FMINNUM treat +0.0>-0.0 now, so let's set MinMaxMustRespectOrderedZero for it.
Full diff: https://github.com/llvm/llvm-project/pull/137367.diff
1 Files Affected:
- (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+1-2)
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 6930b54ddb14a..7baed2d591514 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -8573,8 +8573,6 @@ SDValue TargetLowering::expandFMINIMUM_FMAXIMUM(SDNode *N,
unsigned CompOpcIeee = IsMax ? ISD::FMAXNUM_IEEE : ISD::FMINNUM_IEEE;
unsigned CompOpc = IsMax ? ISD::FMAXNUM : ISD::FMINNUM;
- // FIXME: We should probably define fminnum/fmaxnum variants with correct
- // signed zero behavior.
bool MinMaxMustRespectOrderedZero = false;
if (isOperationLegalOrCustom(CompOpcIeee, VT)) {
@@ -8582,6 +8580,7 @@ SDValue TargetLowering::expandFMINIMUM_FMAXIMUM(SDNode *N,
MinMaxMustRespectOrderedZero = true;
} else if (isOperationLegalOrCustom(CompOpc, VT)) {
MinMax = DAG.getNode(CompOpc, DL, VT, LHS, RHS, Flags);
+ MinMaxMustRespectOrderedZero = true;
} else {
if (VT.isVector() && !isOperationLegalOrCustom(ISD::VSELECT, VT))
return DAG.UnrollVectorOp(N);
See:https://github.com/llvm/llvm-project/pull/168838
See:#168838
This was reverted, and the contentious part is signaling nan handling. One of the reasons simply reverting this is wrong is the signed zero behavior should remain regardless of that
:penguin: Linux x64 Test Results
- 187865 tests passed
- 4989 tests skipped
:white_check_mark: The build succeeded and all tests passed.
:window: Windows x64 Test Results
- 128941 tests passed
- 2842 tests skipped
:white_check_mark: The build succeeded and all tests passed.
llvm/test/CodeGen/AMDGPU/a-v-flat-atomicrmw.ll line 11831 at r6 (raw file):
; GFX90A-NEXT: .LBB148_1: ; %atomicrmw.start ; GFX90A-NEXT: ; =>This Inner Loop Header: Depth=1 ; GFX90A-NEXT: v_cmp_u_f16_e32 vcc, v5, v5
This broke the max formation