llvm-project icon indicating copy to clipboard operation
llvm-project copied to clipboard

SelectionDAG: Improve expandFMINIMUM_FMAXIMUM

Open wzssyqa opened this issue 10 months ago • 6 comments

  1. Use FMAXNUM_IEEE, FMAXNUM, FMAXIMUMNUM if Legal or FMAXNUM_IEEE, FMAXIMUMNUM if Custom. We try the Legal ones first and then Custom ones then.
  2. For NaNs:
    • If Max/Min Opc can be used, we set the result to qNaN if LHS is unordered with RHS.
    • If we use FCMP, we move all NaNs in RHS to LHS only, as we perfer LHS if LHS is unordered with RHS.
  3. For +0 vs -0
    • If Max/Min Opc can be used, as all of them process it correctly now, we can just skip it.
    • If we use FCMP, we prefer RHS if equal, and then we determine the Sign of LHS when the result equals 0:
      1. if IsMax and LHS is positive, we switch LHS.
      2. if IsMax and LHS is negitive, we switch MinMax.
      3. if not IsMax and LHS is positive, we switch MinMax.
      4. if not IsMax and LHS is negitive, we switch LHS.
  4. we introduce a new static function determineFloatSign, which return the SDValue to determine the Sign of a SDValue.

wzssyqa avatar Apr 25 '25 17:04 wzssyqa

@llvm/pr-subscribers-backend-mips @llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-backend-nvptx @llvm/pr-subscribers-backend-powerpc

@llvm/pr-subscribers-llvm-selectiondag

Author: YunQiang Su (wzssyqa)

Changes

ISD::FMAXNUM and ISD::FMINNUM treat +0.0>-0.0 now, so let's set MinMaxMustRespectOrderedZero for it.


Full diff: https://github.com/llvm/llvm-project/pull/137367.diff

1 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp (+1-2)
diff --git a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
index 6930b54ddb14a..7baed2d591514 100644
--- a/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
@@ -8573,8 +8573,6 @@ SDValue TargetLowering::expandFMINIMUM_FMAXIMUM(SDNode *N,
   unsigned CompOpcIeee = IsMax ? ISD::FMAXNUM_IEEE : ISD::FMINNUM_IEEE;
   unsigned CompOpc = IsMax ? ISD::FMAXNUM : ISD::FMINNUM;
 
-  // FIXME: We should probably define fminnum/fmaxnum variants with correct
-  // signed zero behavior.
   bool MinMaxMustRespectOrderedZero = false;
 
   if (isOperationLegalOrCustom(CompOpcIeee, VT)) {
@@ -8582,6 +8580,7 @@ SDValue TargetLowering::expandFMINIMUM_FMAXIMUM(SDNode *N,
     MinMaxMustRespectOrderedZero = true;
   } else if (isOperationLegalOrCustom(CompOpc, VT)) {
     MinMax = DAG.getNode(CompOpc, DL, VT, LHS, RHS, Flags);
+    MinMaxMustRespectOrderedZero = true;
   } else {
     if (VT.isVector() && !isOperationLegalOrCustom(ISD::VSELECT, VT))
       return DAG.UnrollVectorOp(N);

llvmbot avatar Apr 25 '25 17:04 llvmbot

See:https://github.com/llvm/llvm-project/pull/168838

wzssyqa avatar Dec 02 '25 02:12 wzssyqa

See:#168838

This was reverted, and the contentious part is signaling nan handling. One of the reasons simply reverting this is wrong is the signed zero behavior should remain regardless of that

arsenm avatar Dec 02 '25 06:12 arsenm

:penguin: Linux x64 Test Results

  • 187865 tests passed
  • 4989 tests skipped

:white_check_mark: The build succeeded and all tests passed.

github-actions[bot] avatar Dec 04 '25 03:12 github-actions[bot]

:window: Windows x64 Test Results

  • 128941 tests passed
  • 2842 tests skipped

:white_check_mark: The build succeeded and all tests passed.

github-actions[bot] avatar Dec 09 '25 06:12 github-actions[bot]

llvm/test/CodeGen/AMDGPU/a-v-flat-atomicrmw.ll line 11831 at r6 (raw file):

; GFX90A-NEXT:  .LBB148_1: ; %atomicrmw.start
; GFX90A-NEXT:    ; =>This Inner Loop Header: Depth=1
; GFX90A-NEXT:    v_cmp_u_f16_e32 vcc, v5, v5

This broke the max formation

arsenm avatar Dec 15 '25 20:12 arsenm