rector icon indicating copy to clipboard operation
rector copied to clipboard

Add rule to improve performance checking on empty array

Open JohJohan opened this issue 3 years ago • 1 comments

Feature Request

Just like https://github.com/rectorphp/rector/blob/main/docs/rector_rules_overview.md#countarraytoemptyarraycomparisonrector i think it would be good performance to change empty() function call on array values to [] check example:

Diff

$array = [];

-if (empty($array)) {}; 
+if ([] === $array) {};
-if (!empty($array)) {}; 
+ if ([] !== $array) {};

This can only be done if we know that the variable is array. So for typed properties, function arguments or function return types

JohJohan avatar Sep 15 '22 12:09 JohJohan

There is SimplifyEmptyArrayCheckRector that require is_array() check on the condition https://github.com/rectorphp/rector/blob/main/docs/rector_rules_overview.md#simplifyemptyarraycheckrector

Feel free to create feature PR for it.

Some note:

  • nullable array may need get type from Scope->getType() instead of NodeTypeResolver->getType(), see https://github.com/rectorphp/rector-src/pull/2935
  • check combination with RemoveUnusedNonEmptyArrayBeforeForeachRector to avoid unnecessary refactor as will be removed after that.

samsonasik avatar Sep 19 '22 14:09 samsonasik

@samsonasik am i allowed to extend the rule: SimplifyEmptyArrayCheckRector https://github.com/rectorphp/rector/blob/main/docs/rector_rules_overview.md#simplifyemptyarraycheckrector as it describes what i also want only changes the fact to check on in_array() and checking on types array but the rule name also suggests that

johanadivare avatar Oct 12 '22 06:10 johanadivare

I added some tests see https://github.com/rectorphp/rector-src/compare/rectorphp:main...JohJohan:7485?expand=1

johanadivare avatar Oct 12 '22 07:10 johanadivare

I prefer for new rule for that, as Empty_ is not a FuncCall, also, validating array type may cause an issue without is_array() check like previous issue I mentioned, so different rule is better so that will be easier to locate when found a bug.

samsonasik avatar Oct 12 '22 10:10 samsonasik

Isn't this a feature already provided by code style enforcing tools?

staabm avatar Nov 16 '22 06:11 staabm

@staabm maybe you can give an example? And might still benefit rector as we also have SimplifyEmptyArrayCheckRector rule

johanadivare avatar Nov 16 '22 07:11 johanadivare

@johanadivare I think new rule is a way to go here, as the construction is a bit different and empty has lot of traps we have to deal with also in other rules :). Could you send a PR?

TomasVotruba avatar Nov 16 '22 08:11 TomasVotruba

I created https://github.com/rectorphp/rector-src/pull/3069

I did checkout the notes from @samsonasik https://github.com/rectorphp/rector/issues/7485#issuecomment-1251129232 but: note 1: I only check on ArrayType so if the type is nullable i dont replace it note 2: do you want me to write a test in combination with RemoveUnusedNonEmptyArrayBeforeForeachRector

johanadivare avatar Nov 16 '22 13:11 johanadivare

Sorry to bring up this old issue, but what I am wondering where the statement comes from, that the comparison with an empty array is faster?

Some time ago I found the following StackOverflow answer which includes a benchmark. This benchmark shows that in fact checking against an empty array is the slowest (which might be due to the fact that for this one needs to construct/allocate an empty array). I can reproduce similar results for PHP 8.2.

In general I agree, that one should probably not use the empty check and the bool cast, which has similar issues. I usually use the \count(...) > 0 approach, which is in my option the best compromise.

Have you performed any benchmarks to verify the statements, which tell otherwise? Or do I miss something else?

aragon999 avatar Feb 14 '23 18:02 aragon999

@aragon999 count will be slower on complex array structure, if you have array structure like this:

for ($i = 0; $i < 50; $i++) {
    $arr1[] = 0;
    $arr2[] = [
        [
            'test' => [
                'test' => [
                    'test',
                ],
            ],
        ],
    ];
    $arr3[] = md5($i);
    $arr4[] = $i % 2 ? $i : md5($i);
    $arr5[md5($i)] = $i;
}

samsonasik avatar Feb 14 '23 18:02 samsonasik

@samsonasik thank you for the feedback. Thanks I will do some further benchmarks for myself, probably disable the rule and use the appropriate method depending on the array.

aragon999 avatar Feb 14 '23 18:02 aragon999

You should use a profiler to decide whether its worth the effort.

In most applications it doesn't matter at all which variant you choose

staabm avatar Feb 14 '23 18:02 staabm

As the docblock suggest: $array = []; if(empty($values))', '$array = []; if([] === $values) we check [] === $values Yoda style but the actual output will be if($values === []).

@TomasVotruba is this something i can fix with a pull request or is this something for a php ecs fixer?

I found this out after implementing it in our projects and i get allot off: 172 | ERROR | Use Yoda conditions when checking a variable against an expression to avoid an accidental assignment inside the condition statement.

johanadivare avatar Feb 28 '23 15:02 johanadivare

@johanadivare you can use YodaStyleFixer from php-cs-fixer on ecs for that:

use PhpCsFixer\Fixer\ControlStructure\YodaStyleFixer;

// ...

    $ecsConfig->ruleWithConfiguration(YodaStyleFixer::class, ['identical' => \true]);

samsonasik avatar Feb 28 '23 15:02 samsonasik