[Scoper] Scoping autoloaded files result in an autoload conflict
Edit to avoid to summarise the state of this issue.
When scoping a package which has autoloaded files, the scoped files will keep the same hash. As a result if your current application/library is using the scoped package and has files with the same hash, only one file will be loaded which results in undesirable behaviours.
It appears the hash strategy is not something we can change composer/composer#7942 and changing the package name of the scoped packages has undesirable side effects so it is not a solution either.
@jarrodbell https://github.com/humbug/php-scoper/issues/298#issuecomment-678089152 and @ondrejmirtes https://github.com/humbug/php-scoper/issues/298#issuecomment-683425709 made it work by changing it after the composer dump process as part of the scoping build process.
The discussion is carrying from that point to figure out if there is a way PHP-Scoper could fix it by itself without further actions from the user.
Bug report
| Question | Answer |
|---|---|
| Box version | PHP Scoper version 0.11.4 2018-11-13 08:49:16 UTC |
| PHP version | PHP 7.1.22 |
| Platform with version | Ubuntu 16.04.5 LTS |
| Github Repo | https://github.com/mollie/mollie-api-php |
We are using php-scoper in our client to generate a release for integrators to be used in for example Wordpress. When it is used there, there are often more packages with Guzzle via composer.
When this happens they will get the exception:
Uncaught Error: Call to undefined function GuzzleHttp\choose_handler()
This is because the corresponding guzzlehttp/guzzle/src/functions_include.php is not loaded for the second implementation due to the composer file require hash being the same.
Below there is an example of our package being ran alongside another guzzle package. This is a var_dump inside the composer_real.php where the functions_include is being required if the file hash is not already loaded.
array(3) {
["c964ee0ededf28c96ebd9db5099ef910"]=>
string(99) "{path_to_site}/vendor/composer/../guzzlehttp/promises/src/functions_include.php"
["a0edc8309cc5e1d60e3047b5df6b7052"]=>
string(95) "{path_to_site}/vendor/composer/../guzzlehttp/psr7/src/functions_include.php"
["37a3dc5111fe8f707ab4c132ef1dbc62"]=>
string(97) "{path_to_site}/vendor/composer/../guzzlehttp/guzzle/src/functions_include.php"
}
array(3) {
["c964ee0ededf28c96ebd9db5099ef910"]=>
string(106) "{path_to_site}/guzzle/vendor/composer/../guzzlehttp/promises/src/functions_include.php"
["a0edc8309cc5e1d60e3047b5df6b7052"]=>
string(102) "{path_to_site}/guzzle/vendor/composer/../guzzlehttp/psr7/src/functions_include.php"
["37a3dc5111fe8f707ab4c132ef1dbc62"]=>
string(104) "{path_to_site}k/guzzle/vendor/composer/../guzzlehttp/guzzle/src/functions_include.php"
}
So this causes only the first implementation it's choose_handler function to be available.
I'm not really sure if this bug should be reported here, but i'm curious to your expert opinion about this. Might it be worth it to also scope the files that outta be required by composer? To avoid some of them not being loaded? As manipulating the hash in above example would fix the problem.
Hi! Thanks for the report. It's nice to see it being tried on Wordpress :)
I admit I'm a bit confused.
Uncaught Error: Call to undefined function GuzzleHttp\choose_handler()
Since it is not a function from the global namespace, unless explicitly configured otherwise, I would expect this function to be scoped. If it is the case, then it means somewhere a piece of code has not been scoped properly and is still using the old reference (which can be an expected case. Unfortunately due to PHP nature PHP-Scoper won't be able to be perfect and will sometimes need some help via the patchers).
This is because the corresponding guzzlehttp/guzzle/src/functions_include.php is not loaded for the second implementation due to the composer file require hash being the same.
What I would find weird/confusing there is why would it be a problem for the scoped version and not the unscoped one? Also if the hash file is the same it means the content is the same, so indeed there shouldn't be a need to include it twice should it?
Yes, it's not a function from the global namespace, but since it is defined in the functions_include.php it will not be available for the second guzzle package.
The hash is the same even though the contents are not.
Scoped functions_include.php:
<?php
namespace _PhpScoper5c3efd0df2edb;
// Don't redefine the functions if included multiple times.
if (!\function_exists('_PhpScoper5c3efd0df2edb\\GuzzleHttp\\uri_template')) {
require __DIR__ . '/functions.php';
}
Unscoped functions_include.php:
<?php
// Don't redefine the functions if included multiple times.
if (!function_exists('GuzzleHttp\uri_template')) {
require __DIR__ . '/functions.php';
}
So whatever functions_include.php gets loaded first is the one that will work.
As the second one won't be included.
The hash is the same even though the contents are not.
Hm this is very weird then. Has the autoloader been dumped again after the scoping?
Yes, i've done a lot of testing with it.
I've tried adding even more code to the functions_include.
Does not change a thing, unfortunately.
I think it would be best to try to narrow down the issue in a reproducible repo to ease the debugging. It may be a bug in Composer or it means Composer is relying on the hash from the installed.json files which would be troublesome and means we would need to do something about it for PHP-Scoper to work properly in this scenario
I'll create a repo for you 👍
The repository: https://github.com/Smitsel/guzzle-scoper-reproduction
If you run into any problems or need any more information just let me know :)
👍 I'll try to take a look ASAP but it's likely gonna take a couple of weeks, this month is quite busy for me :/
Not a problem. We can hack in a solution if it becomes really pressing. Happy that you are willing to take a look!
It's probably here: https://github.com/composer/composer/blob/master/src/Composer/Autoload/AutoloadGenerator.php#L903
Yep, manipulating the package name fixes it indeed.
Very weird choice to do it just based on that. They should also compare content-size or something.
I'd hash the whole ~~package directory~~ file (incl contents), but that may be a bit resource expensive 😄
it's for the dumping the autoloader so I don't think it would be a big deal
Would using getDistSha1Checksum() be a solution? It's on the PackageInterface.
https://github.com/composer/composer/blob/19ba2edd5c9e86a5786245e6349e979e5ca380d1/src/Composer/Package/PackageInterface.php#L172
I'm not sure. But I think it's a Composer issue so might be better to create an issue and look for a solution there.
If not achievable, maybe we can look for a workaround in PHP-Scoper but I would like to avoid that if possible.
@sandervanhooft no its needs to hash the file contents so it can deal with out situation.
@theofidry Could you open an issue at Composer? I think it will carry more weight if you do it as the owner / maintainer of the humbug.
So as per the discussion in https://github.com/composer/composer/issues/7942, I changed my mind and I think it makes sense for PHP-Scoper to change the names of the scoped packages.
So I guess a solution would be to also change the packages names in composer.json and installed.json to solve the issue, maybe by just appending the prefix to the name.
The concerned code is this one I think: https://github.com/humbug/php-scoper/tree/ddcf428/src/Scoper/Composer
I agree with that approach, makes more sense since this will only happen when file contents of required files get scoped or altered. It's not a composer bug. Sounds like a good solution!
I've experimented with this a bit and couldn't find a simple way to solve this within scoper.
Once the package-name is altered in installed.json - a autoload dump ( which is recommended after scoping right? ) won't work - as the package name does no longer reflect the path on the file system for the vendor libraries.
For the time being I've went to parsing autoload_static.php and autoload_files.php and prefix the file hashes in there to trick autoloader into loading it again
But this is then a 3 step process:
- scoper
- dumper autoloader
- fix autoloader
Any ideas how to best move forward here?
I think fixing the autoloader itself is a must avoid: we are jumping in an incredibly more complex solution if we do so.
I'm quite busy at the moment so I can't take a proper look but maybe @naderman has an idea?
@naderman to avoid the whole thread, the issue here is that:
- Guzzle relies on autoloaded files for declaring some functions
- The scoped Guzzle works as expected, except the autoloaded files loading strategy relies on the file name + package name, which might be the same as the non-scoped Guzzle
The result is that one file is loaded (e.g. the non scoped Guzzle one) and the scoped one won't be loaded since has the same hash as the non-scoped one.
Related Composer issue: https://github.com/composer/composer/issues/7942
Understood - however I am not sure how to catch this in scoper - when one is supposed to dump the autoloader after using scoper.
What solved it for me was this simple script"
<?php
/**
* This helper is needed to "trick" composer autoloader to load the prefixed files
* Otherwise if owncloud/core contains the same libraries ( i.e. guzzle ) it won't
* load the files, as the file hash is the same and thus composer would think this was already loaded
*
* More information also found here: https://github.com/humbug/php-scoper/issues/298
*/
$scoper_path = './build/scoper/vendor/composer';
$static_loader_path = $scoper_path.'/autoload_static.php';
echo "Fixing $static_loader_path \n";
$static_loader = file_get_contents($static_loader_path);
$static_loader = \preg_replace('/\'([A-Za-z0-9]*?)\' => __DIR__ \. (.*?),/', '\'a$1\' => __DIR__ . $2,', $static_loader);
file_put_contents($static_loader_path, $static_loader);
$files_loader_path = $scoper_path.'/autoload_files.php';
echo "Fixing $files_loader_path \n";
$files_loader = file_get_contents($files_loader_path);
$files_loader = \preg_replace('/\'(.*?)\' => (.*?),/', '\'a$1\' => $2,', $files_loader);
file_put_contents($files_loader_path, $files_loader);
Maybe a way would be to collect the included scoped files and append them to the dumped autoloader?
I'm here with the same problem, I have a WordPress plugin bundling the scoped AWS SDK, but it loads a functions file before another plugin with the SDK.
@theofidry have you had any further thoughts on this?
@patrickjahns is https://github.com/humbug/php-scoper/issues/298#issuecomment-525700081 still your working solution? ~~Where does that code live?~~ Found it https://github.com/owncloud/files_primary_s3/pull/237/files#diff-89e7fc97925a92c2521656947a720895
Update: Thanks @patrickjahns - works a treat!
No I couldn't really check. One would need to check if, when the scoped file is appended to the scoper-autoload.php like it is already done for class & function aliases, works. If that's the case then it could be possible to collect those kind of files to append them to the scoper autoloader
Thinking about this again, a possible solution would be to have a FileCollector class and:
- when coming across a
installed.jsonfile:- collect all the files listed in
autoload.files(which can only be files, not directories, if I understood correctly) inFileCollector - remove all those files from the
installed.json
- collect all the files listed in
- when coming across a
composer.json:- collect all the files listed in
autoload.filesinFileCollector - remove all those files from
composer.json
- collect all the files listed in
- ignore
composer.lockfiles
And then when dumping the PHP-Scoper autoloader we can append all those files there and they should not be in the regular Composer vendor/composer/autoload_files.php (even after dumping the autoloader again).
I can help out if one desires to implement this, but I'm unlikely to work on this in a foreseeable future
It seems like the most simple solutions are:
- Patching the autoloader files after generation as @patrickjahns proposed
- Changing the
getFileIdentifier()function inside composer. It could add the composer package name of the current composer.json - if available - to generate a unique hash across multiple packages.
$composer = json_decode(file_get_contents('path/to/composer.json));
$rootName = isset($composer->name) ? $composer->name . ':' : '';
md5($rootName . $package->getName() . ':' . $path);
Is there anything I'm missing?
It could add the composer package name of the current composer.json - if available - to generate a unique hash across multiple packages.
This won't be enough: the example here is that the conflicting file is from the same package: one scoped and the other not. So instead it should either be based on the content or base on the file path for example (so that the same file name but placed in different places don't conflict)
the example here is that the conflicting file is from the same package: one scoped and the other not.
The initial issue mentioned WordPress plugins. What I understood - and what is my use case - is that plugin-a/vendor/guzzlehttp and plugin-b/vendor/guzzlehttp collide since the relative paths to the function files are the same (./vendors/guzzlehttp/src/functions_include.php) and so is the md5 hash.
For this general WordPress plugin use case including the root package name (plugin-a or plugin-b) into the md5 hash would solve this issue.
ha correct, the relative path was not enough hence the suggestion of using a hash of the file content instead somewhere