tycho icon indicating copy to clipboard operation
tycho copied to clipboard

Tycho doesn't use credentials while downloading artifacts from secured p2

Open masofcon opened this issue 2 years ago • 18 comments

We used 2.x Tycho to build our project. Some of our repos in target platform are secured. We faced with auth problems while migrating to 4.x Tycho.

Quick debug shows that MavenAuthenticator uses different instances of DefaultRepositoryIdManager during build.

I've attached some stacks during our build.

DefaultRepositoryIdManager first initialization first-init-stack.txt

Population with repos ids adds-mappings-to-created-repidmanager-stack.txt

Access to populated instance getKnownMavenRepositoryLocations-from-first-repoidmanager-stack.txt

During validate-classpath stage of tycho-compiler plugin Tycho uses newly created instance of DefaultRepositoryIdManager which is not populated with information about repo to id mappings second-init-stack.txt getKnownMavenRepositoryLocations-from-second-repoidmanager-stack.txt

I've also tried to create example in PasswordProtectedP2RepositoryTest but have no luck.

masofcon avatar Nov 19 '23 13:11 masofcon

This behavior is reproduced in Tycho 4.0.2 at least. The reason is in HttpTransportProtocolHandler.transportFactoryMap. During the validation phase contains two pairs of an URLHttpTransportFactory'es (two for JavaUrl and two for Java11). One of JavaUrl's is not properly initialized. HttpTransportProtocolHandler instance that is used during the validation phase also differs from one during the compilation phase. TychoRepositoryTransport also has double transportProtocolHandlers(8 totally) during the validation phase, yet it is the same instance which is used during the build - only with changed transportProtocolHandlers

Tested in 4.0.4 - same behavior.

image

EntrySet's iterable it's a DefaultPlexusBeans, so it's a filtered by Key[type=org.eclipse.tycho.p2maven.transport.TransportProtocolHandler, [email protected]] selection from loaded beans, not some local collection with occasional doubling

image

tretyakevich avatar Nov 20 '23 02:11 tretyakevich

sisu.log

Added SISU tracing which isn't clear to me. I can see a normal provider mapping of HttpTransportProtocolHandler (mapped by the class itself) on the start (InjectorImpl@29f86630). With the singleton scope an so on. But at the end (shutdown) I can see two entries for HttpTransportProtocolHandler - provider (expected) and some implicit constructor contribution (and that is strange, I can't see any constructor mappings in the code and/or during the start of the injector)

tretyakevich avatar Nov 20 '23 07:11 tretyakevich

Still waiting for the info from Tycho developers on this issue. It is hard to find what is going on without complete understanding of Tycho internals.

tretyakevich avatar Dec 13 '23 07:12 tretyakevich

sisu is not part of tycho it is maven, URLHttpTransportFactory should actually never be used unless you explicitly requested for it to it is not surprising it is not fully initialized but would be good to explain more what is meant by "not properly initialized". Also note that actually the transport factories are not important at all as they just delegate to ProxyHelper and MavenAuthenticator.

I still think it is eminent to reproduce the behavior in a unit test, you can try to reduce your case as much as possible to see whats needed for it to happen and then try to derive a test-case from that, e.g. it might depend on the structure of your update-sites as well.

The main difference between Tycho 2 and Tycho 3+ is that Tycho 2 always has authenticate on the hostname only, what make some setups "work" but is a major security issue if credentials are just send to arbitrary (maybe shared by different user) sites.

laeubi avatar Dec 13 '23 08:12 laeubi

The main difference between Tycho 2 and Tycho 3+ is that Tycho 2 always has authenticate on the hostname only, what make some setups "work" but is a major security issue if credentials are just send to arbitrary (maybe shared by different user) sites.

Yeah, this matches the results of our own investigation. It seems that in 80983d52b687ea4be1e3ea89b09f5c1d85dc3acf (Extract the RepositoryIdManager into an own component) this 'hack' was removed without additional comments as a side change that's why we were unable to find proper ticket without debug. The method 'setPasswordForLoading' was removed competelly.


About Plexus component loading - Tycho 4.0.X creates two instances of DefaultRepositoryIdManager, which holds settings of resolved repositories via knownMavenRepositoryIds collection. The first instance is created during the very start of the build cycle, and being filled by a target platform resolution process via addMapping method. That's why DefaultRepositoryIdManager becomes stateful thus vulnerable to multi-instatiation via Guice/Plexus.

The second instance is being instantiated during the initialization of BaselineServiceImpl (the Guice init sequence is P2MetadataDefaultMojo -> org.eclipse.tycho.core.osgitools.BaselineServiceImpl -> P2RepositoryManager -> DefaultRepositoryIdManager). Why it's being instantiated twice despite the fact that is being declared as singleton Component - haven't checked that deep, usually its either different class loader (probably not the case here) or injector. And this instance is not populated with Maven Repository Ids as normally all other components have reference to the original instance of the DefaultRepositoryIdManager. Except ones that performs querying of components instead of a direct injection like the HttpTransportProtocolHandler (getTransportFactory method), as it uses a registry mapping to the transportFactoryMap feature. It seems that protocol handlers are also being instantiated twice due to the same problem, and newest ones have reference to the second instance of a DefaultRepositoryIdManager

That's the place where the applciation has an access to both DefaultRepositoryIdManager instances, and as the internal implementation of the container has no means of conflict resolution/detection - we may receive the second instance of a DefaultRepositoryIdManager, and that's the problem.

We are using the following way of the reproduction:

  1. Local settings of m2 repository mirror
  2. The m2 + tycho application with a target platform file
  3. baseline (from tycho.ext) mojo is being configured for a test project
  4. mvnDebug -X verify -Dtycho.localArtifacts=ignore -DskipTests=true -Dtycho.p2.httptransport.type=JavaUrl

The first instance is being created via the P2ResolverFactoryImpl binding during the resolution of a target platform, the next one - during the verification phase before the start of the baseline mojo.

tretyakevich avatar Jan 15 '24 06:01 tretyakevich

Checked Plexus internals - and yes, another Injector is used to create second instance of the DefaultRepositoryIdManager class. The injector is being selected via the Plexus bean locator using the key Key[type=org.apache.maven.plugin.Mojo, [email protected]("org.eclipse.tycho:tycho-p2-plugin:4.0.2:p2-metadata-default")]

So it's the p2-metadata-default mojo from tycho-p2 plugin. Seems it has its own Plexus component definition - and that's the signal for a Plexus container to create separate injector for it (via the DefaultPlexusContainer#addPlexusInjector).

Probably some definition of Plexus components inside separate bundles (in the way that assumes existence of several instance hierarches, like realm/role filters). IRepositoryIdManager participates both in tycho-core (provided) and p2-maven-plugin (provider). As core bundle's P2ResolverFactory is being used by different Mojo's - it's being instantiated from Mojo's execution, with Mojo-matchable injector, with all its non-povided dependent beans being instantiated as well.

Checked old implementation (in tycho 2.x.x) - this component was instantiated directly using its constructor so no problems with its lifecycle. Seems during the migration to a new Plexus component infrastructure the excessive use of container-managed componentization caused this problem.

tretyakevich avatar Jan 16 '24 03:01 tretyakevich

@tretyakevich now we know its actually the use of p2-metadata mojo do you think you can derive an integration-test to demonstrate the issue?

laeubi avatar Jan 16 '24 05:01 laeubi

@tretyakevich now we know its actually the use of p2-metadata mojo do you think you can derive an integration-test to demonstrate the issue?

Now I'm trying to get the whole picture first and I'll be able to try the test preparation after that (probably)

But for now the quest continues. I've hacked HttpTransportProtocolHandler#getTransportFactory to get the right transport factory with a properly initialized MavenAuthenticator. This helped to reach the authentication stage during the creation of TP inside the CompareWithBaselineMojo - only to find that CompareWithBaselineMojo uses baselineTPStub.addP2Repository(toRepoURI(baselineRepo)) method variant to populate TP with baseline repo definition, which is marked as // convenience method for tests. And this is very strange, as this variant replaces the repository ID with null, and such repo definitions are being ignored by DefaultRepositoryIdManager#addMapping method completely (it has if (mavenRepositoryId == null) return; protection as a first construction in the method). So the authentication is still being ignored for the baseline repo definition. Seems that earlier it worked because of implicit host authentication binding instead of current URI matching.

Checked P2ResolverTest, PasswordProtectedP2RepositoryTest - the first one also ignores defined P2Repository during DefaultRepositoryIdManager population. The second one modified target platform file, which isn't the direct replacement of baseline repo specification, as declaring baseline repo as a target platform file contribution only to get access to it seems a bit weird. Is there any way to run baseline check using the baseline repo with basic auth without modifying CompareWithBaselineMojo?

tretyakevich avatar Jan 16 '24 11:01 tretyakevich

@tretyakevich it is entirely possible that the baseline compare only worked "by accident"... CompareWithBaselineMojo is part of Tycho extra and it has only one integration test that uses a local repo. Eclipse platform build is using it but never with password protection... I personally never used it nor found it useful so it seems its all a bit more best-effort than widely adopted.

Looking at it in more details, it uses a list of strings for the repository, while on other places we use a Repository with url+id for example org.eclipse.tycho.baseline.BaselineMojo and its baselines parameter.

So I would propose that at the very first step we deprecate org.eclipse.tycho.plugins.p2.extras.CompareWithBaselineMojo.baselines and ad a new baselineRepositories parameter that allows to specify an ID+URL and pass these properly to the other components. There is a good chance that then everything already works (or can be made to work)

laeubi avatar Jan 16 '24 11:01 laeubi

We have the same issues, which blocks us to update to Java 21.

@laeubi Would it be an option (as a temporary fix) to make the list of known Maven repository IDs static? (as you did as well in https://github.com/eclipse-tycho/tycho/blob/main/p2-maven-plugin/src/main/java/org/eclipse/tycho/p2maven/transport/MavenAuthenticator.java#L64)?

glatuske avatar Jan 23 '24 07:01 glatuske

@glatuske I'm not sure how this relates to Java 21, but making the id static won't help here as effectively no ID is passed at all what is the root cause here so even if the map is accessible how should Tycho know the ID?

Unless we have an integration-test to demonstrate the issue its also quite hard to test if a proposed fix actually works and I don't have a suitable test setup either for this so it would be more a bit of guessing.

laeubi avatar Jan 23 '24 07:01 laeubi

The Java 21 update does not work because of Tycho 3.x does not support it and without server authentication working there is currently no way to use Java 21 and Tycho together.

The map in DefaultRepositoryIdManager is correctly filled (but not for all created instances) and used to resolve the P2 repositories defined in the target platform. I have tested 4.0.5-SNAPSHOT with eager resolving (requireEagerResolve) and it works well. For me that means that the algorithm works well, it's just the dependency management which creates the issues.

We can start use eager resolving, but would require a Tycho 4.0.5 release including #3303.

glatuske avatar Jan 23 '24 08:01 glatuske

@glatuske you can use newer compilers with older Tycho versions: https://github.com/eclipse-tycho/tycho/wiki/Frequently-asked-questions#how-do-i-use-a-different-compiler-version

is there any chance to make a local build (mvn clean install -T1C -DskipTests) with making the map static to verify it fixes the issue for you as well? The You can just open a PR for it (Tycho 4.0.5 is about to released this week) that can be included in the bugfix release.

laeubi avatar Jan 23 '24 09:01 laeubi

@laeubi I know already done this for 4.0.5-SNAPSHOT, let me try this with 3.0.5 as well.

I have tried locally those combinations:

  • requireEagerResolve: false, no code change --> authentication failed
  • requireEagerResolve: true, no code change --> authentication works
  • requireEagerResolve: false, change to static map --> authentication works
  • requireEagerResolve: true, change to static map --> authentication works

I'll create a PR for master.

glatuske avatar Jan 23 '24 12:01 glatuske

I'll create a PR for master.

Yes that can then be backported to 4.x branch automatically, 3.x branch is not really important....

laeubi avatar Jan 23 '24 12:01 laeubi

It seems that problems go deeper then it seemed from the beginning.

The build seems worked well with my dirty fix of wrongly configured transport authenticator filtering, Until I've started to get random errors during target platform resolving while running integration junit tests via tycho-surefire. Based on detailed logs I've found the source of the problem - previous local m2 vesions (with timestamp) of our project artifacts were resolved instead of current ones via tycho4 target resolution process from provided p2 repository.

As it looked like some problem of a new tycho lazy resolution strategy (like existence of m2 artifacts with matching version in local m2 cache after previous build probably) and I've got no time to trace another problem to its roots - switched to the eager mode and got problems with multi-threaded build. As all Java11HttpTransportFactory'ies and URLHttpTransportFactory'ies have empty knownMavenRepositoryIds for enabled eager strategy and multy-threaded builds.

So it seems that proposed change with static collection is a sensible temporary solution until roots of the problem will be found/solved.

tretyakevich avatar Jan 29 '24 02:01 tretyakevich

@tretyakevich I always use -Dtycho.localArtifacts=ignore (e.g. in .mvn/maven.config) for my builds.

laeubi avatar Jan 29 '24 05:01 laeubi

@tretyakevich I always use -Dtycho.localArtifacts=ignore (e.g. in .mvn/maven.config) for my builds.

Yes, the same thing on my side as well, both during local builds and CI builds. But in this case it doesn't help. As I've said I had no time to dig deeper in this matter, probably it's some glitch of our CI scripts/P2/M2 repositories configuration. Yet with the eager resolution strategy I'm able to proceed further with builds without appearance of a stale artifacts.

tretyakevich avatar Jan 30 '24 03:01 tretyakevich