Bitbucket Server Provider does not support custom SSH port or HTTP context path
Hi!
I've just tried out the new bitbucketserver with the following configuration:
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
name: bitbucket
namespace: flux-system
spec:
type: bitbucketserver
address: ssh://git@mybitbucketserver:2222/ap/fluxcd-sandbox.git
secretRef:
name: bitbucket-token
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: bitbucket
namespace: flux-system
spec:
providerRef:
name: bitbucket
eventSources:
- kind: Kustomization
name: flux-system
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
name: helm-release-succeeded
spec:
eventMetadata:
name: app-name
namespace: app-namespace
version: 0.328.0
eventSeverity: info
eventSources:
- kind: HelmRelease
name: grafana
namespace: acp-observability
exclusionList:
- .*uninstall.*
- .*test.*
inclusionList:
- .*succeeded.*
providerRef:
name: bitbucket
summary: Summary - HelmRelease successfully reconciled
Unfortunately the provider is not functioning with either the following SSH or HTTPS urls:
- ssh://git@mybitbucketserver:2222/ap/fluxcd-sandbox.git
- https://mybitbucketserver/stash/scm/ap/fluxcd-sandbox.git
When using SSH, the provider (ghcr.io/fluxcd/notification-controller:v1.2.3) exits with a panic:
github.com/fluxcd/notification-controller/internal/notifier/bitbucketserver.go:195 +0x315
github.com/fluxcd/notification-controller/internal/notifier/bitbucketserver.go:155 +0x4e8
As far as I can see, this panic has been fixed in https://github.com/fluxcd/notification-controller/pull/722
However, due to the current implementation of parsing Git URLs, the resulting HTTPS URL would look like this: https://mybitbucketserver:2222/... which of course does not make any sense.
When using HTTPS, the controller can not validate the repository URL with the following error:
{
"level": "error",
"ts": "2024-02-15T09:57:23.947Z",
"logger": "event-server",
"msg": "failed to dispatch notification",
"eventInvolvedObject": {
"kind": "Kustomization",
"namespace": "flux-system",
"name": "flux-system",
"uid": "3d28463c-9809-4ca7-ab4e-ef7f05c6f395",
"apiVersion": "kustomize.toolkit.fluxcd.io/v1",
"resourceVersion": "369853"
},
"Alert": {
"name": "bitbucket",
"namespace": "flux-system"
},
"error": "failed to initialize notifier for provider bitbucket: failed to initialize notifier: invalid repository id \"stash/scm/ap/fluxcd-sandbox\""
}
I've looked up the respective locations in the code and it seems like there currently is no handling for custom (SSH) ports (e.g. 2222) or context paths (e.g /stash) implemented.
For SSH, the host is parsed incorrectly, for HTTPS the project and reposlug end up being wrong because of the additional path component:
Input 0: ssh://git@mybitbucket:2222/ap/fluxcd-sandbox.git
Length: 2
ProjectKey: ap
Reposlug: fluxcd-sandbox
Host: https://mybitbucket:2222
Input 1: https://mybitbucket/stash/scm/ap/fluxcd-sandbox.git
Length: 4
ID: stash/scm/ap/fluxcd-sandbox
ProjectKey: stash
Reposlug: scm
Host: https://mybitbucket
To solve this would it make sense to:
- only allow HTTPS addresses in the
addressfield of the provider, as any URL inferred from a given SSH URL can always be incorrect (e.g. when running your HTTPS server on another port than 443) - restrict the
spec.addressfield to the actual application URL (not the repo clone URL, trim any additional slashes at the end) - add new fields for the project key (not name) and repository slug
?
This would introduce a breaking change to the current behaviour but would be much clearer to understand and use in my opinion.
Without introducing breaking changes I could also imagine having an optional spec.contextPath field (or similar) to tell the bitbucket address parser to trim this before trying to determine the project and repository. In addition to this, a note in the docs would be required to inform the user that SSH URLs (with custom ports or installations with custom HTTP context path) are not supported and the user should fall back to using the HTTP Clone URL (+ optional contextPath).
What do you think?
Thank you very much and have a great day, Florian.
Hi, thanks for investigating the issue and sharing so much details. Tagging @gdasson as they are our known bitbucket server provider user and can provide some help here.
My understanding is that you had a look at the bitbucket server provider code and saw that the scm/ prefix is removed from the project ID, refer https://github.com/fluxcd/notification-controller/blob/bd242e091cc16b4b6fdc7ac804b5f40baad49bf6/internal/notifier/bitbucketserver.go#L269-L270. Since strings.TrimPrefix() only removes the prefix if it exists, I think this is okay. The core issue of selecting the wrong project ID could be in parseGitAddress(), refer https://github.com/fluxcd/notification-controller/blob/bd242e091cc16b4b6fdc7ac804b5f40baad49bf6/internal/notifier/util.go#L33. We use https://github.com/chainguard-dev/git-urls for parsing the address. It is capable of parsing git URLs with different schemes, see https://github.com/chainguard-dev/git-urls/blob/2c230de71af0263dc74e4b04a7220c74aee78a68/urls.go#L41. I believe this was done to make it easier for the user to just paste addresses in any known git address scheme when they want to post deployment updates to a git provider.
Since our Providers usually send http requests and ssh address is never used for such communications, just asking for http endpoint would be nice but we do have some providers that don't take http address directly. For example the azure event hub provider uses the azure SDK which internally resolves to some azure endpoint based on event hub name and namespace in Azure. Similarly, for google pub sub provider, the address is the GCP project ID. So depending on the provider, the meaning of address varies. Hence we can't make it http endpoint only. Unfortunately, this doesn't help with the scenario you described where the ssh port and the http application endpoint port may not match. But it works well for most of the providers where the git server is managed and can't be customized. So only a few providers have this problem of custom port and also custom context path. I believe we can try to accommodate for these providers in their implementations and try to avoid introducing new constructs if possible. But if we don't have any option, we can sure make changes to the constructs.
Coming back to the issue at hand, since the bitbucket server provider uses the parseGitAddress() function mentioned above, I think it should be clear why it parses the ssh host incorrectly, it just takes whatever is provided, but doesn't work in this particular case. I don't think we can do much for the port other than just documenting it in the bitbucket server provider docs. It may also be applicable to some other providers like gitea.
Regarding the invalid repository id error, maybe we can modify parseGitAddress() in some way or have another address parser specifically for bitbucket based on some bitbucket specific rules. The existing method of constructing the project ID is very simple, https://github.com/fluxcd/notification-controller/blob/bd242e091cc16b4b6fdc7ac804b5f40baad49bf6/internal/notifier/util.go#L44-L45, just trim / at the left of the URL path and add a .git suffix. I haven't used bitbucket server myself, but my understanding is that maybe something custom could work. I don't know for sure but maybe removal of scm/ would work only when bitbucket is deployed in a subdomain without a subpath. Or maybe we can make use of the .spec.channel to have some special meaning to convey how the address should be interpreted, similar to what some other providers do. For example the GCP pub/sub provider uses the channel field for setting the pub/sub topic ID. In case of bitbucket, maybe a channel could provide the context path and the last part of the address <repo-namespace>/<repo-name> could be extracted from the address field. These two can be combined to form the endpoint to which the requests would be sent for bitbucket.
These are just some ideas I could think of at present. Maybe you'll have better ideas of what'd work for bitbucket with the additional context above.
@fbuchmeier-abi : Thanks for reporting this. Honestly, I wasn't aware of Bitbucket Context path functionality since I never used it. I just read it here now.
Now, without introducing a breaking change, the easiest path I believe would be to modify this code here: https://github.com/fluxcd/notification-controller/blob/main/internal/notifier/bitbucketserver.go#L270
@darkowlzz , @fbuchmeier-abi Instead of doing a strings.TrimPrefix we could do a split on /scm/ as the delimiter and return all thats on the right side of this delimiter. For a HTTPs URL with context path for example https://abc.com/stash/scm/proj/repo.git , the id returned by parseGitAddress function here would be stash/scm/proj/repo.git. Now. after the scm/ we only expect project/reposlug and that's what is validated here. So that way the code would always work irrespective of the fact that there is a context path on the left of /scm/ or not
Now if the context path is used for building path for API interactions, then we would have to suffix what's before the /scm/ to the host variable there.
Thoughts?
@gdasson sounds good to me. If either of you would be interested in implementing and testing it, that would be great. I'll mark this issue as an enhancement.
@gdasson
Instead of doing a strings.TrimPrefix we could do a split on /scm/ as the delimiter and return all thats on the right side of this delimiter. That also sounds like the easiest solution for me in regards to the HTTP context path.
The problem with custom SSH ports could then be addressed by adding a note to the docs or even log a warning message when a custom port is detected. This way a user would be able to quickly detect the problem and use the HTTPS clone URL instead of SSH in this scenario.