notification-controller icon indicating copy to clipboard operation
notification-controller copied to clipboard

Bitbucket Server Provider does not support custom SSH port or HTTP context path

Open fbuchmeier-abi opened this issue 1 year ago • 4 comments

Hi!

I've just tried out the new bitbucketserver with the following configuration:

apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Provider
metadata:
  name: bitbucket
  namespace: flux-system
spec:
  type: bitbucketserver
  address: ssh://git@mybitbucketserver:2222/ap/fluxcd-sandbox.git
  secretRef:
    name: bitbucket-token
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
  name: bitbucket
  namespace: flux-system
spec:
  providerRef:
    name: bitbucket
  eventSources:
    - kind: Kustomization
      name: flux-system
---
apiVersion: notification.toolkit.fluxcd.io/v1beta3
kind: Alert
metadata:
  name: helm-release-succeeded
spec:
  eventMetadata:
    name: app-name
    namespace: app-namespace
    version: 0.328.0
  eventSeverity: info
  eventSources:
  - kind: HelmRelease
    name: grafana
    namespace: acp-observability
  exclusionList:
  - .*uninstall.*
  - .*test.*
  inclusionList:
  - .*succeeded.*
  providerRef:
    name: bitbucket
  summary: Summary - HelmRelease successfully reconciled

Unfortunately the provider is not functioning with either the following SSH or HTTPS urls:

  • ssh://git@mybitbucketserver:2222/ap/fluxcd-sandbox.git
  • https://mybitbucketserver/stash/scm/ap/fluxcd-sandbox.git

When using SSH, the provider (ghcr.io/fluxcd/notification-controller:v1.2.3) exits with a panic:

github.com/fluxcd/notification-controller/internal/notifier/bitbucketserver.go:195 +0x315
github.com/fluxcd/notification-controller/internal/notifier/bitbucketserver.go:155 +0x4e8

As far as I can see, this panic has been fixed in https://github.com/fluxcd/notification-controller/pull/722 However, due to the current implementation of parsing Git URLs, the resulting HTTPS URL would look like this: https://mybitbucketserver:2222/... which of course does not make any sense.

When using HTTPS, the controller can not validate the repository URL with the following error:

{
  "level": "error",
  "ts": "2024-02-15T09:57:23.947Z",
  "logger": "event-server",
  "msg": "failed to dispatch notification",
  "eventInvolvedObject": {
    "kind": "Kustomization",
    "namespace": "flux-system",
    "name": "flux-system",
    "uid": "3d28463c-9809-4ca7-ab4e-ef7f05c6f395",
    "apiVersion": "kustomize.toolkit.fluxcd.io/v1",
    "resourceVersion": "369853"
  },
  "Alert": {
    "name": "bitbucket",
    "namespace": "flux-system"
  },
  "error": "failed to initialize notifier for provider bitbucket: failed to initialize notifier: invalid repository id \"stash/scm/ap/fluxcd-sandbox\""
}

I've looked up the respective locations in the code and it seems like there currently is no handling for custom (SSH) ports (e.g. 2222) or context paths (e.g /stash) implemented.

For SSH, the host is parsed incorrectly, for HTTPS the project and reposlug end up being wrong because of the additional path component:

Input 0: ssh://git@mybitbucket:2222/ap/fluxcd-sandbox.git
Length:  2
ProjectKey: ap
Reposlug: fluxcd-sandbox
Host: https://mybitbucket:2222

Input 1: https://mybitbucket/stash/scm/ap/fluxcd-sandbox.git
Length:  4
ID: stash/scm/ap/fluxcd-sandbox
ProjectKey: stash
Reposlug: scm
Host: https://mybitbucket

To solve this would it make sense to:

  • only allow HTTPS addresses in the address field of the provider, as any URL inferred from a given SSH URL can always be incorrect (e.g. when running your HTTPS server on another port than 443)
  • restrict the spec.address field to the actual application URL (not the repo clone URL, trim any additional slashes at the end)
  • add new fields for the project key (not name) and repository slug

?

This would introduce a breaking change to the current behaviour but would be much clearer to understand and use in my opinion.

Without introducing breaking changes I could also imagine having an optional spec.contextPath field (or similar) to tell the bitbucket address parser to trim this before trying to determine the project and repository. In addition to this, a note in the docs would be required to inform the user that SSH URLs (with custom ports or installations with custom HTTP context path) are not supported and the user should fall back to using the HTTP Clone URL (+ optional contextPath).

What do you think?

Thank you very much and have a great day, Florian.

fbuchmeier-abi avatar Feb 15 '24 12:02 fbuchmeier-abi

Hi, thanks for investigating the issue and sharing so much details. Tagging @gdasson as they are our known bitbucket server provider user and can provide some help here.

My understanding is that you had a look at the bitbucket server provider code and saw that the scm/ prefix is removed from the project ID, refer https://github.com/fluxcd/notification-controller/blob/bd242e091cc16b4b6fdc7ac804b5f40baad49bf6/internal/notifier/bitbucketserver.go#L269-L270. Since strings.TrimPrefix() only removes the prefix if it exists, I think this is okay. The core issue of selecting the wrong project ID could be in parseGitAddress(), refer https://github.com/fluxcd/notification-controller/blob/bd242e091cc16b4b6fdc7ac804b5f40baad49bf6/internal/notifier/util.go#L33. We use https://github.com/chainguard-dev/git-urls for parsing the address. It is capable of parsing git URLs with different schemes, see https://github.com/chainguard-dev/git-urls/blob/2c230de71af0263dc74e4b04a7220c74aee78a68/urls.go#L41. I believe this was done to make it easier for the user to just paste addresses in any known git address scheme when they want to post deployment updates to a git provider.

Since our Providers usually send http requests and ssh address is never used for such communications, just asking for http endpoint would be nice but we do have some providers that don't take http address directly. For example the azure event hub provider uses the azure SDK which internally resolves to some azure endpoint based on event hub name and namespace in Azure. Similarly, for google pub sub provider, the address is the GCP project ID. So depending on the provider, the meaning of address varies. Hence we can't make it http endpoint only. Unfortunately, this doesn't help with the scenario you described where the ssh port and the http application endpoint port may not match. But it works well for most of the providers where the git server is managed and can't be customized. So only a few providers have this problem of custom port and also custom context path. I believe we can try to accommodate for these providers in their implementations and try to avoid introducing new constructs if possible. But if we don't have any option, we can sure make changes to the constructs.

Coming back to the issue at hand, since the bitbucket server provider uses the parseGitAddress() function mentioned above, I think it should be clear why it parses the ssh host incorrectly, it just takes whatever is provided, but doesn't work in this particular case. I don't think we can do much for the port other than just documenting it in the bitbucket server provider docs. It may also be applicable to some other providers like gitea. Regarding the invalid repository id error, maybe we can modify parseGitAddress() in some way or have another address parser specifically for bitbucket based on some bitbucket specific rules. The existing method of constructing the project ID is very simple, https://github.com/fluxcd/notification-controller/blob/bd242e091cc16b4b6fdc7ac804b5f40baad49bf6/internal/notifier/util.go#L44-L45, just trim / at the left of the URL path and add a .git suffix. I haven't used bitbucket server myself, but my understanding is that maybe something custom could work. I don't know for sure but maybe removal of scm/ would work only when bitbucket is deployed in a subdomain without a subpath. Or maybe we can make use of the .spec.channel to have some special meaning to convey how the address should be interpreted, similar to what some other providers do. For example the GCP pub/sub provider uses the channel field for setting the pub/sub topic ID. In case of bitbucket, maybe a channel could provide the context path and the last part of the address <repo-namespace>/<repo-name> could be extracted from the address field. These two can be combined to form the endpoint to which the requests would be sent for bitbucket.

These are just some ideas I could think of at present. Maybe you'll have better ideas of what'd work for bitbucket with the additional context above.

darkowlzz avatar Feb 19 '24 20:02 darkowlzz

@fbuchmeier-abi : Thanks for reporting this. Honestly, I wasn't aware of Bitbucket Context path functionality since I never used it. I just read it here now.

Now, without introducing a breaking change, the easiest path I believe would be to modify this code here: https://github.com/fluxcd/notification-controller/blob/main/internal/notifier/bitbucketserver.go#L270

@darkowlzz , @fbuchmeier-abi Instead of doing a strings.TrimPrefix we could do a split on /scm/ as the delimiter and return all thats on the right side of this delimiter. For a HTTPs URL with context path for example https://abc.com/stash/scm/proj/repo.git , the id returned by parseGitAddress function here would be stash/scm/proj/repo.git. Now. after the scm/ we only expect project/reposlug and that's what is validated here. So that way the code would always work irrespective of the fact that there is a context path on the left of /scm/ or not

Now if the context path is used for building path for API interactions, then we would have to suffix what's before the /scm/ to the host variable there.

Thoughts?

gdasson avatar Feb 22 '24 18:02 gdasson

@gdasson sounds good to me. If either of you would be interested in implementing and testing it, that would be great. I'll mark this issue as an enhancement.

darkowlzz avatar Feb 23 '24 11:02 darkowlzz

@gdasson

Instead of doing a strings.TrimPrefix we could do a split on /scm/ as the delimiter and return all thats on the right side of this delimiter. That also sounds like the easiest solution for me in regards to the HTTP context path.

The problem with custom SSH ports could then be addressed by adding a note to the docs or even log a warning message when a custom port is detected. This way a user would be able to quickly detect the problem and use the HTTPS clone URL instead of SSH in this scenario.

fbuchmeier-abi avatar Feb 23 '24 12:02 fbuchmeier-abi