versatile-data-kit icon indicating copy to clipboard operation
versatile-data-kit copied to clipboard

Propagate Confluence permissions to the vector database

Open DeltaMichael opened this issue 2 years ago • 2 comments

Overview

The primary objective is to ensure that permissions and restrictions set on Confluence pages are accurately reflected within an associated vector database. This aims to maintain consistent access controls across platforms, ensuring that users can only interact with data in the vector database according to their permissions in Confluence.

We should leverage the Atlassian Python API to programmatically fetch permissions and restrictions data for specific Confluence pages. This information should then be used to set or update access controls on the relevant entries in the vector database. The solution should include:

  • A scheduled task or trigger that periodically checks for changes in Confluence page permissions.
  • A mapping mechanism to relate Confluence pages to corresponding entities in the vector database.

Additionally, consider the integration capabilities with LangChain, as it may offer an alternative approach to accessing the Atlassian Python API and handling the permissions data. LangChain's integration with the Atlassian Python API could potentially enhance the process of fetching and applying permission settings, but this requires further investigation to understand whether it is available.

Here is an example on how you can set the Atlassian API:

from atlassian import Confluence

confluence_url = 'your-confluence-url'
username = 'your-username'
api_token = 'your-api-token'

confluence = Confluence(
    url=confluence_url,
    username=username,
    password=api_token)

content_id = 'CONTENT_ID'  # replace with the actual ID of the Confluence page


# fetch the restrictions for a specific page
restrictions = confluence.get_all_restrictions_for_content(content_id=content_id)

Acceptance criteria

  1. Only pages to which the user has access should be used in the chat bot's response
  2. Add documentation on how permissions are handled by VDK
  3. Create additional stories for permission control.

DeltaMichael avatar Feb 05 '24 13:02 DeltaMichael

The proposed api is not working and a different approach should be used

duyguHsnHsn avatar Feb 15 '24 15:02 duyguHsnHsn

There is no direct way to use the REST API, to get permissions on content/space and there is ticket linked to this issue: https://jira.atlassian.com/browse/CONFSERVER-78176 You can also check the full REST API docs here: https://developer.atlassian.com/cloud/confluence/rest/v1/intro/#about

One thing found which I couldn’t get running is: https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-content-permissions/#api-group-content-permissions

But still does it make sence to use this one since we would need to run this for each user/group for every possible content which would end up making too many requests.

There is no direct way to get the permissions using CQL ,too, since no commands linked to getting permissions are provided and the response we get for fetching more information about content does not include the permissions: https://developer.atlassian.com/cloud/confluence/advanced-searching-using-cql/

Apparently, you need to write your own Java plugin with which you can figure out a way of extracting the permissions -> maybe this https://developer.atlassian.com/server/confluence/rest-module/

There is an idea of using webhooks (https://developer.atlassian.com/cloud/confluence/modules/webhook/) but I do not see them reliable in the case of extracting the data at first (how would we even fecth all the permissions at first?). They could be used for tracking updates later on but as I said not for the initial fetch. And also there are issues we should further consider linked to the webhooks: Important: It must be noted that webhook delivery is not guaranteed; it is best effort. When a webhook event is triggered in Jira or Confluence instance then a single HTTP POST is sent to your add-on. If your add-on is down, or there is any network problems between the Atlassian product and your add-on, then you will never receive the webhook event. In general, webhooks are quite reliable; however you must always keep in mind that delivery is not guaranteed.

duyguHsnHsn avatar Feb 20 '24 09:02 duyguHsnHsn