[Bug][github] Branches do not get removed
Search before asking
- [x] I had searched in the issues and found no similar issues.
What happened
When branches are deleted in github, the lake.refs table is not updated to either A: mark the branch as deleted, or B: the branch is deleted from the table.
This causes problems when trying to detect stale branches, because it includes branches that have already been deleted/tidied up in github.
What do you expect to happen
Either:
- Mark the branch as deleted in the
lake.refstable - Remove the branch from the
lake.refstable
How to reproduce
- Create a repository
- Push an additional branch to the repository
- Scrap the repository
- Check the
lake.refstable for the repo + branches - Delete the branch in the github UI
- Scrape the repository
- Check the
lake.refstable for the branch - it should still exist.
Anything else
No response
Version
devlake-1.0.2-beta1
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Hey @cypher7682! 👋 I'm here to help you with any bugs, questions, or contributions you have for the project. I'll assist you while we wait for a human maintainer.
The issue you're encountering is related to the lake.refs table not being updated to reflect branch deletions from GitHub. Currently, there is no field in the lake.refs table specifically designed to mark a branch as deleted [1]. This means that when a branch is deleted in GitHub, it is not automatically marked or removed in the lake.refs table during a repository scrape.
To address this, you might consider implementing a mechanism to either mark branches as deleted or remove them from the table when they are deleted in GitHub. This would help in accurately detecting stale branches by excluding those that have been deleted.
To continue talking to Dosu, mention @dosu.
Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other
This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.
Branch information is currently useless since there is not certainty at all about the existence of the branches.
Why do you need to detect stable branches using DevLake?
This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.