incubator-devlake icon indicating copy to clipboard operation
incubator-devlake copied to clipboard

[Bug][Lake] CodeReview (pull requests) data pulled by lake app does not honour the sync policy time range in config-ui

Open sayeedhussain opened this issue 1 year ago • 6 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

CodeReview (pull requests) data pulled by lake app does not honour the sync policy time range in config-ui. Additional data is getting pulled from github. Refer screenshots.

Screenshot 2024-03-01 at 11 33 39 AM Screenshot 2024-03-01 at 11 33 29 AM

What do you expect to happen

The sync policy time range in config-ui should be honoured by lake app while pulling data.

How to reproduce

  1. Create a project with sync policy time range of 3 months.
  2. Create a github datasource connection with scopeconfig for CodeReview. Ensure the repository has pull requests for more than 3 months in the past.
  3. Collect data
  4. View pr.created dates for the project in MySQL
  5. PRs with created date before 3 months is also available

Anything else

No response

Version

0.21.0-beta5

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

sayeedhussain avatar Mar 01 '24 06:03 sayeedhussain

That's beacuse DevLake fetches GitHub pull request via its graphql API, and this API doesn't support createdAt filter. So Devlake collect all pull requests.

image

Maybe we can collect pull request via search API, just like this: https://github.com/orgs/community/discussions/24611 . We can vote ont this matter.

d4x1 avatar Mar 05 '24 14:03 d4x1

thanks for the analysis @d4x1. For now, we are able to work around this issue by fixing our mysql queries.

But in general, I think it would be good to fix this so that there are no *special conditions about honouring sync policy time range.

sayeedhussain avatar Mar 08 '24 12:03 sayeedhussain

@sayeedhussain IMO, using search API is not the right way. I think we should ask GitHub to update its graphql API, it will take too long. Maybe we can filter out records that don't satisfy the time range. @abeizn Will it lead to other problems?

d4x1 avatar Mar 12 '24 09:03 d4x1

@d4x1 Sure. My suggestion was that it will be good to fix the issue. How to fix is best decided by you/team. thanks!

sayeedhussain avatar Mar 12 '24 12:03 sayeedhussain

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Jun 17 '24 00:06 github-actions[bot]

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Aug 17 '24 00:08 github-actions[bot]

Had been fixed in https://github.com/apache/incubator-devlake/pull/7878/files#diff-cebf96cf757874a695c6c6a61c75035530e6f6e33e61cbe594138736bbf45fb6R214

klesh avatar Sep 09 '24 08:09 klesh