github-crawler icon indicating copy to clipboard operation
github-crawler copied to clipboard

lastUpdateDate doesn't match last commit date for repo/branch

Open anupbaranwal opened this issue 7 years ago • 3 comments

Summary

The lastUpdateDate gives last update date of any branch for a repo. Generally either it should give the last commit date on that branch or there should be a new field called lastCommitDate for every branch.

Type of Issue

It is a :

  • [x] bug
  • [ ] request
  • [ ] question regarding the documentation

Motivation

job should complete and return a field for last commit date on branch.

Current Behavior

Currently all the branches for a repo are having same value for lastUpdateDate.

Expected Behavior

lastUpdateDate should be different for each branches of Repo Or there should be a new field which can represent last commit date on the branch.

Steps to Reproduce (for bugs)

When we run the crawler with crawlAllBranches config as true, we can check same lastUpdateDate for all branches for a repo.

Your Environment

  • Version used: 1.0.6
  • OS and version: Linux
  • Version of libs used:

anupbaranwal avatar Oct 03 '18 06:10 anupbaranwal

Thanks for raising this !

To accomodate this change, and further similar requests, I feel it's time to refactor the configuration.

What about a config like below ?

miscRepoOperations:
  # existing CountSearchResultParser
  - name: nbOfMetricsInPomXml
    method: countSearchHits
    params:
      queryString: "q=metrics+extension:xml"
  # existing OwnershipParserImpl
  - name: owningTeam
    method: computeRepoOwnership
  # NEW repo operation fetching the last commit timestamp
  - name: lastCommitTimestamp
    method: fetchLastCommitTimestamp

we could consider the search and ownership computation as miscellaneous operations we perform on the repo. By defining the right RepositoryOperation interface, we could refactor and "retrofit" CountSearchResultParser and OwnershipParserImpl into it. We could also implement a class that will fetchLastCommitTimestamp, to fix the issue you're raising.

@anupbaranwal : do you want to give it a try ?

vincent-fuchs avatar Oct 03 '18 16:10 vincent-fuchs

Sure, looks good to me. I'll give it a try. Thank you for the suggestion.

Thanks, Anup

On Wed 3 Oct, 2018, 10:27 PM Vincent Fuchs, [email protected] wrote:

Thanks for raising this !

To accomodate this change, and further similar requests, I feel it's time to refactor the configuration.

What about a config like below ?

miscRepoOperations:

existing CountSearchResultParser

  • name: nbOfMetricsInPomXml method: countSearchHits params: queryString: "q=metrics+extension:xml"

existing OwnershipParserImpl

  • name: owningTeam method: computeRepoOwnership

NEW repo operation fetching the last commit timestamp

  • name: lastCommitTimestamp method: fetchLastCommitTimestamp

we could consider the search and ownership computation as miscellaneous operations we perform on the repo. By defining the right RepositoryOperation interface, we could refactor and "retrofit" CountSearchResultParser and OwnershipParserImpl into it. We could also implement a class that will fetchLastCommitTimestamp, to fix the issue you're raising.

@anupbaranwal https://github.com/anupbaranwal : do you want to give it a try ?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/societe-generale/github-crawler/issues/42#issuecomment-426714348, or mute the thread https://github.com/notifications/unsubscribe-auth/AX12YewRLpbo2EfW73cmJkp2IQqTJSp9ks5uhOx0gaJpZM4XFXUF .

anupbaranwal avatar Oct 03 '18 17:10 anupbaranwal

lot of changes have been done in past couple of days. Have a look at misc-repository-tasks , you'll need to implement a task following the same pattern.

vincent-fuchs avatar Nov 20 '18 16:11 vincent-fuchs