coding icon indicating copy to clipboard operation
coding copied to clipboard

Automatic Changelogs

Open ebuchman opened this issue 7 years ago • 9 comments

Changelogs

Requirements

  • All changes to the code come through a PR (eg. using GitFlow)
  • All PRs generate or modify at least one entry in the changelog
  • All entries in the changelog link to all issues addressed by the PRs that generated/modified them
  • Compiling changelog should be easy and automatic and intermittent to avoid merge conflict

Problem

The obvious but naiive solution is to update the CHANGELOG.md with every PR. This has problems:

  • constant merge conflicts
  • changelog entries get added under the wrong version

Ideal

  • Every commit is a single full changelog entry
    • eg. "break [x/auth] NewHandler takes a WetKeeper"

But this is probably too extreme of a discipline.

Proposal 1 - Use Commit History

  • approximate the ideal
  • commits that include #changelog:<type>:<issue> will be included in the changelog
  • <type> is one of break, feat, improve, fix
  • <issue> is the issue number, or hyphen separated list of them
  • every PR includes at least one commit with #changelog:<type>:<issue>
  • use a tool to get all commits from all PRs merged since last release and output the full changelog in the form of CHANGELOG.md

Pros

  • Approximates the ideal - can work towards it by becoming better committers

Cons

  • Complex to support multiple changelog entries in a single commit
  • Requires re-writing commit history to fix changelog data

Proposal 2 - Use a Directory of Unique Files

  • keep a .changelog directory
  • every PR must generate or modify a file in .changelog
  • files in .changelog have the same structures as changelog.md
  • files in .changelog should be named <author>/<issue> where <author> is the author of the PR and <issue> is the issue number being addressed.
    • If there are multiple issues, list them separated by hyphen. Eg. bucky/1546-1332-1102
  • use a tool to merge all files in .changelog and then delete them all

Pros

  • easy to make fixes and updates for the same entry by editing a file

Cons

  • need to maintain new directory and have extra tmp files in the repo

Tool

  • Call it clg for change-log-generator
  • clg <version> will output a changelog formatted like CHANGELOG.md

Enforcement

Both proposals can be enforced through rules in testing

Notes

  • Use git log 06216c1f^..06216c1f --oneline to print all commits involved in the merge commit `06216c1f
  • Use git log --oneline --merges --first-parent <version>..HEAD to get the list of all merge commits to the given branch since

Summary

I'm leaning towards using files. While commit history is the ideal, its hard to realize the ideal and its much more difficult to fix/update the changelog data since they're contained in commits. Using files makes it easiest to find the data and to continuously update it in an auditable way if need be

ebuchman avatar Jul 15 '18 02:07 ebuchman

I like the commit history option, though I have the same concern regarding fixing / updating changelog data. (Though I may not be the right person to opine on this, since I write long commit messages)

RE: File approach, is it possible for a github bot to read the PR message, and then make an additional commit when merging the PR? Then we could have the github bot do the file thing you described, and then have the tool aggregate it all together. (Making new files like that seems like more of a pain to me then dealing with the merge conflicts we have right now)

ValarDragon avatar Jul 15 '18 03:07 ValarDragon

Making new files like that seems like more of a pain to me then dealing with the merge conflicts we have right now

Can you explain why? It's effectively the same amount of work as opening CHANGELOG.md and making edits

ebuchman avatar Jul 15 '18 11:07 ebuchman

Good point, I misread the file proposal and thought it said PR number (not issue number). I didn't like the idea of having to figure out the PR number / making everything at least two commits, but since its issue number its nbd. However we have lots of PR's without assoc. issue numbers, not sure how those would update the changelog.

ValarDragon avatar Jul 15 '18 14:07 ValarDragon

With proposal 1, how does this integrate with squashing, also this will require devs to make sure commits are very precise in their nature (i.e. fix bug...and also cleanup a lot of slightly unrelated godocs). This can be done with interactive committing, but maybe I'm over thinking that.

I do like proposal 2. In regards to use a tool to merge all files in .changelog and then delete them all - - when will this step be done? When the PR is merged?

alexanderbez avatar Jul 15 '18 21:07 alexanderbez

I don't really think the merge conflicts are that annoying of a problem. I do agree that changelog entries going into the wrong section is a big problem.

With that in mind, I think the following may be easier. We have a single "pending_changelog" file. All commits add to that. We move pending changelog to changelog when cutting a new release. Now old PR's still just update the pending changelog. (Perhaps having to fix a merge conflict, tho perhaps not since its just lines that've been removed)

ValarDragon avatar Jul 16 '18 03:07 ValarDragon

I'm more leaning towards Proposal 1 - it feels sloppy to have a bunch of loose proposal files which need to get deleted by some CI on merge (or manually deleted at some point! yuck)

Some comments I'd like to make:

  • the #changelog entry in the commit history can be a part of the commit description. Doesn't seem like to much of a hassle to support multiple changelog entries this way, each item can just be on a new line
  • I think we can easily do this with no commit history re-write, as long as we use special indicators for each release (both in CHANGELOG.md and the release commit) we can have the tool only create new updates to the existing CHANGELOG.md this way the old changelog could be preserved and new information for a current release could just be tacked on-top

As per these points, I think the cons of proposal-1 are alleviated.

rigelrozanski avatar Jul 16 '18 14:07 rigelrozanski

Interesting. Seems other repos do this too. Their squashed commit contains a very verbose description with issue #'s, changes, etc...

alexanderbez avatar Jul 16 '18 14:07 alexanderbez

However we have lots of PR's without assoc. issue numbers, not sure how those would update the changelog.

We should never have PRs without associated issue numbers. It's basically a requirement that you open an issue describing the change you want to make before opening a PR.

it feels sloppy to have a bunch of loose proposal files which need to get deleted by some CI on merge (or manually deleted at some point! yuck)

One way or another there will be manual work to review changelog and fix things since the automation will never be perfect here. It seems more manageable to have a tool that reads a bunch of files and then deletes them than one that reads commits because the commits are immutable so we can't update entries as we go, unlike with files.

I think we can easily do this with no commit history re-write, as long as we use special indicators for each release (both in CHANGELOG.md and the release commit) we can have the tool only create new updates to the existing CHANGELOG.md this way the old changelog could be preserved and new information for a current release could just be tacked on-top

Not sure what you're saying here. The problem is that commits are immutable, so if in the course of the commits, we need to modify the same changelog entry multiple times (eg. we change the way a feature is implemented multiple times before we release, or we add a feature and then remove it). If all the entries are in commits, we either need new syntax to say "ignore old entry and use this one" which sounds very complex or we need to wait until its time to generate the changelog and then fix the duplication issues.

Seems much cleaner and simpler to me to just have a bunch of files that can be updated as we go.

With proposal 1, how does this integrate with squashing, also this will require devs to make sure commits are very precise in their nature (i.e. fix bug...and also cleanup a lot of slightly unrelated godocs). This can be done with interactive committing, but maybe I'm over thinking that.

Not exactly sure, seems we have to be careful with the squashing or very precise in the commits. I think using the commits is a bit too fragile right now.

In regards to use a tool to merge all files in .changelog and then delete them all - - when will this step be done? When the PR is merged?

Whenever a maintainer feels like it. If there's lots of changes, it can be intermittent. Otherwise it can all happen right before release. This is another reason why files are better than commits here because you can consolidate into a single changelog as you go more easily.

ebuchman avatar Jul 16 '18 15:07 ebuchman

We should never have PRs without associated issue numbers. It's basically a requirement that you open an issue describing the change you want to make before opening a PR.

This is definitely not whats happening right now. There are many PR's without assoc. issues. https://github.com/cosmos/cosmos-sdk/pull/1688 https://github.com/cosmos/cosmos-sdk/pull/1684 https://github.com/cosmos/cosmos-sdk/pull/1669 https://github.com/cosmos/cosmos-sdk/pull/1668 https://github.com/cosmos/cosmos-sdk/pull/1627 https://github.com/tendermint/tendermint/pull/1979

I don't think most of the above needed a new issue either. (That would increase development time for small PR's, which I think should be fast)

Also any thoughts on my proposal? (https://github.com/tendermint/tendermint/pull/1979)

ValarDragon avatar Jul 16 '18 16:07 ValarDragon

@ValarDragon ^ yeah it's true there are currently PR's without associated issues, however, we should probably get into the practice of making issues for every PR - even if you only make the issue while simultaneously opening a PR.

Interestingly, There are PR's which close multiple issues - so we need to make multiple changelog entries for a single PR - which should be easy.

@ebuchman

Not sure what you're saying here. The problem is that commits are immutable, so if in the course of the commits, we need to modify the same changelog entry multiple times (eg. we change the way a feature is implemented multiple times before we release, or we add a feature and then remove it). If all the entries are in commits, we either need new syntax to say "ignore old entry and use this one" which sounds very complex or we need to wait until its time to generate the changelog and then fix the duplication issues.

Oh I understand, I was referencing the entire historical changelog - you're talking about intermediate modifications to the same entry - Yeah I agree that it's nicer to just modify a file where each file represents one changelog entry, we can avoid merge conflicts. Koodos! let's do it

rigelrozanski avatar Jul 16 '18 17:07 rigelrozanski