Automatic Changelogs
Changelogs
Requirements
- All changes to the code come through a PR (eg. using GitFlow)
- All PRs generate or modify at least one entry in the changelog
- All entries in the changelog link to all issues addressed by the PRs that generated/modified them
- Compiling changelog should be easy and automatic and intermittent to avoid merge conflict
Problem
The obvious but naiive solution is to update the CHANGELOG.md with every PR. This has problems:
- constant merge conflicts
- changelog entries get added under the wrong version
Ideal
- Every commit is a single full changelog entry
- eg. "break [x/auth] NewHandler takes a WetKeeper"
But this is probably too extreme of a discipline.
Proposal 1 - Use Commit History
- approximate the ideal
- commits that include
#changelog:<type>:<issue>will be included in the changelog -
<type>is one ofbreak,feat,improve,fix -
<issue>is the issue number, or hyphen separated list of them - every PR includes at least one commit with
#changelog:<type>:<issue> - use a tool to get all commits from all PRs merged since last release and output the full changelog in the form of CHANGELOG.md
Pros
- Approximates the ideal - can work towards it by becoming better committers
Cons
- Complex to support multiple changelog entries in a single commit
- Requires re-writing commit history to fix changelog data
Proposal 2 - Use a Directory of Unique Files
- keep a .changelog directory
- every PR must generate or modify a file in .changelog
- files in .changelog have the same structures as changelog.md
- files in .changelog should be named
<author>/<issue>where<author>is the author of the PR and<issue>is the issue number being addressed.- If there are multiple issues, list them separated by hyphen. Eg.
bucky/1546-1332-1102
- If there are multiple issues, list them separated by hyphen. Eg.
- use a tool to merge all files in .changelog and then delete them all
Pros
- easy to make fixes and updates for the same entry by editing a file
Cons
- need to maintain new directory and have extra tmp files in the repo
Tool
- Call it
clgfor change-log-generator -
clg <version>will output a changelog formatted like CHANGELOG.md
Enforcement
Both proposals can be enforced through rules in testing
Notes
- Use
git log 06216c1f^..06216c1f --onelineto print all commits involved in the merge commit `06216c1f - Use
git log --oneline --merges --first-parent <version>..HEADto get the list of all merge commits to the given branch since
Summary
I'm leaning towards using files. While commit history is the ideal, its hard to realize the ideal and its much more difficult to fix/update the changelog data since they're contained in commits. Using files makes it easiest to find the data and to continuously update it in an auditable way if need be
I like the commit history option, though I have the same concern regarding fixing / updating changelog data. (Though I may not be the right person to opine on this, since I write long commit messages)
RE: File approach, is it possible for a github bot to read the PR message, and then make an additional commit when merging the PR? Then we could have the github bot do the file thing you described, and then have the tool aggregate it all together. (Making new files like that seems like more of a pain to me then dealing with the merge conflicts we have right now)
Making new files like that seems like more of a pain to me then dealing with the merge conflicts we have right now
Can you explain why? It's effectively the same amount of work as opening CHANGELOG.md and making edits
Good point, I misread the file proposal and thought it said PR number (not issue number). I didn't like the idea of having to figure out the PR number / making everything at least two commits, but since its issue number its nbd. However we have lots of PR's without assoc. issue numbers, not sure how those would update the changelog.
With proposal 1, how does this integrate with squashing, also this will require devs to make sure commits are very precise in their nature (i.e. fix bug...and also cleanup a lot of slightly unrelated godocs). This can be done with interactive committing, but maybe I'm over thinking that.
I do like proposal 2. In regards to use a tool to merge all files in .changelog and then delete them all - - when will this step be done? When the PR is merged?
I don't really think the merge conflicts are that annoying of a problem. I do agree that changelog entries going into the wrong section is a big problem.
With that in mind, I think the following may be easier. We have a single "pending_changelog" file. All commits add to that. We move pending changelog to changelog when cutting a new release. Now old PR's still just update the pending changelog. (Perhaps having to fix a merge conflict, tho perhaps not since its just lines that've been removed)
I'm more leaning towards Proposal 1 - it feels sloppy to have a bunch of loose proposal files which need to get deleted by some CI on merge (or manually deleted at some point! yuck)
Some comments I'd like to make:
- the
#changelogentry in the commit history can be a part of the commitdescription. Doesn't seem like to much of a hassle to support multiple changelog entries this way, each item can just be on a new line - I think we can easily do this with no commit history re-write, as long as we use special indicators for each release (both in
CHANGELOG.mdand the release commit) we can have the tool only create new updates to the existingCHANGELOG.mdthis way the old changelog could be preserved and new information for a current release could just be tacked on-top
As per these points, I think the cons of proposal-1 are alleviated.
Interesting. Seems other repos do this too. Their squashed commit contains a very verbose description with issue #'s, changes, etc...
However we have lots of PR's without assoc. issue numbers, not sure how those would update the changelog.
We should never have PRs without associated issue numbers. It's basically a requirement that you open an issue describing the change you want to make before opening a PR.
it feels sloppy to have a bunch of loose proposal files which need to get deleted by some CI on merge (or manually deleted at some point! yuck)
One way or another there will be manual work to review changelog and fix things since the automation will never be perfect here. It seems more manageable to have a tool that reads a bunch of files and then deletes them than one that reads commits because the commits are immutable so we can't update entries as we go, unlike with files.
I think we can easily do this with no commit history re-write, as long as we use special indicators for each release (both in CHANGELOG.md and the release commit) we can have the tool only create new updates to the existing CHANGELOG.md this way the old changelog could be preserved and new information for a current release could just be tacked on-top
Not sure what you're saying here. The problem is that commits are immutable, so if in the course of the commits, we need to modify the same changelog entry multiple times (eg. we change the way a feature is implemented multiple times before we release, or we add a feature and then remove it). If all the entries are in commits, we either need new syntax to say "ignore old entry and use this one" which sounds very complex or we need to wait until its time to generate the changelog and then fix the duplication issues.
Seems much cleaner and simpler to me to just have a bunch of files that can be updated as we go.
With proposal 1, how does this integrate with squashing, also this will require devs to make sure commits are very precise in their nature (i.e. fix bug...and also cleanup a lot of slightly unrelated godocs). This can be done with interactive committing, but maybe I'm over thinking that.
Not exactly sure, seems we have to be careful with the squashing or very precise in the commits. I think using the commits is a bit too fragile right now.
In regards to use a tool to merge all files in .changelog and then delete them all - - when will this step be done? When the PR is merged?
Whenever a maintainer feels like it. If there's lots of changes, it can be intermittent. Otherwise it can all happen right before release. This is another reason why files are better than commits here because you can consolidate into a single changelog as you go more easily.
We should never have PRs without associated issue numbers. It's basically a requirement that you open an issue describing the change you want to make before opening a PR.
This is definitely not whats happening right now. There are many PR's without assoc. issues. https://github.com/cosmos/cosmos-sdk/pull/1688 https://github.com/cosmos/cosmos-sdk/pull/1684 https://github.com/cosmos/cosmos-sdk/pull/1669 https://github.com/cosmos/cosmos-sdk/pull/1668 https://github.com/cosmos/cosmos-sdk/pull/1627 https://github.com/tendermint/tendermint/pull/1979
I don't think most of the above needed a new issue either. (That would increase development time for small PR's, which I think should be fast)
Also any thoughts on my proposal? (https://github.com/tendermint/tendermint/pull/1979)
@ValarDragon ^ yeah it's true there are currently PR's without associated issues, however, we should probably get into the practice of making issues for every PR - even if you only make the issue while simultaneously opening a PR.
Interestingly, There are PR's which close multiple issues - so we need to make multiple changelog entries for a single PR - which should be easy.
@ebuchman
Not sure what you're saying here. The problem is that commits are immutable, so if in the course of the commits, we need to modify the same changelog entry multiple times (eg. we change the way a feature is implemented multiple times before we release, or we add a feature and then remove it). If all the entries are in commits, we either need new syntax to say "ignore old entry and use this one" which sounds very complex or we need to wait until its time to generate the changelog and then fix the duplication issues.
Oh I understand, I was referencing the entire historical changelog - you're talking about intermediate modifications to the same entry - Yeah I agree that it's nicer to just modify a file where each file represents one changelog entry, we can avoid merge conflicts. Koodos! let's do it