ozone icon indicating copy to clipboard operation
ozone copied to clipboard

HDDS-11072. Publish user-facing configs to the doc site

Open sarvekshayr opened this issue 1 year ago • 12 comments

What changes were proposed in this pull request?

This PR introduces a mechanism to automatically generate a markdown (.md) file from the ozone-default.xml and ozone-default-generated.xml files, with the entries sorted by configuration keys. The generated markdown file is then published to a new page under the documentation site.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11072

How was this patch tested?

Verified the output and format of the markdown (.md) file.

sarvekshayr avatar Jul 09 '24 08:07 sarvekshayr

@devabhishekpal could you please take a look at the changes, thanks.

tanvipenumudy avatar Jul 09 '24 08:07 tanvipenumudy

Thanks for working on this @sarvekshayr. The PR description mentions automation and publishing, but I don't see any automated way of running this in the PR currently. This is important to work out because it affects the implementation:

  • If we are generating the table on every Ozone build, a Java implementation would be preferred so no new deps are added to the build.
  • If we are generating the table with GitHub actions in CI, a shell based implementation would be preferred.
    • A script using an XML parser like yq (similar to jq and already bundled in Ubuntu actions runners) or xmllint with xpath queries can pull config keys from a config xml file.
    • If we plan to run this from the new website in a "pull" model for HDDS-10683, there will be no Java dependencies in that environment.
      • Note that the repo for the new website contains all the docs there instead of having them in the ozone repo.

Since it looks like the configs require outputs from the Ozone build, not just the source, I'm assuming the plan for automation will look something like this:

  • The Java implementation is plugged into the Maven build when the Ozone is built.
  • For the current website, the source file it generates as a result of the build is committed back to the repo somehow, either as a separate PR by a bot or a commit done in CI.
  • For the new website, the source file it generates is committed back to the ozone-site repo by a bot either automatically or through a pull request.

errose28 avatar Jul 09 '24 15:07 errose28

Hi @sarvekshayr, thanks for adding this patch and @errose28 for your inputs. I believe it would be better to use GitHub actions for committing the result rather than using a plugin. What we could do is:

  • After the build step we can upload the generated files as artifacts (there is a GitHub action actions/upload-artifact@v4 which stores the artifacts generated during a workflow for consumption later)
  • We add a workflow to download these uploaded artifacts (actions/download-artifact@v4) and as you suggested use shell commands to detect the generated xml files and transform them.
  • This can then be committed back into the same PR like we did for the dependabot script or raise a separate PR for both ozone-site (for the new website) and hdds-docs.

@sarvekshayr please refer below on how you might be able to share the data between the workflows: https://docs.github.com/en/actions/using-workflows/storing-workflow-data-as-artifacts

And for pushing the commit you can check the changes that were done for dependabot if you want to add the generated markdown file to the same PR. This is the reference to the line where that step is being executed.

Or you can use the bot to raise a separate PR for better tracking via GitHub actions.

@errose28 @adoroszlai let me know your thoughts as well.

devabhishekpal avatar Jul 28 '24 05:07 devabhishekpal

Thanks for the input @devabhishekpal. Here's what I'm thinking based on the extra info you shared:

  1. We have a GH action that runs as part of the CI on each commit merged to master only.
  2. The action generates the markdown file for the site.
  3. The action checks the hash of the markdown file against the hash of the one already committed. It only raises a PR if they differ.
  4. The action raises a separate PR with the docs changes that we manually review and merge.
    • I don't think we make config key changes too often so the extra manual review should be ok.
    • For now it can raise a PR to the current docs (in the ozone repo) and the new docs (in the ozone-site repo) at the same time.

Some thoughts:

  • In the dependabot changes, we added the commit onto the branch for the PR before it was merged. This was fine in that context because the PR was raised by a bot, but will be annoying to human developers because it will cause their local and upstream branches to diverge.
  • In step 2 above, either method of file generation would work:
    • A Java implementation would build the file as part of the Ozone build and it would be in the build artifact. It could pull the file out of the build artifact for that run and raise a PR.
    • A shell implementation could make the file from the source used in that run and raise a PR.
  • I'm not sure we need a separate artifact just for the generated file, because it will be raised in a PR in the same step which will serve as persistence.

errose28 avatar Jul 29 '24 16:07 errose28

Hello @errose28 @adoroszlai. I had added a comment for checking the hash of the configuration.md that would be generated. But before the first run of this workflow I believe we would require the file to be present initially. For this we can do either of the following:

  • Manually generate the file and raise PR to add it to the current repo on both ozone and ozone-site
  • Add a check to the hash checking step - in case the file is not present continue with the commit else compare hash

Which one do you think would be better?

devabhishekpal avatar Aug 17 '24 17:08 devabhishekpal

I had added a comment for checking the hash of the configuration.md that would be generated. But before the first run of this workflow I believe we would require the file to be present initially.

I would check the hash if the file exists, and raise a PR if the file does not exist. That way the workflow is not dependent on the initial repo state it is running on in order to succeed. If we are doing PRs instead of commits we have a chance to fix it manually if needed.

errose28 avatar Aug 19 '24 17:08 errose28

Thinking back on https://github.com/apache/ozone/pull/6916#discussion_r1722218763, do you think we should also have this run for 1.4.0 branch or run for older release branches as well? This would be applicable where we might backport changes? Or is it expected that config related changes will only be present on the latest release at the time and older releases will only receive fixes?

devabhishekpal avatar Aug 19 '24 18:08 devabhishekpal

do you think we should also have this run for 1.4.0 branch or run for older release branches as well? This would be applicable where we might backport changes? Or is it expected that config related changes will only be present on the latest release at the time and older releases will only receive fixes?

This is a good question. I think defining a mapping between release branches and locations in the docs may be difficult to maintain. This is true whether the docs live in the ozone or ozone-site repo, since doc versions are stored in different directories. We could come up with some regex type thing but that seems fragile. There likely won't be many config changes to past release lines, so it is probably ok to leave it as just the master branch of ozone and ozone-site for now.

errose28 avatar Aug 19 '24 21:08 errose28

I was talking with @sarvekshayr, who was testing this out on her forks of ozone and ozone-site and getting permission errors when ozone tried to raise a PR to ozone-site, since it would need to create a new branch for the PR in ozone-site.

After some research, we will need a personal access token (PAT) for apache/ozone to have permissions to create branches needed for PRs to apache/ozone-site. This needs ASF infra to approve, but it looks like there's already precedent for other projects doing it so we should be able to get approval. See INFRA-25629, INFRA-23946, INFRA-23220, and INFRA-23044 for some examples.

@sarvekshayr will test the PAT approach on her fork, where she has permissions to create them. Once we have a POC running on the forks we can raise an infra ticket to get a PAT for ozone-site and store it as a secret in the Ozone repo. This workflow can then reference the secret.

errose28 avatar Sep 17 '24 19:09 errose28

Thanks for the continued work on this @sarvekshayr. Looks like the PAT approach is working well on your fork. I'll file an infra ticket to get a PAT for the ozone-site repo, and we can continue work here in the mean time.

After reviewing the auto generated PRs, here are some high level suggestions:


PR Message

  • It would be nice to have the PR messages follow the PR template. We can just hardcode that into the code that is generating it.
  • It would also be good to have the branch and workflow names as actual links.
    • For branches, insert <branchname> into this template: [branchname](https://github.com/apache/ozone-site/tree/<branchname>).
    • For workflows, GitHub has some contexts we might be able to use.
  • Ideally we would be able to extract the Jira ID from the commit message, and use that in the PR message as well. I think we do a similar thing in our existing title CI check.
  • We can use the short version of the hash for the branch name. This can be obtained by passing the current hash to git rev-parse --short.

Overall I'm thinking the PR message could look like this, where HDDS-1234 is the hypothetical Ozone change that made the config modification:

source branch: config-doc-update-from-22ddfb951a

HDDS-1234. Update configurations.md page with changes from HDDS-1234

What changes were proposed in this pull request?

This is an automated pull request triggered by the workflow build-branch run from test-branch-2. Please delete the config-doc-update-from-22ddfb951a branch after this PR is merged or closed.

What is the link to the Apache Jira?

HDDS-1234

How was this patch tested?

Reviewers should manually verify the correctness of this change.


Webpage

  • Right now each config key is rendered as it's own table. We should make them one big table with a header row.
  • When a config key references another config key in its description, it would be nice to wrap it in single ticks.
    • This will render them as inline code and prevent the new website's spell checker from flagging them as errors.
    • Since we have the config key map stored in memory when building the page, we can check if a description contains a config key that needs to be wrapped by checking if a whitespace-delimited word in the description is also a key in the map.
      • This will probably slow down the runtime but since this is happening in the background I think that's ok.
  • We can put each tag on a new line for readability. I have an example below.
  • There are some concatenation errors in the existing code descriptions that the new website's spell checker is correctly flagging. We can fix those in a follow up task, don't worry about it for now.

Overall I'm thinking the webpage could look like this:

Unrendered:

---
sidebar_label: Appendix
---

# Configuration Key Appendix

This page provides a comprehensive overview of all configuration keys available in Ozone.

## Configuration Keys

| **Name** | **Value** | **Tags** | **Description** |
|-|-|-|-|
| `hadoop.hdds.db.rocksdb.WAL_size_limit_MB` | 0MB | `OM`<br/>`SCM`<br/>`DATANODE` | The total size limit of WAL log files. Once the total log file size exceeds this limit, the earliest files will be deleted.Default 0 means no limit. |
| `hadoop.hdds.db.rocksdb.WAL_ttl_seconds` | 1200 | `OM`<br/>`SCM`<br/>`DATANODE` | The lifetime of WAL log files. Default 1200 seconds. |
| `hdds.container.balancer.move.replication.timeout` | 50m | `BALANCER` | The amount of time to allow a single container's replication from source to target as part of container move. For example, if `hdds.container.balancer.move.timeout` is 65 minutes, then out of those 65 minutes 50 minutes will be the deadline for replication to complete. |

Rendered:

Configuration Key Appendix

This page provides a comprehensive overview of all configuration keys available in Ozone.

Configuration Keys

Name Value Tags Description
hadoop.hdds.db.rocksdb.WAL_size_limit_MB 0MB OM
SCM
DATANODE
The total size limit of WAL log files. Once the total log file size exceeds this limit, the earliest files will be deleted.Default 0 means no limit.
hadoop.hdds.db.rocksdb.WAL_ttl_seconds 1200 OM
SCM
DATANODE
The lifetime of WAL log files. Default 1200 seconds.
hdds.container.balancer.move.replication.timeout 50m BALANCER The amount of time to allow a single container's replication from source to target as part of container move. For example, if hdds.container.balancer.move.timeout is 65 minutes, then out of those 65 minutes 50 minutes will be the deadline for replication to complete.

errose28 avatar Sep 20 '24 20:09 errose28

I filed INFRA-26150 for adding a personal access token.

errose28 avatar Sep 23 '24 16:09 errose28

Hi @sarvekshayr can you temporarily modify this change to use the OZONE_WEBSITE_BUILD secret per INFRA-26150 to raise a test pull request from a commit in this pull request to the ozone-site repo? Similar to what you have done on your fork. If that test works we can close out the infra ticket and then revert the test change.

errose28 avatar Oct 15 '24 16:10 errose28

@errose28 @sarvekshayr ozone-site has automation to update the website daily with content generated from hadoop-hdds/docs. Shouldn't we use the same approach for documenting configs?

adoroszlai avatar Dec 11 '24 17:12 adoroszlai

ozone-site has automation to update the website daily with content generated from hadoop-hdds/docs. Shouldn't we use the same approach for documenting configs?

I think the biggest difference is that the docs content that is merged into the current website is automatically generated and we assume it does not need reviews. However for this config table I think we want the PR to be automatically created but an actual dev to verify the content lines up with the change that it came from, CI passes, etc before publishing.

So the question is: should ozone-site pull the changes and create a PR, or should ozone push the changes and create a PR. Since changes are being reviewed, it would be good to have a way to trace back which ozone changes led to which config changes needed in the docs, and have one doc update PR per config change commit in Ozone. Easiest way I see to do this is to use a push model, so Ozone can attach information about the commit that triggered the push in the ozone-site pull request.

errose28 avatar Dec 13 '24 18:12 errose28

for this config table I think we want the PR to be automatically created but an actual dev to verify the content lines up with the change that it came from

In that case current approach is fine, I didn't know manual review was desired.

adoroszlai avatar Dec 13 '24 18:12 adoroszlai

This PR has been marked as stale due to 21 days of inactivity. Please comment or remove the stale label to keep it open. Otherwise, it will be automatically closed in 7 days.

github-actions[bot] avatar Nov 12 '25 00:11 github-actions[bot]

Thank you for your contribution. This PR is being closed due to inactivity. If needed, feel free to reopen it.

github-actions[bot] avatar Nov 19 '25 00:11 github-actions[bot]