Links to raw repo or documentation sites?
There has been a development on creating auto-generated interactive documentations for some repos-
- https://thealgorithms.github.io/C
- https://thealgorithms.github.io/C-Plus-Plus and has raised some questions on the direction to proceed.
I see softwares and software libraries provide initial links to their documentation sites and within there, links to their code repos. As an educational platform, wouldn't the docs be more beneficial to the readers who can then navigate to the repo and the respective codes?
Questions
- What value do users/readers get from links to raw repo?
- Is there a value to generate auto-docs for each repo in a similar fashion to their respective
gh-pagesbranches? - Should the main website provide links to these docs instead?
- This would, however require the contributors to properly document their contributions and would intern improve the quality of code being accepted to the repos. Is the reviewers' time consumed outweighed by the benefits?
(Did I miss anything else?)
What value do users/readers get from links to raw repo?
The notion that organized algorithms with reviewed comments and tests are "raw" is misleading. To answer the question above... tons of value. Look at the Gitter comments that people have when they find TheAlgorithms. Those comments are about the power of browsing and reading code. They are about seeing a clear catalog of algorithms organized by type. These repos allow visitors to learn fast by reading code and to understand implementation trade-offs with confidence. I personally would rather see more performance benchmarks than documentation. Auto-generated docs are merely reformatting what is already in the code. Such documentation has its place but it is not more important than the code itself.
You can look at the empirical data - https://github.com/TheAlgorithms/C-Plus-Plus/graphs/traffic and notice the increase in traffic to the repo after the deployment of the doc site. I don't think the value-proposition has been noted here. Hence the exercise to answer the above questions for each of the commenters.
- Benefit of using to github repo as the target sites -
- direct access to code
- no maintenance
- less code review
- filenames in repos not coherent
- Benefit of using doc site
- conveniently formatted formulae, links , descriptions, images, rich HTML
- submitted code needs to be formatted and thoroughly checked before accepting - note that much of existing codes in C & C++ repo was not even compilable.
- way to provide an educational site along with code
- ability to inject Javascript to provide compile & run type access in the future
The problem with the code comments is that there isn't a standardized way to write comments and the writing of comments and docstrings is generally not enforced. We need a way to standardize either the writing of comments or the writing of a documentation.
The problem with the code comments is that there isn't a standardized way to write comments and the writing of comments and docstrings is generally not enforced. We need a way to standardize either the writing of comments or the writing of a documentation.
Are you sure? While there might be multiple documentation standards, it is to the group the pick one and stick with it - For C/C++ repo, we are using Doxygen standards For Python, there is the Sphinx, pydoc, etc Doxygen also handles other languages like Java
If there are codes in the repos that are undocumented, then it is an even bigger problem - as the whole point of this repo is to provide educational code and if codes without documentation are being accepted, it is quite concerning.
Are you sure? While there might be multiple documentation standards, it is to the group the pick one and stick with it
This is true, different programming languages have different standards. I am talking about each repository individually. Contribution guidelines encourage the use of comments, etc. but it isn't required. Many times I had seen poorly named variables allowed in code.
Here is a piece from the Python contribution guidelines:
We encourage you to put docstrings inside your functions but please pay attention to indentation of docstrings.
As you see, this is encouraged but not required.
Are you sure? While there might be multiple documentation standards, it is to the group the pick one and stick with it
This is true, different programming languages have different standards. I am talking about each repository individually. Contribution guidelines encourage the use of comments, etc. but it isn't required. Many times I had seen poorly named variables allowed in code.
Here is a piece from the Python contribution guidelines:
We encourage you to put docstrings inside your functions but please pay attention to indentation of docstrings.
As you see, this is encouraged but not required.
Absolutely true. It has been an effort to enforce the standards and is definitely not a one day thing. See for example - this PR and we are still not done. Others are working on the remaining folders.
It has also been an observation that enforcing stricter rules and standards has improved the quality of contributors and the quality of code. The cost of acquisition, though got higher, the retainment costs of quality contributors has gone low. 😄
I came across this repo: https://github.com/OpenGenus/cosmos with a really nice website A really good set of codes nicely organized into one giant repo. Even they are using the same long inefficient process-line that requires contributions on code, website, documentation, etc.
I like the approach of TheAlgorithms where the repos are nicely organized based on languages and it is already evident that there is a great imbalance in the number of contributors for the repos - maybe due to language popularity. This is also going to be the same problem with having contributors for a website like above. Hence, another empirical proof of validity of having code to be self-explanatory and descriptive so that the documentation gets generated and becomes available immediately.