Duplication of glossary/terminology overview documents
We have glossary with gluster terminology in both Quick Start Guide and Administrator Guide.
The problem is that the content of glossary/terminology overview itself is duplicated in 2 different separate files, with different formatting. This is hard to maintain and keep in sync.
Details
There are 2 files with gluster glossary overview:
- Terminologies.md from Quick Start Guide
- glossary.md from Administrator Guide
While the number of items explained in each file differs greatly:
$ grep '^###' Quick-Start-Guide/Terminologies.md | sed 's/### //' | wc -l
19
$ grep '^\*\*' Administrator\ Guide/glossary.md | sed 's/\*\*//g' | wc -l
45
There is lots of duplication, see the following list of terms which are explained in both files:
$ comm -12 <(grep '^\*\*' Administrator\ Guide/glossary.md | sed 's/\*\*//g' | sort) <(grep '^###' Quick-Start-Guide/Terminologies.md | sed 's/### //' | sort)
Brick
Client
Cluster
Distributed File System
FUSE
Geo-Replication
glusterd
Metadata
Namespace
POSIX
RAID
RRDNS
Server
Trusted Storage Pool
Userspace
Volume
Here is an example how explanation for Brick looks in each file. First is from Quick-Start-Guide/Terminologies.md:
### Brick
Brick is the basic unit of storage, represented by an export directory
on a server in the trusted storage pool.
While this is from Administrator Guide/glossary.md:
**Brick**
: A Brick is the basic unit of storage in GlusterFS, represented by an export
directory on a server in the trusted storage pool.
A brick is expressed by combining a server with an export directory in the following format:
`SERVER:EXPORT`
For example:
`myhostname:/exports/myexportdir/`
Expected state
There should be no duplication of information.
We could for example store the content in a single file and then include this single file in both guides without duplicating the content.
@mbukatov Thanks!. Would you like to send a PR on this ?
@humblec Do you agree with squashing both documents into single one, which would be included in both guides without duplication?
I think squashing is a must, otherwise it may just introduce too much confusion.
How about this:
- Quick-Start-Guide/Terminologies.md is still valid, but should change how it is presented:
-
- it should consist of the list of terms used later in the Quick Start guide, the minimum set focused on very specific, minimal gluster setup - so we should ditch RAID, geo-replication and so on...
-
- each entry should be shortened to the minimum - one sentence if possible
-
- each entry should be pointing to the Administration guide for more in-depth explanation of the term.
This way the quick start introduces basic concepts without needing to read tons of text on the start. Also keeping references to the Administration guide glossary file we can merge descriptions from both files. Also first sentence in the glossary could be the same as in Terminologies.
The best idea would be actually converting documentation to RestrucutredText - it provides much better flexibility in managing documentation via linking between articles or including subsections. But this would take some time and would make docs a bit more complex to configure.
@prashanthpai ^^^ Any thoughts here ?
Just looked at the docs again after waking up. There's:
- Getting started with GlusterFS > Quick start Guide + Terminologies
- Install Guide > Common Criteria + Quick start to install
- Administration Guide > Terminologies (which links to the different section which causes a reading loop) + Glossary
So actually there are at least three places, and I think it's one too much :)
The best idea would be actually converting documentation to RestrucutredText - it provides much better flexibility in managing documentation via linking between articles or including subsections. But this would take some time and would make docs a bit more complex to configure.
Agreed. New subprojects use rst. We're stuck with markdown for glusterdocs though, for now.
So actually there are at least three places, and I think it's one too much :)
PRs to remove this duplication are always welcome :)
AFAIR sphinxdoc allows converting docs from MarkDown to RestructuredText - I remember doing this. but then it would require changing build process of the documentation. So first step would be making minimal new build script/configs and processing .md files, which can later on be easily converted to .rst If I got some spare time I'll try to make a fork+branch to show this.
AFAIR sphinxdoc allows converting docs from MarkDown to RestructuredText - I remember doing this. but then it would require changing build process of the documentation.
We have visited this option earlier (using pandoc). And in many cases, it did require manual intervention after the conversion and it was too much effort.
Yeah, been there ;-) Either way the docs needs reformatting, so it's inevitable.
The only thing I fear is if the docs are already included in some other projects/packages, and they would heavily rely on current setup.
But switching to restructured text would allow easy export to multiple formats like man pages, pdf and so on.
Alright - so what are the next steps here? Is reconciling the documents and presenting a single glossary the decision?