glusterdocs icon indicating copy to clipboard operation
glusterdocs copied to clipboard

Duplication of glossary/terminology overview documents

Open mbukatov opened this issue 9 years ago • 10 comments

We have glossary with gluster terminology in both Quick Start Guide and Administrator Guide.

The problem is that the content of glossary/terminology overview itself is duplicated in 2 different separate files, with different formatting. This is hard to maintain and keep in sync.

Details

There are 2 files with gluster glossary overview:

While the number of items explained in each file differs greatly:

$ grep '^###' Quick-Start-Guide/Terminologies.md | sed 's/### //' | wc -l
19
$ grep '^\*\*' Administrator\ Guide/glossary.md | sed 's/\*\*//g' | wc -l
45

There is lots of duplication, see the following list of terms which are explained in both files:

$ comm -12 <(grep '^\*\*' Administrator\ Guide/glossary.md | sed 's/\*\*//g' | sort) <(grep '^###' Quick-Start-Guide/Terminologies.md | sed 's/### //' | sort)
Brick
Client
Cluster
Distributed File System
FUSE
Geo-Replication
glusterd
Metadata
Namespace
POSIX
RAID
RRDNS
Server
Trusted Storage Pool
Userspace
Volume

Here is an example how explanation for Brick looks in each file. First is from Quick-Start-Guide/Terminologies.md:

### Brick                                                                       
                                                                                
Brick is the basic unit of storage, represented by an export directory          
on a server in the trusted storage pool.                                        

While this is from Administrator Guide/glossary.md:

**Brick**                                                                       
:   A Brick is the basic unit of storage in GlusterFS, represented by an export 
    directory on a server in the trusted storage pool.                          
    A brick is expressed by combining a server with an export directory in the following format:
                                                                                
        `SERVER:EXPORT`                                                         
    For example:                                                                
        `myhostname:/exports/myexportdir/` 

Expected state

There should be no duplication of information.

We could for example store the content in a single file and then include this single file in both guides without duplicating the content.

mbukatov avatar Jan 05 '17 13:01 mbukatov

@mbukatov Thanks!. Would you like to send a PR on this ?

humblec avatar Jan 05 '17 13:01 humblec

@humblec Do you agree with squashing both documents into single one, which would be included in both guides without duplication?

mbukatov avatar Jan 06 '17 10:01 mbukatov

I think squashing is a must, otherwise it may just introduce too much confusion.

How about this:

  • Quick-Start-Guide/Terminologies.md is still valid, but should change how it is presented:
    • it should consist of the list of terms used later in the Quick Start guide, the minimum set focused on very specific, minimal gluster setup - so we should ditch RAID, geo-replication and so on...
    • each entry should be shortened to the minimum - one sentence if possible
    • each entry should be pointing to the Administration guide for more in-depth explanation of the term.

This way the quick start introduces basic concepts without needing to read tons of text on the start. Also keeping references to the Administration guide glossary file we can merge descriptions from both files. Also first sentence in the glossary could be the same as in Terminologies.

The best idea would be actually converting documentation to RestrucutredText - it provides much better flexibility in managing documentation via linking between articles or including subsections. But this would take some time and would make docs a bit more complex to configure.

nvtkaszpir avatar Feb 19 '17 21:02 nvtkaszpir

@prashanthpai ^^^ Any thoughts here ?

humblec avatar Feb 20 '17 03:02 humblec

Just looked at the docs again after waking up. There's:

  • Getting started with GlusterFS > Quick start Guide + Terminologies
  • Install Guide > Common Criteria + Quick start to install
  • Administration Guide > Terminologies (which links to the different section which causes a reading loop) + Glossary

So actually there are at least three places, and I think it's one too much :)

nvtkaszpir avatar Feb 20 '17 06:02 nvtkaszpir

The best idea would be actually converting documentation to RestrucutredText - it provides much better flexibility in managing documentation via linking between articles or including subsections. But this would take some time and would make docs a bit more complex to configure.

Agreed. New subprojects use rst. We're stuck with markdown for glusterdocs though, for now.

So actually there are at least three places, and I think it's one too much :)

PRs to remove this duplication are always welcome :)

prashanthpai avatar Feb 22 '17 06:02 prashanthpai

AFAIR sphinxdoc allows converting docs from MarkDown to RestructuredText - I remember doing this. but then it would require changing build process of the documentation. So first step would be making minimal new build script/configs and processing .md files, which can later on be easily converted to .rst If I got some spare time I'll try to make a fork+branch to show this.

nvtkaszpir avatar Feb 22 '17 07:02 nvtkaszpir

AFAIR sphinxdoc allows converting docs from MarkDown to RestructuredText - I remember doing this. but then it would require changing build process of the documentation.

We have visited this option earlier (using pandoc). And in many cases, it did require manual intervention after the conversion and it was too much effort.

prashanthpai avatar Feb 22 '17 07:02 prashanthpai

Yeah, been there ;-) Either way the docs needs reformatting, so it's inevitable.

The only thing I fear is if the docs are already included in some other projects/packages, and they would heavily rely on current setup.

But switching to restructured text would allow easy export to multiple formats like man pages, pdf and so on.

nvtkaszpir avatar Feb 22 '17 07:02 nvtkaszpir

Alright - so what are the next steps here? Is reconciling the documents and presenting a single glossary the decision?

sankarshanmukhopadhyay avatar Aug 08 '18 02:08 sankarshanmukhopadhyay