markdowndb icon indicating copy to clipboard operation
markdowndb copied to clipboard

extract headings to db

Open superiums opened this issue 2 years ago • 6 comments

how about to extract headings (marked as #, ... ###### ) to db ? heading is the main structor of a markdown file. this helps structer the file and make local knowedege db avaluable.

superiums avatar Dec 04 '23 05:12 superiums

@superiums great suggestion. Do you have a specific structure you want e.g. do you want line numbers etc?

rufuspollock avatar Dec 08 '23 08:12 rufuspollock

linenumber usually is not the first concern, eg:

# medicine Axxx

> this is for catch cold.

this medicine contains following element:
- element A
- element B
- element xxx

## description: 3times per day.

other text xxxxxxxxxxxxxxx


# medicine Bxxx

> this is for catch hot.

this medicine contains following element:
- element yyy
- element uuu
- element xxx

## description: not eat meat.

other text oooooooooooooooooooooooooooooooooo

yet an other text oooooooooooooooooooooooooooo

...

the document could be serialized to sqlite via folling fields:

  • heading 1 (means medicine name line here)
  • heading 2 (means description line here)
  • quoting (menas usage here)
  • listing (means the structor list here)
  • other text (means the text behind the last heading )

which filed to serialize could be customized by cmd arguments. and the line numbers seems not so important. if user need to get the linenumber of heading , the sql 'offset 5, limit 1' may work.

superiums avatar Dec 08 '23 09:12 superiums

this is effeciency if user need to search sth in specifec place. in this example, after serialize, user could search the record easily via

  • heading 1
  • heading 2
  • quoting
  • listing

or sth else, for common usage, all specific markd md is able to act like this. main contains:

  • headings
  • quote
  • list
  • links (both markdown links [xxx] (url) and wiki links [[filename#heading]] )
  • tags ( eg. #tagA #tagB )

superiums avatar Dec 08 '23 09:12 superiums

i rethink it more, i found only heading serialize is neccessary (at most add a qutoe for description ). other ones could put to a content field.

maybe like this: markdowndb --extract-heading 3 means add these fields in db: heading1 heading2 heading3 content markdowndb --extract-heading 3 --extract-description 2 means : heading1 heading2 heading3 description1 description2 content

the discription is the qutoe line wich following the heading line.

superiums avatar Dec 09 '23 01:12 superiums

@superiums This can be done using Markdowndb's new computed fields feature.

For example, the following:

function getFirstHeading(fileObject: any, ast: any){
  let firstHeading: any = null;

  // Use unist-util-visit to traverse the AST and find the first heading
  visit(ast, 'heading', (node: HeadingNode) => {
    if (!firstHeading) {
      firstHeading = node;
    }
    // Stop visiting after finding the first heading
    return visit.EXIT;
  });

  // Assign the header property to the founded header
  fileObject.header = firstHeading;
}
  
client.indexFolder("PATH", {computedFields: [getFirstHeading]})

You can add additional fields (or functions) to compute the first nth headings and the first nth descriptions as needed...

mohamedsalem401 avatar Dec 11 '23 21:12 mohamedsalem401

@mohamedsalem401 this would be perfect for a small blogpost.

rufuspollock avatar May 03 '24 10:05 rufuspollock