ReLaXed icon indicating copy to clipboard operation
ReLaXed copied to clipboard

Loading bibtex files for citations

Open will-hart opened this issue 7 years ago • 8 comments

Thanks for #27 / #64 💯

However, I'd love to be able to load bibtex files in so it fits with my standard workflow. I've discovered that I can read a bibtex file into a javascript variable like so:

mixin loadBibliography(path)
  - var tex = fs.readFileSync(path)
  script.
    var bib = '!{tex}'
    // or window.bibtex = ...

+loadBibliography("./bibliography.txt")

However I don't believe there is currently a way to pass it through to the bibliography generator. (Are there plans to add support for this? via plugins?)

A slightly dodgy way I suppose would be to save the bibtex to window.bibtex in our pug, and then I think we should be able to check for that variable in the generator. I'm not very familiar with puppeteer but I believe the generator could do something like

const bibtex = await page.evaluate(() => bibtex);
const data = new Cite(bibtex);

Thoughts?

will-hart avatar May 30 '18 14:05 will-hart

I've got a "demo" working here https://github.com/will-hart/ReLaXed/commit/ce4588d3415d3b0d401ad536eff8789ffbf7ca17. It does the job but I think it has slowed down compilation times a couple of seconds while the bibtex is parsed.

I've also added a "multiCite" command here https://github.com/will-hart/ReLaXed/commit/6d09c3b3a96e464ab6c480372733b5639de59c32

e.g.

  +multiCite('Q21972835', 'Zhang2009')

produces

image

will-hart avatar May 30 '18 22:05 will-hart

The plan is to add a way to load a local bibliography library, currently I have not added one, as I have not tested how citation-js loads one, I developed it under the (probably false) assumption that citation-js would provide the entire bibtex file as the bibliography, regardless whether or not something was cited. As for loading the bibliography through the pug and then by finding it in page it would be better to have a normal file extension to look for, then bibliography will automatically load that.

var bibtex
if(fs.existsSync(path.join(masterPath, 'bibliography.txt')) {
    bibtex = fs.readFileSync(path.join(masterPath, 'bibliography.txt'))
}
if(bibFile) {
    const data = new Cite(bibtex)
} else {
    const data = new Cite()
}

The multicite is a nice touch, I had not thought of that before.

As I am working on the plugin system, loading a library would be done when activating the bibliography.

{
  "plugins": {
    "bibliography: {
      "enabled": true,
      "library": "path/to/file.txt"
      ...
     }
  }
}

Or via comments

//- use-plugin: bibliography library='path/to/file.txt'

Drew-S avatar May 30 '18 22:05 Drew-S

Oh yeah, oops I was trying so hard to include the bibtex into the Pug template I didn't even think of just reading in a file :) I'll give that a go.

Would you except a PR for one or both changes? Separate PRs?

EDIT: Also, yes citation.js does appear to put everything in the loaded file into the bibliography, even if there is no reference within the text.

will-hart avatar May 30 '18 23:05 will-hart

That is how I understood citation-js working, I do not think there is much demand for such a feature, though it could be easily implemented to include an entire library, but for me, and most people I believe, we want to reference a global library containing all the material we referenced.

Since I use Zotero to automate grabbing of data and holding it a global library, I want to reference that library every time, but only include the data I actually selected. So a custom solution to filter out material not cited is needed. I would probably accept the multicite one as is, after I checked the tests. As for the bibtex, that should not be loaded in using pug, but instead through the bibliography module directly. It is up to you if you want to do separate PRs, you may want @Zulko to look over it as well.

Drew-S avatar May 31 '18 01:05 Drew-S

Lots to read, I'll try to follow :) Some questions and remarks:

  • @will-hart when you say it takes a couple seconds, how big is your bibliography file ?

  • Do I understand correctly that all references in the file will appear in the bibliography ? like, will you get 500 references printed even if 2 only are used in the main text ? That is a deal-breaker for me.

  • I don't want to reinvent bibtex, but the system I see to extract bibliographies from a big file all_references.bib is as follows:

      1. extract all the actually used references from all_references.bib and store them in a local file .bibitems which will be the one actually used
    • Every time all_references.bib is modified, regenerate .bibitems
    • Every time the page is rendered, list all references used. If some references are not in .bibitems (meaning the user has added some references in the document) re-generate .bibitems.
    • As we said before, filter out unused references before printing the bibliography.
  • @Drew-S In your plugin system the plugins should not be an object but rather a list of elements {name: bibliography, options: {library: ...}} I am not sure what the use for "enabled" is. It is important that it is a list and not an object because it will allow to load different plugins with different options, even when they share the same umbrella name.

@will-hart you can propose a PR, but it will have to be checked by both @Drew-S and me as I'm not 100% up to date with that part (I haven't tried it myself yet)

Zulko avatar May 31 '18 07:05 Zulko

It takes about 3 seconds extra to build a bibliography with only a couple of test references in the file. I'll try it later with my 900 reference monster list later to see if that makes it a lot worse :) Also, the increase in build seems to only occur if you include the +bibliography mixin, otherwise build times are unchanged.

Citation.js definitely includes everything in the bibfile in the exported bibliography, even if it wasn't cited. Another approach would be to use a separate Cite object to convert the file, then just copy references from it as we encounter them in the text to the main Cite object. I might have a quick look at how hard this might be to do.

I'll put in a PR for the multiCite tag, but hold off on the bibtex file stuff for now.

will-hart avatar May 31 '18 09:05 will-hart

I've noticed that even with the current implementation, multiple +cite calls to the same reference result in multiple bibliography entries. I think we'll need to check if a citation key has already been added to prevent this.

My tasks

  • [x] PR for +multicite
  • [x] PR for caching citations to avoid repetition in the bibliography
  • [ ] Load bibtex references into the Cite object on demand
  • [ ] Performance tests with large bibtex files
  • [ ] PR for bibtex file loading

will-hart avatar Jun 01 '18 01:06 will-hart

Huh, in all my work I never did try citing the same code multiple times, I only tested two unique ones. You talk about having two Cite() objects one for duplicates, one for unique's. I am not fond of this method, as when you do add a new unique +cite, each object would make separate calls to get the info from online, so that should not happen for net calls. But for local libraries there needs to be a separate Cite() to read the data properly. Just something to take note of.

Drew-S avatar Jun 01 '18 20:06 Drew-S