python-bibtexparser icon indicating copy to clipboard operation
python-bibtexparser copied to clipboard

Add utils module with splitnames() and parsename() functions

Open bcbnz opened this issue 9 years ago • 7 comments

Firstly, this pull request adds a utils.splitnames() function to break apart a string "Donald E. Knuth and Leslie Lamport" into a list of individual names ["Donald E. Knuth", "Leslie Lamport"]. This obeys BibTeX brace level rules so "{Simon and Schuster}" results in an output of ["{Simon and Schuster}"]. It also handles a leading or trailing 'and' in the same way as BibTeX, treating it as part of the name so that "John Smith and Phil Holden and" becomes ["John Smith", "Phil Holden and"]. There are a number of test cases for this function.

Secondly, the customization.splitname() function I previously submitted is renamed to utils.parsename(). I feel this is better suited to being in the utils module. The name change better reflects what the function does, and also avoids the confusion of having splitname and splitnames functions.

Finally, two scripts intended for developer use are added to a new top-level misc folder. The misc/splitnames.py script shows how BibTeX splits a given string of names, and the misc/parsename.py script shows how it parses a given name. They take input from the user (either as commandline arguments, or interactively if run with no arguments), then run BibTeX with a custom style file to generate the output. I used these to generate test cases for the two corresponding functions, and they may be useful for the same purpose in the future. I arbitrarily named the folder misc, if you want them in a different location let me know.

bcbnz avatar Jun 10 '16 03:06 bcbnz

Coverage Status

Coverage increased (+0.06%) to 95.787% when pulling 6ca2e430acae0ca2976369c1d737fb67d8d8e749 on bcbnz:utils into 57fe207c221745c4a53f370928a8f038de8360b9 on sciunto-org:master.

coveralls avatar Jun 10 '16 03:06 coveralls

Coverage Status

Coverage increased (+0.06%) to 95.787% when pulling 9fbe12051db42b4acc7236479972e0271825c5f0 on bcbnz:utils into 57fe207c221745c4a53f370928a8f038de8360b9 on sciunto-org:master.

coveralls avatar Jun 10 '16 04:06 coveralls

Thanks! I try to read it and make comments asap :)

sciunto avatar Jun 15 '16 19:06 sciunto

I think it would be better to rename misc in something like "testcase_generator". In that directory, I would add a README.rst file to explain the existence of this folder (just what you said above).

just for curiosity, have you considered to use pyparsing for some parts of your functions? It makes things more compact and generally less buggy. Nevertheless, I think this function is far enough of the lib core, so it's OK for me.

Would you like to cover this with few lines in the documentation? Also, how would you like to articulate this with getnames() and author()? I would prefer to treat them here also. I'm not totally sure about creating utils.py. I think that all actions on bibtex field contents must be viewed as a customization. I view on this is basically to keep two blobs: a core to parse the general structure of a bibtex file, and a shell to modify the content and its structure (string vs list). This way, I believe it's easy to explain to users.

Apart these little comments, that's a great and valuable contribution. Thanks!

sciunto avatar Jun 19 '16 15:06 sciunto

@bcbnz any news on this (and on https://github.com/sciunto-org/python-bibtexparser/issues/138)? Would be awesome to get this merged, it is one of the remaining issue to sort out before next release.

Thanks!

Phyks avatar Jul 08 '17 12:07 Phyks

@Phyks Feel free to update the branch if you would like. It's quite a long time that this PR has not been updated.

sciunto avatar Jul 13 '17 11:07 sciunto

Ok, I'll try to have a look as soon as possible :)

Phyks avatar Jul 13 '17 12:07 Phyks

Handling of names is significantly simplified in v2.

This v1 PR is stale and thus closed.

MiWeiss avatar May 26 '23 14:05 MiWeiss