ReScience icon indicating copy to clipboard operation
ReScience copied to clipboard

Paper only in PDF?

Open iherman opened this issue 10 years ago • 19 comments

I have found the (only) publication of the journal: Interaction between cognitive and motor… but, to my great disappointment, the paper itself, ie, the narrative content, is only available in PDF. Based on the URL of the PDF file, I also got to the article's repositiory, but that repository only contained a PDF as a finished text. I would have expected to find, through github.io, a version in HTML, properly readable on mobile environment, possibly directly linking into, say iPython notebooks or whatever else, etc. This journal has the potential of showing all the power of the Web for publishing...

iherman avatar Sep 07 '15 15:09 iherman

You're right but it is not our top priority for the time being (lack of human resources). However, this initiative that @ThomasA pointed to might be interesting: https://github.com/PeerJ/paper-now

rougier avatar Sep 07 '15 15:09 rougier

On 07 Sep 2015, at 17:14 , Nicolas P. Rougier [email protected] wrote:

You're right but it is not our top priority for the time being (lack of human resources). However, this initiative that @ThomasA https://github.com/ThomasA pointed to might be interesting: https://github.com/PeerJ/paper-now https://github.com/PeerJ/paper-now

Indeed (I didn't know about that one either).

And I of course understand the resource issues… As long as you keep this issue open, I am of course perfectly fine. Eventually...

Thanks

iherman avatar Sep 07 '15 15:09 iherman

I tried to dig into this a little, and with no surprise, there is no quick win. The goal is to transform the *.md into a html fragment that could be then directly be included in the website. I didn't find any robust, non-buggy avenue to go from the .tex file to some .html. It seems that the answer will come from a standard .md -> .html transformation with some plugins (for cross references & bibliography) and a nice .css file. PeerJ/paper-now could be a good inspiration, but there is some work ahead anyway.

I'm +1'ing this though. It seems quite important to me.

antoine-lizee avatar Sep 16 '15 22:09 antoine-lizee

I had some experiment using rst and docutils, see:

http://www.labri.fr/perso/nrougier/coding/article/article.html http://www.labri.fr/perso/nrougier/coding/article/article.rst.html

but it is only for restructured text format and does not allow to produce the pdf.

What about pandoc itself ? Is the produced html not good enough ?

rougier avatar Sep 17 '15 05:09 rougier

Well it doesn't include any styling, so you end up with basic markup. It doesn't support the cross references nor the equations. I'm sure there is a lot of solutions, but pandoc out of the box is not going to cut it. An inspiration could be rmarkdown. We could even use that to do the conversion?

antoine-lizee avatar Sep 17 '15 17:09 antoine-lizee

Switching to rmarkdown would require authors to have it installed and I don't know to what extent people are ready to do that (I know we're already imposing them the pandoc framework) ? Do you have any experience with rmarkdown ? Does it require R studio or can we have a lighter installation ?

rougier avatar Sep 17 '15 17:09 rougier

I don't think it's really heavier than installing Haskell & cabal which is the current proposition. You do not need RStudio to compile the document, just R and the package rmarkdown. I just typed the following in my R console:

rmarkdown::render("your_article_name.md")

And it gave me a decent HTML from scratch. It'd be needed to create our own template, but it seems slightly easier to start from there.

antoine-lizee avatar Sep 17 '15 18:09 antoine-lizee

just R and the package rmarkdown.

It still needs a pandoc binary as clearly stated in the rmarkdown sources.

eddelbuettel avatar Sep 19 '15 00:09 eddelbuettel

The one real advantage of rmarkdown as a format is that thanks to pandoc you get any one of

html   or    pdf (via latex, of course)    or    doc (which we can IMHO ignore)

In practice one tends to pick either pdf or html early and add specialisation. Lastly, I am firmly behind the use of (neatly typeset) pdf. To me, html rendering depends way to much on the browser used.

eddelbuettel avatar Sep 19 '15 00:09 eddelbuettel

yes, which is conveniently shipped with the package if I remember well - in any case, lighter than installing Haskell (several gigs and minutes of install). But again, that might not be the right solution - any other idea?

antoine-lizee avatar Sep 19 '15 00:09 antoine-lizee

Yes, exactly: it provides a nice all-in-one, including rendering of equations, and I remember having the cross referencing working at some point too.

antoine-lizee avatar Sep 19 '15 00:09 antoine-lizee

@antoine-lizee: I am sorry, but you are still confused.

RStudio (either the server or desktop) happens to install a statically built pandoc binary. So if you have RStudio, you are good.

However, RStudio as authors of CRAN packages are way to cognizant of these dependencies and the rmarkdown package does not depend on RStudio. It simply states that you need a pandoc binary in the path. Which can be a pain for some systems. In practice, Debian/Ubuntu, brew, ... all make it possible.

To sum up, and use a concrete example, I use rmarkdown for all my beamer presentations, and while I have RStudio installed the rendering still uses the /usr/bin/pandoc binary from my distro.

eddelbuettel avatar Sep 19 '15 00:09 eddelbuettel

Yep - I didn't catch that the dependency was installed by RStudio. But my point was that even downloading the pandoc binary is way easier than installing Haskell, which is needed currently for the pandoc-crossref filter only. Sorry if I haven't been clear: I'm just looking for the most user-friendly way to compile the .md in html and pdf at the same time. My personal experience was suggesting rmarkdown, but there might be more straightforward ways....

antoine-lizee avatar Sep 19 '15 00:09 antoine-lizee

Yup, no worries. Now, for the matter at hand, and while I am one of R fanboys here, I think the existing workflow is good. rmarkdown is a little too far out there for non-R users methinks...

eddelbuettel avatar Sep 19 '15 00:09 eddelbuettel

Hello, in the context of this issue I wanted to point you to PythonTex as it fits the purpose of this journal pretty well, I think. To shortly summarize, it's implemented as a Latex package and allows you to include python code (ruby, octave and julia are also supported, R unfortunately not (yet)) and also execute the python code and access the output of the code. The linked article motivates the features and obviously gives a much better overview. I realize that this not top priority and the human resources issue probably also did not magically disappear but I would be keen to hear your opinion about it.

awakenting avatar Aug 16 '16 17:08 awakenting

Thanks for the reference. If I understand it well, it allows to embed the output of script directly into the pdf document, is that right ? Do you have any experience with it ?

However, this does not solve the HTML output "problem" but I suspect pandoc might be the solution. Another possibility could be to use mardkeep (https://casual-effects.com/markdeep/) which seems to be quite powerful but I don't know if it exports to pdf and we might want to avoid "exotic" markup language.

rougier avatar Aug 16 '16 19:08 rougier

Yes it allows to embed the output directly. The ['Step-by-Step solution' section in this demo pdf illustrates it quite nicely. I have no experience yet because I also just found it while searching for a way to automatically include all figures from a given directory (which is straightforward with it).

Yes, there is a depythonize utility that replaces python code by the output and gives you a plain latex document that you can convert with pandoc.

Markdeep looks interesting, indeed!

awakenting avatar Aug 16 '16 20:08 awakenting

I played with Markdeep a while ago. It's quite powerful, and easy to deploy as all there is to copy around is a single HTML file with embedded JavaScript and the Markdown contents. However, for that same reason Markdeep is Web-only; I see no way to get PDF or whatever else from it.

khinsen avatar Aug 17 '16 06:08 khinsen

After a quick look at PythonTeX, I'd say it works much like a bunch of similar tools, such as emacs/org-mode, Lepton. A single combined input contains text and code sections, from which the code is extracted and run, with the output being integrated into the final document.

All these tools are fine for personal use, but they all suffer from the same problem in the context of publication and reviewing: they depend on a monstrous amount of dependencies and require tricky platform-dependent configuration. The name PythonTeX says it all: you must install Python and TeX, which is already a non-trivial task, and then install add-ons on both sides plus edit some configuration files to make it all work together. I wouldn't mind that effort for getting a nice tool for preparing my own publication, but as a reviewer, I expect a much lower overhead.

khinsen avatar Aug 17 '16 14:08 khinsen