Allow plain text `.qmd` files as source notebooks
Addresses #1461
Using quarto and its VS code extension, I find that writing .qmd files to be a smoother interactive alternative to .ipynb files. That .qmd files are plain text comes with several advantages:
-
.qmdseamlessly integrates with Cursor AI/other AI copilots. -
.qmdis fully compatible with standard git tooling -
.qmdworks better with VIM keybindings -
.qmdfiles don't need a specialnbdev_cleanstep to remove cell metadata and outputs, meaning your source files are not altered in any way by nbdev's transpilation process (something that bothers me immensely when developing in.ipynb)
Turns out, nbdev doesn't need many changes to implement this feature.
- Allow export globbing functions to search for
.qmdin addition to .ipynb - Implement a
read_qmd/write_qmdfunction for converting the.qmdto/from nbdev'sAttrDictformat. This means two-way sync (vianbdev_update) also works for.qmdand its corresponding.pyfiles. - Because outputs are not stored inside .qmd files, I use
execnb'srun_allto generate outputs for the docs inside_proc/-cached .ipynb files. - The custom frontmatter parser needed some tweaking to allow cells to include general markdown after the custom frontmatter.
It looks like there have been other attempts to allow .qmd support for nbdev (see this quarto issue) or allow plain-text support (see #1499). However, .qmd support is still missing in the current version of nbdev, and the latter seems to introduce jupytext as an additional dependency which uses the slow quarto convert command to pair a .ipynb and .qmd (this PR introduces a faster .qmd <-> .ipynb parser). Now you can seamlessly develop using a mix of .qmd and .ipynb, whichever you prefer, with no additional dependencies.
I've written up a small tutorial for setting good VSCode defaults in nbs/tutorials/develop_in_plain_text.qmd
A few notes of caution and room for improvement:
- Ensure that all files under nbs/ have distinct names: no
00_core.ipynband00_core.qmd, as both of these will create the intermediate_proc/00_core.ipynb - Currently, the
nbdev_preparewill run executable cells in.qmddocuments twice: 1x when testing and, because outputs aren't saved, 1x when generating the docs.
The PR is in a pretty stable position already (see this fork of nbdev rewritten entirely using .qmd files). There may be edge cases that I haven't considered, but in all I hope this is nearing a good shape to distribute.
What are the next steps here?
I may be having challenges with this, but just wanted to check to see if you've seen this before or if it's something external to your fork:
nbdev_proc_nbs:
"""
Traceback (most recent call last):
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/process.py", line 261, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/process.py", line 210, in _process_chunk
return [fn(*args) for args in chunk]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/process.py", line 210, in <listcomp>
return [fn(*args) for args in chunk]
^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/parallel.py", line 63, in _call
return g(item)
^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/serve_drv.py", line 35, in main
elif src.suffix=='.qmd': exec_qmd(src, dst, x)
^^^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/serve_drv.py", line 23, in exec_qmd
cb()(nb)
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/processors.py", line 292, in __call__
def __call__(self, nb): return self.nb_proc(nb).process()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/process.py", line 130, in process
for proc in self.procs: self._proc(proc)
^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/process.py", line 122, in _proc
if hasattr(proc,'begin'): proc.begin()
^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/processors.py", line 108, in begin
if getattr(cells[idx+1], 'has_sd', 0):
~~~~~^^^^^^^
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/n/home03/ttapera/.conda/envs/era5_sandbox/bin/nbdev_proc_nbs", line 8, in <module>
sys.exit(nbdev_proc_nbs())
^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/script.py", line 125, in _f
return tfunc(**merge(args, args_from_prog(func, xtra)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/quarto.py", line 217, in nbdev_proc_nbs
_pre_docs(**kwargs)[0]
^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/quarto.py", line 209, in _pre_docs
cache = proc_nbs(path, n_workers=n_workers, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/nbdev/serve.py", line 82, in proc_nbs
parallel(nbdev.serve_drv.main, files, n_workers=n_workers, pause=0.01, **kw)
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/parallel.py", line 134, in parallel
return L(r)
^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/foundation.py", line 100, in __call__
return super().__call__(x, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/foundation.py", line 108, in __init__
items = listify(items, *rest, use_list=use_list, match=match)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/site-packages/fastcore/basics.py", line 79, in listify
elif is_iter(o): res = list(o)
^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/process.py", line 620, in _chain_from_iterable_of_lists
for element in iterable:
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/n/home03/ttapera/.conda/envs/era5_sandbox/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
IndexError: list index out of range
Any thoughts? What else would you like to see to help debug?
I got this PR to work for my personal use cases and didn't see much initial interest on this PR to bring it into the main branch. Seems like there's gotten to be a bit more traction since I first made the PR, and I'm happy to push this forward.
What are the next steps here? @football-kowshik
From my side, it has been awhile since I've rebased with the main. I will do that and see what bugs/clashes have come up since then and try to resolve those. Beyond that it's up to the maintainers to see if this is worth incorporating into the main branch (I think it definitely is, but I am biased. The .qmd workflow has proven much smoother for my use cases and it is fully backward compatible with .ipynbs.)
@TinasheMTapera I am not positive, but this bug looks a lot like the weird edge cases I encountered when trying to parse .qmd files as valid nbdev source. Could you share a minimal .qmd file that reproduces this bug? I'm a bit new at contributing to larger OSS projects on github, but I feel that this bug doesn't need its own issue since it is pertinent only to this PR.
@bhoov you haven't requested Jeremy Howard or any of the maintainers to review the PR.
I recently asked in discord, why this PR was not reviewed and the answer I received from Jeremy was:
No one requested a review so i didn't see it 🙂
@bhoov I was able to figure out the problem, and it was not related to your PR, but rather to an edge case of nbdev itself, so please disregard!
I recently asked in discord, why this PR was not reviewed and the answer I received from Jeremy was:
No one requested a review so i didn't see it 🙂
Github actually does not allow me to request reviewers or assign people to this repository, otherwise I would have.
@jph00 can you review this PR? :)
Will do!
Thanks for the review Jeremy. I definitely prioritized feature completeness at the expense of tasteful code and minimal changes 🙃. Still new to contributing to larger existing projects, I'll get there
Do you suggest closing this mammoth PR and instead introduce bite-sized PRs to the main branch? Or should I make smaller PRs to a dedicated "qmd_support" branch?
I'd suggest closing this and do a single small PR, and work through that together first.
Thanks for your patience! :)
Message ID: @.***>