Add cell label in Quarto render status #5772

yuninxia · 2023-06-01T00:59:07Z

In this pull request, I've refined the visibility and interpretability of Quarto render status messages.

Key modifications include:

Cells with labels: The syntax has been updated from " Cell {0}/{1}..." to " Cell '<label>' {0}/{1}...".
Cells without labels: The format has been altered from " Cell {0}/{1}..." to " Cell '' {0}/{1}...".

These refinements boost the readability of the script and contribute to the overall maintainability and professionalism of our codebase. I look forward to your review.

mcanouil · 2023-06-01T10:23:48Z

src/resources/jupyter/notebook.py

+        # get label
+        label = ''
+        source_lines = cell.source.split('\n')
+        label_line = next((line for line in source_lines if line.startswith('#| label:')), None)
+        if label_line:
+            label = label_line.replace('#| label:', '').strip()
+
+        status("  Cell '{0}' {1}/{2}...".format(
+           label, current_code_cell- 1, total_code_cells - 1
+        ))


Thanks for the PR!

It should be more reliable to get the id straight from the cell.
Notebooks, will always have an identifier, but the id is replaced by label if any.
This would avoid to re-do the parsing of parameters, etc.

Notebook cell identifier. Note that if there is no cell id then label will be used as the cell id if it is present. See https://jupyter.org/enhancement-proposals/62-cell-id/cell-id.html for additional details on cell ids.

Suggested change

# get label

label = ''

source_lines = cell.source.split('\n')

label_line = next((line for line in source_lines if line.startswith('#| label:')), None)

if label_line:

label = label_line.replace('#| label:', '').strip()

status(" Cell '{0}' {1}/{2}...".format(

label, current_code_cell- 1, total_code_cells - 1

))

status(" Cell {0}/{1}: {2} ... ".format(

current_code_cell - 1,

total_code_cells - 1,

cell.id,

))

The above suggested code produces the following output:

❯ quarto preview index.qmd --no-browser --no-watch-inputs Starting python3 kernel...Done Executing 'index.ipynb' Cell 1/2: test ... Done Cell 2/2: eb6da12e ... Done

I think that cells with automatically generated IDs shouldn't show the label. I also think that we should measure for the max width of a cell label and pad the other output so all progress messages have the same width (i.e. the Done lines up across lines).

I agree with the suggestion to standardise the messages.

A 80 width string filled with spaces and "Done" at the end would do?

Also other messages would need to be "width standardised".

We should be able to measure the max width of labels at the outset (and therefore get perfect alignment)

I still think we ought to not emit a1886270 and other automatically-generated IDs. If at that point in the code it's too late to know which ID is automatically generated and which isn't, then we need to find a way to carry this information into the right part of the code.

Thank you for bringing this to my attention! Yes, I agree with the suggestion that perhaps there needs to be a flag for whether the ID is automatically generated or not. The reason is, according to the documentation:

define: cell_id is the id in the output.

if "#| id": cell_id = cell.id elif "#| label": cell_id = cell.id else: cell_id = <auto-generated-id>

Since cell.id is automatically generated when neither "#| id" nor "#| label" exists, the above logic is equivalent to:

if is_auto_generated(cell.id) == False: cell_id = cell.id else: cell_id = <auto-generated-id>

Therefore, once we have a flag to know if the id is auto-generated, we can set cell_id to "":

if is_auto_generated(cell.id) == False: cell_id = cell.id else: cell_id = ""

We may need to modify this part of the code as well as some code that has called this function:
quarto-dev/quarto-cli@58a9462

Could you kindly provide some guidance on how the relevant code should be modified? Thank you!

ps. please ignore the latest commit (3cea7dc) in this branch.

We may actually want to marry this with some work that @dragonstyle has been doing on notebook embedding. There, we want people to be able to embed notebook cells by a unique id. We have an idiom there where we first take the label, then use the first cell tag (if any). Perhaps we need a few field that represents this embeddable identifier?

Thank you! While I was examining the code related to notebook embedding, I found the nb_cell_yaml_options() function. This led me to submit a commit (cc1c803) that might resolve the issue [FR] Replace "Cell 1/N" by label (if provided) when using Jupyter #5752. Furthermore, I've noticed a potential problem with the nb_cell_yaml_options() function. Here's the summary:

Summary of the Latest commit

This commit introduces a feature where the render status will now display the label when a user has specified "#| label" for a "code" cell. When running the "quarto render" command, the output will match the following example:

Cell 1/3: 'fig-polar-1'......Done Cell 2/3: 'fig-polar-1024'...Done Cell 3/3: ''.................Done

Notably, even if a user writes "#| id" in the "code" cell, the id will not be visible in the render status.

This is because the return list from nb_cell_yaml_options() just includes cell attributes like label, classes, fig-cap, fig-subcap, and so on, as specified in cell-attributes.yml and yaml-intelligence-resources.json, and does not include cell attributes like "#| id" "#| tags" and "#| export". I'm uncertain if this is an issue that needs addressing or if there's a different approach to access these attributes. Any advice on this would be greatly appreciated!

Additional Note

I observed that in a recent PR (Support export: in jupyter cells #4958), we added support for export, but this attribute is currently absent from the official documentation Code Cells: Jupyter. I believe an update to the documentation to include this attribute might be beneficial.

Then export syntax is a pre-processing directive used by nbdev so indeed not part of our schema or feature set. It (and the others ) can indeed be safely ignored here!

jjallaire · 2023-06-04T11:49:52Z

Thanks for all of the iteration here! This is looking great save for one thing. We end up doing this twice for each cell (once to compute the max and once to present the label for progress):

nb_cell_yaml_options(client.nb.metadata.kernelspec.language, cell)

I wonder if we could just map the cells into a cell_labels array of strings at the top, compute the max, and then just index into that array during the cell-by-cell progress?

yuninxia · 2023-06-05T01:20:50Z

Thank you for your advice!

Summary of the Latest Commit

It now avoids calling nb_cell_yaml_options() twice. Here's the modified code:

# compute total code cells (for progress)
current_code_cell = 1
total_code_cells = sum(cell.cell_type == 'code' for cell in client.nb.cells)

# map cells to their labels
cell_labels = [nb_cell_yaml_options(client.nb.metadata.kernelspec.language, cell).get('label', '') for cell in client.nb.cells]

# find max label length (for progress)
max_label_len = max(len(label) for label in cell_labels)

Further optimization is possible here by iterating client.nb.cells just once to boost efficiency, especially with large data (like using Quarto for book writing):

current_code_cell = 1
total_code_cells = 0
cell_labels = []
max_label_len = 0

for cell in client.nb.cells:
   # compute total code cells (for progress)
   if cell.cell_type == 'code':
      total_code_cells += 1
   # map cells to their labels
   label = nb_cell_yaml_options(client.nb.metadata.kernelspec.language, cell).get('label', '')
   cell_labels.append(label)
   # find max label length
   max_label_len = max(max_label_len, len(label))

Future Work: The output information of R is different from Python and Julia. Can we change the output information of R to the same format as Python and Julia?

The current test results using Python and Julia are consistent, as shown below:

  Cell 1/3: 'fig-polar-1'......Done
  Cell 2/3: 'fig-polar-1024'...Done
  Cell 3/3: ''.................Done

When using R with quarto render hello-r.qmd --to html, the output format is as follows:

processing file: hello-r.qmd
1/4                   
2/4 [fig-airquality]  
3/4                   
4/4 [fig-airquality-2]
output file: hello-r.knit.md

Concerns with R's output format:

It differs from Python/Julia.
Each cell run leads to two counts. For example, with only one code cell, the output is redundant. There should be 1/1 [fig-airquality].

processing file: hello-r.qmd
1/2                   
2/2 [fig-airquality]  
output file: hello-r.knit.md

The hello-r.knit.md file isn't found locally, yet the render status still displays output file: hello-r.knit.md.

jjallaire · 2023-06-05T13:53:15Z

The R output format is actually coming from a totally different stack (knitr) so we don't control it directly. In more recent versions of knitr it is actually a progress bar so even more different. I'm not worried about this divergence.

jjallaire · 2023-06-05T13:54:20Z

Thanks again so much for this (and for all of the iteration). Merging now!

yuninxia · 2023-06-05T18:20:55Z

Thank you for being so supportive!

bryanwhiting · 2023-09-13T02:55:54Z

Thanks for building this!!

cderv linked an issue Jun 1, 2023 that may be closed by this pull request

[FR] Replace "Cell 1/N" by label (if provided) when using Jupyter #5752

Closed

mcanouil reviewed Jun 1, 2023

View reviewed changes

yuninxia marked this pull request as draft June 1, 2023 17:33

yuninxia force-pushed the bugfix/issue-5752 branch from 072be84 to 7147814 Compare June 1, 2023 20:19

ghost assigned cscheid Jun 2, 2023

yuninxia marked this pull request as ready for review June 3, 2023 15:12

yuninxia added 6 commits June 4, 2023 15:32

Add cell label in Quarto render status

ad68714

Add cell label in render status with better alignment

248c8da

Add native cell id in .qmd file's render status

084057f

Add cell label in Quarto render status

26fbe4d

Add cell label in Quarto render status with nb_cell_yaml_options()

6583a39

Add cell label in Quarto render status with nb_cell_yaml_options()

f7d7bdb

yuninxia force-pushed the bugfix/issue-5752 branch from cc1c803 to f7d7bdb Compare June 5, 2023 00:24

jjallaire merged commit 84d4659 into quarto-dev:main Jun 5, 2023

Add cell label in Quarto render status #5772

Add cell label in Quarto render status #5772

Uh oh!

Conversation

yuninxia commented Jun 1, 2023

Uh oh!

mcanouil Jun 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mcanouil Jun 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjallaire Jun 1, 2023

Choose a reason for hiding this comment

Uh oh!

mcanouil Jun 1, 2023

Choose a reason for hiding this comment

Uh oh!

jjallaire Jun 1, 2023

Choose a reason for hiding this comment

Uh oh!

cscheid Jun 2, 2023

Choose a reason for hiding this comment

Uh oh!

yuninxia Jun 2, 2023

Choose a reason for hiding this comment

Uh oh!

jjallaire Jun 2, 2023

Choose a reason for hiding this comment

Uh oh!

yuninxia Jun 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jjallaire Jun 4, 2023

Choose a reason for hiding this comment

Uh oh!

jjallaire commented Jun 4, 2023

Uh oh!

yuninxia commented Jun 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jjallaire commented Jun 5, 2023

Uh oh!

jjallaire commented Jun 5, 2023

Uh oh!

yuninxia commented Jun 5, 2023

Uh oh!

bryanwhiting commented Sep 13, 2023

Uh oh!

Uh oh!

mcanouil Jun 1, 2023 •

edited

Loading

mcanouil Jun 1, 2023 •

edited

Loading

yuninxia Jun 3, 2023 •

edited

Loading

yuninxia commented Jun 5, 2023 •

edited

Loading