-
Notifications
You must be signed in to change notification settings - Fork 370
Add cell label in Quarto render status #5772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/resources/jupyter/notebook.py
Outdated
# get label | ||
label = '' | ||
source_lines = cell.source.split('\n') | ||
label_line = next((line for line in source_lines if line.startswith('#| label:')), None) | ||
if label_line: | ||
label = label_line.replace('#| label:', '').strip() | ||
|
||
status(" Cell '{0}' {1}/{2}...".format( | ||
label, current_code_cell- 1, total_code_cells - 1 | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
It should be more reliable to get the id
straight from the cell.
Notebooks, will always have an identifier, but the id
is replaced by label
if any.
This would avoid to re-do the parsing of parameters, etc.
Notebook cell identifier. Note that if there is no cell
id
thenlabel
will be used as the cellid
if it is present. See https://jupyter.org/enhancement-proposals/62-cell-id/cell-id.html for additional details on cell ids.
# get label | |
label = '' | |
source_lines = cell.source.split('\n') | |
label_line = next((line for line in source_lines if line.startswith('#| label:')), None) | |
if label_line: | |
label = label_line.replace('#| label:', '').strip() | |
status(" Cell '{0}' {1}/{2}...".format( | |
label, current_code_cell- 1, total_code_cells - 1 | |
)) | |
status(" Cell {0}/{1}: {2} ... ".format( | |
current_code_cell - 1, | |
total_code_cells - 1, | |
cell.id, | |
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The above suggested code produces the following output:
❯ quarto preview index.qmd --no-browser --no-watch-inputs
Starting python3 kernel...Done
Executing 'index.ipynb'
Cell 1/2: test ... Done
Cell 2/2: eb6da12e ... Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that cells with automatically generated IDs shouldn't show the label. I also think that we should measure for the max width of a cell label and pad the other output so all progress messages have the same width (i.e. the Done
lines up across lines).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the suggestion to standardise the messages.
A 80 width string filled with spaces and "Done" at the end would do?
Also other messages would need to be "width standardised".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to measure the max width of labels at the outset (and therefore get perfect alignment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we ought to not emit a1886270
and other automatically-generated IDs. If at that point in the code it's too late to know which ID is automatically generated and which isn't, then we need to find a way to carry this information into the right part of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for bringing this to my attention! Yes, I agree with the suggestion that perhaps there needs to be a flag for whether the ID is automatically generated or not. The reason is, according to the documentation:
define: cell_id
is the id in the output.
if "#| id":
cell_id = cell.id
elif "#| label":
cell_id = cell.id
else:
cell_id = <auto-generated-id>
Since cell.id
is automatically generated when neither "#| id"
nor "#| label"
exists, the above logic is equivalent to:
if is_auto_generated(cell.id) == False:
cell_id = cell.id
else:
cell_id = <auto-generated-id>
Therefore, once we have a flag to know if the id is auto-generated, we can set cell_id to ""
:
if is_auto_generated(cell.id) == False:
cell_id = cell.id
else:
cell_id = ""
We may need to modify this part of the code as well as some code that has called this function:
quarto-dev/quarto-cli@58a9462
Could you kindly provide some guidance on how the relevant code should be modified? Thank you!
ps. please ignore the latest commit (3cea7dc
) in this branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may actually want to marry this with some work that @dragonstyle has been doing on notebook embedding. There, we want people to be able to embed notebook cells by a unique id. We have an idiom there where we first take the label
, then use the first cell tag (if any). Perhaps we need a few field that represents this embeddable identifier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! While I was examining the code related to notebook embedding, I found the nb_cell_yaml_options()
function. This led me to submit a commit (cc1c803) that might resolve the issue [FR] Replace "Cell 1/N" by label (if provided) when using Jupyter #5752. Furthermore, I've noticed a potential problem with the nb_cell_yaml_options()
function. Here's the summary:
Summary of the Latest commit
This commit introduces a feature where the render status will now display the label when a user has specified "#| label"
for a "code" cell. When running the "quarto render"
command, the output will match the following example:
Cell 1/3: 'fig-polar-1'......Done
Cell 2/3: 'fig-polar-1024'...Done
Cell 3/3: ''.................Done
Notably, even if a user writes "#| id"
in the "code" cell, the id will not be visible in the render status.
This is because the return list from nb_cell_yaml_options()
just includes cell attributes like label
, classes
, fig-cap
, fig-subcap
, and so on, as specified in cell-attributes.yml and yaml-intelligence-resources.json, and does not include cell attributes like "#| id"
"#| tags"
and "#| export"
. I'm uncertain if this is an issue that needs addressing or if there's a different approach to access these attributes. Any advice on this would be greatly appreciated!
Additional Note
I observed that in a recent PR (Support export: in jupyter cells #4958), we added support for export
, but this attribute is currently absent from the official documentation Code Cells: Jupyter. I believe an update to the documentation to include this attribute might be beneficial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then export
syntax is a pre-processing directive used by nbdev so indeed not part of our schema or feature set. It (and the others ) can indeed be safely ignored here!
072be84
to
7147814
Compare
Thanks for all of the iteration here! This is looking great save for one thing. We end up doing this twice for each cell (once to compute the max and once to present the label for progress): nb_cell_yaml_options(client.nb.metadata.kernelspec.language, cell) I wonder if we could just map the cells into a |
cc1c803
to
f7d7bdb
Compare
Thank you for your advice! Summary of the Latest Commit It now avoids calling # compute total code cells (for progress)
current_code_cell = 1
total_code_cells = sum(cell.cell_type == 'code' for cell in client.nb.cells)
# map cells to their labels
cell_labels = [nb_cell_yaml_options(client.nb.metadata.kernelspec.language, cell).get('label', '') for cell in client.nb.cells]
# find max label length (for progress)
max_label_len = max(len(label) for label in cell_labels) Further optimization is possible here by iterating current_code_cell = 1
total_code_cells = 0
cell_labels = []
max_label_len = 0
for cell in client.nb.cells:
# compute total code cells (for progress)
if cell.cell_type == 'code':
total_code_cells += 1
# map cells to their labels
label = nb_cell_yaml_options(client.nb.metadata.kernelspec.language, cell).get('label', '')
cell_labels.append(label)
# find max label length
max_label_len = max(max_label_len, len(label)) Future Work: The output information of R is different from Python and Julia. Can we change the output information of R to the same format as Python and Julia? The current test results using Python and Julia are consistent, as shown below:
When using R with
Concerns with R's output format:
|
The R output format is actually coming from a totally different stack (knitr) so we don't control it directly. In more recent versions of knitr it is actually a progress bar so even more different. I'm not worried about this divergence. |
Thanks again so much for this (and for all of the iteration). Merging now! |
Thank you for being so supportive! |
Thanks for building this!! |
In this pull request, I've refined the visibility and interpretability of Quarto render status messages.
Key modifications include:
Cells with labels: The syntax has been updated from
" Cell {0}/{1}..."
to" Cell '<label>' {0}/{1}..."
.Cells without labels: The format has been altered from
" Cell {0}/{1}..."
to" Cell '' {0}/{1}..."
.These refinements boost the readability of the script and contribute to the overall maintainability and professionalism of our codebase. I look forward to your review.