-
Notifications
You must be signed in to change notification settings - Fork 105
Add notebooks to docs download #652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
54f23e3
ff55d5a
8dd8fbf
091e1a1
fc4df94
df1f9cd
14973ad
06bb33d
e7519c9
4530247
57c3162
2065a14
2babb2d
b41a4cb
425539d
dde0ed3
a32ec1b
1e7f351
e455881
d1bacac
a78fac4
98bf999
60a07a8
af7ceb4
4459016
7f00453
708d22b
64c62a4
1c6fad9
26cdf37
01e6954
f55de5c
848dc87
fdbee40
2a56d63
9ed76f1
27b9d8b
e06a333
2a396e4
3a1691e
e7979d9
d07ab29
c00b84d
997eac5
6dcb42e
0355f82
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,4 +28,5 @@ venv | |
site_libs | ||
.DS_Store | ||
index_files | ||
digest.txt | ||
digest.txt | ||
**/*.quarto_ipynb |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
#!/bin/bash | ||
# Add Jupyter notebook download links to rendered HTML files | ||
# This adds a download link to the toc-actions section (next to "Edit this page" and "Report an issue") | ||
|
||
set -e | ||
|
||
echo "Adding notebook download links to HTML pages..." | ||
|
||
# Link text variables | ||
DOWNLOAD_TEXT="Download notebook" | ||
COLAB_TEXT="Open in Colab" | ||
|
||
# Colab URL configuration (can be overridden via environment variables) | ||
COLAB_REPO="${COLAB_REPO:-TuringLang/docs}" | ||
COLAB_BRANCH="${COLAB_BRANCH:-gh-pages}" | ||
COLAB_PATH_PREFIX="${COLAB_PATH_PREFIX:-}" | ||
|
||
# Find all HTML files that have corresponding .ipynb files | ||
find _site/tutorials _site/usage _site/developers -name "index.html" 2>/dev/null | while read html_file; do | ||
dir=$(dirname "$html_file") | ||
ipynb_file="${dir}/index.ipynb" | ||
|
||
# Check if the corresponding .ipynb file exists | ||
if [ -f "$ipynb_file" ]; then | ||
# Check if link is already present | ||
if ! grep -q "$DOWNLOAD_TEXT" "$html_file"; then | ||
# Get relative path from _site/ directory | ||
relative_path="${html_file#_site/}" | ||
relative_path="${relative_path%/index.html}" | ||
|
||
# Extract notebook name from parent folder | ||
notebook_name=$(basename "$relative_path") | ||
|
||
# Construct Colab URL | ||
if [ -n "$COLAB_PATH_PREFIX" ]; then | ||
colab_url="https://colab.research.google.com/github/${COLAB_REPO}/blob/${COLAB_BRANCH}/${COLAB_PATH_PREFIX}/${relative_path}/index.ipynb" | ||
else | ||
colab_url="https://colab.research.google.com/github/${COLAB_REPO}/blob/${COLAB_BRANCH}/${relative_path}/index.ipynb" | ||
fi | ||
|
||
# Insert both download and Colab links BEFORE the "Edit this page" link | ||
# The download attribute forces browser to download with custom filename instead of navigate | ||
perl -i -pe 's|(<li><a href="[^"]*edit[^"]*"[^>]*><i class="bi[^"]*"></i>Edit this page</a></li>)|<li><a href="index.ipynb" class="toc-action" download="'"$notebook_name"'.ipynb"><i class="bi bi-journal-code"></i>'"$DOWNLOAD_TEXT"'</a></li><li><a href="'"$colab_url"'" class="toc-action" target="_blank" rel="noopener"><i class="bi bi-google"></i>'"$COLAB_TEXT"'</a></li>$1|g' "$html_file" | ||
echo " ✓ Added notebook links to $html_file" | ||
fi | ||
fi | ||
done | ||
|
||
echo "Notebook links added successfully!" |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My main other question is: if the qmd-to-ipynb script is already in Python, why not write the Bash scripts in Python as well? I don't mind the use of Python (it's way better for quick scripts than Julia is), but I think reducing the number of moving parts that need to work together will make for easier maintenance in the future. Especially because all of these scripts serve a common purpose and you're unlikely to want to run one without running the others. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Eh, personally I find any bash stuff gets harder to read the more complex it gets, I can do that though, not a problem. Let me do the current changes and verify it works then I can explore this fully. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh yes bash is definitely hard to read but I meant moving the bash to python - not the other way round! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. aha, gotchu. tbh, I originally wrote the bash as a quick fix with the CI, it was only because the default convert didn't work that I did the python too. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,32 @@ | ||||||
#!/bin/bash | ||||||
# Generate Jupyter notebooks from .qmd files without re-executing code | ||||||
# This script converts .qmd files to .ipynb format with proper cell structure | ||||||
|
||||||
set -e | ||||||
|
||||||
echo "Generating Jupyter notebooks from .qmd files..." | ||||||
|
||||||
# Find all .qmd files in tutorials, usage, and developers directories | ||||||
find tutorials usage developers -name "index.qmd" | while read qmd_file; do | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Using 'while read' with 'find' can fail with filenames containing spaces or special characters. Use 'find ... -print0 | while IFS= read -r -d "" qmd_file; do' for safer file handling.
Suggested change
Copilot uses AI. Check for mistakes. Positive FeedbackNegative Feedback |
||||||
dir=$(dirname "$qmd_file") | ||||||
ipynb_file="${dir}/index.ipynb" | ||||||
|
||||||
echo "Converting $qmd_file to $ipynb_file" | ||||||
|
||||||
# Convert qmd to ipynb using our custom Python script | ||||||
# Use relative path from repo root (assets/scripts/qmd_to_ipynb.py) | ||||||
python3 assets/scripts/qmd_to_ipynb.py "$qmd_file" "$ipynb_file" | ||||||
|
||||||
# Check if conversion was successful | ||||||
if [ -f "$ipynb_file" ]; then | ||||||
# Move the notebook to the _site directory | ||||||
mkdir -p "_site/${dir}" | ||||||
cp "$ipynb_file" "_site/${ipynb_file}" | ||||||
echo " ✓ Generated _site/${ipynb_file}" | ||||||
else | ||||||
echo " ✗ Failed to generate $ipynb_file" | ||||||
fi | ||||||
done | ||||||
|
||||||
echo "Notebook generation complete!" | ||||||
echo "Generated notebooks are in _site/ directory alongside HTML files" |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,235 @@ | ||||||
#!/usr/bin/env python3 | ||||||
""" | ||||||
Convert Quarto .qmd files to Jupyter .ipynb notebooks with proper cell structure. | ||||||
Each code block becomes a code cell, and markdown content becomes markdown cells. | ||||||
""" | ||||||
|
||||||
import sys | ||||||
import json | ||||||
import re | ||||||
from pathlib import Path | ||||||
from typing import List, Dict, Any, Optional | ||||||
Comment on lines
+7
to
+11
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We're only using Python's standard library here, so this can be done just as easily in Julia too. I mean, it's always better to avoid Python in Julia-specific packages! |
||||||
|
||||||
|
||||||
class QmdToIpynb: | ||||||
def __init__(self, qmd_path: str): | ||||||
self.qmd_path = Path(qmd_path) | ||||||
self.cells: List[Dict[str, Any]] = [] | ||||||
self.kernel_name = "julia" # Default kernel | ||||||
self.packages: set = set() # Track packages found in using statements | ||||||
|
||||||
def _extract_packages_from_line(self, line: str) -> None: | ||||||
"""Extract package names from a 'using' statement and add to self.packages.""" | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The docstring should specify that this method only works for Julia 'using' statements, as it's not clear from the current description that this is language-specific.
Suggested change
Copilot uses AI. Check for mistakes. Positive FeedbackNegative Feedback |
||||||
line = line.strip() | ||||||
if not line.startswith('using '): | ||||||
return | ||||||
|
||||||
# Remove 'using ' prefix and any trailing semicolon/whitespace | ||||||
remainder = line[6:].rstrip(';').strip() | ||||||
|
||||||
# Handle 'using Package: item1, item2' format - extract just the package name | ||||||
if ':' in remainder: | ||||||
package = remainder.split(':')[0].strip() | ||||||
if package and package != 'Pkg': | ||||||
self.packages.add(package) | ||||||
else: | ||||||
# Handle 'using Package1, Package2, ...' format | ||||||
packages = [pkg.strip() for pkg in remainder.split(',')] | ||||||
for pkg in packages: | ||||||
if pkg and pkg != 'Pkg': | ||||||
self.packages.add(pkg) | ||||||
|
||||||
def parse(self) -> None: | ||||||
"""Parse the .qmd file and extract cells.""" | ||||||
with open(self.qmd_path, 'r', encoding='utf-8') as f: | ||||||
content = f.read() | ||||||
|
||||||
lines = content.split('\n') | ||||||
i = 0 | ||||||
|
||||||
# Skip YAML frontmatter | ||||||
if lines[0].strip() == '---': | ||||||
i = 1 | ||||||
while i < len(lines) and lines[i].strip() != '---': | ||||||
# Check for engine specification | ||||||
if lines[i].strip().startswith('engine:'): | ||||||
engine = lines[i].split(':', 1)[1].strip() | ||||||
if engine == 'julia': | ||||||
self.kernel_name = "julia" | ||||||
elif engine == 'python': | ||||||
self.kernel_name = "python3" | ||||||
i += 1 | ||||||
i += 1 # Skip the closing --- | ||||||
|
||||||
# Parse the rest of the document | ||||||
current_markdown = [] | ||||||
|
||||||
while i < len(lines): | ||||||
line = lines[i] | ||||||
|
||||||
# Check for code block start | ||||||
code_block_match = re.match(r'^```\{(\w+)\}', line) | ||||||
if code_block_match: | ||||||
# Save any accumulated markdown | ||||||
if current_markdown: | ||||||
self._add_markdown_cell(current_markdown) | ||||||
current_markdown = [] | ||||||
|
||||||
# Extract code block | ||||||
lang = code_block_match.group(1) | ||||||
i += 1 | ||||||
code_lines = [] | ||||||
cell_options = [] | ||||||
|
||||||
# Collect code and options | ||||||
while i < len(lines) and not lines[i].startswith('```'): | ||||||
if lines[i].startswith('#|'): | ||||||
cell_options.append(lines[i]) | ||||||
else: | ||||||
code_lines.append(lines[i]) | ||||||
i += 1 | ||||||
|
||||||
# Check if this is the Pkg.instantiate() cell that we want to skip | ||||||
code_content = '\n'.join(code_lines).strip() | ||||||
is_pkg_instantiate = ( | ||||||
'using Pkg' in code_content and | ||||||
'Pkg.instantiate()' in code_content and | ||||||
len(code_content.split('\n')) <= 3 # Only skip if it's just these lines | ||||||
) | ||||||
Comment on lines
+94
to
+98
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The magic number 3 should be defined as a named constant to improve code readability and maintainability. Copilot uses AI. Check for mistakes. Positive FeedbackNegative Feedback |
||||||
|
||||||
# Add code cell (with options as comments at the top) unless it's the Pkg.instantiate cell | ||||||
if not is_pkg_instantiate: | ||||||
full_code = cell_options + code_lines | ||||||
self._add_code_cell(full_code, lang) | ||||||
|
||||||
i += 1 # Skip closing ``` | ||||||
else: | ||||||
# Accumulate markdown | ||||||
current_markdown.append(line) | ||||||
i += 1 | ||||||
|
||||||
# Add any remaining markdown | ||||||
if current_markdown: | ||||||
self._add_markdown_cell(current_markdown) | ||||||
|
||||||
def _add_markdown_cell(self, lines: List[str]) -> None: | ||||||
"""Add a markdown cell, stripping leading/trailing empty lines.""" | ||||||
# Strip leading empty lines | ||||||
while lines and not lines[0].strip(): | ||||||
lines.pop(0) | ||||||
|
||||||
# Strip trailing empty lines | ||||||
while lines and not lines[-1].strip(): | ||||||
lines.pop() | ||||||
|
||||||
if not lines: | ||||||
return | ||||||
|
||||||
content = '\n'.join(lines) | ||||||
cell = { | ||||||
"cell_type": "markdown", | ||||||
"metadata": {}, | ||||||
"source": content | ||||||
} | ||||||
self.cells.append(cell) | ||||||
|
||||||
def _add_code_cell(self, lines: List[str], lang: str) -> None: | ||||||
"""Add a code cell.""" | ||||||
# Extract packages from Julia code cells | ||||||
if lang == 'julia': | ||||||
for line in lines: | ||||||
self._extract_packages_from_line(line) | ||||||
|
||||||
content = '\n'.join(lines) | ||||||
|
||||||
# For non-Julia code blocks (like dot/graphviz), add as markdown with code formatting | ||||||
# since Jupyter notebooks typically use Julia kernel for these docs | ||||||
if lang != 'julia' and lang != 'python': | ||||||
# Convert to markdown with code fence | ||||||
markdown_content = f"```{lang}\n{content}\n```" | ||||||
cell = { | ||||||
"cell_type": "markdown", | ||||||
"metadata": {}, | ||||||
"source": markdown_content | ||||||
} | ||||||
else: | ||||||
cell = { | ||||||
"cell_type": "code", | ||||||
"execution_count": None, | ||||||
"metadata": {}, | ||||||
"outputs": [], | ||||||
"source": content | ||||||
} | ||||||
|
||||||
self.cells.append(cell) | ||||||
|
||||||
def to_notebook(self) -> Dict[str, Any]: | ||||||
"""Convert parsed cells to Jupyter notebook format.""" | ||||||
# Add package activation cell at the top for Julia notebooks | ||||||
cells = self.cells | ||||||
if self.kernel_name.startswith("julia"): | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The condition checks if kernel_name starts with 'julia', but line 18 sets kernel_name to 'julia' as a string. This should use equality check instead of startswith() to avoid potential false matches.
Suggested change
Copilot uses AI. Check for mistakes. Positive FeedbackNegative Feedback |
||||||
# Build the source code for the setup cell | ||||||
pkg_source_lines = ["using Pkg; Pkg.activate(; temp=true)"] | ||||||
|
||||||
# Add Pkg.add() calls for each package found in the document | ||||||
for package in sorted(self.packages): | ||||||
pkg_source_lines.append(f'Pkg.add("{package}")') | ||||||
|
||||||
pkg_cell = { | ||||||
"cell_type": "code", | ||||||
"execution_count": None, | ||||||
"metadata": {}, | ||||||
"outputs": [], | ||||||
"source": "\n".join(pkg_source_lines) | ||||||
} | ||||||
cells = [pkg_cell] + self.cells | ||||||
|
||||||
notebook = { | ||||||
"cells": cells, | ||||||
"metadata": { | ||||||
"kernelspec": { | ||||||
"display_name": "Julia", | ||||||
"language": "julia", | ||||||
"name": self.kernel_name | ||||||
}, | ||||||
"language_info": { | ||||||
"file_extension": ".jl", | ||||||
"mimetype": "application/julia", | ||||||
"name": "julia" | ||||||
} | ||||||
}, | ||||||
"nbformat": 4, | ||||||
"nbformat_minor": 5 | ||||||
} | ||||||
return notebook | ||||||
|
||||||
def write(self, output_path: str) -> None: | ||||||
"""Write the notebook to a file.""" | ||||||
notebook = self.to_notebook() | ||||||
with open(output_path, 'w', encoding='utf-8') as f: | ||||||
json.dump(notebook, f, indent=2, ensure_ascii=False) | ||||||
|
||||||
|
||||||
def main(): | ||||||
if len(sys.argv) < 2: | ||||||
print("Usage: qmd_to_ipynb.py <input.qmd> [output.ipynb]") | ||||||
sys.exit(1) | ||||||
|
||||||
qmd_path = sys.argv[1] | ||||||
|
||||||
# Determine output path | ||||||
if len(sys.argv) >= 3: | ||||||
ipynb_path = sys.argv[2] | ||||||
else: | ||||||
ipynb_path = Path(qmd_path).with_suffix('.ipynb') | ||||||
|
||||||
# Convert | ||||||
converter = QmdToIpynb(qmd_path) | ||||||
converter.parse() | ||||||
converter.write(ipynb_path) | ||||||
|
||||||
print(f"Converted {qmd_path} -> {ipynb_path}") | ||||||
|
||||||
|
||||||
if __name__ == "__main__": | ||||||
main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This complex regex replacement should be broken down into multiple steps or variables to improve readability and maintainability.
Copilot uses AI. Check for mistakes.