diff --git a/guests/python/DEVELOPMENT.md b/guests/python/DEVELOPMENT.md
new file mode 100644
index 0000000..967e3cd
--- /dev/null
+++ b/guests/python/DEVELOPMENT.md
@@ -0,0 +1,61 @@
+# Development Notes
+## Execution Models
+### Embedded VM
+#### Pyodide
+Website: .
+
+Pros:
+- Supports loads of dependencies
+- Runs in the browser
+
+Cons:
+- Doesn't seem to be working with freestanding WASM runtimes / servers, esp. not without Node.js
+
+#### Official CPython WASM Builds
+Links:
+-
+-
+-
+-
+
+Pros:
+- Official project, so it has a somewhat stable future and it is easier to get buy-in from the community
+
+Cons:
+- Can only run as a WASI CLI-like app (so we would need to interact with it via stdio or a fake network)
+- Currently only offered as wasip1
+
+#### pyo3 + Official CPython WASM Builds
+Instead of using stdio to drive a Python interpreter, we use [pyo3].
+
+Pros:
+- We can interact w/ Python more efficiently.
+
+Cons:
+- Needs pre-released Python 3.14, because 3.13 seems to rely on "thread parking", which is implemented as WASM exceptions, which are not supported by wasmtime yet. Relevant code is .
+
+#### webassembly-language-runtimes
+Website:
+
+This was formally a VMWare project.
+
+Cons:
+- Seems dead?
+
+### Ahead-of-Time Compilation
+This is only going to work if
+
+- the ahead-of-time compiler itself is lightweight enough to be embedded within a database (esp. it should not call to some random C host toolchain)
+- the Python compiler/transpiler is solid and supports enough features
+
+#### componentize-py
+Website:
+
+#### py2wasm
+Website:
+
+### Other Notes
+-
+
+
+[pyo3]: https://pyo3.rs/
diff --git a/guests/python/README.md b/guests/python/README.md
index c7950da..5f98310 100644
--- a/guests/python/README.md
+++ b/guests/python/README.md
@@ -13,63 +13,195 @@ or
just release
```
-## Execution Models
-### Embedded VM
-#### Pyodide
-Website: .
+## Python Version
+We currently bundle [Python 3.14.0rc2].
+
+## Python Standard Library
+In contrast to a normal Python installation there are a few notable public[^public] modules **missing** from the [Python Standard Library]:
+
+- [`curses`](https://docs.python.org/3/library/curses.html)
+- [`ensurepip`](https://docs.python.org/3/library/ensurepip.html)
+- [`fcntl`](https://docs.python.org/3/library/fcntl.html)
+- [`grp`](https://docs.python.org/3/library/grp.html)
+- [`idlelib`](https://docs.python.org/3/library/idle.html)
+- [`mmap`](https://docs.python.org/3/library/mmap.html)
+- [`multiprocessing`](https://docs.python.org/3/library/multiprocessing.html)
+- [`pip`](https://pip.pypa.io/)
+- [`pwd`](https://docs.python.org/3/library/pwd.html)
+- [`readline`](https://docs.python.org/3/library/readline.html)
+- [`resource`](https://docs.python.org/3/library/resource.html)
+- [`syslog`](https://docs.python.org/3/library/syslog.html)
+- [`termios`](https://docs.python.org/3/library/termios.html)
+- [`tkinter`](https://docs.python.org/3/library/tkinter.html)
+- [`turtledemo`](https://docs.python.org/3/library/turtle.html#module-turtledemo)
+- [`venv`](https://docs.python.org/3/library/venv.html)
+- [`zlib`](https://docs.python.org/3/library/zlib.html)
+
+Some modules low level modules like [`os`](https://docs.python.org/3/library/os.html) may not offer all methods, types, and constants.
+
+## Dependencies
+We do not bundle any additional libraries at the moment. It is currently NOT possible to install your own dependencies.
+
+## Methods
+Currently we only support [Scalar UDF]s. One can write it using a simple Python function:
+
+```python
+def add_one(x: int) -> int:
+ return x + 1
+```
+
+You may register multiple methods in one Python source text. Imported methods and private methods starting with `_` are ignored.
+
+## Types
+Types are mapped to/from [Apache Arrow] as follows:
+
+| Python | Arrow |
+| ------------ | ----------- |
+| [`bool`] | [`Boolean`] |
+| [`datetime`] | [`Timestamp`] w/ [`Microsecond`] and NO timezone |
+| [`float`] | [`Float64`] |
+| [`int`] | [`Int64`] |
+| [`str`] | [`Utf8`] |
+
+Additional types may be supported in the future.
+
+## NULLs
+NULLs are rather common in database contexts and a first-class citizen in [Apache Arrow] and [Apache DataFusion]. If you do not want to deal with it, just define your method with simple scalar types and we will skip NULL rows for you:
+
+```python
+def add_simple(x: int, y: int) -> int:
+ return x + y
+```
-Pros:
-- Supports loads of dependencies
-- Runs in the browser
+However, you can opt into full NULL handling. In Python, NULLs are expressed as optionals:
-Cons:
-- Doesn't seem to be working with freestanding WASM runtimes / servers, esp. not without Node.js
+```python
+def add_nulls(x: int | None, y: int | None) -> int | None:
+ if x is None or y is None:
+ return None
+ return x + y
+```
-#### Official CPython WASM Builds
-Links:
--
--
--
--
+or via the older syntax:
-Pros:
-- Official project, so it has a somewhat stable future and it is easier to get buy-in from the community
+```python
+from typing import Optional
-Cons:
-- Can only run as a WASI CLI-like app (so we would need to interact with it via stdio or a fake network)
-- Currently only offered as wasip1
+def add_old(x: Optional[int], y: Optional[int]) -> Optional[int]:
+ if x is None or y is None:
+ return None
+ return x + y
+```
-#### pyo3 + Official CPython WASM Builds
-Instead of using stdio to drive a Python interpreter, we use [pyo3].
+You may also partially opt into NULL handling for one parameter:
-Pros:
-- We can interact w/ Python more efficiently.
+```python
+def add_left(x: int | None, y: int) -> int | None:
+ if x is None:
+ return None
+ return x + y
-Cons:
-- Needs pre-released Python 3.14, because 3.13 seems to rely on "thread parking", which is implemented as WASM exceptions, which are not supported by wasmtime yet. Relevant code is .
+def add_right(x: int, y: int | None) -> int | None:
+ if y is None:
+ return None
+ return x + y
+```
-#### webassembly-language-runtimes
-Website:
+Note that if you define the return type as non-optional, you MUST NOT return `None`. Otherwise, the execution will fail.
-This was formally a VMWare project.
+To give you a better idea when a Python method is called, consult this table:
-Cons:
-- Seems dead?
+| `x` | `y` | `add_simple` | `add_nulls` | `add_left` | `add_right` |
+| ------ | ------ | ------------ | ----------- | ---------- | ----------- |
+| `None` | `None` | 𐄂 | ✓ | 𐄂 | 𐄂 |
+| `None` | some | 𐄂 | ✓ | ✓ | 𐄂 |
+| some | `None` | 𐄂 | ✓ | 𐄂 | ✓ |
+| some | some | ✓ | ✓ | ✓ | ✓ |
-### Ahead-of-Time Compilation
-This is only going to work if
+You may find this feature helpful when you want to control default values for NULLs:
-- the ahead-of-time compiler itself is lightweight enough to be embedded within a database (esp. it should not call to some random C host toolchain)
-- the Python compiler/transpiler is solid and supports enough features
+```python
+def half(x: float | None) -> float:
+ # zero might be a sensible default
+ if x is None:
+ return 0.0
-#### componentize-py
-Website:
+ return x / 2.0
+```
+
+or if you want turn a value into NULLs:
-#### py2wasm
-Website:
+```python
+def add_one_limited(x: int) -> int | None:
+ # do not go beyond 100
+ if x >= 100:
+ return None
-### Other Notes
--
+ return x + 1
+```
+## Default Parameters and Kwargs
+Default parameters, `*args`, and `**kwargs` are currently NOT supported. So these method will be rejected:
+
+```python
+def m1(x: int = 1) -> int:
+ return x + 1
+
+def m2(*x: int) -> int:
+ return x + 1
+
+def m3(*, x: int) -> int:
+ return x + 1
+
+def m4(**x: int) -> int:
+ return x + 1
+```
+
+## State
+We give no guarantees on the lifetime of the Python VM, but you may use state in your Python methods for performance reasons (e.g. to cache results):
+
+```python
+_cache = {}
+
+def compute(x: int) -> int:
+ try:
+ return _cache[x]
+ except ValueError:
+ y = x * 100
+ _cache[x] = y
+ return x
+```
+
+You may also use a builtin solution like [`functools.cache`]:
+
+```python
+from functools import cache
+
+@cache
+def compute(x: int) -> int:
+ return x * 100
+```
-[pyo3]: https://pyo3.rs/
+## I/O
+There is NO I/O available that escapes the sandbox. The [Python Standard Library] is mounted as a read-only filesystem.
+
+
+[^public]: Modules not starting with a `_`.
+
+[Apache Arrow]: https://arrow.apache.org/
+[Apache DataFusion]: https://datafusion.apache.org/
+[`bool`]: https://docs.python.org/3/library/stdtypes.html#boolean-type-bool
+[`Boolean`]: https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.Boolean
+[`datetime`]: https://docs.python.org/3/library/datetime.html#datetime.datetime
+[`float`]: https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex
+[`Float64`]: https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.Float64
+[`functools.cache`]: https://docs.python.org/3/library/functools.html#functools.cache
+[`int`]: https://docs.python.org/3/library/stdtypes.html#numeric-types-int-float-complex
+[`Int64`]: https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.Int64
+[`Microsecond`]: https://docs.rs/arrow/latest/arrow/datatypes/enum.TimeUnit.html#variant.Microsecond
+[Python 3.14.0rc2]: https://www.python.org/downloads/release/python-3140rc2/
+[Python Standard Library]: https://docs.python.org/3/library/index.html
+[Scalar UDF]: https://docs.rs/datafusion/latest/datafusion/logical_expr/struct.ScalarUDF.html
+[`str`]: https://docs.python.org/3/library/stdtypes.html#text-sequence-type-str
+[`Timestamp`]: https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.Timestamp
+[`Utf8`]: https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.Utf8
diff --git a/host/tests/integration_tests/python/runtime/dependencies.rs b/host/tests/integration_tests/python/runtime/dependencies.rs
index c33ff66..ed984b2 100644
--- a/host/tests/integration_tests/python/runtime/dependencies.rs
+++ b/host/tests/integration_tests/python/runtime/dependencies.rs
@@ -42,3 +42,37 @@ def foo(x: int) -> int:
&Int64Array::from_iter([Some(12), Some(23), Some(34)]) as &dyn Array,
);
}
+
+#[tokio::test(flavor = "multi_thread")]
+async fn functools_cache() {
+ const CODE: &str = "
+from functools import cache
+
+_counter = 0
+
+@cache
+def foo(x: int) -> int:
+ global _counter
+ _counter += 1
+ return x + _counter
+";
+
+ let udf = python_scalar_udf(CODE).await.unwrap();
+ let array = udf
+ .invoke_with_args(ScalarFunctionArgs {
+ args: vec![ColumnarValue::Array(Arc::new(Int64Array::from_iter([
+ Some(10),
+ Some(20),
+ Some(10),
+ ])))],
+ arg_fields: vec![Arc::new(Field::new("a1", DataType::Int64, true))],
+ number_rows: 3,
+ return_field: Arc::new(Field::new("r", DataType::Int64, true)),
+ })
+ .unwrap()
+ .unwrap_array();
+ assert_eq!(
+ array.as_ref(),
+ &Int64Array::from_iter([Some(11), Some(22), Some(11)]) as &dyn Array,
+ );
+}