Embedding Python in Rust (for tests)

January 22, 2025

The latest generation of programming languages (Rust, Go, Zig) come bundled not just with a standard library but with a suite of first-party tools for working with the code itself (e.g. cargo fmt, gofmt, zig fmt, etc.). But I suspect that some future generation of (statically typed) programming languages will also come with a first-party embedded scripting language to make it easier to write tests. Until then though, third-party embedded scripting languages can be convenient.

PyO3 does the heavy lifting for embedding Python in Rust. And its docs are pretty good. But it took me a little while to pull together all of the pieces for a few common things I'd want to do. So in this post we'll build a little test runner written in Rust that exposes some Rust functions to tests written in Python. And we'll embed the Python interpreter itself in our Rust test runner.

The code for this post is available on GitHub.

Building Python from source

Note: You don't have to build Python from source. Your system Python is fine if you just want to try PyO3 out.

Unlike some bindings libraries, PyO3 seems to have no option to ship builds of Python itself alongside the bindings. We are at the whim of each developer's local version of Python. That is, what builds on one machine with one local version of Python might not build on another machine with another local version of Python.

If we want to be able to control the exact build of Python we are embedding, we can build Python from scratch. (Another good reason to build from scratch is to try out --disable-gil. I became curious about this while trying out PyO3.) Let's go.

$ git clone https://github.com/python/cpython
$ cd cpython
$ git checkout v3.13.1

Let's disable _tkinter and _gdbm when we build (since the build with them was not working for me and I did not want to spend more time on this).

diff --git a/Modules/Setup b/Modules/Setup
index e4acf6bc7de..b2e4543bcf2 100644
--- a/Modules/Setup
+++ b/Modules/Setup
@@ -296,3 +296,6 @@ PYTHONPATH=$(COREPYTHONPATH)
#
# _sqlite3 _tkinter _curses pyexpat
# _codecs_jp _codecs_kr _codecs_tw unicodedata
+
+*disabled*
+_tkinter _gdbm

If (when you later run make -j16 to build the code) you see more missing modules, you can add them to this *disabled* list.

Now run autoconf and tell it to install to a local directory instead of a global one.

$ ./configure --prefix=$(pwd)/build

(Here is where you would pass --disable-gil if you wanted to: ./configure --disable-gil --prefix=$(pwd)/build.)

Then build and install.

$ make -j16
$ make install

And confirm the Python binary.

$ ./build/bin/python3.13 --version
Python 3.13.1

Now we've got a controlled version of Python, let's move on to a Rust project that embeds it.

Tests and test runner boilerplate

Create a new Rust project and add pyo3 and pyo3_ffi as dependencies.

$ mkdir rust-python-test-runner
$ cd rust-python-test-runner
$ cargo init
$ cargo add pyo3 pyo3_ffi

Now let's create some Python test files in a tests/ directory.

First, tests/basic.py that simply prints out a message.

print("Ok :) ")

Now let's do one with Python's builtin threading module, in tests/threads.py.

import threading
def client1():
   print("client 1 here")
def client2():
   print("client 2 here")
ts = [threading.Thread(target=f) for f in [client1, client2]]
for t in ts:
   t.start()
for t in ts:
   t.join()

(If you want to fork bomb the test runner, try using Python's builtin multiprocessing library instead.)

Now in our Rust test runner (src/main.rs) let's first find all of these test files.

use std::fs;
use pyo3::prelude::*;
fn main() -> PyResult<()> {
   let test_dir = concat!(env!("CARGO_MANIFEST_DIR"), "/tests");
   let tests = fs::read_dir(test_dir).unwrap();
   for test in tests {
       let test = test.unwrap();
       let filename = test.path().into_os_string().into_string().unwrap();
       if !filename.ends_with(".py") {
           continue;
       }
       println!("[STARTED]: {}", filename.clone());
       // TODO: Run tests.
       println!("[PASSED]: {}", filename.clone());
   }
   Ok(())
}

To get PyO3 to use our custom build of Python when we build the Rust code, we need to set the PYO3_PYTHON_HOME environment variable to the Python binary we built.

$ PYO3_PYTHON=$HOME/vendor/cpython/build/bin/python3.13 cargo run
  Compiling pyo3-build-config v0.23.4
  Compiling pyo3-macros-backend v0.23.4
  Compiling pyo3-ffi v0.23.4
  Compiling pyo3 v0.23.4
  Compiling pyo3-macros v0.23.4
  Compiling rust-python-test-runner v0.1.0 (/Users/phil/tmp/rust-python-test-runner)
   Finished `dev` profile [unoptimized + debuginfo] target(s) in 3.84s
    Running `target/debug/rust-python-test-runner`
[STARTED]: /Users/phil/tmp/rust-python-test-runner/tests/threads.py
[PASSED]: /Users/phil/tmp/rust-python-test-runner/tests/threads.py
[STARTED]: /Users/phil/tmp/rust-python-test-runner/tests/basic.py
[PASSED]: /Users/phil/tmp/rust-python-test-runner/tests/basic.py

Now let's actually run the Python test code.

Running Python code with PyO3

Before we can run any Python code we need to initialize the interpreter by calling pyo3::prepare_freethreaded_python.

diff --git a/src/main.rs b/src/main.rs
index 6cfd1cd..b456ea3 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -3,6 +3,8 @@ use std::fs;
use pyo3::prelude::*;

fn main() -> PyResult<()> {
+    pyo3::prepare_freethreaded_python();
+
    let test_dir = concat!(env!("CARGO_MANIFEST_DIR"), "/tests");
    let tests = fs::read_dir(test_dir).unwrap();
    for test in tests {

Then we can acquire the GIL with Python::with_gil and then run our source code with the Python object we get within the callback.

diff --git a/src/main.rs b/src/main.rs
index b456ea3..1200070 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -1,3 +1,4 @@
+use std::ffi::CString;
use std::fs;

use pyo3::prelude::*;
@@ -16,7 +17,24 @@ fn main() -> PyResult<()> {

        println!("[STARTED]: {}", filename.clone());

-        // TODO: Run tests.
+        let module_name = test
+            .path()
+            .file_stem()
+            .unwrap()
+            .to_os_string()
+            .into_string()
+            .unwrap();
+        let source = fs::read_to_string(filename.clone())?;
+
+        Python::with_gil(|py| -> PyResult<()> {
+            PyModule::from_code(
+                py,
+                &CString::new(source.as_str())?.as_c_str(),
+                &CString::new(filename.as_str())?.as_c_str(),
+                &CString::new(module_name.as_str())?.as_c_str(),
+            )?;
+            Ok(())
+        })?;

        println!("[PASSED]: {}", filename.clone());
    }

The only tedious bit is converting all our Rust strings to C strings Python is happy with.

Let's build and run.

$ PYO3_PYTHON=$HOME/vendor/cpython/build/bin/python3.13 cargo run
  Compiling rust-python-test-runner v0.1.0 (/Users/phil/tmp/rust-python-test-runner)
   Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.18s
    Running `target/debug/rust-python-test-runner`
[STARTED]: /Users/phil/tmp/rust-python-test-runner/tests/threads.py
client 1 here
client 2 here
[PASSED]: /Users/phil/tmp/rust-python-test-runner/tests/threads.py
[STARTED]: /Users/phil/tmp/rust-python-test-runner/tests/basic.py
Ok :)
[PASSED]: /Users/phil/tmp/rust-python-test-runner/tests/basic.py

Pretty easy! Recall that tests/threads.py even imports builtin modules.

import threading
def client1():
   print("client 1 here")
def client2():
   print("client 2 here")
ts = [threading.Thread(target=f) for f in [client1, client2]]
for t in ts:
   t.start()
for t in ts:
   t.join()

Let's go one level deeper and expose Rust functions to our Python test code.

Calling Rust from Python

To call Rust code from Python we need to expose a Python module object that our Python test code can import. And then we need to attach Rust functions to that Python module object.

I was pleasantly surprised that when you want to expose a Rust function to Python, you don't need to write the function in terms of CPython objects (i.e. CPython argument types and CPython return types). You can write a Rust function with Rust types and the pyfunction macro handles conversion between Rust and CPython argument types and return types.

For example from the PyO3 user docs:

use pyo3::prelude::*;
use pyo3::ffi::c_str;

#[pyfunction]
fn add_one(x: i64) -> i64 {
   x + 1
}

#[pymodule]
fn foo(foo_module: &Bound<'_, PyModule>) -> PyResult<()> {
   foo_module.add_function(wrap_pyfunction!(add_one, foo_module)?)?;
   Ok(())
}

fn main() -> PyResult<()> {
   pyo3::append_to_inittab!(foo);
   Python::with_gil(|py| Python::run(py, c_str!("import foo; foo.add_one(6)"), None, None))
}

I was expecting we'd have to do something like:

fn add_one(x: PyObject) -> PyObject {
 let i = x.to_int();
 let res = x + 1;
 return res.to_pyobject();
}

It's great that we don't! Even for more complex types like arrays and hash tables.

Complex types

Let's say we want to expose a Rust interface for running a SQL query to our Python tests. We should be able to accept a query string and return a list of results.

The exact details of this function do not matter, so we will ignore them and simply return a literal list of results.

#[pyfunction]
fn sql(_sql_string: String) -> Vec<HashMap<String, String>> {
   vec![
       HashMap::from([
           ("name".to_string(), "Terry".to_string()),
           ("id".to_string(), "1".to_string()),
       ]),
       HashMap::from([
           ("name".to_string(), "Bina".to_string()),
           ("id".to_string(), "2".to_string()),
       ]),
   ]
}

We need to both declare this function and make it available in a Python module.

diff --git a/src/main.rs b/src/main.rs
index 1200070..63f006a 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -1,9 +1,30 @@
+use std::collections::HashMap;
use std::ffi::CString;
use std::fs;

use pyo3::prelude::*;

+#[pyfunction]
+fn sql(_sql_string: String) -> Vec<HashMap<String, String>> {
+    vec![
+        HashMap::from([
+            ("name".to_string(), "Terry".to_string()),
+            ("id".to_string(), "1".to_string()),
+        ]),
+        HashMap::from([
+            ("name".to_string(), "Bina".to_string()),
+            ("id".to_string(), "2".to_string()),
+        ]),
+    ]
+}
+
+#[pymodule]
+fn testrunner(testrunner_module: &Bound<'_, PyModule>) -> PyResult<()> {
+    testrunner_module.add_function(wrap_pyfunction!(sql, testrunner_module)?)
+}
+
fn main() -> PyResult<()> {
+    pyo3::append_to_inittab!(testrunner);
    pyo3::prepare_freethreaded_python();

    let test_dir = concat!(env!("CARGO_MANIFEST_DIR"), "/tests");
@@ -15,6 +36,10 @@ fn main() -> PyResult<()> {
            continue;
        }

+    if env!("RUN_TEST").len() > 0 && !filename.ends_with(env!("RUN_TEST")) {
+        continue;
+    }
+
        println!("[STARTED]: {}", filename.clone());

        let module_name = test

PyO3 does the rest.

Now a test.

import testrunner
print(testrunner.sql("SELECT * FROM users"))

Build and execute the runner (this time we'll filter only the sql.py test by setting RUN_TEST=sql.py).

$ RUN_TEST=sql.py PYO3_PYTHON=$HOME/vendor/cpython/build/bin/python3.13 cargo run
  Compiling rust-python-test-runner v0.1.0 (/Users/phil/tmp/rust-python-test-runner)
   Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.27s
    Running `target/debug/rust-python-test-runner`
[STARTED]: /Users/phil/tmp/rust-python-test-runner/tests/sql.py
[{'name': 'Terry', 'id': '1'}, {'name': 'Bina', 'id': '2'}]
[PASSED]: /Users/phil/tmp/rust-python-test-runner/tests/sql.py

Great!

But we oversimplified by having the query results be an array of string to string hashmaps. In reality, it should be an array of string to whatever hashmaps because the database will return dynamically typed results, different for each query. This would be tedious to express in Rust, but Python should be fine with it.

Dynamic types

We got this far without explicit conversions between Rust and Python data types. But if we want to be able to return dynamic types to Python from a Rust function we can use Python data types directly.

Specifically, for a function like the following, we want to return the PyObject type because a Python object can be a string or an integer.

#[pyfunction]
fn different_return_types_sometimes() -> PyObject {
 if some_condition {
   // return a string
 } else {
   // return an integer
 }
}

We can convert most basic Rust types into a Python object that is not type-checked using IntoPyObjectExt::into_py_any. But that method requires access to the Python interpreter (which we can get with Python::with_gil). This is likely because we are now attempting to construct Python objects ourselves; objects whose memory is managed by the Python interpreter. Whereas before this, we were only trying to create Rust objects and the conversion to Python objects could happen at some later point.

#[pyfunction]
fn different_return_types_sometimes() -> PyResult<PyObject> {
   Python::with_gil(|py| -> PyResult<PyObject> {
       if true {
           "hey".to_string().into_py_any(py)
       } else {
           1_i32.into_py_any(py)
       }
   })
}

So let's apply this to our SQL method and have it return values of dynamic types.

diff --git a/src/main.rs b/src/main.rs
index bbc1125..153dc6f 100644
--- a/src/main.rs
+++ b/src/main.rs
@@ -3,19 +3,23 @@ use std::ffi::CString;
use std::fs;

use pyo3::prelude::*;
+use pyo3::IntoPyObjectExt;

#[pyfunction]
-fn sql(_sql_string: String) -> Vec<HashMap<String, String>> {
-    vec![
-        HashMap::from([
-            ("name".to_string(), "Terry".to_string()),
-            ("id".to_string(), "1".to_string()),
-        ]),
-        HashMap::from([
-            ("name".to_string(), "Bina".to_string()),
-            ("id".to_string(), "2".to_string()),
-        ]),
-    ]
+fn sql(_sql_string: String) -> PyResult<PyObject> {
+    Python::with_gil(|py| -> PyResult<PyObject> {
+        vec![
+            HashMap::from([
+                ("name".to_string(), "Terry".into_py_any(py)?),
+                ("id".to_string(), 1_i32.into_py_any(py)?),
+            ]),
+            HashMap::from([
+                ("name".to_string(), "Bina".into_py_any(py)?),
+                ("id".to_string(), 2_i32.into_py_any(py)?),
+            ]),
+        ]
+        .into_py_any(py)
+    })
}

#[pymodule]

And build and test it:

$ RUN_TEST=sql.py PYO3_PYTHON=$HOME/vendor/cpython/build/bin/python3.13 cargo run
  Compiling rust-python-test-runner v0.1.0 (/Users/phil/tmp/rust-python-test-runner)
   Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.29s
    Running `target/debug/rust-python-test-runner`
[STARTED]: /Users/phil/tmp/rust-python-test-runner/tests/sql.py
[{'name': 'Terry', 'id': 1}, {'name': 'Bina', 'id': 2}]
[PASSED]: /Users/phil/tmp/rust-python-test-runner/tests/sql.py

Look carefully, the id values are integers this time not strings. The name values are still strings. Dynamic types.

Other embedded options in Rust

Rhai (a new embedded language written in Rust) and mlua (Rust bindings to Lua) and rusty_v8 (Rust bindings to V8) seem compelling but they do not have support for parallelism within the scripting language itself. This would be particularly convenient for something like tests that need to do multiple things at the same time. Lua has coroutines and JavaScript has async/await but both can be more tedious and error-prone than threads and blocking IO. For tests you'd ideally write the simplest code you can, but no simpler.

Share this