Extending Python with C Extension Modules

C-based Modules are a common sight in Python today. Popular libraries such as NumPy, Opencv, Pytorch, all have their base built on C or C++. What this means is that when we call NumPy functions, NumPy internally calls compiled C code, which executes the operation natively on the machine and returns the result in a Python wrapper. But why do this? The simple answer is Performance. As you must be aware that Python performance decreases significantly due to its support for dynamic typing. Before the interpreter executes an operation, it has to deduce the type of operands passed to it. Such type checks take a significant hit on the execution time of the program.

C Modules allow us to bypass these checks and execute native machine code directly via Python. There are multiple ways to link a C module to Python. One of the ways to do this is by using the CPython API.

To demonstrate how we can use this API, let us build a small greeting program in C and call it from Python. Our C program will take a name as input and return "Hello <name>" as the output.

Building our extension

To integrate C with Python we first need to install python3-dev package on your system.

sudo apt-get install python3-dev

Now create a directory called extension and create a file call greetmodule.c within it. Import the following headers within the file.

#include <Python.h>
#include <string.h>

The Python.h header is provided by the CPython library. The library provides multiple types and functions which are prefixed with the Py extension. The API enables us to integrate with Python by exposing an interface that allows us to wrap the native C types into supported Python types and link the Python module to the C extension module.

Let's build our greeting function. Note that this function deals with PyObject. PyObject is a C representation of a Python object instance. All other types extend from this type. The function takes in two arguments namely self, and args.  

  • self is used to represent the current object/module being referred to.
  • args pointer points to the arguments that are passed by the Python function.
static PyObject* name(PyObject *self, PyObject* args){
    char *name;
    char greeting[255] = "Hello ";
    if (!PyArg_ParseTuple(args, "s", &name)){
        return NULL;
    }

    strcat(greeting, name);
    return Py_BuildValue("s", greeting);
}

The PyArg_ParseTuple  function parses the arguments passed by the Python program and interprets them to native C values. The s in the second argument dictates that the expected argument is of string type. Python will throw a type error if the argument types do not match. In such a case we stop the execution of the module and return NULL.

Py_BuildValue is the opposite of the PyArg_ParseTuple wherein it converts the native C values to Python types. We use it to wrap our C variables within a Python object wrapper so that Python can interpret the values within the interpreter.

Once we have defined our method, our next step is to link it within a Python Module so that it is exposed to the Python interpreter. To do this we shall declare our method within an array of  type PyMethodDef.

static PyMethodDef moduleMethods[] = {
    {"name", name, METH_VARARGS, "Greets with your name"}
};

This array contains a list of all methods with their details mentioned within. That is {<method name>, <method pointer>, <args type>, <doc string>}. The python interpreter uses these details to call the method with the correct arguments. You can read more about the arguments here.

Now let's define our module of type PyModuleDef.  The first and fourth arguments here usually remain constant and are not necessarily something to bother about. The second argument specifies the name of the module (__name__), the third argument the docstring (__doc__), and the last argument passes the moduleMethods array we defined earlier.

static struct PyModuleDef greetModule = {
    PyModuleDef_HEAD_INIT,
    "greet",
    "Greetings Module",
    -1,
    moduleMethods
};

Our last step is to create a module using the PyModule_Create function. We call this function within the PyInit_greet function. Note that the function name should strictly be in the format of PyInit_<module name>. Python will look for this function when we call import greet.

PyMODINIT_FUNC PyInit_greet(void){
    return PyModule_Create(&greetModule);
};

Now that we have created our module object, we can move on to use it within Python. Create a main.py file within and add the following code and run it using python main.py.

import greet

print("Name: ", greet.__name__)
print("Docstring: ", greet.__doc__)
print("Greeting: ", greet.name("Lezwon"))

You will notice an error that says ModuleNotFoundError: No module named 'greet'. This is due to the fact that our C module has not yet been built and hence is not available to Python. We need to create a shared object file .so so that Python can use it.

Compiling our extension

To create the object file we shall make use of setuptools. setuptools is a python package that automates the creation and linking of extension modules to Python. Install it using the following command.

pip install setuptools

Now create a setup.py file and add the following code within it.

from setuptools import setup, Extension

ext_modules = [
    Extension('greet', sources = ['greetmodule.c']),
]

setup(
    name = 'Greeting Project',
    ext_modules = ext_modules
)

Within this code, we define a Python extension greet and point it to the C source file. We then pass our extension module to setup which compiles and provides the greet module within our project.

Let's compile our code with the following command.

python setup.py build_ext --inplace

You will now notice a greet.cpython-38-x86_64-linux-gnu.so file appears in your directory. This is a shared object file that will be linked to our Python program during runtime. It contains the greet module that we defined in C.

Running our program

With the .so file available, run the main.py file again. You will notice the following output.

>> python main.py
Name:  greet
Docstring:  Greetings Module
Greeting:  Hello Lezwon

We see the module name, docstring, and greeting string printed in our console. We can verify that the imported module is the C extension by using inspect tool.

import inspect
inspect.getfile(greet)
# -> '/workspaces/extension/greet.cpython-38-x86_64-linux-gnu.so'

Note that shared object files have higher precedence over python modules while being imported. What this means is that if there is a python file called greet.py in the same directory, Python would skip it and import the greet.cpython-38-x86_64-linux-gnu.so file instead. Refer to this document for more information.

We have now successfully created a C module extension for Python. Using this method we could create a Python wrapper for any C library we need. Many of the popular scientific computing libraries use this method to link their Python packages to their highly performant C counterparts. C extensions are not constrained by type checking mechanisms within the CPython Interpreter and hence can provide a significant boost to operations that involve a high number of iterations and calculations. We also have Cython, Numba, ctypes, cffi, pybind11, and other tools which make the task of running C modules from python easier. If you are curious about performance in python, I suggest you have a look at them.

If you like such content please do consider following me on Twitter. I frequently delve into the internals of software and journal my learnings on this blog, so make sure you also subscribe to it.

The code for this post is also available as a gist on Github.


References:

  1. DrapsTV, “Python3 Advanced Tutorial 9 - C Extensions,” YouTube. Jul. 23, 2015, Accessed: Jun. 18, 2021. [Online]. Available: https://www.youtube.com/watch?v=a65JdvOaygM.
  2. “Extending and Embedding the Python Interpreter — Python 3.9.5 documentation,” Python.org, 2021. https://docs.python.org/3/extending/index.html (accessed Jun. 18, 2021).
  3. "Built-In Package Support In Python 1.5". Python.Org, 2021, https://www.python.org/doc/essays/packages/. Accessed 19 June 2021.
  4. Python, Real. "Python Import: Advanced Techniques And Tips – Real Python". Realpython.Com, 2021, https://realpython.com/python-import/. Accessed 19 June 2021.
  5. Python, Real. "Python Modules And Packages – An Introduction – Real Python". Realpython.Com, 2021, https://realpython.com/python-modules-packages/#:~:text=The%20four%20modules%20(%20mod1.py,py%20)%20are%20defined%20as%20previously. Accessed 19 June 2021.

Lezwon Castelino

Lezwon Castelino

Freelancer | Open Source Contributor | Ex- @PyTorchLightnin Core ⚡ | Solutions Hacker | 20+ Hackathons