Thursday, September 10, 2020

Python Extension Programming using C



Hello dear readers! welcome back to another section of my tutorial on Python. In this tutorial post, we are going to be discussing about Python Extension Programming using C.

Any code that you write using any compiled language like C, C++, C#, C Shell, or Java can be integrated into a Python script. This code is considered as an "extension".

A Python Programming extension module is nothing more than a normal C library. On Unix systems, these libraries usually end in .so (for shared objects). But when it comes to Windows machines, you typically see .dll (for dynamically linked library).


Pre-Requisites for Writing Extensions

To start writing your extension, you will need Python header files.

  • On Unix systems, it normally requires  the installation of a developer-specific package such as Python2.5-dev.
  • Windows Operating System users get these headers as part of the package when they make use of the binary Python installer.

Additionally, it's is assumed that you have a very good knowledge of C or C++ to write any Python Extension using C Programming.


Structure of Python Extension Module

A Python extension module will have the following parts -

  • A header file Python.h.
  • The C functions you want to expose as the interface from your module.
  • A table mapping the names of your functions as Python developers see them to C functions inside of the extensions module.
  • An initialization function.


The Header File Python.h

You need to include Python.h header file in your C Source file, which gives you full access to the internal Python API which is used for hooking your module to the interpreter.

You have to make sure to include the Python.h before any other headers you might need, followed with all the functions that you want to call from Python.

The C Functions

The signatures of the C program implementation of your functions always takes one of the three forms below -

static PyObject *MyFunction( PyObject *self, PyObject *args );

static PyObject *MyFunctionWithKeywords(PyObject *self,
                                 PyObject *args,
                                 PyObject *kw);

static PyObject *MyFunctionWithNoArgs( PyObject *self );

Each of the preceding declaration returns a Python object. So there is no such thing as a void function in Python as there is in C. If you do not want your functions to return a value, return the C equivalent of Python's None value. The Python program headers define a macro, Py_RETURN_NONE, that does all this for us.

The names of your C functions can be whatsoever you want as they are never seen outside of the extension module. They are being defined as a static function.


Your C functions are named by combining the Python module and function names together, Below is an example -

static PyObject *module_func(PyObject *self, PyObject *args) {
   /* Do your stuff here. */
   Py_RETURN_NONE;
}

This is a Python function called func inside the module module. You are going to be putting pointers to your C functions into the method table for the module that usually comes first in your source code.

The Method Mapping Table

This method mapping table is a simple array of the PyMethodDef structures. That structure looks something like this -

struct PyMethodDef {
   char *ml_name;
   PyCFunction ml_meth;
   int ml_flags;
   char *ml_doc;
};

Following is the description of the members of this structure -

  • ml_name - The name of the function as the Python interpreter presents when it is used in Python programs.
  • ml_meth - This must be the address to a function that has any one of the signatures described above.
  • ml_flags - This tells the C interpreter which of the three signatures that ml_meth is using.
    • This flag has a value of METH_VARARGS.
    • This Python flag can be bitwise OR'ed with METH_KEYWORDS if you would love to allow keyword arguments in your function.
    • This Python flag can also have a value of METH_NOARGS which acknowledges that you do not want to accept arguments.
  • ml_doc - This is the docstring for the function, which could be NULL if you do not feel like writing one.


This table needs to be terminated with a sentinel that consists of NULL and zero values for the appropriate numbers.

Example

For the above defined-function, we have the following below method mapping table -

static PyMethodDef module_methods[] = {
   { "func", (PyCFunction)module_func, METH_NOARGS, NULL },
   { NULL, NULL, 0, NULL }
};

The Initialization Function

The last section of your extension module is the Initialization function. This function is called by the Python interpreter when the module loads. It is vital that the function be named initModule, where Module here is the name of the module.

The initialization function needs to be exported from the library that you will be building. The Python headers define PyMODINIT_FUNC to include the proper incantations in order for that to happen for the particular environment that we are compiling. All you have to do is use it when you are defining the function.

Your C initialization function has the following overall structure -

PyMODINIT_FUNC initModule() {
   Py_InitModule3(func, module_methods, "docstring...");
}

Following below is the description of Py_InitModule3 function -

  • func - The function to be exported.
  • module_methods - This is the mapping table name defined above.
  • docstring - The comment you want to give in your extension.

Putting this all together looks like the following below -

#include <Python.h>

static PyObject *module_func(PyObject *self, PyObject *args) {
   /* Do your stuff here. */
   Py_RETURN_NONE;
}

static PyMethodDef module_methods[] = {
   { "func", (PyCFunction)module_func, METH_NOARGS, NULL },
   { NULL, NULL, 0, NULL }
};

PyMODINIT_FUNC initModule() {
   Py_InitModule3(func, module_methods, "docstring...");
}

Example

The following below is a simple example that makes use of all the above concepts -

#include <Python.h>

static PyObject* helloworld(PyObject* self) {
   return Py_BuildValue("s", "Hello, Python extensions!!");
}

static char helloworld_docs[] =
   "helloworld( ): Any message you want to put here!!\n";

static PyMethodDef helloworld_funcs[] = {
   {"helloworld", (PyCFunction)helloworld, 
      METH_NOARGS, helloworld_docs},
      {NULL}
};

void inithelloworld(void) {
   Py_InitModule3("helloworld", helloworld_funcs,
                  "Extension module example!");
}

From the following above example, the Py_BuildValue function is used to build a Python value. Save the above code in a hello.c file. We would see how to compile and also install this module to be called from the Python script.


Building and Installing Extensions

The distutils package makes it very simple to distribute Python modules, both the pure Python and extension modules, in a standard way. Modules are distributed in source form, built and installed via setup script usually called setup.py as follows.

For the above module, you need to the following setup.py script -

from distutils.core import setup, Extension
setup(name='helloworld', version='1.0',  \
      ext_modules=[Extension('helloworld', ['hello.c'])])

Now, use the following command, which is going to perform all needed compilation and linking procedures, with the right compiler and linker commands and flags, and then copies the resulting dynamic library into an appropriate directory -

$ python setup.py install

On Unix-based machines, you will most likely need to run this command as root in order to have permissions to write to the site-packages directory. This is usually not a problem at all on Windows OS.


Importing Extensions

Once you have installed your extension, you would be able to import and call that extension in your Python script as follows -

#!/usr/bin/python
import helloworld

print helloworld.helloworld()

Output

When the above code is executed, it will produce the following result -

Hello, Python extensions!!

Passing Function Parameters

As you will most likely want to define functions which accepts arguments, you can also use one of the other signatures for your C functions. For example, the following function, that accepts a number of parameters, would be defined like this -

static PyObject *module_func(PyObject *self, PyObject *args) {
   /* Parse args and do something interesting here. */
   Py_RETURN_NONE;
}

The method table holding an entry for the new function would look like this -

static PyMethodDef module_methods[] = {
   { "func", (PyCFunction)module_func, METH_NOARGS, NULL },
   { "func", module_func, METH_VARARGS, NULL },
   { NULL, NULL, 0, NULL }
};

You can use API PyArg_ParseTuple function to extract the arguments from the one PyObject pointer that is passed into your C function.


The very first argument to the PyArg_ParseTuple is the args arguments. It is the object you will parse. The second argument is a format string that describes the arguments as you expect them to display. Each of the arguments is being represented by one or more characters in format string. Below is an example -

static PyObject *module_func(PyObject *self, PyObject *args) {
   int i;
   double d;
   char *s;

   if (!PyArg_ParseTuple(args, "ids", &i, &d, &s)) {
      return NULL;
   }
   
   /* Do something interesting here. */
   Py_RETURN_NONE;
}

Compiling the new version of your module and importing it enables you to invoke the new function with any number of arguments of any type -

module.func(1, s="three", d=2.0)
module.func(i=1, d=2.0, s="three")
module.func(s="three", d=2.0, i=1)

You can probably come up with even more variations.

The PyArg_ParseTuple Function
Here is a standard signature for the PyArg_ParseTuple function -

int PyArg_ParseTuple(PyObject* tuple,char* format,...)

This function returns 0 for error, and and a value that is not equal to 0 for success. tuple is the PyObject* that was the C program function's second argument. Here format is a C string that describes both the mandatory and optional arguments.


Below is the list of format codes for the PyArg_ParseTuple function -

CodeC typeMeaning
ccharA Python string of length 1 becomes a C char.
ddoubleA Python float becomes a C double.
ffloatA Python float becomes a C float.
iintA Python int becomes a C int.
llongA Python int becomes a C long.
Llong longA Python int becomes a C long long
OPyObject*Gets non-NULL borrowed reference to Python argument.
schar*Python string without embedded nulls to C char*.
s#char*+intAny Python string to C address and length.
t#char*+intRead-only single-segment buffer to C address and length.
uPy_UNICODE*Python Unicode without embedded nulls to C.
u#Py_UNICODE*+intAny Python Unicode C address and length.
w#char*+intRead/write single-segment buffer to C address and length.
zchar*Like s, also accepts None (sets C char* to NULL).
z#char*+intLike s#, also accepts None (sets C char* to NULL).
(...)as per ...A Python sequence is treated as one argument per item.
|The following arguments are optional.
:Format end, followed by function name for error messages.
;Format end, followed by entire error message text.

Return Value
The Python program Py_BuildValue takes in a format string much like PyArg_ParseTuple does. But rather than passing in the addresses of the values that you are building, you pass in the real values.

Example
Here is an example showing how to implement an add function -

static PyObject *foo_add(PyObject *self, PyObject *args) {
   int a;
   int b;

   if (!PyArg_ParseTuple(args, "ii", &a, &b)) {
      return NULL;
   }
   return Py_BuildValue("i", a + b);
}

This is what it is going to look like if implemented in Python -

def add(a, b):
   return (a + b)

You can return two values from your function as follows, this would be captured by using a list in Python.

static PyObject *foo_add_subtract(PyObject *self, PyObject *args) {
   int a;
   int b;

   if (!PyArg_ParseTuple(args, "ii", &a, &b)) {
      return NULL;
   }
   return Py_BuildValue("ii", a + b, a - b);
}

This is what it is going to look like if implemented in Python -

def add_subtract(a, b):
   return (a + b, a - b)


The Py_BuildValue Function
Here is the standard signature for the Py_BuildValue function -

PyObject* Py_BuildValue(char* format,...)

From the above syntax, format is a C string that describes the object that is to be built. The following arguments of the Py_BuildValue are the C values from which the result is built. PyObject* result is a new reference 

The following table below lists out the commonly used code strings, of which zero or more are joined into string format.

CodeC typeMeaning
ccharA C char becomes a Python string of length 1.
ddoubleA C double becomes a Python float.
ffloatA C float becomes a Python float.
iintA C int becomes a Python int.
llongA C long becomes a Python int.
NPyObject*Passes a Python object and steals a reference.
OPyObject*Passes a Python object and INCREFs it as normal.
O&convert+void*Arbitrary conversion
schar*C 0-terminated char* to Python string, or NULL to None.
s#char*+intC char* and length to Python string, or NULL to None.
uPy_UNICODE*C-wide, null-terminated string to Python Unicode, or NULL to None.
u#Py_UNICODE*+intC-wide string and length to Python Unicode, or NULL to None.
w#char*+intRead/write single-segment buffer to C address and length.
zchar*Like s, also accepts None (sets C char* to NULL).
z#char*+intLike s#, also accepts None (sets C char* to NULL).
(...)as per ...Builds Python tuple from C values.
[...]as per ...Builds Python list from C values.
{...}as per ...Builds Python dictionary from C values, alternating keys and values.


Alright guys! This is where we are rounding up for this tutorial post. In my next tutorial, we are going to be starting our tutorials on PHP.

Feel free to ask your questions where necessary and i will attend to them as soon as possible. If this tutorial was helpful to you, you can use the share button to share this tutorial.

Follow us on our various social media platforms to stay updated with our latest tutorials. You can also subscribe to our newsletter in order to get our tutorials delivered directly to your emails.

Thanks for reading and bye for now.
Share:

0 comments:

Post a Comment

Hello dear readers! Please kindly try your best to make sure your comments comply with our comment policy guidelines. You can visit our comment policy page to view these guidelines which are clearly stated. Thank you.