The presentation has four parts.
First we provide some information about Nexedi.
Then we reach to the conclusions, so that those who are in a hurry can leave.
Then we take some time to explain the rational behind our efforts.
And last we show some code.
Nexedi clients are mainly large companies and governments looking for scalable enterprise solutions such as ERP, CRM, DMS, data lake, big data cloud, etc.
One of the recent services launched by Nexedi is a Big Data cloud entirely based on Open Hardware and entirely running on Free Software that anyone can contribute to. The target cost is 30% to 60% lower than low cost European clouds, and 10 times lower than US based public clouds. Targer use cases include performance testing, disaster recovery and big data batch processing.
Nexedi supports Pyodide - we need more hands
We of course studied all alternatives to CPython runtime, searching for a faster one with JIT or multi-core support: pypy, micropython, pyjion, pyparalle, etc. (see High Peformance Multi-core Python at Nexedi).
We also studied all implementations of Python for web browsers.
None of the alternatives brought good results, except Pypy for performance in a few cases and for impressive performance in a web browser. But pypy has other issues: no improvements over CPython for multi-core, still harder to interface with C/C++ (but this is improving).
After some time, we finally found a way that we believe has some future: use Python for high-level scripting or reflexive programming (as it was intended for), use something else for high performance.
This something else already exists: it is called Cython. It has a syntax similar to Python. It is already widely adopted. It is fully compatible with Python 3 (Cython is a superset of Python language).
This something else could also be C, C++, FORTRAN, etc. which interface with CPython very easily thanks to Cython.
The power of using Python as a driver and something else such as Cython for performance is showcased in a project called Pyodide.
Pyodide is a project that originates at Mozilla and that brings a complete Python based data science notebook into a Web browser.
Pyodide is now also sponsored by Nexedi (see OfficeJS Iodide: Ubiquitous Data Science Notebooks for Business and Education) through the work Roman Yurchak who added dynamic loading of python modules, accelerated the build process, fixed stability issues and is on the way to release scipy and scikit-learn.
Pyodide is really interesting for us because, despite the extreme slowness of CPython runtime compiled to Web Assembly running in a Web browser, Pyodide is actually fast and usable.
Running a Pyodide notebook involves in general executing CPython (10 times slower than native) and NumPy (nearly same as native). In average, Pyodide is only 50% slower than native.
By using LWAN and Cython, we achieved to create a small HTTP server in python that is able to run faster that the fastest HTTP golang server and that is also able to scale linearly on multiple
And if code is added to this server using Cython cdef functions to create some dynamic pages (eg. a page with a Fibonacci result), then the server still remains as scalable as golang and stil faster.
We have thus been able to demonstrate how to equal go in concurrency and beat it in performance: use cython with nogil option.
We studied more the underlying LWAN library and found that it includes a coroutine library. Based on the ideas of Juliusz Chroboczek's system programming project at Paris 7 University, we created a small co-routine library for Cython and compared it with asyncio, gevent and goroutines on a "empty corounine" benchmark.
Surprisingly, it is an order of magnitude faster than exiting python libraries. The run-time part is probably as good as in golang. The Spawn time is till high, most likely because we rely on malloc, but could be improved.