Today we’re very excited to announce Pyston-lite, a JIT for Python that is easily installable as an extension module. We’ve taken the core technology of Pyston and repackaged it so that you can install it through your existing Python package manager, making it dramatically easier to use. Pyston-lite doesn’t contain all of the optimizations of regular Pyston, but it is roughly 10-25% faster than stock Python 3.8 depending on the workload and we are not done optimizing it.
When we started Pyston v2 two years ago we originally built it as a PEP 523 extension module. We quickly, however, ran into optimizations that are prevented by that API, and we decided to fork the entire CPython codebase instead. We knew that ease-of-use was a primary factor in getting people to switch Python implementations so we made Pyston as easy to install as possible such as through a portable package and a conda package, but we kept noticing that there was still friction in the transition process which put off potential users. So we decided to do the one thing we could do to make the process even easier and offer an extension module again.
As a bonus, this is the first time that Mac users can use the Pyston JIT.
We’ve also released version v2.3.4 of full Pyston, which contains additional optimizations since v2.3.3. It is approximately 6% faster than v2.3.3 on pyperformance, for a total speedup of 66%. This release particularly improves the performance of Python floats and speeds up benchmarks like richards by 65% over v2.3.3.
The way you write your code can affect how well Pyston can optimize it; Kevin gave a talk at PyCon about this and the video is now available, where he gives some examples of how a smart optimizer introduces some new considerations for the programmer.
Using Pyston-lite
We think we’ve made it as easy as it can get now: to start using a JIT for Python 3.8 on Linux or Mac, simply do
pip install pyston_lite_autoload
or
conda install pyston_lite_autoload -c pyston -c conda-forge
This will install our JIT and configure the current environment to automatically use it until you uninstall the package. You don’t have to create a new environment or recompile any packages, and we are not aware of any compatibility issues or limitations in the code that can be run.
We offer two packages: pyston_lite
, which contains our JIT, and pyston_lite_autoload
which automatically injects our JIT into the Python process at startup. If you want programmatic control over enabling the JIT you can call pyston_lite.enable()
without installing the autoload package, and if you install the autoload package you can disable the JIT injection by setting the DISABLE_PYSTON=1
environment variable.
Caveats
There are many things that we’re still hoping to add to Pyston-lite, but we wanted to launch an early version of it to get feedback and gauge interest. In particular, we’re planning on adding:
- More optimizations. We’ve ported most of the Pyston optimizations to Pyston-lite, but there are still a few more that we can port with additional work.
- Support for more Python versions. This current release only supports Python 3.8, but since Python-lite has a much smaller surface area we believe we can target multiple releases, unlike Pyston which will only target a single release for the foreseeable future.
- Working with upstream CPython to add more JIT APIs. We currently are unable to use all of our techniques in Pyston-lite due to having less control over the system, but we are in discussions with the CPython team to add more JIT hooks to Python 3.12. Ideally we will be able to offer an extension module for 3.12 that has the same performance as a full fork of CPython.
A final caveat is that the performance is quite variable: we’ve seen very different performance results on different processors, with better improvements on a recent AMD processor than a moderately-old Intel processor. This is in addition to the workload-sensitivity that is inherent in a Python optimizer.
Final words
We hope you’ll try Pyston-lite out using the command above, because speeding up your Python code will not get any easier than this. If you have any trouble or questions you can find us on Discord and GitHub — we’d love to hear about your experience and which features you’d like us to prioritize.
Addendum: benchmarking
As more people are running Python benchmarks we wanted to make sure to publish our benchmarking methodology. The following numbers were generated on EC2 instances, either c6i.2xlarge or c6g.2xlarge, using an Ubuntu 20.04 AMI. The baseline was the Ubuntu-provided Python 3.8.10. The script used was essentially
git clone https://github.com/pyston/python-macrobenchmarks
cd python-macrobenchmarks
git checkout c7dbe453
bash compare.sh python3.8 pyston
We found that while EC2 instances exhibit performance drift over time, varying by +-1% over the course of a day, when benchmarks are run back-to-back the results are surprisingly consistent. We chose to use EC2 instances as representative of user environments, and for ease of reproduction.
We got the following results:
pyperformance x86 | ARM | Pyston macrobenchmarks x86 | ARM | |
Pyston 2.3.4 | +66% | +54% | +35% | +25% |
Pyston-lite 2.3.4 | +28% | +25% | +8% | +8% |
Pyston-lite 2.3.4 Mac | +27% | +39% | +8% | +5% |
CPython 3.11.0b3 | +15% | +10% | +8% | +5% |