Monday, April 3, 2017

PyPy 5.7.1 bugfix released

We have released a bugfix PyPy2.7-v5.7.1 and PyPy3.5-v5.7.1 beta (Linux 64bit), due to the following issues:
  • correctly handle an edge case in dict.pop (issue 2508)
  • fix a regression to correctly handle multiple inheritance in a C-API type where the second base is an app-level class with a __new__ function
  • fix a regression to fill a C-API type’s tp_getattr slot from a __getattr__ method (issue 2523)
Thanks to those who reported issues and helped test out the fixes

You can download the v5.7.1 release here:

What is PyPy?

PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7 and CPython 3.5. It’s fast (PyPy and CPython 2.7.x performance comparison) due to its integrated tracing JIT compiler.
We also welcome developers of other dynamic languages to see what RPython can do for them.
The PyPy 2.7 release supports:
  • x86 machines on most common operating systems (Linux 32/64 bits, Mac OS X 64 bits, Windows 32 bits, OpenBSD, FreeBSD)
  • newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux,
  • big- and little-endian variants of PPC64 running Linux,
  • s390x running Linux
Please update, and continue to help us make PyPy better.

Cheers, The PyPy team

Saturday, April 1, 2017

Native profiling in VMProf

We are happy to announce a new release for the PyPI package vmprof.
It is now able to capture native stack frames on Linux and Mac OS X to show you bottle necks in compiled code (such as CFFI modules, Cython or C Python extensions). It supports PyPy, CPython versions 2.7, 3.4, 3.5 and 3.6. Special thanks to Jetbrains for funding the native profiling support.

vmprof logo

What is vmprof?

If you have already worked with vmprof you can skip the next two section. If not, here is a short introduction:

The goal of vmprof package is to give you more insight into your program. It is a statistical profiler. Another prominent profiler you might already have worked with is cProfile. It is bundled with the Python standard library.

vmprof's distinct feature (from most other profilers) is that it does not significantly slow down your program execution. The employed strategy is statistical, rather than deterministic. Not every function call is intercepted, but it samples stack traces and memory usage at a configured sample rate (usually around 100hz). You can imagine that this creates a lot less contention than doing work before and after each function call.

As mentioned earlier cProfile gives you a complete profile, but it needs to intercept every function call (it is a deterministic profiler). Usually this means that you have to capture and record every function call, but this takes an significant amount time.

The overhead vmprof consumes is roughly 3-4% of your total program runtime or even less if you reduce the sampling frequency. Indeed it lets you sample and inspect much larger programs. If you failed to profile a large application with cProfile, please give vmprof a shot. or PyCharm

There are two major alternatives to the command-line tools shipped with vmprof:
  • A web service on
  • PyCharm Professional Edition
While the command line tool is only good for quick inspections, and PyCharm compliment each other providing deeper insight into your program. With PyCharm you can view the per-line profiling results inside the editor. With the you get a handy visualization of the profiling results as a flame chart and memory usage graph.

Since the PyPy Team runs and maintains the service on (which is by the way free and open-source), I’ll explain some more details here. On you can inspect the generated profile interactively instead of looking at console output. What is sent to You can find details here.

Flamegraph: Accumulates and displays the most frequent codepaths. It allows you to quickly and accurately identify hot spots in your code. The flame graph below is a very short run of (Thus it shows a lot of time spent in PyPy's JIT compiler).

List all functions (optionally sorted): the equivalent of the vmprof command line output in the web.

 Memory curve: A line plot that shows how how many MBytes have been consumed over the lifetime of your program (see more info in the section below).

Native programs

The new feature introduced in vmprof 0.4.x allows you to look beyond the Python level. As you might know, Python maintains a stack of frames to save the execution. Up to now the vmprof profiles only contained that level of information. But what if you program jumps to native code (such as calling gzip compression on a large file)? Up to now you would not see that information.

Many packages make use of the CPython C API (which we discurage, please lookup cffi for a better way to call C). Have you ever had the issue that you know that your performance problems reach down to, but you could not profile it properly? Now you can!

Let's inspect a very simple Python program to find out why a program is significantly slower on Linux than on Mac:

import numpy as np
n = 1000
a = np.random.random((n, n))
b = np.random.random((n, n))
c =, b)

Take two NxN random matrix objects and create a dot product. The first argument to the dot product provides the absolute value of the random matrix.

RunPythonNumPyOSn=... Took
[1]CPython 3.5.2NumPy 1.12.1Mac OS X, 10.12.3n=5000~9 sec
[2]CPython 3.6.0NumPy 1.12.1Linux 64, Kernel 4.9.14n=1000~26 sec

Note that the Linux machine operates on a 5 times smaller matrix, still it takes much longer. What is wrong? Is Linux slow? CPython 3.6.0? Well no, lets inspect and [1] and [2] (shown below in that order).

[2] runs on Linux, spends nearly all of the time in PyArray_MatrixProduct2, if you compare to [1] on Mac OS X, you'll see that a lot of time is spent in generating the random numbers and the rest in cblas_matrixproduct.

Blas has a very efficient implementation so you can achieve the same on Linux if you install a blas implementation (such as openblas).

Usually you can spot potential program source locations that take a lot of time and might be the first starting point to resolve performance issues.

Beyond Python programs

It is not unthinkable that the strategy can be reused for native programs. Indeed this can already be done by creating a small cffi wrapper around an entry point of a compiled C program. It would even work for programs compiled from other languages (e.g. C++ or Fortran). The resulting function names are the full symbol name embedded into either the executable symboltable or extracted from the dwarf debugging information. Most of those will be compiler specific and contain some cryptic information.

Memory profiling
We thankfully received a code contribution from the company Blue Yonder. They have built a memory profiler (for Linux and Mac OS X) on top of that displays the memory consumption for the runtime of your process.

You can run it the following way:

$ python -m vmprof --mem --web

By adding --mem, vmprof will capture memory information and display it in the dedicated view on You can tha view by by clicking the 'Memory' switch in the flamegraph view.

There is more

Some more minor highlights contained in 0.4.x:
  • VMProf support for Windows 64 bit (No native profiling)
  • VMProf can read profiles generated by another host system
  • VMProf is now bundled in several binary wheel for fast and easy installation (Mac OS X, Linux 32/64 for CPython 2.7, 3.4, 3.5, 3.6)
Future plans - Profile Streaming

vmprof has not reached the end of development. There are many features we could implement. But there is one feature that could be a great asset to many Python developers.

Continuous delivery of your statistical profile, or in short, profile streaming. One of the great strengths of vmprof is that is consumes very little overhead. It is not a crazy idea to run this in production.

It would require a smart way to stream the profile in the background to and new visualizations to look at much more data your Python service produces.

If that sounds like a solid vmprof improvement, don't hesitate to get in touch with us (e.g. IRC #pypy, mailing list pypy-dev, or comment below)

You can help!

There are some immediate things other people could help with. Either by donating time or money (yes we have occasional contributors which is great)!
  • We gladly received code contribution for the memory profiler. But it was not enough time to finish the migration completely. Sadly it is a bit brittle right now.
  • We would like to spend more time on other visualizations. This should include to give a much better user experience on (like a tutorial that explains the visualization that we already have). 
  • Build Windows 32/64 bit wheels (for all CPython versions we currently support)
We are also happy to accept google summer of code projects on vmprof for new visualizations and other improvements. If you qualify and are interested, don't hesitate to ask!

Richard Plangger (plan_rich) and the PyPy Team

[1] Mac OS X
[2] Linux64

Tuesday, March 21, 2017

PyPy2.7 and PyPy3.5 v5.7 - two in one release

The PyPy team is proud to release both PyPy2.7 v5.7 (an interpreter supporting Python v2.7 syntax), and a beta-quality PyPy3.5 v5.7 (an interpreter for Python v3.5 syntax). The two releases are both based on much the same codebase, thus the dual release. Note that PyPy3.5 only supports Linux 64bit for now.

This new PyPy2.7 release includes the upstream stdlib version 2.7.13, and PyPy3.5 (our first in the 3.5 series) includes the upstream stdlib version 3.5.3.

We continue to make incremental improvements to our C-API compatibility layer (cpyext). PyPy2 can now import and run many C-extension packages, among the most notable are Numpy, Cython, and Pandas. Performance may be slower than CPython, especially for frequently-called short C functions. Please let us know if your use case is slow, we have ideas how to make things faster but need real-world examples (not micro-benchmarks) of problematic code.

Work proceeds at a good pace on the PyPy3.5 version due to a grant from the Mozilla Foundation, hence our first 3.5.3 beta release. Thanks Mozilla !!! While we do not pass all tests yet, asyncio works and as these benchmarks show it already gives a nice speed bump. We also backported the f"" formatting from 3.6 (as an exception; otherwise “PyPy3.5” supports the Python 3.5 language).

CFFI has been updated to 1.10, improving an already great package for interfacing with C.

We now use shadowstack as our default gcrootfinder even on Linux. The alternative, asmgcc, will be deprecated at some future point. While about 3% slower, shadowstack is much more easily maintained and debuggable. Also, the performance of shadowstack has been improved in general: this should close the speed gap between other platforms and Linux.

As always, this release fixed many issues and bugs raised by the growing community of PyPy users. We strongly recommend updating.

You can download the v5.7 release here:
We would like to thank our donors for the continued support of the PyPy project.
We would also like to thank our contributors and encourage new people to join the project. PyPy has many layers and we need help with all of them: PyPy and RPython documentation improvements, tweaking popular modules to run on pypy, or general help with making RPython’s JIT even better.


What is PyPy?

PyPy is a very compliant Python interpreter, almost a drop-in replacement for CPython 2.7 and CPython 3.5. It’s fast (PyPy and CPython 2.7.x performance comparison) due to its integrated tracing JIT compiler.
We also welcome developers of other dynamic languages to see what RPython can do for them.
The PyPy 2.7 release supports:
  • x86 machines on most common operating systems (Linux 32/64 bits, Mac OS X 64 bits, Windows 32 bits, OpenBSD, FreeBSD)
  • newer ARM hardware (ARMv6 or ARMv7, with VFPv3) running Linux,
  • big- and little-endian variants of PPC64 running Linux,
  • s390x running Linux


What else is new?

(since the releases of PyPy 2.7 and 3.3 at the end of 2016)
There are many incremental improvements to RPython and PyPy, the complete listing is here.
Please update, and continue to help us make PyPy better.

Cheers, The PyPy team

Saturday, March 4, 2017

Leysin Winter Sprint Summary

Today is the last day of our yearly sprint event in Leysin. We had lots of ideas on how to enhance the current state of PyPy, we went skiing and had interesting discussions around virtual machines, the Python ecosystem, and other real world problems.

Why don't you join us next time?

A usual PyPy sprints day goes through the following stages:

  1.  Planning Session: Tasks from previous days that have seen progress or are completed are noted in a shared document. Everyone adds new tasks and then assigns themselves to one or more tasks (usually in pairs). As soon as everybody is happy with their task and has a partner to work with, the planning session is concluded and the work can start.
  2. Discussions: A sprint is a good occasion to discuss difficult and important topics in person. We usually sit down in a separate area in the sprint room and discuss until a) nobody wants to discuss anymore or b) we found a solution to the problem. The good thing is that usally the outcome is b).
  3. Lunch: For lunch we prepare sandwiches and other finger food.
  4. Continue working until dinner, which we eat at a random restaurant in Leysin.
  5. Goto 1 the next day, if sprint has not ended.
Sprints are open to everybody and help newcomers to get started with PyPy (we usually pair you with a developer familiar with PyPy). They are perfect to discuss and find solutions to problems we currently face. If you are eager to join next year, please don't hesitate to register next year around January.

Sprint Summary   

Sprint goals included to work on the following topics:
  • Work towards releasing PyPy 3.5 (it will be released soon)
  • CPython Extension (CPyExt) modules on PyPy
  • Have fun in winter sports (a side goal)


  • We have spent lots of time debugging and fixing memory issues on CPyExt. In particular, we fixed a serious memory leak where taking a memoryview would prevent numpy arrays from ever being freed. More work is still required to ensure that our GC always releases arrays in a timely manner.
  • Fruitful discussions and progress about how to flesh out some details about the unicode representation in PyPy. Our current goal is to use utf-8 as the unicode representation internally and have fast vectorized operations (indexing, check if valid, ...).
  • PyPy will participate in GSoC 2017 and we will try to allocate more resources to that than last year.
  • Profile and think about some details how to reduce the starting size of the interpreter. The starting point would be to look at the parser and reduce the amount of strings to keep alive.
  • Found a topic for a student's master thesis: correctly freeing cpyext reference cycles.
  • Run lots of Python3 code on top of PyPy3 and resolve issues we found along the way.
  • Initial work on making RPython thread-safe without a GIL.

List of attendees

- Stefan Beyer
- Antonio Cuni
- Maciej Fijalkowski
- Manuel Jacob
- Ronan Lamy
- Remi Meier
- Richard Plangger
- Armin Rigo
- Robert Zaremba

We would like to thank our donors for the continued support of the PyPy project and we looking forward to next years sprint in Leysin.

The PyPy Team

Wednesday, March 1, 2017

Async HTTP benchmarks on PyPy3

Hello everyone,

Since Mozilla announced funding, we've been working quite hard on delivering you a working Python 3.5.
We are almost ready to release an alpha version of PyPy 3.5. Our goal is to release it shortly after the sprint. Many modules have already been ported and  it can probably run many Python 3 programs already. We are happy to receive any feedback after the next release. 

To show that the heart (asyncio) of Python 3 is already working we have prepared some benchmarks. They are done by Paweł Piotr Przeradowski @squeaky_pl for a HTTP workload on serveral asynchronous IO libraries, namely the relatively new asyncio and curio libraries and the battle-tested tornado, gevent and Twisted libraries. To see the benchmarks check out and the instructions for reproducing can be found inside in the repository. Raw results can be obtained from

The purpose of the presented benchmarks is showing that the upcoming PyPy release is already working with unmodified code that runs on CPython 3.5. PyPy also manages to make them run significantly faster.

The benchmarks consist of HTTP servers implemented on the top of the mentioned libraries. All the servers are single-threaded relying on underlying event loops to provide concurrency. Access logging was disabled to exclude terminal I/O from the results. The view code consists of a lookup in a dictionary mapping ASCII letters to verses from the famous Zen of Python. If a verse is found the view returns it, otherwise a 404 Not Found response is served. The 400 Bad Request and 500 Internal Server Error cases are also handled.

The workload was generated with the wrk HTTP benchmarking tool. It is run with one thread opening up to 100 concurrent connections for 2 seconds and repeated 1010 times to get consecutive measures. There is a Lua script provided that instructs wrk to continuously send 24 different requests that hit different execution paths (200, 404, 400) in the view code. Also it is worth noting that wrk will only count 200 responses as successful so the actual request per second throughput is higher.

For your convenience all the used libraries versions are vendored into the benchmark repository. There is also a precompiled portable version of wrk provided that should run on any reasonably recent (10 year old or newer) Linux x86_64 distribution. The benchmark was performed on a public cloud scaleway x86_64 server launched in a Paris data center. The server was running Ubuntu 16.04.01 LTS and reported Intel(R) Xeon(R) CPU D-1531 @ 2.20GHz CPU. CPython 3.5.2 (shipped by default in Ubuntu) was benchmarked against a pypy-c-jit-90326-88ef793308eb-linux64 snapshot of the 3.5 compatibility branch of PyPy.

We want to thank Mozilla for supporting our work!

fijal, squeaky_pl and the PyPy Team

Tuesday, January 24, 2017

Leysin Winter Sprint: 25/26th Feb. - 4th March 2017

The next PyPy sprint will be in Leysin, Switzerland, for the twelveth time. This is a fully public sprint: newcomers and topics other than those proposed below are welcome.

Goals and topics of the sprint

The list of topics is very open.

  • The main topic is Python 3.5 support in PyPy, as most py3.5 contributors should be present. It is also a good topic if you have no or limited experience with PyPy contribution: we can easily find something semi-independent that is not done in py3.5 so far, and do pair-programming with you.
  • Any other topic is fine too: JIT compiler optimizations, CFFI, the RevDB reverse debugger, improving to speed of your program on PyPy, etc.
  • And as usual, the main side goal is to have fun in winter sports :-) We can take a day off (for ski or anything else).

Exact times

Work days: starting 26th Feb (~noon), ending March 4th (~noon).

I have pre-booked the week from Saturday Feb 25th to Saturday March 4th. If it is possible for you to arrive Sunday before mid-afternoon, then you should get a booking from Sunday only. The break day should be around Wednesday.

It is fine to stay a few more days on either side, or conversely to book for a part of that time only.

Location & Accomodation

Leysin, Switzerland, "same place as before".

Let me refresh your memory: both the sprint venue and the lodging will be in a pair of chalets built specifically for bed & breakfast: The place has a good ADSL Internet connection with wireless installed. You can also arrange your own lodging elsewhere (as long as you are in Leysin, you cannot be more than a 15 minutes walk away from the sprint venue).

Please confirm that you are coming so that we can adjust the reservations as appropriate.

The options of rooms are a bit more limited than on previous years because the place for bed-and-breakfast is shrinking; but we should still have enough room for us. The price is around 60 CHF, breakfast included, in shared rooms (3 or 4 people). If there are people that would prefer a double or single room, please contact me and we'll see what choices you have. There are also a choice of hotels in Leysin.

Please register by Mercurial:

or on the pypy-dev mailing list if you do not yet have check-in rights:

You need a Swiss-to-(insert country here) power adapter. There will be some Swiss-to-EU adapters around, and at least one EU-format power strip.