New paper in Communication Physics: A phase diagram for bacterial swarming

I’m happy to say that the paper “A phase diagram for bacterial swarming” has been published in Communication Physics (https://www.nature.com/articles/s42005-020-0327-1). This paper is the result of ancient long-running research (started in 2015…) and is joint work with Avraham Be’er, Bella Ilkanaiv, Daniel Kearns, Sebastian Heidenreich, Markus Bär and Gil Ariel. In it, we analyze how colonies of bacteria move together collectively, as a function of two physical parameters: the length of individual bacteria, and the density of the colony in space. We did this by filming the bacterial colonies through a microscope and tracking the paths of individual bacteria; if anything, working with the little buggers is worth it just for these amazing films. I was responsible for developing the image tracking and analysis tools. In this post, I’ll first give a tiny overview of the biology and statistical properties involved, then talk a bit about how the software works.

Bacterial phases

This is Bacillus subtilis (a bit fuzzy, I know):

Bacillus subtilis is a nice bacterium to study. It has a collection of tiny oars, or flagella on its cell walls, making it an effective swimmer. It is about 7 microns long and 1 micron thick, and so has an aspect ratio of about 7. By means of genetic engineering, it’s possible to control some of the mechanisms with which the bacterium decides how much to grow and when to divide, effectively creating mutant species which are either shorter or longer than those found in the wild. In our study, we made bacteria with aspect ratios of 5.5, 13, and 19. Here is an example of a long fella:

The nice thing about these mutants is that they are individually identical in essentially all other aspects – only their length is changed. So, for example, the swimming speed of an individual bacterium doesn’t really depend on the bacterium’s aspect ratio. However, different lengths may affect the interactions between bacteria, and by putting the different mutants to the test, we can see how their size affects their collective swimming speed, or, more generally, their swarming behavior.

For Bacillus subtilis, under the right conditions, can move en-masse, with gigantic clusters of tightly-packed bacteria swimming quickly and in the same direction. This usually happens at high population densities, when the bacteria are all squished against each other. At low densities, a lot of the individual bacteria are immobile, and movement is scattered:

At higher densities, many more bacteria start moving, and in larger clusters:

This looks a lot scarier for the longer bacteria types:

So now we have two parameters to look at: The length of the bacteria, and their density (measured either by bacteria per unit area, or by the surface area covered by them). We know that these must affect the collective motion in some way (trivially: when density is near 0, there cannot be collective movement, and when density is near 1, much of the movement is collective). Can we quantify these effects? Are there different “phases” of motion, i.e ranges of parameters where the types of movement are fundamentally different in some way? This is what our research focused on.

To figure all these things out, we must calculate different properties of the bacterial motion, and see how they change as a function of the parameters. By “property of motion”, I mean some aggregate quantity that somehow tells you something meaningful about the combined movement of thousands of tiny agents. They can be relatively simple, like the average velocity of all of the bacteria, or more complex, like how well they align relative to each other. Calculating them often involves extracting the individual orientation, position, and velocity of each bacterium in the image.

A “phase transition” in the swarming behavior of Bacillus Subtilis means that as we vary some parameter continuously (say, the density), some property of motion makes a drastic jump, and is (practically) discontinuous. Some property, but not all of them; there may be particular properties which stay continuous always. Finding phase transitions is then a matter of finding the “right” property of motion, which is not always an easy task; knowing what to look for is half the art.

For example, here is a graph showing the average speed of all moving bacteria in the frame, as a function of their surface density, for all bacteria lengths:

You can see that in general, the speed goes up as a function of density, for all aspect ratios. Also, the fastest type is the non-mutant one found in nature (a coincidence, or natural optimization? who knows). In any case, the average speed is a continuous function of density, and does not indicate different phases of motion. The same goes for many other properties of motion, such as the correlation of velocities and orientation of the same bacterium as a function of time.

But if you probe enough, you do eventually find sharp, pronounced transitions, which show that short bacteria behave differently than long bacteria, and long bacteria behave qualitatively differently at low and high densities. The key here is analyzing the clusters of moving bacteria. As the bacteria move about, they go in and out of the camera frame, so the density and number of bacteria is never exactly constant throughout the entire movie. How these densities and numbers change as a function of time depends on the sizes of the groups of bacteria which move together. If the motion is carried out in lots of small clusters which move (more or less) independently, we can expect a lot of small fluctuations, making the density behave erratically in the small scale. On the other hand, if all bacteria move as one, we can expect larger, smoother fluctuations in the density (imagine a gigantic cluster sweeping across the field of view of the camera). One aspect of this can be captured, for example, by the Hurst exponent of the density time series.

The idea is this: The density of bacteria in the image can be thought of as a stochastic process which varies in time, fluctuating around some average:

The Hurst exponent is a measure of how well correlated a stochastic process is with itself. The higher the Hurst exponent, the larger the correlation between different times, and also the smoother the paths. The following image shows a clear transition in the Hurst exponent in long bacteria, when the average density reaches about $0.25$:

This is not the only indication of different cluster sizes; we also analyzed the spatial distribution of bacteria as well as the distribution of cluster sizes for some very basic notion of cluster (it’s not at all easy to pin down what a moving cluster is, and even harder to decide whether a particular bacterium should be part of one or not). But all in all, we believe that the phase diagram of the swarming of Baciullus subtilis looks as follows:

This is just a cartoon, of course. For very small bacterial densities, nothing ever moves. Then, there is a difference between short and long bacteria (we only experimented with 4 types, so of course we can say nothing about continuity; this is a very coarse phase diagram). The short bacteria can always exhibit swarming behavior. For long bacteria, the density dictates whether there will only be small clusters, or whether there will also be very large clusters as well. Finally, at too high densities, there is also no movement, because things are jammed (this is suggested in another paper, we didn’t look at jammed phases). Pictorially, we may imagine these phases like this:

Neat, isn’t it?

Algorithmics

Some of our analysis required looking at the position, orientation, and velocity of individual bacteria. Extracting this information is not always easy. Here is a rough sketch of the image analysis, which works quite well for the shorter bacteria. For the technically inclined, you can find our matlab code here: https://bitbucket.org/renang/bacterial_swarming_phase.

We invert the colors, improve the contrast and clean up the image a bit:

The image is now ready for thresholding: If a pixel is sufficiently bright we keep it and color it white, and if it is too dark we discard it (color it black). Hopefully, this will separate the image into connected components of pixels, each of which represents just a single bacteria. This is wishful thinking.

It’s quite obvious that some clumps of pixels are too thin (i.e we chopped off chunks of the bacterium), while some clumps of pixels are connected where they shouldn’t be, representing more than one bacterium. If we zoom in on the red circle above, for example, we see lots of bacteria whose pixel sets are connected:

Our goal is then to cut these apart, reaching better bacterial identification:

We do this in a two step algorithm. The first step follows the neat work “Detecting and Tracking Motion of Myxococcus xanthus Bacteria in Swarms”, by Liu et al. This algorithm is based on the topological skeleton, or medial axis of the background image. The idea is this: Take your white connected-component of pixels, which you think represents more than one bacterium, and surround it by a large black background box. Then, “set fire” to the edges of the box, and see how the fire propagates. When two different sources of fire reach a pixel at the same time, color it red. The result will be a “skeleton” of the background image. For example, if the following connected-component, which represents a single bacterium, is given as input,

the output after drawing the background skeleton is

The idea is that two connected or nearly-connected bacteria will have a very concave shape, and the background’s skeleton will represent this in the form of dangling spurs:

All we have to do in this case is find those two dangling edges and connect them, effectively slicing the white image in twain:

This method works fairly well in practice, though not always; you can even see in the picture above that the cut didn’t really happen where we would have wanted it to happen. Also, connecting the dangling skeleton lines is actually not an easy task: The lines don’t always come in easily recognizable pairs, and in fact sometimes they don’t come in pairs at all, and there are singleton dangling edges which have to be “paired off with infinity”:

So, as often happens in real-life applications, this method isn’t fool-proof, and we are sometimes left with connected-components which are still untouched. For these we apply an iteration of an in-house algorithm which tries to match axes to the bacterial pixel blobs. The algorithm is as follows. We first divide the bacterial pixels into a grid:

From each point in the grid, we send out rays in all directions, extending them until they hit the background, i.e exit the bacterial pixels. If a ray happens to be aligned with a bacterium, it will be quite long, but if it is perpendicular to a bacterium, it will be short:

So we assume that long rays are well-aligned with some bacterium, and pick the angle that gives the longest line. This gives us an $(x,y)$ coordinate (the initial sampled point) and an angle $\theta$ (of the best-fitting line); together, we get a point in $\mathbb{R}^3$. This point is a very noisy representation of bacterium. The idea is that if we sample many such points, we’ll be able to cluster similar points together, perhaps revealing the actual bacterial structure hiding in the pixels behind the noise. We do this by using the k-means algorithm for various values of $k$. The result, while again not always accurate, manages to “find out” additional bacteria in the original connected-components. Together with the previous method, many of the bacteria in the image can be found.

Overall, I had quite a lot of fun writing the code for this project, trying to solve the various algorithmic problems which arose along the way (we didn’t even talk about how to track bacteria between frames!). If this project interests you in any way, I invite you once more to read the paper. We have also made the matlab code available at https://bitbucket.org/renang/bacterial_swarming_phase.