A while ago I read in some dubious piece that beloved and wonderful π can be found everywhere in nature. In circles, in spheres, in electrical circuits, when dropping needles over the American flag. It was stated that one of the places where π loves to appear is in rivers. Specifically, *meandering* rivers – rivers that twist and coil and snake about, definitely not taking the shortest route towards the sea. Here is a nice example of such a river, the Rio Cauto in Cuba:

Supposedly, it was found that the ratio between the path that the river takes, and the actual distance between its starting and ending point is π. This ratio is called “sinuosity”. Amazing, is it not?

This statement bugged me. Why should the ratio be π, and not any other magical number? In fact, I was skeptic on whether this ratio should be constant at all. While there are some beautiful pictures of meandering rivers out there, I have seen plenty of water streams which are rather straight. It turns out that the snakiness of a river depends on the slope of the hill down which it slides. Gravity is a pretty strong force, which doesn’t give the river much choice as to where it will go. As they say, “obey gravity, it’s the law”. Rivers that run in plains and flatlands are much windier than those on mountains.

Further, water, like any other object when influenced by gravity, tends to take the shortest route possible. The fact that it swirls and curves around is because there are obstacles which prevent it from heading straight on. The river may overcome these obstacles by either overflowing them, going around them, or by cutting out a new path. This means that the actual shape of the river strongly depends on the ground conditions – how tall the rocks are, how hard it is to erode them, what type of soil is found there. Even if π does exist somewhere in there, I expect different geographical situations to yield different ratios.

So shall I let this issue pass silently? Of course not. Python to the rescue!

I took on the task of writing a small program that, when given an aerial view of a river, can tell me its sinuosity – the ratio between its winding length, and the actual distance it covers.

How it does so is fairly simple. Calculating the distance between the starting and ending points is rather easy – that’s just the regular, good ol’ Euclidian distance. What’s more difficult is to calculate the actual winding length. Doing so requires going along the river, and measuring its length. Computerwise, this means that we have to create “internal sampling points” inside the river itself. We will then play “connect the dots” and measure the length by summing up the distances between these internal points.

We start with a picture of a river. In this example, I used GoogleMaps and homed in on a small section of minor channel in South America, which eventually merges with the Amazon.

The first step is for the computer to find out where the river is. Humans have a rather easy time at it – any child could come up to the screen and yell “look, the river is right there!”. But alas, computers are dumb and have to be told what to do – currently, the computer just sees a bunch of raw pixel data. The classic way of doing so is by computing gradients in order to find edges. But I wanted something quick and easy, so I chose instead to filter by color. The picture was scanned pixel by pixel, and everything not blue enough was deleted. Afterwards, the picture was scanned again, and individual blue pixels – which might have evaded the previous filter by chance – were deleted. Only sufficiently large groups of blue were allowed to stay. In the end, we are left with something that is clean enough for the computer to work with, if we give it a few hints on what to do. This filter worked fine for me in the above example, but it is very picture specific. When dealing with the Nile, for example, the blue intensity doesn’t work, and something else is needed.

Now we are left with a collection of blue pixels, most of which represent the river. However, we are still dealing with raw data – the pixels of the picture itself. The next step is to find out what path the river takes.

We start with some point that we know is inside the river; this point is given manually, and has to be inputted to the program. From this point on, we look around us at a certain distance r, and ask: which points that are a distance r from us are river points? The identification, of course is done by color, now that everything is blue. We then continue the process, but now with the new points we just found out about as our starting points. This way, we traverse across the entire river, going about jumps of size r.

Another way to picture it is like this: whenever we find a point that we know is inside the river, we draw a circle of radius r around that point. If we happen to draw parts of the circle on blue pixels, we assume that those parts are in the river, and we can repeat the process again.

There are several things to consider when implementing this naive algorithm. First, we must be careful not to return to where we already looked: we want to avoid the case where we say, “oh, the point 10 steps to the left is a river point”, go there, and then say “oh, the point 10 steps to the right is a river point” – because that is where we started from. This may seem trivial, but calculation errors mean that you have to be careful with what you consider as “two identical points”.

Second, the distance r must be chosen carefully. If it is smaller than the width of the river, then instead of going along the length of the river, we may find ourselves going back and forth along the width. On the other hand, r cannot be too big, otherwise we might miss acute twists and turns in the river, or even accidentally skip an entire section of the river, when it bends too close to itself. The ideal would be to use an adaptive radius, which changes based on how thick the river is, and if there is any river beside it. However, this would require much more complicated image analysis, and I chose to remain with the simple algorithm.

Another factor to consider is that we cannot draw a perfect circle – we are dealing with a pixelated environment. The circle must be drawn in discrete angle steps. Again we have a dependence on the step size – if it is too large, then we will skip slender curves in the river. If it is too small, then two adjacent points on the circle will go along the width of the river, and not its length, and this requires correction. Again, an adaptive model would have been ideal, but for the rivers used, constant step size worked with reasonable results.

Speaking of results, here is the same picture after processed by the algorithm (on top: with dots, on bottom: with shown circles).

Now we have a bunch of points which we assume are inside the river. Unfortunately, the circle algorithm doesn’t guarantee that they are generated in proper order. In order to play “connect the dots”, and go along the entire river, we must go from start to finish. Jumping around between random points in the river will give us an incorrect value for distance travelled. Again, we use a naive algorithm for finding the correct path. Assuming we know either the starting or the ending position, we go from point to point by always going to the closest next point we have. This simple method works elegantly when there are no branches in the river, meaning, the river doesn’t diverge into two different channels. If it does, then the algorithm is hopeless can cannot aim to be correct – after it finishes processing one channel, it abruptly jumps to the next, generating an artifact in the path.

The algorithm can be fixed by recognizing when there are large jumps – too large to be in adjacent parts of the river. These jumps are caused when the path taken by the algorithm jumps from channel to channel. When we recognize these, we can treat the two channels as two different rivers, applying the algorithm separately to each channel. This partially solves the problem of divergence, but not of convergence – what if the river splits up, then joins back again? This problem was not addressed here; for the program to deal with it, manual editing has to be performed. Overall, the algorithm gave decent, although not perfect, results (there are some bends which it could have included but missed).

Path with branch correction:

Path without branch correction:

Now that all the dots are connected in the right order, we can start actually measuring. The program ran over the river, and saved the aerial distance, as well as the actual distance covered. So, we can see how the ratio changes along the river itself. The result? Hardly π.

This is a nice looking graph. At first, every twist and turn in the river changes the ratio dramatically – this is because of the relative size of the meanders themselves, when compared with the length we have already traversed along the river. As we go further and further, the value stabilizes to about 2.56.

Could I be missing something here? Maybe. Perhaps I need to look at the entire river in order to get π, although this seems unlikely, since the river in the picture is several dozens of kilometers long. Perhaps this is an exception to the π rule. Or perhaps π just doesn’t play a theoretical part in this river game. Whichever case it is, I’m glad I started this project. I had a lot of fun coding and playing around with various river pictures – the python PIL module made it extraordinarily easy. Perhaps one day I will take a look at the geophysics which govern the true behavior of meanders. Until then – happy image processing, everybody!

Just for fun: remember when I said that if the circle radius r is too big, and the angle step is just so, that it may happen that we sometimes skip river bends and twists by accident? Here is a test case I made to demonstrate the effect: