The Samovar


Fast fractals with Python and numpy

This will be of little interest to people who regularly read my blog, but might be of some interest to people who find their way here by the power of Google.

The standard way to compute fractals like the Mandelbrot set using Python and numpy is to use vectorisation and do the operations on a whole set of points. The problem is that this is slower than it needs to be because you keep doing computations on points that have already escaped. This can be avoided though, and the version below is about 3x faster than the standard way of doing it with numpy.

The trick is to create a new array at each iteration that stores only the points which haven’t yet escaped. The slight complication is that if you do this you need to keep track of the x, y coordinates of each of the points as well as the values of the iterate z. The same trick can be applied to many types of fractals and makes Python and numpy almost as good as C++ for mathematical exploration of fractals.

I’ve included the code below, both with and without explanatory comments. This 400×400 image below using 100 iterations took 1.1s to compute on my 1.8GHz laptop:

mandel

Uncommented version:

def mandel(n, m, itermax, xmin, xmax, ymin, ymax):
    ix, iy = mgrid[0:n, 0:m]
    x = linspace(xmin, xmax, n)[ix]
    y = linspace(ymin, ymax, m)[iy]
    c = x+complex(0,1)*y
    del x, y
    img = zeros(c.shape, dtype=int)
    ix.shape = n*m
    iy.shape = n*m
    c.shape = n*m
    z = copy(c)
    for i in xrange(itermax):
        if not len(z): break
        multiply(z, z, z)
        add(z, c, z)
        rem = abs(z)>2.0
        img[ix[rem], iy[rem]] = i+1
        rem = -rem
        z = z[rem]
        ix, iy = ix[rem], iy[rem]
        c = c[rem]
    return img

Commented version:

from numpy import *

def mandel(n, m, itermax, xmin, xmax, ymin, ymax):
    '''
    Fast mandelbrot computation using numpy.

    (n, m) are the output image dimensions
    itermax is the maximum number of iterations to do
    xmin, xmax, ymin, ymax specify the region of the
    set to compute.
    '''
    # The point of ix and iy is that they are 2D arrays
    # giving the x-coord and y-coord at each point in
    # the array. The reason for doing this will become
    # clear below...
    ix, iy = mgrid[0:n, 0:m]
    # Now x and y are the x-values and y-values at each
    # point in the array, linspace(start, end, n)
    # is an array of n linearly spaced points between
    # start and end, and we then index this array using
    # numpy fancy indexing. If A is an array and I is
    # an array of indices, then A[I] has the same shape
    # as I and at each place i in I has the value A[i].
    x = linspace(xmin, xmax, n)[ix]
    y = linspace(ymin, ymax, m)[iy]
    # c is the complex number with the given x, y coords
    c = x+complex(0,1)*y
    del x, y # save a bit of memory, we only need z
    # the output image coloured according to the number
    # of iterations it takes to get to the boundary
    # abs(z)>2
    img = zeros(c.shape, dtype=int)
    # Here is where the improvement over the standard
    # algorithm for drawing fractals in numpy comes in.
    # We flatten all the arrays ix, iy and c. This
    # flattening doesn't use any more memory because
    # we are just changing the shape of the array, the
    # data in memory stays the same. It also affects
    # each array in the same way, so that index i in
    # array c has x, y coords ix[i], iy[i]. The way the
    # algorithm works is that whenever abs(z)>2 we
    # remove the corresponding index from each of the
    # arrays ix, iy and c. Since we do the same thing
    # to each array, the correspondence between c and
    # the x, y coords stored in ix and iy is kept.
    ix.shape = n*m
    iy.shape = n*m
    c.shape = n*m
    # we iterate z->z^2+c with z starting at 0, but the
    # first iteration makes z=c so we just start there.
    # We need to copy c because otherwise the operation
    # z->z^2 will send c->c^2.
    z = copy(c)
    for i in xrange(itermax):
        if not len(z): break # all points have escaped
        # equivalent to z = z*z+c but quicker and uses
        # less memory
        multiply(z, z, z)
        add(z, c, z)
        # these are the points that have escaped
        rem = abs(z)>2.0
        # colour them with the iteration number, we
        # add one so that points which haven't
        # escaped have 0 as their iteration number,
        # this is why we keep the arrays ix and iy
        # because we need to know which point in img
        # to colour
        img[ix[rem], iy[rem]] = i+1
        # -rem is the array of points which haven't
        # escaped, in numpy -A for a boolean array A
        # is the NOT operation.
        rem = -rem
        # So we select out the points in
        # z, ix, iy and c which are still to be
        # iterated on in the next step
        z = z[rem]
        ix, iy = ix[rem], iy[rem]
        c = c[rem]
    return img

if __name__=='__main__':
    from pylab import *
    import time
    start = time.time()
    I = mandel(400, 400, 100, -2, .5, -1.25, 1.25)
    print 'Time taken:', time.time()-start
    I[I==0] = 101
    img = imshow(I.T, origin='lower left')
    img.write_png('mandel.png', noscale=True)
    show()
Advertisements

22 Comments

You are a GOD

Comment by Al Juarîsmi

I’m wondering about

# less memory
58 multiply(z, z, z)
59 add(z, c, z)

Is this really better than

z[:] = z[:]*z[:] + c[:]

As I would expect multiply does a loop and add does a loop, while the vectorized assign does a loop to compute a temporary array, then a loop to assign to the result.

Comment by benny

Hey benny, I did some timings on this with IPython. For arrays of length 10000 I got that my method takes around 40 microseconds per loop, whereas yours takes 50 microseconds. For arrays of length 1000000 mine takes around 8ms and yours around 20ms – so it’s a significant saving especially for larger arrays.

Also, your analysis is not quite right. First of all, your code z[:]=z[:]*z[:]+c[:] in fact does exactly the same as z[:]=z*z+c. This translates into something like:

temp1 = empty_like(z)
multiply(z, z, temp1)
temp2 = empty_like(z)
add(temp1, c, temp2)
z[:] = temp2

In other words, your version creates two intermediate temporary variables, and does one extra loop and one extra copy. It also requires three Python function calls instead of just two in my case.

Comment by Dan | thesamovar

You rule man. Thanks a bunch!

Comment by Blaine

its not working! :*( it keeps saying there is an unexpected indent on lines 19 through 22. im new to python, sooo yeah…

Comment by Anonymous

In Python the white space is important, so if you don’t copy the program exactly it might cause problems. Try using the ‘copy to clipboard’ button at the top right of the code window.

Comment by Dan | thesamovar

UPDATE: Using numexpr, it’s possible to get this running even faster. The trick is to combine several iterations into one, so each step you compute z maps to (z**2+c)**2+c instead of z maps to z**2+c. Using 8 iterations per step seems to be optimal and gives a speed up of 4.5x over the version above. Here’s the code:

from numpy import *
import numexpr as ne

def mandel(n, m, itermax, xmin, xmax, ymin, ymax):
    ix, iy = mgrid[0:n, 0:m]
    x = linspace(xmin, xmax, n)[ix]
    y = linspace(ymin, ymax, m)[iy]
    c = x+complex(0,1)*y
    del x, y # save a bit of memory, we only need z
    img = zeros(c.shape, dtype=int)
    ix.shape = n*m
    iy.shape = n*m
    c.shape = n*m
    z = copy(c)
    for i in xrange(itermax):
        if not len(z): break # all points have escaped
        multiply(z, z, z)
        add(z, c, z)
        rem = abs(z)>2.0
        img[ix[rem], iy[rem]] = i+1
        rem = -rem
        z = z[rem]
        ix, iy = ix[rem], iy[rem]
        c = c[rem]
    return img

def nemandel(n, m, itermax, xmin, xmax, ymin, ymax,
             depth=1):
    expr = 'z**2+c'
    for _ in xrange(depth-1):
        expr = '({expr})**2+c'.format(expr=expr)
    itermax = itermax/depth
    print 'Expression used:', expr
    ix, iy = mgrid[0:n, 0:m]
    x = linspace(xmin, xmax, n)[ix]
    y = linspace(ymin, ymax, m)[iy]
    c = x+complex(0,1)*y
    del x, y # save a bit of memory, we only need z
    img = zeros(c.shape, dtype=int)
    ix.shape = n*m
    iy.shape = n*m
    c.shape = n*m
    z = copy(c)
    for i in xrange(itermax):
        if not len(z): break # all points have escaped
        z = ne.evaluate(expr)
        rem = abs(z)>2.0
        img[ix[rem], iy[rem]] = i+1
        rem = -rem
        z = z[rem]
        ix, iy = ix[rem], iy[rem]
        c = c[rem]
    img[img==0] = itermax+1
    return img

if __name__=='__main__':
    from pylab import *
    import time
    doplot = True
    args = (1000, 1000, 100, -2, .5, -1.25, 1.25)
    start = time.time()
    I = mandel(*args)
    print 'Mandel time taken:', time.time()-start
    start = time.time()
    I2 = nemandel(*args)
    print 'Nemandel time taken:', time.time()-start
    start = time.time()
    I3 = nemandel(*args, depth=2)
    print 'Nemandel 2 time taken:', time.time()-start
    start = time.time()
    I4 = nemandel(*args, depth=3)
    print 'Nemandel 3 time taken:', time.time()-start
    for d in xrange(4, 9):
        start = time.time()
        I4 = nemandel(*args, depth=d)
        print 'Nemandel', d, 'time taken:', time.time()-start

    if doplot:
        subplot(221)
        img = imshow(I.T, origin='lower left')
        subplot(222)
        img = imshow(I2.T, origin='lower left')
        subplot(223)
        img = imshow(I3.T, origin='lower left')
        subplot(224)
        img = imshow(I4.T, origin='lower left')
        show()
Comment by Dan | thesamovar

I have been toying with your code and have discovered an improvement. Rather than:

x = linspace(xmin, xmax, n)[ix]
y = linspace(ymin, ymax, m)[iy]
c = x+complex(0,1)*y

which uses advanced indexing, instead do:

x = linspace(xmin, xmax, n)
x.shape = n, 1
y = linspace(ymin, ymax, m)
y.shape = 1, m
c = x+complex(0,1)*y

which works because of numpy’s broadcasting. It’s about 0.1 seconds faster on my machine for a 1920×1080 image. Admittedly 0.1 seconds isn’t much, but I found it interesting!

Comment by Paul Etherton

Hi Paul, thanks for that! I hadn’t really thought about optimising the startup costs and that’s a neat trick.

Comment by Dan | thesamovar

I’m stoked to play around with a Mandelbrot set, but the original program only gave me a matrix. The new version is not working on my Python 3.2.3, either, even after putting parens around the print arguments. “Numexpr” wasn’t found. Keep us posted if you upgrade the code!

Comment by Peter Farrell

Hi Peter – numexpr hasn’t been upgraded to Python 3 yet, but the version in the original post that doesn’t use it should work fine.

The output of the mandel function is a matrix, which you can convert into an image – that’s what the bit at the bottom of the program does, converts to a PNG file.

Comment by Dan | thesamovar

okay being a super newbie how come when i copy and run this it won’t let me because I have no idea what numpy * is…Can someone let me know where that file is?

Comment by Anonymous

You need to have numpy installed to run it. If you install matplotlib from here: http://www.ehow.com/how_8519555_install-pylab-python.html
, it will have the files you need.

Comment by redlac1

Hi Dan — Great stuff! I know Python, and I don’t know numpy/pylab, so please forgive me if this is really stupid. It appears to me that the coloring algorithm you use here produces unexpected artifacts when applied to subregions. These are easier to see in context (see http://anvil.zeetix.net/abaxoscope.html, zoom in at least 2 clicks).

Boundaries that are subtle when the entire set is rendered are colored incorrectly when a portion is rendered. I used an image size of 256×256 in rendering the above, and here is an example parameter set that demonstrates the problem (this is for the top left tile at zoom level 2):

n, m = 256
itermax = 100
xmin = -2.0
xmax = -1.375
ymin = -1.25
ymax = -0.625

I created these images using Python 2.7.2 and numpy 1.7.0 on a Windows 7-Pro system.

I wonder if you could perhaps write more about the algorithm for actually coloring each point in the image? I *think* the relevant line is “img[ix[rem], iy[rem]] = i+1”. Since I’m a numpy newbie, I’m hazy about what this is doing. A coloring scheme used by a different approach (that, sadly, is not nearly as beautiful!) that avoids these artifacts is to calculate the RGB values separately:
r = i % 4 * 64
g = i % 8 * 32
b = i % 16 * 16
color = b * 65536 + g * 256 + r

Perhaps there’s a way to preserve your coloring scheme while avoiding the anomalies in certain (but not all!) of the tiles?

In any case, your example is totally awesome. I’d love to get this coloring issue solved, then I can *really* have fun with it.

Comment by Thomas M. Stambaugh

Hi Thomas, interesting page – it’s generated using my Python code but some sort of clever zooming HTML thing? Nice!

The problem is that you are rendering rectangular blocks separately and the colouring algorithm isn’t designed for this. The way it colours is to use the hot-cold (jet) colour map from the lowest to highest stored iteration number. This means you always get a nice scaling on any given image, but they are not comparable on zooms or across tiles. To get that effect, you’d want to use a cyclic colour map. This is something that can be done in numpy. Just choose a number of iterations as the cycle (say 100), take x=(i%100)/100 and then the colour would be cmap(x) for cmap a cyclic colour map. Pick one from http://www.scipy.org/Cookbook/Matplotlib/Show_colormaps (e.g. hsv will do the trick). Now when you call imshow add the keywords cmap=cmap, vmin=0, vmax=1.

Unfortunately this won’t work with my stupid trick to make the interior coloured differently, so you’ll have to handle that separately. I don’t have time to write the code now, but you can apply a matplotlib cmap to a 2D array and get back a 3D array where the last dimension has size 4 and gives the RGBA components, and this can be passed to imshow. So, first create an array of the right shape initialised to black (zeros), then set the points which are not in the interior to have the colours output by applying cmap. Hope that makes sense, sorry I don’t have more time to write it out fully.

Do post results and code here if you do it, would love to see the result.

Comment by Dan | thesamovar

Ah, I’m glad it wasn’t just some stupid error on my part.

My “abaxoscope” (a device for viewing things that only exist as a result of computation, of which Mandelbrot sets are an example) must remain a sideline while make its underlying platform viable. That means I’ll have to restrict myself to working on it on occasional weekends.

Your approach sounds perfectly reasonable, and has the added benefit that it will force me to actually wrap my head around numpy and mapplotlib — things I really MUST do.

Of course, if you know more than 17 people who might be interested in paying real money for such magic, I’m all ears :).

Comment by Thomas M. Stambaugh

I solved one problem and uncovered another. It turns out that there’s a totally easy way to fix the original rendering issue (thank you, stackoverflow.com!). Simply call imshow with “0” and “itermax” as the “vmin” and “max” arguments:

imshow(I.T, origin=’lower left’, vmin=0, vmax=100)

Problem solved!

Sadly, however, it reveals another problem that I’m still chasing — a memory leak in imshow. It’s memory consumption grows without bounds until crashing my 32-bit Python by using 2G of memory. I’ll see if I can get around that, perhaps by just calling PIL directly.

I’ll update the abaxoscope link once I get it running properly. Although five zoom levels are fun, I *really* want to demonstrate the full effect of arbitrary zoom.

Comment by Thomas M. Stambaugh

Hi Tom, yes that vmin/vmax thing will work, but I still think the colour map will get less interesting the more you zoom in. Another issue is that you will probably reach the limit of floating point arithmetic quite quickly as you start zooming in. Numpy does have support for higher precision with dtype=float128, but that will only get you so much further. At some point you’ll need arbitrary precision arithmetic and that’ll mean not using numpy.

Comment by Dan | thesamovar

Ah, the floating point question really gets my attention. I’m going to have the same issue in my mainline activity (which is all about the Google Map V3 functionality) because V3 expresses world coordinates as floats between 0 and 256. I’ve been worried about the implications of that, and your comment makes me more so.

It looks like I’m going to have to understand the Decimal package anyway, I guess maybe I’ll go back to more conventional approaches to rendering the mandelbrot tiles.

Comment by Thomas M. Stambaugh

I would guess even 32 bit floats would be enough for map coordinates (but on the edge), but 64 bit floats would certainly be plenty! For Mandelbrot though, once you start zooming in you’ll reach that boundary pretty quickly. Unfortunately, it will get horribly slow once you start zooming in that much (you’ll also need to increase your number of iterations very high too).

Comment by Dan | thesamovar

Great news! It turns out there are no floating point problems *at all* (at least that I can see), so long as I render the image one tile at a time. This is great, because it is precisely what I hoped for. I did some tiles at zoom level 19, and found some gorgeous detail just as I had hoped. I also found and solved the memory problem, it turned out to be just my misunderstanding of what “imshow” actually does.

I have to do some mainline work on the rest of my stuff that will actually help this along. When I get the demo updated, I’ll ping you (the same link I posted above will work, just with more detail).

Comment by Thomas M. Stambaugh

Cool – do let me know when it’s done! 🙂

Comment by Dan | thesamovar




Comments are closed.



%d bloggers like this: