premature optimization is the root of all…oh wait (a roguelike in python, #11)

Up to now I’ve been running the demo in a rather small window (about 40×20 cells) with a small viewport (19×19). Actually I like the idea of a game this size, but with the demo I’d like to try something bigger. Since I’m scrolling I knew I might run into issues with how fast the screen redraws (as I’m redrawing the whole thing every turn).

So first I put the level in a text file and made it much bigger (it’s now about 80×50), and then I increased the window size, and the viewport size to 40×40. As I found out the game really got sluggish. Reducing the viewport to 30×30 improved things to make it playable, and so I wanted to see exactly where things were slowing down.

To start, I firmly believe in first writing without optimization in mind. Much of what I write is very verbose, using strings insteads of integers for properties and so on. I think that for what I’m doing the bottlenecks are going to come in specific cases, and it’s not really necessary to bitshift when I can just multiply integers, or whatever real programmers do, I’m sure I’m getting the terms wrong here.

As with many things Python makes it really easy to use support modules and tools, at least for basic things. So what follows here is an idiot’s guide to profiling the bottlenecks in your code.

The cProfile module is dead simple to use and in the standard library. I used it in two ways. First, from the command line, you can do this:

python -m cProfile -o nameOfYourOutputFile nameOfYourGameFile

Your output file doesn’t need to exist first. What this will do is start up your game using the cProfile module to time how many seconds are spent in each function call. When you exit your game you’ll find the output file in the game’s directory. Then you can create a simple .py script like so (this is all in the Python documentation for cProfile by the way) — let’s say your output file was called test_profile:

import pstats
p = pstats.Stats('test_profile')

p.sort_stats('time').print_stats(10)

Running this script will give you a breakdown of time spent. You can sort this output in various ways, see the docs for details.

Now what if you want to profile only one function in your code? When I ran the first script I saw that the viewport draw call was taking a significant portion of time, but the output doesn’t break down the draw function itself. All you need to do is something like this, where you call your draw function:

        PROFILER.runcall(self.view.update)
        PROFILER.dump_stats('test_profile')

Remembering at the top of your game file to import cProfile and set up the profiler like so:

#############################################
# imports
#############################################

import os
import random
import yaml
import cProfile


import libtcod.libtcodpy as libtcod


PROFILER = cProfile.Profile()

That’s it. Then you just run the pstats script above in the same way.

These are the calls that take the most time in my draw function:

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      202    7.430    0.037   16.854    0.083 diana.py:427(draw)
   194171    1.873    0.000    4.267    0.000 C:\Documents and Settings\GO\My Documents\dev\diana\libtcod\libtcodpy.py:58(__mul__)
   194171    1.393    0.000    2.036    0.000 C:\Documents and Settings\GO\My Documents\dev\diana\libtcod\libtcodpy.py:74(int_to_col)
   182002    1.360    0.000    1.647    0.000 C:\Documents and Settings\GO\My Documents\dev\diana\libtcod\libtcodpy.py:448(console_put_char)
   181800    0.926    0.000    0.926    0.000 C:\Documents and Settings\GO\My Documents\dev\diana\libtcod\libtcodpy.py:454(console_set_back)
   194171    0.642    0.000    0.642    0.000 C:\Documents and Settings\GO\My Documents\dev\diana\libtcod\libtcodpy.py:50(__init__)
   181800    0.496    0.000    0.496    0.000 diana.py:415(viewport_ij_to_world_xy)
   181800    0.494    0.000    0.494    0.000 diana.py:117(cell_type_get)
    94940    0.390    0.000    0.390    0.000 diana.py:421(xy_in_viewport)
    83478    0.380    0.000    0.380    0.000 C:\Documents and Settings\GO\My Documents\dev\diana\libtcod\libtcodpy.py:1050(map_is_in_fov)
   194171    0.358    0.000    0.358    0.000 {isinstance}
   203134    0.345    0.000    0.345    0.000 {method 'append' of 'list' objects}
   182002    0.288    0.000    0.288    0.000 {ord}

So it looks like multiplying colors together takes a significant portion of time. I may have to look into precaching this information somehow. However there is an even easier way to speed things up, though it requires a third-party module — the Psyco module.

All you need to do to get Psyco working (after downloading and installing it) is put in your game file like this:

if __name__ == '__main__':
    try:
        import psyco
        psyco.full()
    except ImportError:
        pass
    
    game = Game()
    player = Awesome()

    while not game.exit:
        game.update()
        player.ui.draw()

You can also have Psyco optimize only specific functions in your code — as it uses more memory in its overhead, this may be important for some people.

I did notice a speed improvement with Psyco, but as it’s an external dependency my first line of attack will be speeding up the routines in the draw code itself. It may be enough to keep the viewport a reasonable size, thus drawing less to the screen. Of course I haven’t added any real gameplay yet…now that can’t take much time to compute, can it?

Advertisements

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: