NME – profiling your app performance

Profilers are some of the most important tools to optimize an application – yet many developers don’t even know such a thing exists.

There are profilers for most runtimes, that is you’ll get a different profiler for C++, .NET, Java and even Actionscript. For instance we’re using JetBrain’s dotTrace to profile FlashDevelop (a .NET application), and for Actionscript you’ll have the fantastic Adobe Scout.

Haxe NME, for desktop/mobile targets, cross-compiles your haxe code into C++ and builds a pure native application, so we’ll use a C++ profiler.

There are many things we can profile:

  • method calls timings – how much time is spent executing each function ,
  • memory allocation – do you waste/leak memory?
  • framerate – without displaying a FPS counter,
  • CPU usage – are we consuming all the CPU power?
  • GPU usage – do we make efficient calls?
  • etc. (Profiling on Wikipedia)

Let’s see, both for Mac and Win, how you can time your CPU usage.

Time profiling on Mac OSX

If you’re on Mac, then Xcode comes with a complete set of excellent profilers for free; it’s called Instruments.

1. Launch Instruments from Xcode

1-open-instruments

2. Choose Time Profiler tool

2-choose-tool

3. Launch your app

3-launch

When you’ve selected the app, click the big “Record” button (isn’t that cool to profile directly from the device?) – this will start and monitor your app until you press this button again.

4. Explore the results

4-timings

By default the results aren’t displayed in a very interesting way so look in the left pane and check “Top Functions” – this will order all the function calls by total time spent.

It’s important to note that the time in front of a method is the sum of all the sub calls it does; in this screenshot you can see that the time spent in FlxGame’s draw() method corresponds to nearly 50% of the CPU time.

5. Explore in depth

5-timing-details

If you double-click on one method in the timing results, you’ll have a line-by-line breakdown of the time spend in sub-methods. Here you can see haxe to C++ converted code (which contains a lot of debug information) but it’s fairly easy to understand to what haxe original code it corresponds. In this sample, most of FlxGame.draw() time is actually spend in the FlxState’s draw() method so you’ll have to dig deeper to find what is the bottleneck.

6. Do a Release build from time to time

6-release-build

By default, as you can see in the screenshots, NME adds a lot of debug information which have a noticeable CPU cost. That’s because it’s a “Debug” build, and to do a “Release” build you can:

  • either edit the “Scheme” and change the “Build Configuration”,
  • or make an “Archive”, ie. export an IPA that you’ll install on the device using Xcode Organizer.

In both cases you won’t have as much information in the profiler but it can dramatically reduce the CPU load – for instance the HaxeFlixel port tests jump from 20 to 60fps.

Time profiling on PC

On PC, you have the option to pay several thousands of dollars for Visual Studio Ultimate, or get one of the free/cheap alternative profilers; I found Sleepy to be quite good for code timing but you can explore StackOverflow discussions to find more on the subject.

1. Start Sleepy

1-launch

2. Select the application

2-select-exe

Here I’m going to profile a desktop exe. Haxe NME builds a debug version of your application under cpp/windows/obj. Make sure you select the right working directory so it will be able to load the application assets!

3. Run the profiler

3-running

The profiler will launch the application and monitor its execution – stop it after a little while (click “Ok”) to inspect the results.

4. Explore the results

4-result

Here again I’ve changed the default sorting option by choosing to the %Inclusive column so I’ll see the methods where most of the CPU time is spent. In this sample, the most costly method of my application is my main enterFrame handler which redraws the scene.

5. Dig in the details

5-method-timings

As in Xcode Instruments you’ll have a line-by-line breakdown of the time spend in sub-methods. Here you can see haxe to C++ converted code (which contains a lot of debug information) but it’s fairly easy to understand to what haxe code it corresponds.

Awesome isn’t it?

Profiling is priceless; you can find exactly where your code sucks instead of complaining that your tech is slow and randonly optimizing your code without really locating the bottleneck.

Take the time to explore and learn how to use profilers – aside from timings, you’ll also want to look Memory Allocations profiler to see if you’re wasting memory doing useless allocations! For Mac, Xcode includes one, and for Windows you can use VMMap to get an overview.

3 thoughts on “NME – profiling your app performance

  1. Philippe, nice writeup. I really like the “very sleepy” profile – I find the flat view much easier to analyse than the xcode tree. Apple should look a bit more outward when designing their software.
    For proper hxcpp analysis, you really need to compile for release – there are a few template things the need to be inlined for any sort of performance. On windows, there is define “debuglink”, which will keep the symbols in the exe/ndll, but still do the optimisations so you get more accurate output. You can add this with “-D debuglink”.

  2. Would you happen to know of any tools we might use for memory profiling? I’m leaking memory and I don’t know what’s causing it 🙁

Leave a Reply

Your email address will not be published. Required fields are marked *