NME – profiling your app performance
Profilers are some of the most important tools to optimize an application – yet many developers don’t even know such a thing exists.
There are profilers for most runtimes, that is you’ll get a different profiler for C++, .NET, Java and even Actionscript. For instance we’re using JetBrain’s dotTrace to profile FlashDevelop (a .NET application), and for Actionscript you’ll have TheMiner and we’re expecting a new one from Adobe soon.
Haxe NME, for desktop/mobile targets, cross-compiles your haxe code into C++ and builds a pure native application, so we’ll use a C++ profiler.
There are many things we can profile:
- method calls timings – how much time is spent executing each function ,
- memory allocation – do you waste/leak memory?
- framerate – without displaying a FPS counter,
- CPU usage – are we consuming all the CPU power?
- GPU usage – do we make efficient calls?
- etc. (Profiling on Wikipedia)
Let’s see, both for Mac and Win, how you can time your CPU usage.
Time profiling on Mac OSX
If you’re on Mac, then Xcode comes with a complete set of excellent profilers for free; it’s called Instruments.
1. Launch Instruments from Xcode
2. Choose Time Profiler tool
3. Launch your app
When you’ve selected the app, click the big “Record” button (isn’t that cool to profile directly from the device?) – this will start and monitor your app until you press this button again.
4. Explore the results
By default the results aren’t displayed in a very interesting way so look in the left pane and check “Top Functions” – this will order all the function calls by total time spent.
It’s important to note that the time in front of a method is the sum of all the sub calls it does; in this screenshot you can see that the time spent in FlxGame’s draw() method corresponds to nearly 50% of the CPU time.
5. Explore in depth
If you double-click on one method in the timing results, you’ll have a line-by-line breakdown of the time spend in sub-methods. Here you can see haxe to C++ converted code (which contains a lot of debug information) but it’s fairly easy to understand to what haxe original code it corresponds. In this sample, most of FlxGame.draw() time is actually spend in the FlxState’s draw() method so you’ll have to dig deeper to find what is the bottleneck.
6. Do a Release build from time to time
By default, as you can see in the screenshots, NME adds a lot of debug information which have a noticeable CPU cost. That’s because it’s a “Debug” build, and to do a “Release” build you can:
- either edit the “Scheme” and change the “Build Configuration”,
- or make an “Archive”, ie. export an IPA that you’ll install on the device using Xcode Organizer.
In both cases you won’t have as much information in the profiler but it can dramatically reduce the CPU load – for instance the HaxeFlixel port tests jump from 20 to 60fps.
Time profiling on PC
On PC, you have the option to pay several thousands of dollars for Visual Studio Ultimate, or get one of the free/cheap alternative profilers; I found Sleepy to be quite good for code timing but you can explore StackOverflow discussions to find more on the subject.
1. Start Sleepy
2. Select the application
Here I’m going to profile a desktop exe. Haxe NME builds a debug version of your application under cpp/windows/obj. Make sure you select the right working directory so it will be able to load the application assets!
3. Run the profiler
The profiler will launch the application and monitor its execution – stop it after a little while (click “Ok”) to inspect the results.
4. Explore the results
Here again I’ve changed the default sorting option by choosing to the %Inclusive column so I’ll see the methods where most of the CPU time is spent. In this sample, the most costly method of my application is my main enterFrame handler which redraws the scene.
5. Dig in the details
As in Xcode Instruments you’ll have a line-by-line breakdown of the time spend in sub-methods. Here you can see haxe to C++ converted code (which contains a lot of debug information) but it’s fairly easy to understand to what haxe code it corresponds.
Awesome isn’t it?
Profiling is priceless; you can find exactly where your code sucks instead of complaining that your tech is slow and randonly optimizing your code without really locating the bottleneck.
Take the time to explore and learn how to use profilers – aside from timings, you’ll also want to look Memory Allocations profiler to see if you’re wasting memory doing useless allocations!