Using the Simulation Profiler
The simulation profiler included in SIMbox SDK installation is a powerful tool for identifying performance issues and bottlenecks. The tool can help you to pinpoint the source of the issues down to a line of code. Performance issues can occur for multiple reasons, such as scenario complexity, low hardware profile, terrain complexity, entity composition complexity and more. This article will help you to locate the source of the problem in your component code.
Once you have identified that there is an actual performance issue (for example, easily identified by a low frame rate) you should enable the simulation profiler by using the Simulation Settings.
This tool is accessible from the SIMbox SDK add-in toolbar (or menu) in Visual Studio:
or from the start menu (Start->Programs->KnowBook->Settings->Simulation Settings):
Click the Developer Tools tab and check the Enable profiler check box to enable the profiler:
Next time you run the simulation a new pane will open up along with it showing the profiler:
• Toggle profiler On/Off': 'The profiler can create a major overhead for the simulation and generally speaking will always cause an additional performance hit. There should be a certain factor taken into consideration when using the profiler.
While the profiler is off, the FPS graph is active, but no sampling is performed and therefore no additional performance hit. This is useful in cases where you only want to profile during a certain situation or state and not before.
• Toggle always on top': 'This will make sure this window is always on top of other windows.
• FPS graph': 'This graph shows the current (black) FPS graph and the average (blue) FPS. The graph will show and calculate the average for the past 20 seconds only. This graph is a good way to identify FPS spikes and to correlate those to specific situations or actions invoked during the simulation.
• Current FPS':' A value indicating the current FPS
• Last 20 seconds average FPS: A value indicating the current average FPS (20 seconds history)
• View selection: Use this tabbed layout to switch view orientation. Read further to learn more about the different views.
• Data area: This area will show the raw data matrix for the currently selected view.
• Graph area: This area will show a relative line graph of selected data items metrics. The profiler measures CPU consumption as a percentage of the overall frame time, and only from the profiled time. This means that there might be other factors affecting the performance which are not part of the simulation frame loop. Moreover, there is always a line for NON-PROFILED which is a place holder for simulation loop time which is not included in the profiling. The profiler collects runtime information and displays the data in a table. The table is showing the following columns:
• Name – the name of the component
• CPU – the current CPU time the component is consuming
• Average CPU – the overall average component CPU consumption
• Last 20 seconds average CPU – the last 20 seconds average component CPU consumption
There are several different data views, all arranged as separate tab pages:
• All Modules – this view will show all module data – including the simulation engine, extensions and components (LOC/COC/OOC)
• Extensions – this view will show only extensions loaded by the engine
• OC’s (LOC/OOC) – this view will show only LOC’s and OOC’s data (will aggregate all instances in all entities)
• COC’s – this view will show only COC’s data (will aggregate all instances in all entities)
• Base Objects – this view will help you understand which of the base objects is performing poorly. This might happen if the performance hit is caused by a difference in the base object structure and the component it includes.
• Entities – this will help you understand if there is a specific entity which is performing poorly. This might happen if you have two instances of the same base object but one of them is in a different state which is causing a different logic or calculation flow.
• Module Operations – this view will enable you to see more details regarding CPU consumed by different operations your component is performing.
For this article we will create a simple scenario with multiple Ambulance objects following a route and a controlled Cessna172F on the runway. The Cessna was modified to include a new component that was created for the demo. When running the scenario it is obvious that there is a performance issue since the frame rate is very low – average ~20FPS:
Once activating the profiler (using the toggle button), the profiler starts to collect data and populates the lists. In the image below you can see the All Modules view with the sort set to the last 20 seconds average descending.:
As you can see in the image above, it is also possible to view a visual comparison using a simple line graph of the selected items in the data table. It is also clear that the engine is one of the two major consumers of CPU time (you will see why later on). The other major consumer is our newly added component – DemoComponent.
Before we investigate the new component, let’s review the different available views. The image below shows the CPU load by base object type. You can see that the Cessna is the major CPU consumer:
The image below shows the data by entities (an entity is an instance of a base object):
The next image shows all the components (logic and output). Once again, it is clear that our component is consuming CPU:
The most important view is the Module Operations view. Here you can investigate the specific operations your component is performing and see the CPU consumption distribution to identify the bottlenecks. In the image below you can see the Engine module breakdown. You can see that the most consuming operation is called [Update] update Scenes. This is the phase that the graphics engine is rendering all the scenes and cameras and will always be the major consumer (since we are discussing frame time percentage and not time).
The next image shows the breakdown for our component. Since we are using the default registered operations, there is still a lot to discover, but you can easily see that the update method/operation is consuming a lot of CPU. There is also a callback operation that we will notice later on.
When you review the code you can see that there is a single call to another method. Usually the code will be much more complicated and this is a simplified example for obvious reasons:
And the implementation below:
In order to use the profiler full capabilities, add an include directive to your code file:
Wrap the suspected code segment using these macro calls:
Build and run the scenario. You will see the new operation added to the list:
You may also notice that the CPU consumption of the update operation is now minor. This is happening because of the fact that there are nested macro calls and the profiler will only show the operation exclusive CPU consumption. Let’s add some more macro calls in the next suspected method. Notice the nesting of macros (it should be pretty obvious now why it is consuming CPU time…)
Review the new output below:
See the code segment below and it should be clear why there was another entry under the callback function for attribute change:
Using the techniques described in this article it should be possible to pin point a performance issue up to a single line of code.