next up previous contents
Next: 4 Performance Modeling Environments Up: Appnotes Index Previous:2 Purposes

RASSP Token-based Performance Modeling Application Note

3.0 Metrics

The primary measurements associated with token-based performance modeling are throughput, latency, and utilization. Time-line activity graphs display the concurrency of events as a function of time. Related measurements include efficiency which can be derived from the primary measurements as a ratio compared to their theoretical maximum values.

3.1 Throughput

Throughput is the rate at which data is processed by the system. It is usually specified in terms of the number of data-elements or operations that can be processed per unit time. Throughput may be specified as an instantaneous peak or a sustainable average over time. The latter is often more meaningful in determining system size requirements if there is sufficient memory to spread the processing load over time.

3.2 Latency

Latency is the time delay between when an activity starts and when it completes or reaches an equivalent point. Latencies typically apply to processing or data-transfer activities. For instance, a processing latency is the time between when a piece of data is applied to a processing system and when the result due to that piece of data becomes available at the output of the processing system. Or for a communication latency, the latency for a packet transfer is the time delay between when the first word of data transfers from the source device until that first word arrives at the destination device. Note that this definition does not consider the time to transfer the entire packet, as that would be a function of the packet length and the transfer rate (throughput). By defining the metrics in this way, the delays and rates are orthogonal; so latency and throughput are specified separately.

3.3 Utilization

Utilization is the percentage of time that a resource is actively performing an application activity. The opposite of utilization is idleness. Utilization percentages are typically calculated for critical system resources such as processor elements and network linkages or buses.

The instantaneous utilization of a memory element may also be recorded or plotted. Such a measurement is not time-based, but rather expresses the percentage of memory-space consumed or allocated to application data at a given instant in time. The peak value is typically most relevant, and the instantaneous value can be plotted as a function of time.

To calculate a time-based utilization, the total time that a resource is in-use must be accumulated over some time interval. The interval is often assumed from time-zero until several iterations of the algorithm processing have completed to achieve a good steady-state value. Such a time-interval may include some dead-time, as systems can take a while for the data to get distributed to the devices and to reach steady-state processing. If only a few sets of data are applied to the system and the simulation is run until all the data has been processed, then on the trailing-end, many of the devices have long-finished by the time the last device finishes. These effects produce lower utilization percentages than might be expected intuitively.

Alternatively, some designers specify the interval from the first time a given device was used until the last time it was used. However, even this definition can yield misleading results. For instance, when a device is used only once, for only for a very small portion of the total simulation duration, this method would report 100% - utilization even though the device was hardly used.

Therefore, it is recommended that the method for defining the utilization interval be stated and understood while analyzing time-based utilization numbers.

3.4 Time-lines

An activity time-line graph displays the concurrent activity of a set of system resources as a function of time. Typically, the times when a processor is actively computing an application task are shown as a dark or colored bar. Colors often identify the particular software tasks being executed. The intervening times, when a processor is idle while waiting for input data to arrive or while waiting for output data to be sent away, are shown as white-space. Such a plot is specifically called a processing time-line. Processing time-lines provide graphic visualization that shows the designer the patterns of processing concurrency in the design. It also helps reveal bottle-necks and the critical processing paths.

In a similar way, the data transfer activity across the network linkages and buses can be displayed on a time-line graph. Such a plot is specifically called a communications time-line. Communication time-lines help the designer visualize the traffic flow through the system as a function of time. They also help to identify hot-spots or bottlenecks in the architecture or mapping. Such visualization assists in evaluating architectures and guides the designer in balancing the processing load and transfer patterns to achieve the best processing yield from a given architecture.

Although primarily used to display hardware activity, time-lines can also display the activity of software tasks as a function of time. Such a graph is called a software time-line. Colors can be used to identify the particular processor a task is executing on.


next up previous contents
Next: 4 Performance Modeling Environments Up: Appnotes Index Previous:2 Purposes

Approved for Public Release; Distribution Unlimited Dennis Basara