How should I report on processor utilization for machines running Intel Hyper-threading (HT) technology?

Hyper-threading (HT) is the brand name for the technology Intel uses in many of its Xeon 32-bit processors that enables one physical processor core to execute two instruction streams (or threads) concurrently. On an HT machine, when HT is enabled, each physical processor currently presents two “logical” CPU interfaces to the operating system so that two program threads can be dispatched at a time. The best way to report on processor utilization for an HT machine is to calculate the average utilization of the logical processors associated with the same physical processor core.

Figuring out whether or not HT is beneficial or detrimental on a specific workload is difficult today unless you can do an apples-to-apples comparison between an HT machine and a non-HT machine running the exact same workload. On an HT machine, all the processor level resource usage measurements such as % Processor Time represent utilization of a logical processor. Some authorities recommend averaging the processor utilization of the two logical processors that share a physical processor core to calculate utilization of the physical processor. To do this, you must understand which logical processors are associated with the same physical processor core.

Two new Processor configuration records, introduced in Performance Sentry version 2.4.7, allow you to identify HT machines definitively and determine which logical processors share a physical processor core. An instance of a DTS.CPU configuration record that identifies a physical processor is written for each physical processor that is present. These records contain a counter called # Logical Processors Supported that will tell you if it is an HT machine, along with a counter called  # Logical Processors Active that shows you if HT is enabled. If the # Logical Processors Supported counter contains a null value, then the machine is not HT-capable. If the # Logical Processors Supported counter contains valid numeric data, then it is an HT-capable machine. (You should see a numeric value of 2 for current HT-ready processors. Note that Intel’s processor roadmap shows them contemplating building HT machines with more than 2 logical processors sometime in the future.) You can also tell if HT is enabled on the machine. On an HT machine, if the # Logical Processors Active is less than # Logical Processors Supported, then the HT support has been disabled.

The DTS.CPU records contain some additional CPU hardware configuration data that you might find interesting, like the amount of L1, L2 and L3 cache memory is installed, where that information is available.

DTS.LogicalProcessor records are also written that associate a logical processor instance name (the same instance name used in the Processor records) with a DTS.CPU physical processor core parent instance. Both sets of Processor configuration records are automatically written once to the beginning of each NTSMF data file, just before the first interval data records.

The core technology that Intel uses in its HT machines is known as Simultaneous Multithreading (SMT), which you can learn about at this University of Washington, Computer Science department web site. Much of the research published here shows SMT to be quite promising. Multiple threads executing simultaneously on the same processor core works well when an instruction from one thread blocks inside the instruction pipeline, but the processor can continue to make forward progress executing instructions from another thread. On the other hand, in practice HT is sometimes detrimental to overall performance when it comes to real-world workloads, forcing customers to disable HT in some instances. Threads executing concurrently on the same processor core must contend for shared resources inside the processor, particularly the same processor cache. Multiple threads can also interfere with each other’s instruction execution progress, which leads to degraded performance levels. One suggestion is that this interference is more likely to occur when you are attempting to run an homogenous workload and less likely to occur when the processor is executing threads from unrelated processes. In other words, on a machine dedicated to a specific application or one instance of SQL Server, HT could do more harm than good.

According to this white paper posted on Microsoft ‘s web site, logical processor instance names on an HT machine are generated in sequence, one to a physical processor until all the physical processors have one logical processor, and then in sequence again until all the HT logical processors have been accounted for. For example, on an HT-enabled machine with 4 processor cores, processor instances 0 and 4 are associated with the first physical processor present, instances 1 and 5 are associated with the next, etc. Since the assignment of logical processor numbers to physical processor cores is a BIOS function, the authors of the Microsoft white paper were not entirely certain that every HT machine you ever come across will look this way, but at least every one that they have seen so far conforms to this numbering scheme.

, ,

Comments are closed.