Malware exploit new flaw in Intel ‘Coffee lake’ and ‘Skylake processors’

On Mar 9, 2021

A new research has yielded yet another means to pilfer sensitive data by exploiting what’s the first “on-chip, cross-core” side-channel in Intel Coffee Lake and Skylake processors.

Published by a group of academics from the University of Illinois at Urbana-Champaign, the findings are expected to be presented at the USENIX Security Symposium coming this August.

While information leakage attacks targeting the CPU microarchitecture have been previously demonstrated to break the isolation between user applications and the operating system, allowing a malicious program to access memory used by other programs (e.g., Meltdown and Spectre), the new attack leverages a contention on the ring interconnect.

SoC Ring interconnect is an on-die bus arranged in a ring topology which enables intra-process communication between different components (aka agents) such as the cores, the last level cache (LLC), the graphics unit, and the system agent that are housed inside the CPU. Each ring agent communicates with the ring through what’s called a ring stop.

To achieve this, the researchers reverse-engineered the ring interconnect’s protocols to uncover the conditions for two or more processes to cause a ring contention, in turn using them to build a covert channel with a capacity of 4.18 Mbps, which the researchers say is the largest to date for cross-core channels not relying on shared memory, unlike Flush+Flush or Flush+Reload.

“Importantly, unlike prior attacks, our attacks do not rely on sharing memory, cache sets, core-private resources or any specific uncore structures,” Riccardo Paccagnella, one of the authors of the study, said. “As a consequence, they are hard to mitigate using existing ‘domain isolation’ techniques.”

Observing that a ring stop always prioritizes traffic that is already on the ring over new traffic entering from its agents, the researchers said a contention occurs when existing on-ring traffic delays the injection of new ring traffic.

Armed with this information, an adversary can measure the delay in memory access associated with a malicious process due to a saturation of bandwidth capacity caused by a victim process’ memory accesses. This, however, necessitates that the spy process consistently has a miss in its private caches (L1-L2) and performs loads from a target LLC slice.

In doing so, the repeated latency in memory loads from LLC due to ring contention can allow an attacker to use the measurements as a side-channel to leak key bits from vulnerable EdDSA, and RSA implementations as well as reconstruct passwords by extracting the precise timing of keystrokes typed by a victim user.

Specifically, “an attacker with knowledge of our reverse engineering efforts can set itself up in such a way that its loads are guaranteed to contend with the first process’ loads, […] abuses mitigations to preemptive scheduling cache attacks to cause the victim’s loads to miss in the cache, monitors ring contention while the victim is computing, and employs a standard machine learning classifier to de-noise traces and leak bits.”

The study also marks the first time a contention-based microarchitectural channel has been exploited for keystroke timing attacks to infer sensitive data typed by the victim.

In response to the disclosures, Intel categorized the attacks as a “traditional side channel,” which refers to a class of oracle attacks that typically take advantage of the differences in execution timing to infer secrets.

The chipmaker’s guidelines for countering timing attacks against cryptographic implementations recommend adhering to constant time programming principles.

Runtime is independent of secret values
The order in which the instructions are executed (aka code access patterns) are independent of secret values, and
The order in which memory operands are loaded and stored (data access patterns) are independent of secret values

Additional guidance on safe development practices to mitigate traditional side-channel attacks can be found here. The source code to reproduce the experimental setup detailed in the paper can be accessed here.

CPUs exploit