New ‘Performance Cloning’ Techniques Designed to Boost Computer Chip Memory Systems Design

North Carolina State University researchers have developed software using two new techniques to help computer chip designers improve memory systems. The techniques rely on “performance cloning,” which can assess the behavior of software without compromising privileged data or proprietary computer code.

Computer chip manufacturers try to design their chips to provide the best possible performance. But to find the most effective designs, manufacturers need to know what sort of software their clients will be using.

“For example, programs that model protein folding use a lot of computing power, but very little data – so manufacturers know to design chips with lots of central processing units (CPUs), but significantly less memory storage than would be found on other chips,” says Yan Solihin, an associate professor of computer engineering at NC State and an author of two papers describing the new techniques.

However, many large customers – from major corporations to Wall Street firms – don’t want to share their code with outsiders. And that makes it tough for chip manufacturers to develop the best possible chip designs.

One way to address this problem is through performance cloning. The concept behind performance cloning is that a chip manufacturer would give profiler software to a client. The client would use the profiler to assess its proprietary software, and the profiler would then generate a statistical report on the proprietary software’s performance. That report could be given to the chip manufacturer without compromising the client’s data or code.

The profiler report would then be fed into generator software, which can develop a synthetic program that mimics the performance characteristics of the client’s software. This synthetic program would then serve as the basis for designing chips that will better meet the client’s needs.

Previous work at Ghent University and the University of Texas at Austin has used performance cloning to address issues related to CPU design – but those initiatives did not focus on memory systems, which are an important element of overall chip design.

Researchers have now developed software using two new techniques to help optimize memory systems.

The first technique, called MEMST (Memory EMulation using Stochastic Traces), assesses memory in a synthetic program by focusing on the amount of memory a program uses, the location of the data being retrieved and the pattern of retrieval.

For example, MEMST looks at how often a program retrieves data from the same location in a short period of time, and at how likely a program is to retrieve data from a location that is near other data that’s been retrieved recently. Both of these variables affect how quickly the program can retrieve data.

The second technique, called MeToo, focuses on memory timing behavior – how often the program retrieves data and whether the program has periods in which it makes many memory requests in a short time. Memory timing behavior can have a significant impact on how a system’s memory system is designed.

For example, if you think of memory requests as cars, you don’t want to have a traffic jam – so you may want to be sure there are enough lanes for the traffic. These traffic lanes equate to memory bandwidth; the broader the bandwidth, the more lanes there are.

“Both MEMST and MeToo are useful for chip designers, particularly for designers who work on memory components, such as DRAM, memory controllers and memory buses,” Solihin says.

The new techniques expand on previous work done by Solihin that used performance cloning to look at cache memory.

“Our next step is to take MEMST and MeToo, as well as our work on cache memory, and develop an integrated program that we can commercialize,” says Solihin, author of the forthcoming Fundamentals of Parallel Multicore Architecture, which addresses memory hierarchy design.

The paper on MEMST, “MEMST: Cloning Memory Behavior using Stochastic Traces,” will be presented at the International Symposium on Memory Systems, being held Oct. 5-8 in Washington, D.C. The paper was co-authored by Solihin and Ganesh Balakrishnan of Advanced Micro Devices, a former NC State Ph.D. student.

The paper on MeToo, “MeToo: Stochastic Modeling of Memory Traffic Timing Behavior,”will be presented at the International Conference on Parallel Architecture and Compilation, being held Oct. 18-21 in San Francisco, Calif. Lead author of the paper is Yipeng Wang, a Ph.D. student at NC State. Co-authors are Balakrishnan and Solihin. The work was supported by the National Science Foundation under grant number CNS- 0834664.