Monday, September 19, 2011

Real Economic Development: Sandia, Cray Form Institute

Real “development” of any economy comes from doing new things or doing existing things more productively. Doing new things can mean luring a new company to an area. Or it can mean existing companies doing something new.

Back in May I got a news release from Sandia National Laboratories about a new partnership with Cray Inc., the supercomputer manufacturer. It slipped to the bottom of the electronic pile, but in thinking about developing our lagging economy and in my continued annoyance at those whining about “dependence on the government,” I dug it out.

Much of Sandia’s release follows. What the release doesn’t say is that New Mexicans at national laboratories have led large-scale scientific computing since before there were national laboratories. Los Alamos was known as the Manhattan Project in those days. When I started following such things in the 1980s, New Mexico had more supercomputers per capita than any state.

Here is the release: May 27, 2011 6:01:54 AM MDT ALBUQUERQUE, N.M. — Sandia National Laboratories and supercomputer manufacturer Cray Inc. are forming an institute focused on data-intensive supercomputers. The Supercomputing Institute for Learning and Knowledge Systems (SILKS), to be located at Sandia in Albuquerque, will take advantage of the strengths of Sandia and Cray by making software and hardware resources available to researchers who focus on a relatively new application of supercomputing.

That task is to make sense of huge collections of data rather than carry out more traditional modeling and simulation of scientific problems. Sandia and Cray signed a cooperative research and development agreement (CRADA) to establish the institute.

“It’s an unusual opportunity,” said Bruce Hendrickson, Sandia senior manager of computational sciences and math. “Cray has an exciting machine [the XMT] and we know how to use it well. This CRADA should help originate new technologies for efficiently analyzing large data sets. New capabilities will be applicable to Sandia’s fundamental science and mission work.”

Shoaib Mufti, director of knowledge management in Cray’s custom engineering group, said, “Sandia is a leading national lab with strong expertise in areas of data analysis. The concept of big data in the HPC [high-performing computing] environment is an important area of focus for Cray, and we are excited about the prospect of new solutions that may result from this collaborative effort with Sandia.” Rob Leland, Sandia director of computing research, added, “This is a great example of how Sandia engages our industrial partners. The XMT was originally developed at Sandia’s suggestion. It combined an older processor technology Cray had developed with the Red Storm infrastructure we jointly designed, giving birth to a new class of machines. That’s now come full circle. The Institute will leverage this technology to help us in our national security work, benefitting the Labs and the nation as well as our partner.” Red Storm was the first parallel processing supercomputer to break the teraflop barrier. Its descendants, built by Cray, are still the world’s most widely purchased supercomputer. The XMT, however, has a different mode of operation from conventional parallel-processing systems. Says Hendrickson, “Think about your desktop: The memory system’s main job is to keep the processor fed. It achieves this through a complex hierarchy of intermediate memory caches that stage data that might be needed soon. The XMT does away with this hierarchy. Though its memory accesses are distant and time-consuming to reach, the processor keeps busy by finding something else to do in the meantime.” In a desktop machine or ordinary supercomputer, Hendrickson said, high performance can only be achieved if the memory hierarchy is successful at getting data to the processor fast enough. But for many important applications, this isn’t possible and so processors idle most of the time. Said another way, traditional machines try to avoid latency (waiting for data) though the use of complex memory hierarchies. The XMT doesn’t avoid latency; instead, it embraces it. By supporting many fine-grained snippets of a program called “threads,” the processor switches to a new thread when memory access would otherwise make it have to wait for data. “Traditional machines are pretty good for many science applications, but the XMT’s latency tolerance is a superior approach for lots of complex data applications,” Hendrickson says. “For example, following a chain of data links to draw some inference totally trashes memory locality because the data may be anywhere.” More broadly, he says, the XMT supports programs very good at working with large data collections that can be represented as graphs. Such computations appear in biology, law enforcement, business intelligence, and in various national security applications. Instead of a single answer, results are often best viewed as graphs. Sandia and other labs have already built software to run graph algorithms, though “the software is still pretty immature,” Hendrickson said. “That’s one reason for the institute. As semantic database technology grows in popularity, these kinds of applications may become the norm.” Among its other virtues, the XMT saves power because it runs at slower clock speeds. This normally bad thing is good here because rapid computation is not the goal but rather the accurate laying-out of data points. SILKS’ primary objectives, as described in the CRADA, are to accelerate the development of high-performance computing, overcome barriers to implementation, and apply new technologies to enable discovery and innovation in science, engineering, and for homeland security.

No comments: