Tokyo Tech News
December 10, 2015
Tokyo Tech's TSUBAME-KFC/DL (Deep Learning) supercomputer, which employs oil immersion cooling and other advanced power-saving features, again took second place in the November 2015 Green500 List(1), the world's supercomputer energy efficiency rankings.
The TSUBAME-KFC supercomputer, the predecessor, was developed by Tokyo Tech's Global Scientific Information and Computing Center (GSIC) as a prototype for its next-generation TSUBAME 3.0 supercomputer, with funding from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) under the "Ultra Green Technology in Supercomputing Clouds Information Infrastructure" project. The project aims to substantially reduce the power required for both the machine and the cooling system, and is being conducted in collaboration with world-renowned domestic and overseas supercomputing vendors such as NEC and NVIDIA. TSUBAME-KFC became operational in October 2013, and in November 2013 and in June 2014, it was consecutively recognized as the world's most energy-efficient supercomputer on the Green500 List, which ranks the world's top supercomputers according to their energy efficiency in computing.
In October 2015, TSUBAME-KFC was upgraded to TSUBAME-KFC/DL by replacing the Tesla K20X GPUs(2) with K80s in order to enhance its computational capacity for deep learning. Although peripheral to the main objective, Green500 metrics were re-measured to investigate whether improvements had been made. On November 18, it was announced that TSUBAME-KFC/DL had become No.2 in the world according to the November 2015 release of the Green500, having achieved 5,331.79 MFLOPS(3) per watt, nearly 1,000 MFLOPS/W more since the previous measurement. Being ranked within the top 5 in five consecutive Green editions (No.1 in November 2013, No.1 in June 2014, No.3 in November 2014, No.5 in June 2015, No.2 in November 2015) demonstrates the project's technological leadership in terms of the future of supercomputers, whose performance will be predominantly limited by electrical power requirements as we strive towards a low-carbon society.
As a project prototype, TSUBAME-KFC was designed in a way that drastically reduces the cooling power through a combination of oil immersion cooling, where the entire system is entirely submerged in non-conductive warm oil circulating both inside and outside the compute nodes, and ambient cooling, which uses cooling towers cooled by the natural environment without using power-hungry compressors.
The TSUBAME-KFC/DL system is composed of forty-two compute nodes interconnected by an FDR InfiniBand network. Each compute node is equipped with two Intel Xeon E5-2620 v2 processors (Ivy Bridge EP) and four NVIDIA Tesla K80 GPU boards (two GPUs on a board) inside a 1U server to achieve extremely high density, accommodating forty-two nodes or 336 GPUs in one oil-immersion rack. The system's overall theoretical peak performance is 318 TeraFlops (493 with auto-boost of GPUs) in double-precision floating point arithmetic, and also achieves over 951 TeraFlops (1,476 with auto-boost) in single precision, or more than a PetaFlop per rack.
The results achieved today are the fruition of a series of research projects conducted by GSIC. Along with the Ultra Green Project, there are a number of past and ongoing supercomputing projects such as "Ultra Low Power High Performance Computing (ULPHPC) " and "EBD: Next Generation Big Data Infrastructure Technologies Towards Yottabyte/Year" as part of the Core Research for Evolutionary Science and Technology program of the Japan Science and Technology Agency. Much of the research has also evolved around GPUs utilized for supercomputers under the NVIDIA GPU Center of Excellence (formerly NVIDIA CUDA Center of Excellence) program. In addition, the results achieved by TSUBAME-KFC would not have been possible without extensive collaboration with industrial partners in both Japan and the US, including NEC, NVIDIA, Green Revolution Cooling, Supermicro, Intel, and Mellanox.
Explanations of Technical Terms
A list that ranks the world's top 500 supercomputers in terms of their power efficiency (FLOPS/Watt) in computing. It uses the LINPACK Benchmark which is also utilized by the famous Top500 List which ranks the machines in terms of their absolute achieved performance. The list was started in November 2007 and, like the Top500, is updated in June and November of each year. The list is increasing in its significance, as future supercomputer performance will be predominantly limited by the overall power available to the facility, and as such power efficiency will be the most important metric in a given machine.
2.GPU (Graphics Processing Unit)
A type of many-core processor that utilizes hundreds to thousands of low-power processing cores in a single CPU chip to achieve high performance in applications via massively parallel processing. Although it was initially a special-purpose processor for graphics processing accompanying a standard CPU, over time it has become increasingly generalized, programmable, and powerful, such that it is now utilized in many applications as a general-purpose compute engine. The majority of scientific applications now run in a GPU-CPU environment, exhibiting higher performance and lower power consumption over a CPU-only environment.
3.Flops (FLOPS), MegaFlops (MFLOPS), GigaFlops (GFLOPS), TeraFlops (TFLOPS), PetaFlops (PFLOPS)
Flops (FLOPS or Floating Operations Per Second) is a primary performance index that indicates how many floating point calculations are performed in one second by a given computer. Mega, Giga, Tera, and Peta are prefix multipliers and indicate 10 to the 6th (million), 9th (billion), 12th (trillion), and 15th (quadrillion) powers respectively.