2002-04: Super-Scalable Algorithms for Next-Generation High-Performance Cellular Architectures
This research in cellular architectures was part of a Cooperative Research and Development Agreement (CRADA) between IBM and Oak Ridge National Laboratory (ORNL) to develop algorithms for the next-generation of supercomputers. It focused on the development of algorithms that are able to use a 100,000-processor machine efficiently and are capable of adapting to or simply surviving faults. Such huge computer systems, like the IBM Blue Gene/L, need to address already existing problems in algorithm scalability and fault-tolerance, which continue to increase with processor scale. In a first step, the team at ORNL developed a simulator in Java, since a 100,000-processor machine was not available. A prototype of the Java Cellular Architecture Simulator (JCAS) was presented at the ACM/IEEE International Conference on Supercomputing (SC) in 2001 and was able to emulate up to 5000 virtual processors on a single real processor solving Laplace`s equation. Another demonstration at the follow-up conference in 2002 was capable of emulating up to 500,000 virtual processors on a cluster with 5 real processors solving Laplace’s equation and the global maximum problem.
Participating Institutions
Funding Sources
- Laboratory Directed Research and Development, Oak Ridge National Laboratory
Important Publications
Symbols: Abstract,
Publication,
Presentation,
BibTeX Citation,
DOI Link
- Christian Engelmann and George A. (Al) Geist. Super-Scalable Algorithms for Computing on 100,000 Processors. In Lecture Notes in Computer Science: Proceedings of the 5th International Conference on Computational Science (ICCS) 2005, Part I, pages 313-320, Atlanta, GA, USA, May 22-25, 2005. Springer Verlag, Berlin, Germany. ISBN 978-3-540-26032-5. ISSN 0302-9743.