CFD Review  
Serving the CFD Community with News, Articles, and Discussion
CFD Review

User Preferences
Site Sponsorship
Headline Feeds
Mobile Edition
Privacy Policy
Terms of Service

Submit a CFD Story

Site Sponsors
Siemens PLM Software
The Choice for CFD Meshing
CFD Review

Tell a Friend
Help this site to grow by sending a friend an invitation to visit this site.

CFD News by Email
Did you know that you can get today's CFD Review headlines mailed to your inbox? Just log in and select Email Headlines Each Night on your User Preferences page.

Tech Report: Overflow Solver Performance on 512 CPU Origin
Posted Tue May 08, 2001 @04:32PM
Print version Email story Tweet story
Solver The researchers at NASA Ames have developed a new highly-scalable parallelism technique for use with the Overflow CFD code. The new method, termed Multi-Level Parallelism (MLP), was designed to be simpler and more scalable than MPI. MLP was specifically designed for the new large CPU count shared memory systems.

Building on their success with Overflow, a Chimera grid based CFD solver, on a 256 CPU SGI Origin 2000, the Ames team applied the code to a new 512 processor Origin 2000.

Sponsor CFD Review

The machine

The 512 CPU Origin 2000 used by the team is unique in that it is the world's first 512 CPU, shared memory symmetric multi-processing system in the world. The machine was specially built for NASA Ames by SGI and was demonstrated to scale well to 512 processes once a few OS performance problems were sorted out.

The problems

The main difficulty was in scaling the parallelism up by a factor of 2 on the same grid which had been used previously on the 256 CPU machine since the researchers were interested in reducing the solution time for the given problem, and not in solving a larger problem. This introduced problems with scaling the inner loops in the code and remote memory access latency, which did not affect the results on the 256 processsor machine.

The results

By optimizing the Overflow code further, the Ames team was able to get the results to scale linearly up to 512 processors on the full aircraft simulation (35 million node grid). A fully converged solution of the full aircraft configuration is now available in less than 2 hours of elapsed time. Another way of looking at this performance is that it would take 117 Cray C90 supercomputers to match the performance of the code on the 512 CPU Origin 2000.

Future Work

The NASA Ames team is happy with the performance they were able to achieve with the Overflow code on the 512 CPU system. However, they believe that they may be able to speed the simulation by a factor of 2 with further optimation of the code.


This is an overview of NAS Technical Report 00-005.

[ Post Comment ]

CHT '01 | Preview: Next Generation CFX-5 Post Processor  >


CFD Review Login
User name:


Create an Account

Related Links
  • NAS Technical Report 00-005
  • NASA
  • Overflow
  • SGI
  • More on Solver
  • Also by nwyman
  • This discussion has been archived. No new comments can be posted.

    You will visit the Dung Pits of Glive soon. All content except comments
    ©2018, Viable Computing.

    [ home | submit story | search | polls | faq | preferences | privacy | terms of service | rss  ]