home news images pubs c3dTeam
whatIsCart3D?
surfaceModeling
meshGen?
flowSolvers?
Tiger
flowCart
overview
reorder
mgPrep
running
Input/Output files
postprocess?
mailList?
betaTest?
licensing?
   Overview

What is flowCart?
What kind of discretization does it use?
How is it integrated into the Cart3D?
What's clic?
Parallelization and Domain Decomposition
Multigrid
Further Documentation:

What is flowCart?
flowCart is a scalable, multilevel, solver for the Euler equations governing the inviscid flow of a compressible fluid. The solver is intended to (ultimately) replace the Tiger code which most (existing) Cart3D users are familiar with. Meshes from cubes are treated as unstructured collections of Cartesian cells, and it takes advantage of the fact that cells are Cartesian wherever possible to reduce the operation count. Both the parallelization and multigrid are completely transparent to the user and are turned on by simple command line arguments to encourage their use. 
(top)
What kind of discretization does it use?
Spatial Discretization:
flowCart uses cell-centered, finite-volume, upwind differencing. There are several flux functions provided, and it is extensible, so that you can add your favorite to the growing list of available flux functions. The linear reconstruction scheme makes it formally second-order accurate. In van Leer's notation, this is a "kappa = ?" scheme, implying that the stencil for the gradient reconstruction is central difference. Implementationally, this gradient is computed using a least-squares reconstruction, and the least squares problem is solved via the normal equations. At cut-cells, and cells which neighbor refinement boundaries, the reconstruction is always performed using true cell and face centroids. This results in substantially improved accuracy at wall-boundaries and through refinement boundaries. Numerical accuracy assessments (AIAA 2000-0808 610Kb, ps.gz format) demonstrate an order of accuracy of 1.88, and validation cases have demonstrated that discrete solutions from this solver are comparable with body-fitted structured and unstructured solvers for comparable numbers of cells. If you're an existing Cart3D user, you can expect noticeably lower dissipation at wall boundaries, and substantially improved shock resolution and propagation.
Temporal Discretization:
Advance to steady state is performed by an unstructured, nested multigrid procedure. The solver is not currently time-accurate, nor is a time-accurate mode (currently) provided. A user selectable Runge-Kutta scheme provides inner smoothing to drive the multigrid. A collection of Runge-Kutta schemes are provided for you to choose from, (clic here to see how) and you can substitute in your favorite one in addition to those provided. Multigrid performance is competitive with those found in the literature and is really easy to use, again see AIAA 2000-0808 for details and examples. Of course, if you just want to drive your solutions with plain old Runge-Kutta, you're free to do so. In all cases, flowCart drives the cut-cells somewhat more aggressively than Tiger does since it uses a monotonicity preserving timestep in these cells, rather than a simple area scaling. As a result, even single grid (no multigrid) runs converge somewhat faster.
(top)
How is it integrated into the Cart3D?
flowCart is tightly integrated into Cart3D. Pre/Post processing operations have been substantially reduced. Since it was written specifically as a module for Cart3D, file translation steps have been completely eliminated. The cubes mesh generator puts out Mesh.c3d files, and these can be used directly by flowCart, without translation. flowCart can be asked to extract Cp's and other flow quantities both on the body's surface and on cutting planes through the domain directly (-T and -clic options). It also provides both residual and lift/drag information for convergence monitoring (the -his option). After a run, you can extract a variety of information using clic. Loads and moment information are all conservatively transferred back to the input surface triangulation, so that you can postprocess on the surface without having to load the entire discrete solution.
(top)
What's clic?
clic is a Component based force and moment module developed as a post-processor and data-extractor for Cart3D. Its an extremely flexible and powerful package. Using clic, you can extract Cp cuts on any component or group of components in your configuration, you can compute LDM (lift, drag, moment) for components, component groups or configurations, and extract the usual bevy of point-moments, line-moments ("hinge moments") etc.. If you want to see some of what it was designed to do, take a look at the original ISO software project plan (here, 64kb acrobat format). Clic can be run as a postprocessor, or called directly through an API. The clic home page will get you started with the package.
(top)
Parallelization and Domain Decomposition
flowCart uses a domain decomposition approach to parallelization. As a result, scalability on large numbers of processors is quite good. Speedups in excess of 56 on 64 processors are typical. In domain decomposition approaches, the computation proceeds on subdomains that are farmed off to the machine's processors. After a certain amount of work is done, each subdomain passes information to its neighboring subdomain in an explicitly coded communication step.

OpenMP

In its current form, flowCart communicates using the OpenMP standard. Thus, although it uses explicit message passing, it is formally a shared memory model. For machines with physically distributed memory, OpenMP requires a distributed shared memory layer (DVSM) to be present on the machine so that addresses of physically distant memory are addressable by all processors. This paradigm works well on a variety of existing machines, and more machines are being developed with this programming model in mind. We currently have an MPI port in progress, but this is not the highest priority. In addition, we are working with KAI to ensure compatibility with the DVSM that they are developing for Beowulf type machines. We're not tied ot KAI, nor are we promoting their products, If you are developing a DVSM, and want an ANSI standard C application to test it out on, contact us.
Domain Decomposition
Our domain decomposition approach was designed to be transparent to the user. It relies on a space-filling-curve based reordering of the cubes mesh which is then partitioned on the fly when you startup the solver. All you need to know is that if you want to run in parallel you just have to (1) reorder the mesh with the reorder utility, and (2) set your MP_SET_NUMTHREADS environment variable to the desired number of processors, flowCart does everything else.  Here's how you'd do that for a 16 CPU case, (in csh):
1.% reorder  # reorder the mesh (using default file names)
2.% setenv MP_SET_NUMTHREADS 16 # choose number of processors
3.% flowCart # start the run
(top)
Multigrid
flowCart uses Full Approximation Storage (FAS) multigrid for convergence acceleration. You invoke it via the -mg %dcommand line option (e.g. for 4 levels of multigrid you'd type "% flowCart -mg 4"). By default its set up to do use a full multigrid startup procedure. This means that computation starts on the coarsest mesh, once that converges a bit, the next finer mesh gets into the act and it does two level multigrid. After a while the evolving solution gets transferred to the next finest mesh, and it does three level multigrid. This continues until we've reached the finest mesh, and then the entire mesh hierarchy is used to converge the solution on the finest mesh. When you run in parallel, each processor gets its own hierarchy of fine to coarse meshes to work with, here is what it would look like for 2 processors:

Multigrid obviously needs a sequence of coarse meshes to get going. These meshes are derived from the fine grid that you make with cubes. We use a special coarsening procedure which attempts to coarsen with an 8:1 coarsening ratio. Anytime a parent cell finds all its children at the same level of refinement, that parent cell gets inserted into the coarse mesh. Of course there are situations where one or more children has further subdivision, and coarsening gets "suspended" in that cell. As a result, coarsening ratios with this algorithm approach (but do not exceed) 8:1. In practice, finer meshes achieve coarsening ratios above 7:1, which is sufficient to ensure good smoothing. W-cycle multigrid requires at least 4:1 to maintain its (theoretical) constant time bound. Meshes at any level in the hierarchy fully cover the computational domain, and may contain cells at many levels of refinement.

The mgPrep utility takes a reordered fine mesh (output from reorder) and creates a sequence of coarse meshes directly from this mesh. Since the fine mesh has already been reordered, the coarse meshes will automatically be ready for domain-decomposition, should you wish to run the final multigrid run in parallel. When a multigrid mesh is run in parallel, all meshes in the hierarchy get automatically partitioned using the exact same space-filling-curves, so coarse meshes in a given partition overlaps maximally with the finer grids that it supports.

Building coarse meshes with mgPrep is extremely easy. mgPrep takes in a reordered mesh (usually called Mesh.R.c3d) and outputs the input mesh, and the series of coarse meshes generated. The hierarchy of meshes is stored in a single file, usually called Mesh.mg.c3d. Here's how (starting from Mesh.c3d output from cubes):
1.% reorder  # reorder the mesh (using default file names)
2.% mgPrep -n 6 # generate coarser meshes (orig + 5 coarser)
3.% setenv MP_SET_NUMTHREADS 16 # choose number of processors
4 % flowCart -mg 6 # run on 16 CPU's with 6 level multigrid

Note:  This example has some options that you could play with, for example, step 2 creates 5 levels of coarser meshes (with the original mesh this makes 6 meshes total). By default it creates Mesh.mg.c3d. flowCart's input control file now needs to point to this file as the input mesh. In step 3, the example uses all 6 meshes in the hierarchy, but you don't need to, you could use anywhere from 1 to 6 levels of mesh, so running on 3 meshes (for example) with "% flowCart -mg 3" would be an equally valid command line. Even running single mesh "% flowCart" will work fine (in this case, -mg 1, is the default).
(top)
Further Documentation:
Technical information on all topics covered in this overview are addressed in "A parallel multilevel method for adaptively refined Cartesian grids with embedded boundaries". AIAA Paper 2000-0808, Jan 2000.
(top)


last update 21 Jun 2000, M. Aftosmis