Next: Summary of this thesis Up: Introduction and Motivation Previous: History of exponential divergence

The kinds of errors made in N-body simulations

Input and output Errors

I first distinguish between two general types of errors.

Input errors are errors that can be controlled directly while devising and implementing models of N-body systems. Input errors in N-body systems may also be called approximations, and may be divided into modelling approximations and implementation approximations. Modelling approximations simplify the system being simulated, and include:

finite-N sampling, also called discreteness noise, because the N used is generally several orders of magnitude less than the N of the real system being modelled. The general consensus seems to be that this is the limiting source of error in current large N-body simulations. For example, Hernquist, Hut and Makino [15], Barnes and Hut [5], and Sellwood [31] explicitly say this; Singh, Hennessy and Gupta [32] have a ``master error equation'' in which clearly discreteness noise is dominant; and Pfenniger and Friedl [27], Jernigan and Porter [17], and Barnes and Hut [3] all imply that using the largest possible N is a desirable characteristic.
force softening, i.e., replacing with in the denominator of the gravitational force computation for some small constant , usually chosen to approximate the average inter-particle separation. This is done because it allows a smaller N to approximate a larger N, and also to eliminate the singularity at r=0 [6].
reducing the dimensionality of the problem from 3 to 2, if applicable. This is not done as often as in the past.

Implementation approximations measure how well the implementation simulates the model, and include such things as

numerical integration truncation error.
machine roundoff error.
Using approximate force computation algorithms like the Barnes-Hut tree code [3] or the Fast Multipole Method [12]. Hernquist, Hut, and Makino [15] try to show that the effect of this error is negligible, by showing that the energy of each individual particle is conserved to a high degree regardless of whether the Barnes-Hut or the direct algorithm is used. However, they used the leapfrog integrator with a constant timestep, which guarantees that the energy error for the entire system is bounded. I'm not sure if this integrator guarantees that the energy of individual particles is conserved; if it does, the results may not be as conclusive as they appear. Furthermore, as discussed below, energy conservation may not be a stringent enough error criterion.

Output error measures the difference between the output and a real system, and results from the cumulative effect of all the input errors. A simulation with small output errors is said to have high accuracy. Given that N-body systems are chaotic, and that their simulation introduces the above input errors, we must now ask precisely what we mean by the ``accuracy'' of a simulation. Amazing as it may seem, there is currently no clear definition of simulation accuracy [29]. Obviously, attempting to follow the individual paths of all N particles is infeasible; Goodman, Heggie and Hut [10] show that this would require O(N) digits of precision. On the other hand, most astronomical publications quote energy conservation as their only measure of output error, even though there are infinitely many solutions with equal energy but vastly different phase-space trajectories. Some of these simulations even use an energy-conserving integrator like leapfrog, in which case quoting energy conservation is of dubious merit, because the integrator conserves energy no matter how big other errors become!

Macroscopic statistics vs. microscopic details

In large N-body simulations, one is not usually concerned with the precise evolution of individual particles, but instead with the evolution of the distribution of particles. Most practitioners know that the exponential magnification of errors means they cannot possibly trust the microscopic details, but they believe that the statistical results are independent of the microscopic errors, although little work has been done to test this belief [10]. Barnes and Hut [5] claim that astrophysical N-body simulations require only ``modest'' accuracy levels, but also concede that quoting energy conservation isn't enough, and that more stringent tests are needed.

An example of conservation of macroscopic properties is given by Kandrup and Smith [19]. They show that a histogram of the e-folding times of individual particles stays constant within statistical uncertainties, even though the phase-space distribution of those particles is vastly different for different initial conditions.

However, until more stringent tests are applied to N-body simulations, we'll never know, for example, if our simulations of spiral galaxies produce spirals for the same reason that real spiral galaxies do.

Suggestions for measures of output error

We now must distinguish between the desired properties of simulations, and deviations that simulations make from those properties, i.e., the output errors they make.

0.

We could demand that a simulation precisely follow the exact evolution of the true solution.

This would be possible in principle for chaotic maps, but not for ODEs. For maps, an arbitrary precision arithmetic package could be employed, but this is infeasible because it requires keeping all the digits of every operation, and each multiplication operation typically doubles the number of digits.

For problems in which the map is really an ODE integration, like N-body systems, this is not possible even in principle, because no numerical integration routine is known that can give exact solutions to arbitrary nonlinear ODEs.

1.

Next, we could demand that a simulation have the property that the phase-space distance between the computed solution and the true solution with the same initial conditions be bounded by a small constant.

If this property could be realized, all our troubles would be over, for it is a sufficient condition under any reasonable definition of simulation validity. Unfortunately, ODE integrations have truncation errors, so the magnitude of these errors will be magnified exponentially on a short time scale. Goodman, Heggie and Hut [10] offer some solace in that the exact evolution could be closely followed for a long time if O(N) digits are kept, but this is currently infeasible. There seems little hope of obtaining valid simulations by this criterion.

2.

At the least stringent extreme we can demand that such physically conserved quantities as energy, and linear and angular momentum, are conserved to some small error.

Certainly global conservation of these quantities is necessary for any reasonable definition of simulation validity, but it is unclear what other properties are implied by such conservations.

There are several possibilities between these extremes.

3.

A little less stringent than (1), we can demand that a simulation follow, with some small error, the exact phase-space evolution of a set of initial conditions that is close to the actual set of initial conditions used in the simulation.

Since, with large simulations, we are only interested in the evolution of the distribution of particles, and since the initial conditions are usually generated from some random distribution anyway, this is almost as good as option (1). The study of shadowing relates precisely to this property.

What if shadowing turns out also to be an unattainable goal? We will need to demand less stringent properties of simulations, such as:

4.

Even if a shadow solution cannot be found for a particular simulation, it is still possible that the global distribution of particles is close to the exact global distribution of a true solution with similar initial conditions. In other words, a low-resolution, ``smoothed'' animation of the real system and the computed solution would be indistinguishable to the unaided eye.

This would be almost as good as shadowing, at least for collisionless systems, but it is unclear how one would go about proving the existence of this property for a simulation.

5.

If the global distribution cannot be followed closely, then perhaps at least some statistical properties of the distribution could be reproduced by a simulation. A reasonable statistical property may be something like the time-evolution of the Fourier spectrum of the density distribution in spherical or cylindrical co-ordinates, so that the wavelengths of the distribution (i.e., the relative abundance of ``clumpiness'' of various sizes) are similar to those of the real system.

This is the least stringent property, that I can think of, that a large N-body simulation would need to be considered valid; i.e., it is the weakest necessary condition I can think of. Note it is more stringent than energy conservation. However, it is again unclear how one would go about proving this property exists for a simulation.

It should be clear that the ordering in stringency of the above properties, from most to least stringent, allows the following logical deductions:

Next: Summary of this thesis Up: Introduction and Motivation Previous: History of exponential divergence

Wayne Hayes
Sun Dec 29 23:43:59 EST 1996

Access count (updated once a day) since 1 Jan 1997: 11732