The refinement procedure of GHYS and QT can be likened to Newton's method for finding a zero of a function. Indeed, it may be possible to formulate it exactly as a Newton's method with boundary conditions, although this has not yet been shown. For pedagogical purposes, I will assume in the following paragraph that the GHYS refinement procedure can be formulated as a Newton's method.
The basic idea is as follows. Let be a trajectory with S steps that has noise , where is the machine precision, and is some constant significantly greater than 1 that allows room for improvement towards the machine precision. Let be the 1-step error at step i+1, where for all i. The set of 1-step errors is represented by , and is estimated by a numerical integration technique that has higher accuracy than used to compute . This describes a function, call it g, taking as input the entire orbit and whose output is the set of 1-step errors , i.e., . Since the 1-step errors are assumed to be small, is small. That is, is close to a zero of g, if one exists. A zero of the function would represent an orbit with zero 1-step error, i.e., a true orbit. This is an ideal situation in which to run Newton's method. If Newton's method converges, then a true orbit has been found.
There are exactly two criteria for a trajectory to be called a numerical shadow of :
This section presents the GHYS refinement procedure for a two-dimensional Hamiltonian system. Assume we have a noisy orbit , and we want to find a less noisy orbit . The 1-step errors are estimated by
where is an integrator with higher accuracy than the one used to compute . The refined orbit will be constructed by setting
where is a correction term for step i. Now,
Assuming the correction terms to be small, then can be expanded in a Taylor series about :
where is the linearized map. For a discrete chaotic map, is just the Jacobian of the map at step i. For a system of ODEs, is the Jacobian of the integral of the ODE from step i to step i+1. The final equation for the corrections is
As will be seen later, it is the computation of the linear maps , called resolvents, that takes most of the CPU time during a refinement, because the resolvent has terms in it, and it needs to be computed to high accuracy. Presumably, if one is interested in studying simpler high-dimensional systems, a chaotic map would be a better choice than an ODE system, because no ODE integration is needed.
If the problem were not chaotic, the correction terms could be computed directly from equation 1.1. But since will amplify any errors in that occur near the unstable direction, computing the 's by iterating equation 1.1 quickly produces nothing but noise. Therefore, GHYS suggest splitting the error and correction terms into components in the stable ( ) and unstable ( ) directions at each timestep:
Computing the unstable unit direction vectors is currently done by initializing the unstable vector at time 0, , to an arbitrary unit vector and iterating the linearized map forward with
Since magnifies any component that lies in the unstable direction, and assuming we are not so unlucky to choose a that lies precisely along the stable direction, then after a few e-folding times will point roughly in the actual unstable direction. Similarly, the stable unit direction vectors are computed by initializing to an arbitrary unit vector and iterating backwards,
where can be computed either by inverting , or by integrating the variational equations backwards from step i+1 to step i. The latter is far more expensive, but may be more reliable in rare instances. In any case, it is intended that , the identity matrix. There may be more efficient ways to compute the stable and unstable vectors, possibly having to do with eigenvector decomposition of the Jacobian of the map (not the Jacobian of the integral of the map), but I have not looked into this.
Substituting equations 1.2 and 1.3 into equation 1.1 yields
For the same reason that magnifies errors in the unstable direction, it diminishes errors in the stable direction. Likewise, diminishes errors in the unstable direction and magnifies errors in the stable direction. Thus the terms should be computed in the backwards direction, and terms in the forward direction. Taking components of equation 1.6 in the unstable direction at step i+1 (recall that lies in the same direction as ), iterate backwards on
and taking components in the stable direction, iterate forwards on
The initial choices for and are arbitrary as long as they are small -- smaller than the maximum shadowing distance -- because equation 1.8 damps initial conditions, and equation 1.7 damps final conditions. QT and GHYS choose them both as 0. This choice is probably as good as any, but it can be seen here that if one shadow exists, there are infinitely many of them. (Another justification of this is offered below.) Another way of looking at these initial choices for and is that they ``pinch'' the growing components at the final end point, and the backwards-growing components at the initial point, to be small, so that and . That is, boundary conditions are being forced on the problem so that the exponential divergence is forcibly masked, if possible.
Note that these boundary conditions allow the initial conditions for the shadow and noisy orbits to differ along the unstable direction. In fact, this must occur if the change in initial conditions is to have any effect. That is, when looking for a shadow, if perturbations are only allowed in the stable direction, those perturbations would die out, leading the ``shadow'' to follow the true orbit that passes through our initial conditions -- the one that is already known to diverge exponentially from the computed (noisy) orbit.
This section is derived from QT's Appendix B, although it is presented slightly differently.
If the configuration space is D dimensional, then there are 2D dimensions in the phase space. It can be shown that in a Hamiltonian system, the number of stable and unstable directions is each equal to D. At timestep i, let be the D unstable unit vectors, and let be the D stable unit vectors. For any particular timestep, it will be convenient if the unstable vectors are orthogonal to each other, and the stable vectors are orthogonal to each other. However, the stable and unstable vectors together will not in general form an orthogonal system.
The vectors are evolved exactly as before, except using Gram-Schmidt orthonormalization to produce two sets of D-orthonormal vectors at each timestep. Since we do not know a priori what directions are stable and unstable at each timestep, we choose an arbitrary orthonormal basis at time zero, and an arbitrary orthonormal basis at time S, and evolve them as:
Then, at each timestep i, do two Gram-Schmidt orthonormalizations: one on to produce , and another on to produce . After a few e-folding times, we find that points in the most unstable direction at timestep i, points in the second most unstable direction, etc. Likewise, points in the most stable direction at time i, points in the second most stable direction, etc.
The multidimensional generalizations of the error and correction vectors is the obvious
To convert the 1-step error at timestep i from phase-space co-ordinates to the stable and unstable basis , one constructs the matrix whose columns are the unstable and stable unit vectors, , and solves the system .
The equation for the correction co-efficients in the unstable subspace at timestep i+1 is
which we project out along producing
where the scalar , and the Gram-Schmidt process ensures if j<k. The boundary condition at timestep S is , and we compute the co-efficients backwards using
We first solve for , which doesn't require knowledge of the other co-efficients, then we solve for , etc.
The equation for the correction co-efficients in the stable subspace at timestep i is
which we project out along producing
where , and the Gram-Schmidt process ensures if j<k. The boundary condition at time 0 is , and we compute the co-efficients forwards using
As with the unstable corrections, we first compute which does not require knowledge of the other co-efficients, then we compute , etc.
There is no guarantee that refinement converges towards a true orbit; if there was, then all noisy orbits would be shadowable. In fact, even if some refinements are successful, numerical refinement alone does not prove rigorously that a true shadow exists; it only proves the existence of a numerical shadow, i.e., a trajectory that has less noise than the original. Furthermore, the 1-step error computed by any numerical technique measures the difference between the noisy and more accurate solutions at timestep i+1, where both start from the same position at timestep i. Even if this distance is about , it does not imply that the difference between the noisy solution at timestep i+1 and the true solution passing through step i is equally small. This was dramatically demonstrated one day when I accidentally set the precision of the ``accurate'' integrator to only , and the shadowing routine happily refined the orbit until the 1-step errors all had magnitudes less than . Considering that the ``accurate'' trajectories were only being computed to a tolerance of , it seems that refinement had converged to within of a specific numerical orbit that had true 1-step errors of about .
Furthermore, a numerical integration routine that is requested to integrate to a tolerance close to the machine precision might not achieve it, because it might undetectably lose a few digits near the machine precision. Thus, even when a numerical shadow is found with 1-step errors of , the true 1-step errors are probably closer to . GHYS provide a method called containment that can prove rigorously when a true shadow exists, but we have not implemented containment. As a surrogate to containment, QT did experiments on simple chaotic maps with 100-digit accuracy (using the Maple symbolic manipulation package) showing that if the GHYS refinement procedure refined the trajectory to 1-step errors of about , then successful refinements could be continued down to . It is reasonable to assume that refinement would continue to decrease the noise, converging in the limit to a noiseless (true) trajectory.
For the above reasons, we are confident that convergence to a numerical shadow implies, with high probability, the existence of a true shadow. However, to prove it rigorously requires implementing a scheme such as the GHYS containment procedure. This is one possible avenue for further research.
There is also no guarantee that, even if the refinement procedure does converge, that it converges to a reasonable shadow of ; in principle it could converge to a true orbit that is far from , in which case the term ``shadow'' would be inappropriate. However, I have found in practice that refinement always fails due to the 1-step errors ``exploding'' (becoming large). I have never seen a case in which the refined orbit diverged from the original orbit while retaining small 1-step errors.
The error explosion occurs when 1-step errors are so large that the linearized map becomes invalid for computing corrections. Since the method is global (i.e., each correction depends on all the others), inaccuracies in the computation of the corrections can quickly amplify the noise rather than decreasing it. Thus, within 1 or 2 refinement iterations, the 1-step errors can grow by many orders of magnitude, resulting in failed refinements. It is unclear if local methods like SLES (introduced below) will suffer the same consequences; probably they do, but errors probably grow much more slowly, and only locally in the orbit where 1-step errors are large.
I have tried two integrators as my ``accurate'' integrator, although both were variable-order, variable-timestep Adams's methods called SDRIV2 and LSODE [18, 16]. QT used a Bulirsch-Stoer integrator by Press et al. [28]. It would be interesting to study whether the choice of accurate integrator influences the numerical shadow found, because if one true shadow exists, then infinitely many (closely packed) true shadows exist. However, if the boundary conditions are the same, then the solutions should be the same.
Since is arbitrary, and the displacement along the actual unstable subspace at time 0 that determines the evolution of the orbit, is it possible that we may fail to find a shadow because the initial perturbations allowed by our choice of don't include the perturbations necessary to find a shadow?
I have not pursued this question at all, but it seems an interesting one. There are examples introduced below where the GHYS refinement procedure failed to find a shadow for a noisy orbit , when it can be shown that one exists; the above scenario may be a cause. One way to test it is to try to find a shadow for using a different initial guess for , although I have not tried this. Similar comments apply to .
However, there are at least 3 arguments against this dependence being a problem. First, since the stable subspace has measure zero in the total space, it seems unlikely that an arbitrary choice for could choose a subspace that doesn't contain ``enough'' of the unstable subspace. Secondly, the correction co-efficients for the unstable subspace are computed starting at the end of the trajectory, where is presumably correct; similarly for the correction co-efficients in the stable subspace and . Thirdly, assuming the shadowing distance is much larger than the 1-step errors, it is likely that the intervals near the endpoints where the stable/unstable vectors are inaccurate are too small to allow instabilities in the computations of the corrections to build from 1-step error sizes to shadow-distance sizes. Further study of this possible problem may be helpful.
If it does turn out to be a problem, there are many possible fixes. First, we could attempt to shadow only between points that are known to have accurate approximations to the stable and unstable directions. One measure of this accuracy is to start with two different arbitrary unit vectors at time 0 and evolve them forward using equation 1.4. When they converge within some tolerance to the same vector at timestep a, we can assume they are correct after timestep a. The same could be done to the stable vectors, starting with two different guesses at time S and evolving backwards using equation 1.5 until they converged at timestep b. Then, shadowing would be attempted only between a and b.
If we want to shadow all the way to the endpoints, we could attempt to get reliable stable/unstable vectors everywhere in the following manner: compute the times a and b as above. Then compute backwards, attempting to evolve the correct backwards, while hoping that instabilities from the stable subspace don't overwhelm the computation for the few steps that iteration proceeds backwards. Do the same for , except evolving forward. This seems to me the most promising method, although I have not yet tested it. To test how much of the unstable subspace has encroached on , perhaps we can dot with to ensure it stays small.
Another (untested) method to compute the stable and unstable vectors out to the endpoints is: compute in the normal way. Then choose to be orthogonal to . Even though these vectors are not generally orthogonal, choosing to be orthogonal to seems better than a random choice. Then, compute normally. Then recompute using a orthogonal to .
If a method can be invented that computes the stable and unstable vectors using eigenvector decomposition, then it probably would be better than all the above methods, because it would not require any resolvents and not be restricted to being away from the endpoints.