Although the following refinement algorithm has absolutely nothing to do
with neural networks, the idea for it was inspired by how the neural net
backpropagation learning algorithm works. In backpropagation, a function
measures the error in the output of a neural network,
given a certain input a. The neural network ``learns'' by adjusting its
internal variables
to reduce its output error;
learning stops when the output error is sufficiently small.
The way it reduces the error is by taking partial
derivatives of the error function with respect to all the internal variables
, producing the gradient of the error with respect to
. Moving
a small amount in the direction opposite the gradient should then reduce
the error. The amount by which
is moved along the negative gradient
is called the learning rate.
In one sentence, backpropagation says, ``Find a direction to move that
reduces the error, and then move a little bit in that direction.''
The inspiration for shadowing is this: once we have computed the 1-step
error vectors along the noisy trajectory
,
it seems reasonable that the
1-step error at step i would be reduced if we simply
subtracted a fraction
of
from
, for some
.
This is the basis for what I call a Local Error
Subtraction method.
A naïve implementation of the above idea may be simply to set
for i from 1 to S.
However, this does not define how to change the initial point
,
because there is no
. Since any new true trajectory must
have a different initial condition, we must find a way to correct
.
To allow corrections to , I introduce the backward error,
where integrates the solution backwards
from time i+1 to time i, and
is defined on
,
where S is the number of shadow steps. Thus each timestep from 1 to
S-1 has both a forward error
and backward error
;
timestep 0 has only a backward error
,
and timestep S has only a forward error
.
Subtracting some linear combination from
(setting
) does not work for any reasonable
values of
that I have tried (between 0.1 and 0.9). I
believe this is because 1-step errors computed in the forward
direction will be slightly unstable in their components that lie
in the unstable subspace, and backwards 1-step errors will be unstable
in their components that lie in the stable subspace. The method works,
however, if we rewrite
using equation 1.2
(which was
), and
, and update
as
In other words, we use only the components of the 1-step error that are computationally stable.
The SLES
method seems to work about as fast as the optimized GHYS method introduced
in the next chapter. For example, using
, 1-step errors
are generally reduced by about a factor of 10 per refinement iteration,
as one would expect when subtracting away 90% of the error.
However, I have not extensively tested it, and since
it seems to have a weaker theoretical basis than the GHYS method,
I have not used it
in any of the results mentioned in later chapters. However,
I have no reason to believe that the method is unreliable,
so further testing in the future seems warranted.
Finally, note that SLES makes no explicit use of the linear map .
Currently, the only use of the resolvents in SLES is implicitly through the
construction of the stable and unstable directions. If a method of computing
the stable and unstable directions can be found that doesn't use the linear
maps (eg., eigenvector decomposition as mentioned above), then the SLES
method will not use the linear maps at all, and they will not need to
be computed. Since computing the linear
maps (for an ODE system) is currently by far
the most computationally intensive section of the program, getting rid
of them entirely is an exciting prospect.
It is perhaps important to note that there is a fundamental difference
between the GHYS refinement procedure, which I call a global
method, and SLES, which I call a local method. Let
the initial noisy orbit be , with the superscript
referring to the refinement iteration.
Assuming the GHYS refinement procedure can be written as a Newton's method,
it can be summarized as
where g was defined near the beginning of this chapter as the function
computing the forward 1-step errors on the entire orbit,
and g' is the Jacobian of g. Thus the entire orbit is updated
using a single (large) equation.
SLES, on the other hand, applies the local correction formula of
equation 1.9 to each point independently; namely
One consequence of this difference occurs when the 1-step errors in the GHYS method ``explode'' and cause the iteration to fail. The explosion seems to occur much faster using GHYS than using SLES. I believe this is because in GHYS, all the 1-step errors along the entire trajectory contribute to each other's computation of the correction, so that if a single 1-step error is too big for its linear map to be valid, then the corrections for all the points will be invalid. However, in SLES, there is no such dependence between points at arbitrary distances along the trajectory, so a point with large 1-step error can propagate its instability to neighbors j shadow steps away only after j refinement iterations.
Currently, the only known way to isolate glitches using the GHYS method is
the way QT did it, namely to try shadowing for longer and longer times
until the orbit explodes, and then to do a binary search
for shorter and
longer shadows until the shadow step at which the glitch occurs is pinpointed.
This is alot of work.
Because of the slow way glitches propagate their effects in SLES, it may
be possible to isolate glitches simply by their large errors and that
of their neighbors. It may even be possible to pinpoint multiple glitches
along a single trajectory. For example, points and
would be
glitches if we could shadow the trajectory from 0 to
and from
to
and from
to S, but not shadow any part
of the trajectory containing
or
. This introduces the concepts
of shadowable segments, although it is not clear what use such a concept
is.