1.. _benchmarks:
2
3Benchmarks
4==========
5
6In this section we provide a few performance comparisons between heyoka and other popular
7ODE integration packages. Specifically, we compare heyoka to:
8
9- `DifferentialEquations.jl <https://diffeq.sciml.ai/>`__, a popular Julia
10  library implementing several ODE solvers. In these benchmarks, we will be using
11  explicit Runge-Kutta solvers such as ``Vern9`` and ``DP8`` (see
12  `here <https://diffeq.sciml.ai/stable/solvers/ode_solve/>`__ for a list of
13  ODE solvers available in DifferentialEquations.jl);
14- `Boost.ODEInt <https://www.boost.org/doc/libs/master/libs/numeric/odeint/doc/html/index.html>`__,
15  a C++ package implementing various algorithms for the solution of systems of ODEs. In these
16  benchmarks, we will be using the explicit Runge-Kutta-Fehlberg 78 solver;
17- the ``IAS15`` integrator from `REBOUND <https://github.com/hannorein/rebound>`__,
18  a popular N-body integration package. Like heyoka, ``IAS15`` is a high-precision
19  non-symplectic integrator with adaptive timestepping capable of conserving the
20  dynamical invariants over billions of dynamical timescales. Note, however, that
21  ``IAS15`` is not a general-purpose integrator, and thus we will be able to use
22  it only in benchmarks involving gravitational N-body systems.
23
24Note that the integrators from DifferentialEquations.jl by default
25enable dense output, which however incurs in a heavy computational cost. While heyoka also supports
26:ref:`dense output <tut_d_output>`, this feature is opt-in and its performance impact is much more limited.
27In these benchmarks, we will be testing both with and without dense output.
28
29All benchmarks were run on an Intel Xeon Platinum 8360Y CPU. More benchmark results are available in the
30`heyoka paper <https://arxiv.org/abs/2105.00800>`__.
31The benchmarks' source code is available in the `github repository <https://github.com/bluescarni/heyoka/tree/master/benchmark>`__.
32
33The planetary three-body problem
34--------------------------------
35
36Here we will numerically integrate a specific case of the `three-body problem <https://en.wikipedia.org/wiki/Three-body_problem>`__
37in which the three particles are the Sun, Jupiter and Saturn, all represented as point masses
38attracting each other according to `Newtonian gravity <https://en.wikipedia.org/wiki/Newton%27s_law_of_universal_gravitation>`__.
39The initial conditions are taken from `this paper <https://ntrs.nasa.gov/citations/19860060859>`__, and the integration
40is run for a total of :math:`10^5` years.
41For ``Vern9`` and heyoka, the test is run both with and without :ref:`dense output <tut_d_output>`. When dense output is enabled,
42the result of the integration over :math:`5 \times 10^5` equispaced time grid points is requested.
43
44In order to measure the accuracy of the integration, we will also compare the final state of the system
45with the result of a numerical integration in quadruple precision with a tolerance of :math:`10^{-30}`.
46
47Let us see first the results for an error tolerance of :math:`10^{-15}`:
48
49.. image:: images/ss_3bp_15.png
50  :align: center
51  :alt: 3BP benchmark 1e-15
52
53We can see how, without dense output, heyoka is about 3 times faster than ``Vern9``. When dense output is requested,
54heyoka's runtime increases by a modest :math:`\sim 24\%`, whereas for ``Vern9`` the runtime increases by a factor of
55:math:`\sim 3`, so that, with dense output, heyoka is about :math:`\sim 7` times faster than ``Vern9``. Performance-wise,
56Boost.ODEint is comparable to ``Vern9`` (note that the RKF78 integrator from Boost.ODEInt does not support dense output).
57
58From the point of view of the integration accuracy, we can see how the RMS of the error across the components of the state
59vector with respect to the quadruple-precision integration is of the order of :math:`10^{-9}` for heyoka, while for both
60``Vern9`` and Boost.ODEInt the error is about :math:`\sim 35` times larger.
61
62Note that, even if the error tolerance for the integration is set to :math:`10^{-15}`, the error at the end of the integration
63is of the order of :math:`10^{-9}`. This is to be expected, as the error on the state variables accumulates as (at least)
64:math:`t^{\frac{3}{2}}` (a result known as Brouwer's law).
65
66Let us now see the results for an error tolerance of :math:`10^{-9}`:
67
68.. image:: images/ss_3bp_9.png
69  :align: center
70  :alt: 3BP benchmark 1e-9
71
72Whereas heyoka is still faster than ``Vern9`` and Boost.ODEInt, at this higher integration tolerance the performance
73advantage is smaller.
74
75The integration accuracy of both heyoka and ``Vern9`` is of the order of :math:`10^{-3}`. By contrast,
76the accuracy of Boost.ODEInt is two orders of magnitude worse.
77
78The outer Solar System
79----------------------
80
81In this benchmark, we will integrate the motion of the outer Solar System for 1 million years. We define the outer Solar
82System as the 6-body problem consisting of the Sun, Jupiter, Saturn, Uranus, Neptune and Pluto, all considered as point
83masses attracting each other according to `Newtonian gravity <https://en.wikipedia.org/wiki/Newton%27s_law_of_universal_gravitation>`__.
84The initial conditions are taken from `this paper <https://ntrs.nasa.gov/citations/19860060859>`__.
85
86For this benchmark, we will be comparing heyoka to the ``IAS15`` integrator from `REBOUND <https://github.com/hannorein/rebound>`__.
87
88Here are the results:
89
90.. image:: images/outer_ss_bench.png
91  :scale: 60%
92  :align: center
93  :alt: Outer Solar System benchmark
94
95We can see how heyoka is about 5 times faster than ``IAS15`` in this specific test. A detailed performance comparison with ``IAS15``
96is available in the `heyoka paper <https://arxiv.org/abs/2105.00800>`__.
97
98Event detection
99---------------
100
101In this benchmark we measure the overhead of heyoka's :ref:`event detection <tut_events>` system and compare it to
102the ``Vern9`` and ``DP8`` integrators from DifferentialEquations.jl.
103We consider the dynamical system studied by Hénon and Heiles in a
104`famous numerical experiment <https://ui.adsabs.harvard.edu/abs/1964AJ.....69...73H/abstract>`__ investigating
105the existence of additional integrals of motion in axisymmetric potentials. The differential equations are:
106
107.. math::
108
109   \begin{cases}
110   v_x^\prime &= -x-2xy \\
111   v_y^\prime &= y^2-y-x^2 \\
112   x^\prime &= v_x \\
113   y^\prime &= v_y
114   \end{cases},
115
116with initial conditions
117
118.. math::
119
120   \begin{cases}
121   v_x\left(0\right) &= -0.2525875586263492  \\
122   v_y\left(0\right) &= -0.2178423952983717  \\
123   x\left(0\right) &= 0  \\
124   y\left(0\right) &=  0.2587703282931232 \\
125   \end{cases}.
126
127Our objective is to compute the `Poincaré section <https://en.wikipedia.org/wiki/Poincar%C3%A9_map>`__
128of the solution on the :math:`\left( y,v_y \right)` plane. This can be accomplished by setting up the event equation
129
130.. math::
131
132   x = 0
133
134to detect when the solution crosses the :math:`\left( y,v_y \right)` plane. Like in the original paper,
135we impose the additional constraint that the event direction must be *positive* (i.e., we only detect
136crossing of the plane from below). The total integration time is :math:`2000` time units and the tolerance
137is set to :math:`10^{-15}`. For both heyoka and the DifferentialEquations.jl integrators, we measure the runtime
138both with and without event detection.
139
140Here are the results:
141
142.. image:: images/event_det.png
143  :align: center
144  :alt: Event detection benchmark
145
146We can see how heyoka's event detection system has a much lower overhead than the event detection system in
147DifferentialEquations.jl. heyoka's event detection system combines the free dense output from Taylor's method
148with state-of-the-art polynomial root finding techniques to provide an event-detection methodology which is
149both rigorous and computationally efficient. By contrast, DifferentialEquations.jl adopts the approach
150of checking for sign changes in the event equation using the interpolant of the solution
151within a timestep at discrete points. Note that
152this approach is not rigorous, in the sense that if the event equation has two zeroes between the interpolation
153points the event will be missed. By contrast, heyoka's approach does not suffer from this issue.
154