1. INTRODUCTION
Historically, fully self-consistent realistic astrophysical N -body simulations are a challenging problem (Aarseth 1971; Efstathiou et al. 1985; Stadel 2001; Springel et al. 2001; Dehnen & Read 2011; Klypin 2018). On the purely gravitational case, direct summation N -body codes suffer from a bottleneck due to the computational cost of force calculation on a particle-by-particle basis, which scales as the square of particle number (Np2). For this reason, they are primarily used for simulating dense stellar environments or planetary systems. Such limitations triggered the development of sophisticated approximate hybrid collisionless methods, like the TreePM (Tree-Particle Mesh)/P3M (Particle ParticleParticle Mesh)/P3T (Particle ParticleParticle Tree) codes (Xu 1995; Bode et al. 2000; Bagla 2002; Bode & Ostriker 2003) where the short-range component of the force is carried out either by expensive/accurate direct summation or tree force solvers (Couchman 1991; Oshino et al. 2011; Habib et al. 2013). Alternatively, AMR (Adaptive Mesh refinement) methods are used to compute the large dynamical range of the gravitational force (Villumsen 1989; Jessop et al. 1994; Kravtsov et al. 1997; Teyssier 2002), seeking a balance between accuracy and computational efficiency. N-body simulations require both a fast way to calculate the accelerations and an accurate and efficient integration method to evolve particles in time. The secondorder leapfrog symplectic integrator (Verlet 1967) is the most widely used in collisionless N -body simulations (e.g. Klypin 2018; Angulo & Hahn 2021). It is strictly symplectic when a global-constant time-step is adopted; however, this is not suitable for addressing problems with a large dynamical range that are currently studied with modern codes. In the quest to improve efficiency, it is necessary to adopt multiple or adaptive time-steps. The general problem of geometric/symplectic (preserving phase space volume), time-symmetric (recover initial conditions after changing dt for -dt) and reversible integrators (recover initial conditions after changing the sign of velocities) has been addressed in the field of differential equations for dynamical systems (Hairer et al. 2002). In such work, they point out the differences between adaptive global time-steps or multi-time-steps (several rates of evolution for different parts of the system) and discuss the constraints required for the integration method and the time-step selection function to preserve the mapping properties. In astronomy, the influential study of Quinn et al. (1997) discusses different operator-based leapfrog implementations using time step blocks plus a time-step selection function in the KDK/DKD leapfrog integrator for massive N -body simulations. They point out that particle migration across time-step blocks may break up the symmetry and sometimes involves backward integration, which can be difficult to reconcile with a dissipative component like gas. Current collisionless simulations codes commonly use the KDK leapfrog implementation with adaptive time-steps (e.g. Quinn et al. 1997; Springel 2005; Dehnen & Read 2011; Klypin 2018), which we call hereafter AKDK. Recently, Dehnen (2017) discusses conditions where the Hairer et al. (2002) analysis for time-symmetric integrators can be extended to discrete time-stepping. They conclude that there is no general solution. Many of the proposed integrators truly preserve the symmetries; however, the specific time-step selection function should also respect symmetrization. In several cases, the computational overhead makes the proposal impractical.
In this paper, we explore and quantify the Hierarchical Hamiltonian Splitting (HHS) strategy proposed by Pelupessy et al. (2012), which is, as leapfrog, a second-order scheme. This integrator was tested for small number of bodies or collisional simulations, delivering good energy conservation and momentum conservation at machine accuracy. However, no analysis of time-symmetry or reversibility was presented. Some modern N -body codes like AREPO (Weinberger et al. 2020) and GADGET4 (Springel et al. 2021) have implemented versions of HHS with some differences with respect to the original proposal, although they do not give details of the performance or other properties that allow the comparison with the commonly used integrators. In this work, we extend the discussion of HHS in the context of collisionless N-body simulations, by numerically investigating the time-symmetry and velocity reversibility. We also test some time-step selection functions to explore the potential advantages. In all cases we compare with the global constant time-step leapfrog integrator and the adaptive one (AKDK) in order to assess the conditions under which HHS is a competitive alternative. As we discussed above, in some modern P3M codes running in hybrid architectures, the most expensive calculation is the shortrange direct summation force integration, in some cases processed inside GPUs (Habib et al. 2016). Motivated by that, we implemented HHS in a direct summation code running in GPUs and complement that with additional tests with the Tree/FMM code GADGET4.
The rest of the paper has been organized as follows: § 2 summarizes the main properties of the integrators used here to carry out their comparison while § 3 introduces the time-step selection functions. § 4 shows accuracy tests performed with emphasis on the cases on an isolated halo and a minor merger. § 5, § 6 and § 7 contain the results of these tests taking into account the effect of time-step functions, performance and long-term stability, respectively. In § 8, we quantify reversibility and time-symmetry for the different codes. Finally, a discussion and the main conclusions are given in § 9.
2. INTEGRATORS
We implemented three different integrators in a direct summation code dubbed as NPsplitt (Aguilar-Argüello et al. in prep.), the leapfrog, Adaptive-KDK (AKDK) and the Hierarchical Hamiltonian Splitting integrators (HHS). Below, we describe each integrator. It is common to express integrators as a composition of operators using the Hamiltonian splitting technique in potential (Kick ) and kinetic energy (Drift), although there are other possibilities (Oshino et al. 2011).
2.1 Leapfrog
The leapfrog integrator is a second-order widely used integrator. As mentioned previously, this integrator is strictly symplectic only when a global-constant time-step is adopted. Symplectic integrators are designed to numerically preserve the integrals of motion and the phase-space volume of the simulated system.
In the leapfrog method, the evolution of the gravitational system can be written as a sequence of Kick (advance of velocities) and Drift (advance of positions) operators , defined by:
where x, v and a are the position, velocity and acceleration of a particle, respectively, and dt is the time step. In this paper, we use the operator sequence called KDK leapfrog (also known as velocity Verlet, Swope et al. 1982):
where we consider that the evolution is for one time step, i.e. from tn to tn + dt. Through this paper, we will refer to KDK leapfrog with a global-constant time-step as the Leapfrog integrator.
2.2 AKDK
Contemporary codes have extensively used KDK (equation 2) combined with a block time-step scheme (Hayli 1967; Sellwood 1985; Hut & McMillan 1986; Hernquist & Katz 1989; Makino 1991), frequently using rungs which are power of two: dtr = dt02(−r) and different assigning time-step functions, most frequently an acceleration based one (Springel 2005). We will use it as a reference integration scheme, but it should be noted that it is not symplectic (Hairer et al. 2002) and that the block-step is a multi time-step scheme.
2.3. Hierarchical Hamiltonian Splitting
The hierarchical Hamiltonian Splitting (HHS) method is a second-order integrator that uses individual time steps of the particles (Pelupessy et al. 2012) through recursively splitting the Hamiltonian. It accurately preserves linear and angular momentum and has a good energy conservation.
This integrator consists of adaptively and recursively splitting the Hamiltonian as a function of the current time step, dt, so that the so called Slow system (hereinafter S) contains all the particles with a time step larger than dt, and the so called Fast system (hereinafter F) contains all the particles with a time step smaller than dt. Thus, the splitting is as follows:
where
are, respectively, the kinetic and potential energies, and VSF is the potential energy of the interactions between S and F particles. The previous splitting scheme is known as HOLD (since it “holds” VSF for evaluation at the slow time-step, Pelupessy et al. 2012).
The S system is solved using the DKD scheme (also known as position Verlet, Tuckerman et al. 1990), which consists of drifts of the particles in this system (due to TS) and kicks on the particles of both systems (due to VSS + VSF). For the F system, the same procedure as for the original system is applied but using a halved time-step. Hence, the splitting is applied recursively to the F system with time-step dt/2r. The recursion ends when the system F (of the rung r) has no particles. At the end of the current integration step, the new time-step of a particle is calculated. In this scheme, a particle can change its time step to higher (lower) value if its current integration time is synchronized with the higher (lower) rung.
It is well known that the Kick and Drift operators are symplectic. However, using multiple or adaptive time-steps may not preserve such properties in a general way (Hairer et al. 2002). Therefore we need to investigate the behaviour of HHS.
3. Time-Step Selection Function
Besides the formulation of integrators with individual time steps based on symmetric operators, the choice for each particle time step is made through the so called time-step selection function. There is no unique choice; arguably the most commonly used time-step function in contemporary collisionless N body codes (e.g. GADGET, Springel 2005) is based on the acceleration as:
where ai is the acceleration acting on the particle i, giving the code the possibility of adapting to high/low accelerations, and is the force Plummer softening1. Improvements have been recently discussed by using a dynamical time proxy (Zemp et al. 2007) and a tidal force time scale (Dehnen & Read 2011; Grudić & Hopkins 2020), establishing a balance between short and long time steps, which may translate into higher efficiency. Extensive comparisons of AKDK with both choices have been discussed in Zemp et al. (2007) and Grudić & Hopkins (2020).
In our study, in an attempt to preserve the energy stability of the HHS integrator while allowing adaptive multi time-steps, and following Pelupessy et al. (2012), we use an approximated time-symmetrized time-step extrapolation criterion for each particle. To obtain such a time-step criterion, we start from the implicit criterion (Hut et al. 1995):
where τ is a time-step function (non-symmetrized), and τsym is the symmetrized time-step function of τ. To a first-order perturbative expansion:
hence, from equation 6:
so that the time-step we will use is given by
It is important to state that the minimization indicated above is across the so called Slow particles. For a time-step proportional to the inter-particle free-fall time:
The former option is a two-body-based proxy for the dynamical-time-motivated step function suggested by Zemp et al. (2007).
For completeness with Pelupessy et al. (2012), for a time-step proportional to the inter-particle fly-by time (typically used in collisional problems):
We will quantify the efficiency of such time-step functions. However, the high acceleration derivatives in the case of collisional problems may require going beyond the first order in the perturbative expansion.
Along the paper, we will mostly use the approximated symmetric free-fall time-step for HHS (defined by equations 9 and 10), and only in a few tests we will use the minimum of this and the approximated symmetric fly-by time-step (equations 9 and 11). In § 5, we will present a comparison with the GADGET4 implementation of HHS (Springel et al. 2021); this code uses a time-step function similar to equation 5 but the accuracy parameter, η accel, is included inside the square root. For AKDK, we will use the standard time-step function given by equation 5.
Table 1 summarizes the combinations of integrators and time-step selection functions used through this work.
Integrator name | Integration scheme | Time-step scheme | Time-step selection function | Reference |
---|---|---|---|---|
Leapfrog | KDK | Global-constant | Verlet (1967) | |
AKDK | KDK | BLOCK | eq. 5 | e.g. Quinn et al. (1997); Springel (2005) |
Dehnen & Read (2011); Klypin (2018) | ||||
HHS | HHS | HOLD | eqs. 9, 10 | Pelupessy et al. (2012) |
nsHHS | HHS | HOLD | eqs. 9, 10 | This work |
(without dτij/dt) | ||||
sAKDK | KDK | BLOCK | eqs. 9, 10 | This work |
4. Accuracy Tests
In this section, we present the test results of the HHS algorithm in terms of accuracy by simulating an isolated halo and sinking satellites, and compare them with the global-constant time-step leapfrog and the adaptive one, AKDK. To proceed with the comparison we implemented the different integrators in a direct summation N-body code (Aguilar-Argüello et al. in prep.). All the experiments were run in a single GPU. As a sanity check, we performed binary system tests (not reported here) and the results are consistent with those reported in other works (e.g. Dehnen & Read 2011; Pelupessy et al. 2012; Springel 2005).
4.1. Isolated Cuspy Halo
We adopted as a reference model an equal particle mass, isolated halo following the NFW cuspy density profile predicted by collisionless dark matter cosmological simulations (Navarro et al. 1997). The large density range and the corresponding different dynamical times make it a suitable system for an adaptive time-step code. Such tests depend on resolution to actually capture the benefit of individual time-steps as compared with a global-constant timestep scheme. We will use as a reference time scale the dynamical time2 at the NFW characteristic radius (rs), since it has been used to study the stability of the halo in other works (e.g. Klypin et al. 2015).
For the integration of our fiducial model, we adopted G = 1 (gravitational constant), Mvir = 1 (virial mass) and rs = 1 (scale length, also called characteristic radius), as model units. We will use these model units through the paper.
To investigate and quantify differences in accuracy and performance between the integrators first, we followed a fiducial halo sampled with 105 particles for 40 dynamical times at rs, tdyn.
Because our implementations of AKDK and HHS have different time-step function, a meaningful comparison is to assume an energy conservation threshold, which implies using distinct accuracy parameters for both integrators. For the first test, we considered a 10−7 threshold and accuracy parameters ηFF = 0.003 and ηaccel = 0.01 for HHS and AKDK, respectively, both constrained to 6 time-step rungs. Figure 1 shows the result of these tests. The upper left panel shows the energy error. HHS (black) stays very close to Leapfrog (red) during the first 20 tdyn, afterwards it shows a small drift. AKDK (blue) drifts almost linearly and after 20 tdyn it slightly flattens. Because the main computational overhead of HHS over AKDK comes from building and updating the time-step hierarchy in HHS, we decided to explore experiments where we delayed such an update, and denoted them as HHS-sTSS. We observed that such an action results in important savings in computational time (yellow line). The energy accuracy test is lower but acceptable for a collisionless simulation, and it is faster; in addition linear and angular momentum are preserved to machine precision.
Regarding the conservation of other dynamical quantities like linear and angular momentum or the system barycenter the situation is different (see Figure 1). Leapfrog and HHS preserve almost at machine precision the linear and angular momentum (see bottom panels), whereas AKDK presents a smaller accuracy, although it has a slope until 10 t dyn, afterwards it flattens. Interestingly, HHS with a delay in updating the time-step hierarchy (HHS-sTSS, yellow) is almost indistinguishable from HHS and leapfrog. The upper right panel shows the accuracy in preserving the halo centroid. Once again, leapfrog and both HHS versions accurately keep the centroid, while the AKDK accuracy is degraded two orders of magnitude. The wall-clock time for leapfrog, HHS and HSS-sTSS experiments was, respectively, 1.7, 0.8 and 0.6 times the corresponding for AKDK. As a complement, we performed tests using larger time steps reaching lower energy accuracy, the general results are the same. Finally, we emphasize that all integrators accurately preserve the density profile, as we can see in Figure 2; with obvious dependence on the number of particles, we decided to show the test with two million particles in order to minimize discreteness effects.
4.2. Minor Merger: Sinking Satellites
Satellite accretion onto larger galaxies is an astrophysical problem commonly simulated by both isolated and cosmological N -body simulations (Miller et al. 2020; Arca-Sedda & Capuzzo-Dolcetta 2016). We simulated a satellite, represented by a softened and massive particle, falling into a spherical system, represented by collisionless softened particles, for seven dynamical times. The spherical system consists of N = 105 equal mass particles spatially distributed according to the NFW density profile (Navarro et al. 1997). We adopted the same units as the previous fiducial isolated halo case. We used Plummer softening and we chose a softening parameter ∈ = 0.026 (in model units). The satellite’s initial separation from the center of the spherical system is R sat = 2.6, and its mass is m sat = 0.01, which is ≈1300 times bigger than the mass of one collisionless particle.
This test is particularly useful because the sinking process involves orbital angular momentum and energy transfer into the host system. We tracked energy, linear and angular momentum conservation as well as the host center of mass behavior (see Figure 3). Leapfrog (red) stays flat. HHS (black) starts to jump at one dynamical time, afterwards it stays flat. AKDK (blue) shows a lower accuracy in energy conservation and presents a systematic energy growth. As in the case of the isolated halo, the HHS energy drift slope is considerably flatter than the corresponding to AKDK. Linear and angular momentum are less accurate for AKDK by several orders of magnitude, while system barycenter behaves essentially the same for all three integrators. These results indicate that HHS is an excellent alternative for dynamical friction studies.
The evolution of the satellite radial position shows differences below the one percent level (see Figure 4). We conclude that, for a reduced number of dynamical times, the three integrators can provide an accurate description of the sinking process.
5. Time-step selection function tests
As it has already been discussed some hierarchical/adaptive time-step integrators, like HHS and AKDK, include a time-step selection function; such a prescription may help restoring the integrator symmetry. We may question if the approximated symmetric time-step selection function based on particle pairs given by equation 9 is only useful for a particular kind of simulation or code, and if we lose all the convenient properties of HHS, observed at this point, when the problem is not tractable by a direct summation code. To quantify such an effect we evolved the fiducial isolated halo switching the time-step selection function and we additionally performed test with the Tree/FMM code GADGET4.
5.1. Direct Summation Code
First, we consider our direct summation code implementation, simulating the fiducial isolated halo considering the HHS and AKDK integrators, as it can be seen in the upper panel of Figure 5. We take as a reference the AKDK integrator with an accuracy parameter ηaccel = 0.08 (fiducial, solid blue line). The dotted black line shows HHS (with ηFF = 0.055), which is almost twice faster but slightly less accurate, and the dashed black line shows a case of HHS (with ηFF = 0.015) remarkably more accurate but 10% slower than the fiducial AKDK. We include another AKDK case (dashed blue) with a smaller accuracy parameter ηaccel = 0.049, which is 20% less accurate than HHS (dashed black line) but 10% slower, and also is 20% slower and an order of magnitude more accurate than the fiducial AKDK case (solid blue). This means that considering even smaller values for the parameter η will not make AKDK more efficient that HHS. Next, we performed some changes in the time-step selection function, we removed the derivative term from equation 9 which represents the symmetrizing correction for the HHS integrator, we dubbed such tests nsHHS and they appear as the magenta lines in Figure 5. The energy drift is larger as compared with HHS but it is still acceptable for collisionless simulations, and it is faster. For completeness, we performed tests using equation 9 in AKDK (dubbed as sAKDK, green lines) with the same parameters (ηFF, rungs, etc.) as in the HHS tests. The energy drift is larger for the sAKDK test than the one corresponding to HHS; however, sAKDK is faster. Note that it is possible to match HHS accuracy by lowering the ηFF parameter in sAKDK tests, but it is slower (we do not show it because it does not add new information). So far, our tests suggest that HHS may benefit hybrid codes that use direct summation force calculation as part of their algorithm. For example, in the P3M technique (Hockney & Eastwood 1998) the most expensive part corresponds to direct summation, which runs in accelerators like GPUs (Habib et al. 2013, 2016; Cheng et al. 2020). Even using different time-step selection function, HHS may still obtain considerable performance. We analyze such a situation in the following sections, as well as the HHS performance in codes using approximated force computations (e.g. a tree code).
5.2. Tree Code: GADGET4
Recently, the 4th version of the publicly available code GADGET has implemented the HHS in the so called hierarchical gravity mode (Springel et al. 2021). The time-step selection function is similar to our equation 5 but the accuracy parameter ηaccel is inside the square root. Changing the step function is a sensible choice because our equation 9, based on pair interactions, is not practical for very large numbers of particles. Also, in the GADGET4 HHS implementation, instead of using the DKD representation (as in our implementation), the authors adopted the KDK one (for further details we refer to Springel et al. 2021), in a similar way as Zhu (2017). Hence, this allow us to explore the case of an approximated gravitational acceleration code, such as a tree code (e.g. Barnes & Hut 1986), with different time-step selection function. For that purpose, we perform some tests using AKDK and HHS with the Tree version of GADGET4. As in the case of direct summation tests (§ 5.1), we simulated the fiducial isolated halo in such a way that we can directly compare with the direct summation code tests, including a non-symmetrized time-step function. The middle panel of Figure 5 shows the results for the Tree code version of GADGET4. For the corresponding fiducial AKDK (solid blue line), GADGET4 opens three time-step rungs, and for its HHS implementation (dotted black line), it opens only two time-step rungs. Both integrators show an energy drift similar to 10−3, however, HHS (dotted black line) is 66% faster, but less accurate. Motivated by this result, we ran a new HHS test decreasing the ηaccel parameter (dashed black line); the energy is quite flat; therefore, energy conservation after 40 tdyn is almost an order of magnitude better than the fiducial AKDK (solid blue line), but it is 7% slower. If we decrease ηaccel for the AKDK integrator in order to match energy conservation (dashed blue line), the wall-clock time is considerably larger, by 50%. Although this is a particular example model, it is consistent with our previous tests. HHS is more stable and for medium and long-term integrations may be more efficient than AKDK, regardless of not using the approximated symmetric free-fall particle pairs time-step function (equations 9 and 10).
5.3 Fast Multipole Method Code: GADGET4
Gadget4 has implemented a different gravity solver based on the Fast Multipole Method expansion (FMM, e.g. Dehnen 2000, 2002). We ran the same fiducial isolated halo, as in the previous section, adopting the FMM scheme truncating the expansion at the quadrupole term. Results are presented in the lower panel of Figure 5. The blue solid line shows the fiducial case using AKDK (ηaccel = 0.01125), the corresponding HHS case (ηaccel = 0.01125, dotted black line) is relatively faster (66%); however, it is less accurate. We experimented lowering the ηaccel parameter for HHS, and the energy evolution is flat (dashed black line). We also decreased ηaccel for AKDK (dashed blue line); the energy accuracy is 10% worse but the wall-clock time is almost 100% larger; therefore, there is no point in trying smaller ηaccel values. The experiments with the GADGET4 FMM version also confirm the conclusions from our direct summation tests.
6. Performance
Although the adaptive nature of the HHS and AKDK integrators may imply a higher efficiency compared with the global-constant time-step leapfrog, the benefit of taking adaptive time-steps is evident only when the dynamical range is large. In comparison with AKDK, HHS has a computational overhead due to the recursive splitting of the Hamiltonian needed to build the time-step hierarchy.
We decided to make a short exploration of the simulation parameters (accuracy, rung number, and minimum time-step) with our direct summation code. We present the results in Figure 6, which shows a pragmatic diagnostic of the integrator performance: energy conservation vs. wall-clock time. The upper and middle panels correspond to the fiducial isolated halo. In the upper panel we fixed the number of time-step rungs and we varied the minimum time-step; whereas in the middle panel, we fixed the minimum time-step (same for both integrators) and we varied the number of time-step rungs. The lower panel corresponds to the cuspy halo including four live cuspy satellites, a common situation in cosmological simulations that requires a large dynamical range. In many cases HHS outperforms AKDK. We may wonder what the reason is given the extra computations related with the splitting process. One possible suspect is the individual timestep distribution. To investigate that, we built the histogram of time steps for particles for the initial conditions and at later times (see Figure 7). For HHS (solid circles) it is clear that only a moderate fraction of particles are found in the deepest time-step rung. For AKDK the distribution is almost the opposite, there is a peak of particles in the three deepest rungs, which translates into many more time-steps than HHS. The difference in performance is partly due to the time-step selection function, in agreement with Zemp et al. (2007); Grudić & Hopkins (2020). Although tweaking parameters we may obtain differences in performance and accuracy, we noticed that for a defined energy conservation threshold in many cases HHS outperforms AKDK, because the energy growth is smaller for HHS than for the standard AKDK implementation, allowing HHS to use larger steps. Nevertheless, it is important to mention that if all the particles are similarly distributed in the time-step rungs for both integrators the computational overhead of HHS starts to play a more important role. This may happen for very low resolution runs where the dynamical range is artificially shortened. This is in agreement with the middle panel of the figure, showing that only when using more than three step runs HHS is faster than AKDK. The time-step histogram presented in Figure 7 is a handy tool to asses the situation.
Our conclusions are consistent with recent studies that published successful results using HHS with GADGET4 and reaching an extremely large dynamical range with multi-million particle numbers (Wang et al. 2020).
7. Long-Term Stability
At this point we have compared the accuracy and performance of the integrators for some dynamical times. A natural question arises, whether the HHS advantages are relevant for realistic long-term integration. Dark matter halos survive around 30 − 200 dynamical times in cosmological simulations depending on the merger/accretion history (Klypin et al. 2015). As before, we investigated the stability of energy, linear and angular momentum and density centroid for our fiducial isolated halo model, this time for hundreds of dynamical times (see Figure 8). Energy conservation of HHS (black) is quite close to the global-constant time-step leapfrog (red) behavior during the first 20 − 40 dynamical times (consistent with our previous tests). However, after that time it starts to slowly drift which seems to get slower at the end of the simulation (≈ 700 tdyn). AKDK (blue) quickly drifts to a considerably larger energy error and keeps systematically growing. The yellow curve represents the HHS-sTSS version that delays the time-step hierarchy updating. As we observed before, it behaves as AKDK but with a smaller accuracy, although it is faster (≈ 40%). For the linear and angular momentum all integrators preserve them. However, while HHS preserves them almost at machine precision, AKDK preservation is almost 8 orders of magnitude worse. A similar disparity is obtained by following the halo centroid. Seeking a cause of this difference in accuracy we tracked the circular velocity in experiments with a smaller number of particles (N = 2000) evolving the system for 400 dynamical times at 2.1rs, where the circular velocity peaks. Figure 9 shows the circular velocity profile at several times. We found some differences, but they are all inside 10%. We conclude that, at the level of high accuracy, we do not expect important differences between integrators, and performance is the most relevant difference. As for as N-body simulations reaching a larger dynamical range, the smallest structures may live for a larger number of dynamical times; in this context HHS may offer a more stable option.
Recent studies regarding long-term N -body evolution (Hernandez et al. 2020), suggest that longterm integrations may be unstable to small changes in initial conditions realizations. This may be particularly critical for highly non-linear situations like the three-body problem. We generated a small ensemble of realizations for the fiducial isolated halo and for the sinking satellite problem. Results are shown in Figure 10 only for Leapfrog (red) and HHS (black). Indeed, some scatter is found; however, overall the results are robust.
8. Reversibility and Time Symmetry
The above numerical experiments show that there are certain parameter combinations where HHS is more accurate than AKDK or, alternatively, it is faster for a given energy accuracy. Because both integrators have different parameters it is natural to ask if there is a true advantage of HHS or if it is a misleading result, dependent on our implementation, accuracy or even particle number.
As we discussed in the introduction and in agreement with Hairer et al. (2003) we performed time symmetry and velocity reversibility tests using both HHS and AKDK integrators; we used leapfrog with a global-constant time-step as a reference case.
We chose the sinking satellite system as the test bed because the satellite orbit allows easily to track the system response in configuration space as well as in energy. At three different moments, termed BW1, BW2 and BW3, we reversed the velocity signs for all particles, and for one case instead of velocities we reversed the time sign; after that we continued the integration. Figure 11 shows the global result of forward (FW, solid lines) and backward (BW) evolution (i.e. inverting the sign of velocities) of the satellite distance to the halo center for all integrators, either for high (left panels) or low (right panels) energy conservation accuracy. For high energy conservation accuracy (≈ 10−8, left panels), the differences in configuration space are in general small, which is consistent with Hernandez & Bertschinger (2018)), where they discuss that AKDK preserves quantities like angular and linear momentum (see also Figure 3). However, there are still some differences as we can see in the lower panel. The satellite distance change with respect to its own forward evolution (lower panels) is a good diagnostic of reversibility. Clearly, the position difference in HHS (black) regarding the symplectic Leapfrog (red, global constant step) is below the simulation resolution determined by the softening (horizontal grey line), while for AKDK (blue) the calculated position difference is larger but close to the simulation softening. At a less accurate energy conservation but more common in collisionless simulations (10−3, right panels), differences between forward and velocity reversal integration in Leapfrog (global constant step) and HHS are still below the simulation softening. However, AKDK has notable differences that are well above the simulation resolution and they are even detected in configuration space. We also tracked the fractional energy change (Figure 12). The first outstanding fact is the truly reversible behavior of the global-constant time-step leapfrog (upper panel); almost every peak and valley is reproduced after the velocity reversal. The middle panel shows the high accuracy tests for AKDK (blue lines) and HHS (black lines), and the bottom panel the corresponding low accuracy tests. For AKDK, there is a roughly systematic growth for both forward integration (blue solid opaque line) and also after reversing velocities, suggesting that AKDK is non-reversible. Instead, HHS is almost flat, suggesting that it is approximately reversible. We also ran a test using HHS where we changed the time variable sign. In this case, the energy presents a systematic growth, showing that HHS is not time symmetric. However, for the sake of clarity, it was not included. We performed similar tests for the fiducial isolated halo and obtained similar results. The small slope showed in the energy drift suggests that HHS is approximately reversible. Furthermore the energy drift is small compared to AKDK, which is non-reversible and non time-symmetric. This in agreement with Hairer et al. (2002) who showed that the symplectic Sto¨rmer-Verlet-LeapFrog (AKDK) performs poorly for an adaptive time-step integrator, unless the timestep assigning function is properly selected with a possible computational cost.
Figure 5 shows that even GADGET4 with the selection function given by equation 5 and its implementation of HHS allows, under certain conditions, for a faster or more accurate integration than AKDK. Figure 13 investigates the reversibility of HHS and AKDK in the Tree (upper panel) and FMM (lower panel) implementations using GADGET4. The forward (FW) evolution of both integrators is shown with solid opaque lines, and the backward (BW) evolution after changing the velocity sign, at 40 tdyn, is shown with dashed lines. For the Tree gravity solver using HHS, the energy drift for both forward and backward evolution is similar, particularly during the first 10 tdyn. However, although encouraging, we do not have enough evidence to claim that HHS is reversible. The time-step function adopted by GADGET4 may be the reason it is slightly different to our tests with a direct summation code. On the other hand, the reversed integration of AKDK shows a discontinuous change in its slope, indicating that it is not reversible. We conclude that Tree-HHS is more stable than Tree-AKDK. For the FMM gravity solver the test is not conclusive, because it does not show a significantly smaller slope in energy drift using the HHS scheme (Hierarchical Gravity).
9. Discussion and Conclusions
Using a GPU direct summation N-body code, we tested and characterized the Hierarchical Hamiltonian Splitting (HHS) integrator proposed by Pelupessy et al. (2012), but we focused on collisionless simulations. As a reference we compared with the global-constant time-step symplectic Leapfrog integrator and the widely used Adaptive one (AKDK). We also complemented our study using the HHS implementation in GADGET4 (dubbed as hierarchical gravity, Springel et al. 2021), which uses a different time-step selection function and approximate force solvers (Tree and FMM).
As recently discussed (e.g. Dehnen 2017), there is no general solution for a symplectic adaptive multi time-step integrator, although there are several proposals (e.g. Huang & Leimkuhler 1997; Hairer 1997; Calvo et al. 1998; Hardy et al. 1999; Farr & Bertschinger 2007). The problem is not exclusive of astrophysics; the field of differential equations of dynamical systems has extensively reviewed the subject (Hairer et al. 2002). In particular, it is important to say that adaptive and multi-step techniques are not identical but they are related. Hernandez & Bertschinger (2018) address the important case of adaptive time steps, inspired by Hairer et al. (2002). They discuss what is called the “backwards error analysis” in order to explore commonly used integrators in astrophysics, concluding that symplecticity, time symmetry and reversibility are not the same and they are not always a guarantee of energy conservation. Nevertheless, the KAM (Kolmogorov-Arnold-Moser) theorem assures stability for symplectic (geometrical) and reversible integrators (Hairer et al. 2002) in contrast to nonreversible ones, like AKDK with the standard timestep selection function. The case of multi-step integration, requires extra conditions like the so called impulse (splitting) or averaging (less often force evaluation) techniques (Hairer et al. 2002). Similar problems have been discussed in the molecular dynamics field, and the r-RESPA method (Tuckerman et al. 1992) that is a splitting technique has been applied to the long-short range splitting case or multiple mass segregation with considerable advantage in performance. An important feature is that such approaches to the multi-step problem resulted in a symmetric or reversible method, not in a symplectic one. The major concrete benefit has been an improvement in the performance at a given energy accuracy. A similar situation applies to HHS, the case we study in this paper. HHS is a composition of Fast and Slow operators (Pelupessy et al. 2012) and based in our numerical experiments it is approximately reversible, which explains the performance advantage shown in § 6. The discussion in Hairer et al. (2002) and Hernandez & Bertschinger (2018) shows that the reversibility depends on the symmetry under a velocity sign change of hamiltonian and the corresponding time-step assigning function, and not on the accuracy of gravity calculation or particle number. The particular case of equation 10 is an approximate 1st order version of the one proposed by Hut et al. (1995) that depends only on the module of particle pairs relative velocities; therefore it is reversible. A future interesting avenue inspired by the r-RESPA method presented by Tuckerman et al. (1992) is the extension already discussed by Portegies Zwart et al. (2020). We experimented with the strategy motivated by the multi-step averaging technique (Hairer et al. 2002), updating the time-step hierarchy only at a given number of global time-steps (dubbed as HHS-sTSS, Figures 1 and 8). There is a gain in CPU time; however, the energy drift increases suggesting that further investigation is required.
Our results are summarized below:
1. Based on reversibility and time symmetry tests we concluded that HHS is not time symmetric but it is approximately reversible; it is also more stable than AKDK for a given energy accuracy. Although the exact correspondence between forward and backwards integration lasts only for few dynamical times, even for ten dynamical times the energy drift growth is small. In contrast, the AKDK energy drift grows systematically with time, clearly showing that it is non-reversible using the commonly used time-step selection function. Based in our tests HHS reversibility explains the advantage in performance regarding AKDK. Based on the Hairer et al. (2002) discussion and the used time-step selection functions that depend only on the velocity module, such properties are independent of accuracy in gravity calculation and of particle number.
2. Our findings with the direct summation code may be also relevant for the high accuracy and costly section (PP) of codes using the P3M technique, as it has been also shown in molecular dynamics studies (Plimpton et al. 1997).
3. In agreement with our direct summation code tests, changing the time-step selection function for a non-symmetrized one, it is possible to find a combination of parameters in GADGET4 (using both the Tree and FMM code version) where HHS is more efficient than AKDK. We found approximate reversibility for the Tree Gadget4 test. However, for FMM Gadget4 our tests were inconclusive.
4. The population of particle histogram across the time-step rungs is useful to find a convenient parameter combination for the integrators.