1 Introduction
Ever since the seminal work of Landau [1], the study of phases and continuous phase transitions via an order parameter has been a fundamental paradigm in condensed matter physics. In the Landau scheme, the purpose of an order parameter is to signal where the phase of a certain system breaks a given symmetry of the underlying microscopic Hamiltonian [2]. This process in which the ground state of a physical system ends with a lower number of symmetries than the original Hamiltonian has been called spontaneous symmetry breaking [2]. A plethora of phases of matter such as crystals [3], magnets [4,5], and conventional superconductors [3] can be identified by the spontaneous symmetry breaking mechanism. However, not all phases of matter can be classified by an order parameter. We refer to those that are recognized by another attribute. For instance, the many-body localization transition which manifests itself through a change in the entanglement dynamics [6,7], the BEC-BCS crossover that can be detected by the decay of the correlation functions [8,9], and the so-called topological phases of matter [10,11] that are distinguished by the evaluation of topological invariants such as the Chern number [12].
Although the conventional and non-conventional phases of matter can not be characterized by the same theoretical scheme, machine learning techniques offer the possibility of classifying them by using different algorithms and procedures [13,14,15,16]. In fact, machine learning has emerged as a powerful tool to classify and identify phases of matter. For instance, it has been used to predict crystal structures [17], solve impurity problems [18], and classify thermal and quantum phases of matter [19,20,21,22,23]. More recently, recurrent neural networks have been employed to build variational wave functions for quantum many-body problems [24], and convolutional neural networks have been used to distinguish the dynamics of an Anderson insulator from a many-body localized phase [25].
In this manuscript, we use machine learning techniques to address the problem of single-particle localization in one-dimensional quasiperiodic lattices with both, nearest neighbor and next-nearest neighbor tunneling. In particular, using supervised learning, where the learner needs to be trained with previously classified data, we demonstrate the efficiency of an artificial neural network for addressing the classification of extended and localized wave functions. For this purpose, we first train the neural network (NN) using the eigenstates obtained from the exact diagonalization of the well-known Aubry-André (AA) model [26]. It has been recognized that this model is a suitable one to identify how single-particle localization emerges as a result of correlated disorder in a lattice [26,27]. In contrast to the one-dimensional Anderson model [28], where any strength of the uncorrelated disorder yields the exponential localization of the single-particle eigenstates, in the AA model there is a threshold in the correlated disorder that signals the transition between extended and localized single-particle eigenstates. To avoid confusion with the uncorrelated or random disorder, we shall use quasidisorder to indicate the correlated disorder introduced in the AA model. After the training procedure, we probe the performance of the neural network by classifying eigenstates belonging to a particular generalization of the Aubry-André model that includes next-nearest neighbor tunneling. Using the inverse participation ratio (IPR), we demonstrate that the NN classifies above 96% of the profiles correctly. Our results are of relevance in the study of disordered systems with machine learning techniques and can serve as a benchmark for further theoretical studies.
The manuscript is organized as follows, in Sec. 2, we introduce the two models where the machine learning technique is accomplished, these models are the Aubry-André and the Extended Aubry-André. Both of them represent a quasiperiodic lattice in one dimension. Section 3 displays the theoretical tools used to probe the performance of the neural network. The architecture and physical parameters of the network are also exposed in Sec. 3. The results of the classification task are shown in Sec. 4. Finally, in Sec. 5, we discuss and summarize our findings.
2 Model
To study the localization phenomenon in quasiperiodic lattices through a neural network, we first consider the well-known Aubry-André model [26] on a lattice having L sites with periodic boundary conditions. The Hamiltonian of the AA model is:
where
being
3 Methods
3.1 Localization tools
Before proceeding to the description of the neural network, we introduce an
important and widely used physical quantity that is a footprint of the
localization transition. This parameter, called the inverse participation ratio
(IPR), gives a measure of the inverse of lattice sites where the wave function
has a non-negligible amplitude. For a normalized state
where
The definition for the IPR in Eq. (3) is related to a single state
where the subscript j indicates the j-th
eigenstate (ordered according to the energy value from lowest to highest), and
the index i signals the lattice site. The average inverse
participation ratio is a measure of the amount of extended or localized states
in the whole spectrum [27]. As we shall see, both the inverse
participation ratio
3.2 Data preparation
Like any machine learning implementation, the first task is to provide the raw
data that will feed the learning algorithm [14,15]. For this purpose, we perform
numerical exact diagonalization to obtain the eigenstates of the Hamiltonian in
Eq. (1) for 42 evenly spaced values of
where
The matrices
3.3 Artificial neural network architecture and training procedure
Artificial neural networks are nonlinear models which are used for supervised
learning, the structure and architecture of NNs are originally inspired by
biological neural networks [15]. An artificial neural network contains
several layers of interconnected nodes. These nodes, called neurons, are the
basic units of a neural network. Typically, the first layer of neurons is called
the input layer, the middle layers “hidden layers" and the last layer is
called the output layer. To perform supervised learning on the data
We now describe the full action of the network on the data. The feed-forward
attribute of the NN means that the flow of data is from left to right, with the
output of one layer serving as the input for the next. In the first layer, a
given input vector
where
Here,
where K is the number of classes that the NN has to classify. As
pointed out, in this manuscript K = 2, extended and localized
profiles. Like all supervised learning procedures, a loss function must be
specified, this function, denoted by J, quantifies the
precision of the NN and has to be minimized with respect to all weights and
biases in order to optimize the neural network classification [15]. We employ a
cross-entropy cost function supplemented with
where V is the matrix of tags and λ is the regularization
parameter. Similarly to the number of neurons in the hidden layer, the value of
λ has to be tuned to improve the performance. In Eq. (10), we denote
4 Results
4.1 Testing set
After cost function minimization, the neural network performance is analyzed
using previously unseen data, that is, the data belonging to the test set. In
Fig. 3a) we show the average test
accuracy of the output layer as a function of the quasidisorder strength
Figure 3b) illustrates the average output
layer outcome as a function of the quasidisorder strength
4.2 Extended Aubry-André model
To go beyond the test set, we probe the neural network performance on wave
functions of the extended Aubry-André model in Eq. (2). In contrast to the AA
Hamiltonian, the EAA Hamiltonian includes, as stated above, tunneling to
next-nearest neighbors which gives rise to the emergence of mobility edges
[30]. We should point out here that the network was
only trained with data generated from the AA model. Thus, the eigenstates
belonging to the EAA model are new for the network. In Fig. 4, we show the average output layer outcomes and the
IPR of each eigenvector in the spectrum for
Now, we turn to the classification of the resulting wave functions of the EAA
model when we vary the quasidisorder strength
To conclude this study we concentrate now on the average
5 Conclusion
In this manuscript, we have illustrated the capacity of an artificial neural network
to classify extended and localized single-particle states that arise in
quasiperiodic one-dimensional lattices. In particular, we first train and test the
artificial neural network using eigenstates belonging to the celebrated Aubry-André
(AA) model. By collecting not just the ground state bul all eigenstates, we
accomplish an excellente classification in both, the low- and high-energy sectors of
the model. Then, we demonstrate the versatility of the network by probing its
performance on the eigenstates of the Extended Aubry-André (EAA) model. Our results
show that the neural network does not learn the IPR parameter, since quantitatively
speaking the IPR and the output layer values do not match. This means that new
parameters that sense the localization are conceived by the network. Surprisingly,
the performance of the neural network is satisfactory since it classifies above
The study here addressed shows the efficiency and capacity of a neural network to classify profiles that come from a more complex model than the one used to train the NN. Although our analysis focuses on one-dimensional models with nearest neighbor and next neighbor hopping, supervised learning with neural networks can also be used to analyze the localization phenomena in higher dimensions and in lattices with power-law hopping, where the peculiar multifractal states arise. The classification of extended and localized single-particle states through neural networks provides a useful benchmark to tackle the many-body localization problem using supervised learning techniques. Diagnosing many-body phases of matter requires, in addition to fully connected neural networks, the use of convolutional neural networks or principal component analysis to deal with the exponential dimension of quantum many-body states [25].