1 Introduction
The goal of the present paper is to determine the statistical pervasive systematic risk factors in the Mexican Stock Exchange by means of an uncommon computational technique, namely, Independent Component Analysis (ICA), in order to detect a more reliable structure of the pervasive factors driving the returns on equities in the Mexican Stock Exchange (BMV for its acronym in Spanish).
Because of its nature, ICA is designed by assuming a linear mixture of random variables that are not normally distributed, which is a relevant property for the problem we are dealing with. This technique helps to reveal a linear combination of underlying time series; by extracting their statistically independent components, the pervasive sources of some observed parallel time series can be explained.
ICA has been used, mainly in fields such as signal and image processing, speech and audio separation, biomedical signals and image analysis, telecommunications, neurophysiology, text and document processing, bioinformatics, environmental issues and some industrial applications. In relatively recent years, studies about the applications of ICA in different fields of Finance have been made in some countries.
The works that we considered more relevant in the context of our research have used ICA for extracting the following: the underlying factors explaining the stock returns in Japan [2], Hong Kong [4], Italy [9], the USA [24] and during the crisis period [25]; the relevant factors driving the movements from implied volatility surfaces of index options [1]; the factors driving the movements of a term structure on interest rates in Germany [35]; the factors driving spot rate curve movements in the USA [3]; the factors moving the returns for real estate investment trusts in the USA [30], and for estimating the factor model of returns for the USA Thrift Saving Plan Funds [37], and the factors for pricing multiasset derivatives [26].
Moreover, some other representative studies of ICA in Finance have used this technique for the following purposes:
(1) to analyze the interactions between currencies in the Foreign Exchange [36];
(2) to model the conditional higher moments risk in international stock markets [48], the term structure of multiple yield curves [46], and the volatility of market price indexes [47];
(3) to manage investment portfolios [8];
(4) to allocate assets [32];
(5) to forecast financial time series [30];
(6) to compute improved portfolio risk measures such as VaR in banking sector [6, 7];
(7) to explain the volatility of investment funds [45];
(8) to generate an equity sector classification [43];
(9) to improve bank performance evaluation [29];
(10) to produce multifactor index variance from the SPX sector ETF returns [38];
(11) to measure the dependency between stocks in the USA [17], and
(12) to analyze herding among hedge fund styles [27].
As far as we are concerned, there is no study regarding the application of the ICA in Finance focused on Mexico. Consequently, we shall try to fill this gap in financial literature by contributing with the application of a novel extraction technique to extract the underlying structure of risk factors in the Mexican Stock Exchange.
The outline of this paper is as follows. In section 2, we briefly describe the ICA technique; in section 3, we present an empirical study; and in section 4, we draw the main conclusions.
2 Independent Components Analysis
2.1 ICA Basics
Despite the widespread evidence concerning the non-Gaussianity of the returns on equities, the most popular latent variables analysis techniques used for extracting the pervasive factors underlying the financial multivariate data are Principal Component Analysis (PCA) and Factor Analysis (FA), which assume a Gaussian distribution of the latent factors.
ICA represents an improved extraction technique for this kind of data, since it is based on a multivariate non-normality approach and looks for mutually and statistically independent components. According to [21], statistical independence means that not one of the components gives any information about the others.
Also following [10], mutually and statistically independent can be interpreted as being of different nature. ICA was introduced in the field of signal processing and neural computation as a tool to solve the problem of Blind Source Separation (BSS) and Signal Reconstruction.
According to [40], the former concept implies revealing hidden factors from observable measures, where we know very little about the original signals and their process of generation.1 The basic technique for solving this kind of problem is ICA, which assumes that the observed variables are the result of an unknown mixing process of some latent original sources. Consequently, the observed variables can be decomposed by means of a demixing process, capable of estimating some statistically independent components that can be considered as reliable proxies for the original sources that generated the observed variables (s ≈ y).
The main characteristic of the latent sources is that they are assumed to be non-Gaussian and mutually independent. They are known as the independent components of the multivariate observed data.
According to [5], the formal expressions of the mixing and demixing processes in the basic ICA model are as follows:
where x represents the vector of observed variables; A, the mixing matrix; s, the vector of original sources; y, the vector of the independent components; and W, the demixing matrix, which we assume as being invertible. Since we are ignorant of both the input and output processes and also the original sources, the ICA methodology makes several assumptions: a) both the original sources and the components y are non-Gaussian and mutually independent; b) the number of observed mixtures is equal to the number of original sources, so the unknown mixing matrix is square; c) if the independent components are equal to the original sources, the mixing matrix A will be the inverse of the demixing matrix W:
Under these assumptions we can estimate both W and y from x by looking for some components as statistically independent as possible. Thus, the objective of ICA is to find a demixing linear mapping W in which the components y would be as statistically independent as possible.
In relevant literature we can find mainly three estimation criteria for ICA: a) the maximization of non-Gaussianity, b) the maximum likelihood estimation, and c) the minimization of mutual information. As it is expressed in [23], under some conditions, the three approaches are essentially equivalent or at least closely related.
The former three criteria allow for different methods of computing the ICs, which resemble one another in the sense that the optimization step is done by means of an iterative algorithm. The two main methods are: the adaptive algorithms based on gradient methods, and the fixed-point iteration scheme algorithm, known as fast fixed-point or Fast-ICA algorithm.
2.2 PCA, FA, ICA and Finance
In reference to PCA and FA, [21] state that ICA is capable of finding the underlying factor when these techniques fail; furthermore, [39] declare that ICA might reveal some features that otherwise would remain hidden. In addition, PCA and FA present a limitation that ICA overcomes. It is often believed that PCA and FA generate independent components; however, this is only true if the data are multivariate normally distributed, since uncorrelated components are also independent for Gaussian data.
The real world data and specially the financial time series usually are non-Gaussian. ICA will search statistically independent components for non-Gaussian data. Moreover, independence represents a stronger property than uncorrelatedness, since the former implies the latter but not vice versa. Therefore, uncorrelatedness is not enough to separate the underlying components. From a different perspective, PCA and FA techniques use only the covariance matrix to obtain linear decorrelated components, i.e., they minimize second-order statistics.
ICA uses statistics that are not considered in the covariance matrix, i.e., they additionally minimize higher-order statistics containing information not included in the covariance matrix. Consequently, another problem related to the use of PCA and FA on financial time series is the fact that, in finance, probability distributions have fat tails, and therefore the outliers can distort the estimation of the parameters in both cases.
Conversely, ICA presents a special problem absent in both PCA and FA: the estimated independent components (ICs) are not explicitly ranked as in the other methods, where the factors are automatically ranked by their eigenvalues. Additionally, therefore we have to apply an algorithm able to order the ICs according to some criteria.
In the case of financial series, on the other hand, it is reasonable to assume that there is a set of independent factors that underlie the observed time series, which might be related to political, meteorological, technical, fundamental, macroeconomic, market, national or international aspects, and that ICA might be an appropriate model to extract them. Consequently, ICA is very suitable for use on financial time series for the following reasons: first, ICA deals with the problem of blind source separation or dealing with parallel time series, like those obtained from financial variables; secondly, ICA works with non-Gaussian random variables, which are the ones most commonly found in financial data; thirdly, from statistical and financial standpoints, ICA produces more reliable underlying components or factors, since they are statistically independent and not only uncorrelated. This fact contributes directly to the aim of extracting systematic risk factors affecting the returns on equities in a multifactor asset-pricing model like the Arbitrage Pricing Theory.
3 Empirical Study
3.1 The Data
We used four different databases formed as follows: First, for the sake of comparison with previous research [28], we ran our study over two databases consisting of 291 quotations, formed on the basis of weekly closing prices in log-returns from 20 stocks of the Mexican Stock Exchange over the period running from July 3, 2000 to January 27, 2006.2 One of these two databases is stated in returns (DBWR) and the other, in excesses of the free-risk interest rate (DBWE).3
Besides, we also used two other daily databases, one expressed in returns (DBDR) and another in excesses (DBDE). The period of the daily databases, consisting of 1410 observations from 22 stocks, extended from July 3, 2000 to January, 27, 2006.4
The returns were calculated using the logarithmic returns of the stocks’ closing prices, in accordance with the following expression:
Although ICA does not require time series being stationary, by using the continuous logarithmic returns analysis to compute the returns on equities as expressed in expression 4, we already are considering that the prices time series are not stationary and that a difference has been done in order to make those series stationary in mean. In addition, as the returns are differential values, the underlying mean and trend are discarded, and thus the ICA algorithm is able to capture the interactions between the different stocks at a given moment.
On the other hand, the ICA as a methodology does not require that each time series intrinsically be stationary. What ICA assumes is that the overall set of time series preserve the same kind of interactions between times series, that is, the statistics of the observations might change, but the interaction between them captured by the matrix W does not change.
Finally, it is a fact that by averaging over longer time intervals, such as increasing the time period from daily to weekly to monthly, gives a time series that increasingly has a lower discrepancy (see [11]); however, the discrepancies at the high values of the returns in the QQ plots with respect to a Gaussian at the level of one month, are compatible with the assumptions about non-Gaussianity needed for the ICA algorithm.
3.2 Methodology and Results
3.2.1 Tests for Univariate and Multivariate Normality
It is known [21] that PCA (implicitly) and FA (explicitly) require a normally distributed multivariate sample in order to produce completely reliable results, i.e., they will only produce uncorrelated and independent components if the sample data have no higher order statistics beyond the variance.
Thus, if the samples do not fulfill these conditions, we will be prompted to use a more suitable technique such as ICA to uncover the underlying sources in a non-Gaussian sample. Therefore, we first tested the univariate normality (UVN) of each individual series, since ICA requires that not more than one of the observed signals (the returns on equities) be non-Gaussian.
Tables 1 to 4 present the descriptive statistics up to the fourth moment of the four databases used in this study. We can observe that the skewness and the kurtosis of practically all the stocks differs from those of the Gaussian distribution.
Mean | Median | Std. Dev. | Skewness | Kurtosis | Jarque-Bera | Probability | |
---|---|---|---|---|---|---|---|
ALFAA | 0.0036 | 0.0041 | 0.0619 | -0.6609 | 7.4108 | 257.0801 | 0.0000 |
ARA_01 | 0.0049 | 0.0061 | 0.0406 | -0.1335 | 3.5483 | 4.5102 | 0.1049 |
BIMBOA | 0.0032 | 0.0019 | 0.0422 | 0.0777 | 4.7718 | 38.3563 | 0.0000 |
CIEB | -0.0019 | 0.0004 | 0.0505 | -0.7843 | 6.2150 | 155.1639 | 0.0000 |
COMERUBC | 0.0023 | 0.0010 | 0.0454 | 0.1356 | 4.4699 | 27.0904 | 0.0000 |
CONTAL_01 | 0.0020 | 0.0000 | 0.0438 | 0.0716 | 4.6692 | 34.0319 | 0.0000 |
ELEKTRA_01 | 0.0027 | 0.0033 | 0.0569 | -0.2465 | 4.3674 | 25.6200 | 0.0000 |
FEMSAUBD | 0.0024 | 0.0017 | 0.0424 | -0.2520 | 4.7448 | 39.9911 | 0.0000 |
GCARSOA1 | 0.0034 | 0.0062 | 0.0445 | -0.3802 | 4.3096 | 27.8059 | 0.0000 |
GEOB | 0.0082 | 0.0128 | 0.0629 | -0.2622 | 5.1221 | 57.9405 | 0.0000 |
GFINBURO | 0.0025 | 0.0031 | 0.0426 | -0.3496 | 5.3609 | 73.5098 | 0.0000 |
GFNORTEO | 0.0069 | 0.0077 | 0.0436 | 0.2487 | 4.5283 | 31.3195 | 0.0000 |
GMODELOC | 0.0019 | 0.0017 | 0.0321 | 0.3192 | 5.2380 | 65.6702 | 0.0000 |
PE_OLES_01 | 0.0047 | 0.0000 | 0.0674 | 0.3414 | 4.3948 | 29.2415 | 0.0000 |
SORIANAB | 0.0007 | 0.0000 | 0.0438 | -0.0533 | 4.7728 | 38.2445 | 0.0000 |
TELECOA1 | 0.0013 | 0.0025 | 0.0444 | -0.1219 | 3.7457 | 7.4627 | 0.0240 |
TELMEXL | 0.0012 | 0.0000 | 0.0334 | -0.5724 | 7.7828 | 293.2540 | 0.0000 |
TLEVICPO | 0.0009 | 0.0020 | 0.0475 | -0.3993 | 5.7427 | 98.9405 | 0.0000 |
TVAZTCPO | -0.0003 | 0.0000 | 0.0528 | -0.3567 | 4.4700 | 32.3714 | 0.0000 |
WALMEXV | 0.0033 | 0.0030 | 0.0398 | -0.0261 | 4.5949 | 30.8752 | 0.0000 |
Mean | Median | Std. Dev. | Skewness | Kurtosis | Jarque-Bera | Probability | |
---|---|---|---|---|---|---|---|
ALFAA | 0.0019 | 0.0030 | 0.0620 | -0.6709 | 7.3742 | 253.8279 | 0.0000 |
ARA_01 | 0.0032 | 0.0045 | 0.0406 | -0.1423 | 3.5319 | 4.4115 | 0.1102 |
BIMBOA | 0.0015 | 0.0002 | 0.0422 | 0.0699 | 4.7836 | 38.8079 | 0.0000 |
CIEB | -0.0036 | -0.0010 | 0.0506 | -0.7874 | 6.1942 | 153.7829 | 0.0000 |
COMERUBC | 0.0006 | -0.0005 | 0.0455 | 0.1275 | 4.4335 | 25.7027 | 0.0000 |
CONTAL_01 | 0.0004 | -0.0018 | 0.0438 | 0.0597 | 4.6472 | 33.0725 | 0.0000 |
ELEKTRA_01 | 0.0010 | 0.0017 | 0.0569 | -0.2500 | 4.3482 | 25.0695 | 0.0000 |
FEMSAUBD | 0.0007 | 0.0003 | 0.0424 | -0.2723 | 4.7356 | 40.1191 | 0.0000 |
GCARSOA1 | 0.0017 | 0.0052 | 0.0446 | -0.4009 | 4.3393 | 29.5442 | 0.0000 |
GEOB | 0.0065 | 0.0103 | 0.0630 | -0.2847 | 5.1160 | 58.2218 | 0.0000 |
GFINBURO | 0.0008 | 0.0015 | 0.0426 | -0.3555 | 5.3354 | 72.2614 | 0.0000 |
GFNORTEO | 0.0052 | 0.0062 | 0.0437 | 0.2379 | 4.4759 | 29.1582 | 0.0000 |
GMODELOC | 0.0002 | 0.0001 | 0.0322 | 0.2873 | 5.2272 | 64.1473 | 0.0000 |
PE_OLES_01 | 0.0030 | -0.0017 | 0.0675 | 0.3316 | 4.3801 | 28.4267 | 0.0000 |
SORIANAB | -0.0009 | -0.0010 | 0.0439 | -0.0721 | 4.7767 | 38.5244 | 0.0000 |
TELECOA1 | -0.0004 | 0.0006 | 0.0445 | -0.1458 | 3.7462 | 7.7812 | 0.0204 |
TELMEXL | -0.0005 | -0.0015 | 0.0335 | -0.6063 | 7.8238 | 299.9606 | 0.0000 |
TLEVICPO | -0.0008 | 0.0007 | 0.0476 | -0.4135 | 5.7603 | 100.6749 | 0.0000 |
TVAZTCPO | -0.0020 | -0.0009 | 0.0528 | -0.3650 | 4.4637 | 32.4391 | 0.0000 |
WALMEXV | 0.0016 | 0.0016 | 0.0399 | -0.0627 | 4.5845 | 30.6314 | 0.0000 |
Mean | Median | Std. Dev. | Skewness | Kurtosis | Jarque-Bera | Probability | |
---|---|---|---|---|---|---|---|
ALFAA | 0.0007 | 0.0000 | 0.0246 | -0.1153 | 6.3963 | 680.8083 | 0.0000 |
ARA_01 | 0.0010 | 0.0000 | 0.0189 | -0.0442 | 5.9361 | 506.9414 | 0.0000 |
BIMBOA | 0.0007 | 0.0000 | 0.0187 | 0.3740 | 7.6206 | 1287.2010 | 0.0000 |
CIEB | -0.0004 | 0.0000 | 0.0213 | -0.6673 | 9.9616 | 2951.9139 | 0.0000 |
COMERUBC | 0.0005 | 0.0000 | 0.0204 | 0.4306 | 6.4539 | 744.4508 | 0.0000 |
CONTAL_01 | 0.0004 | 0.0000 | 0.0211 | -0.1938 | 6.8047 | 859.2542 | 0.0000 |
ELEKTRA_01 | 0.0005 | 0.0002 | 0.0245 | -0.1246 | 6.4904 | 719.3973 | 0.0000 |
FEMSAUBD | 0.0005 | 0.0000 | 0.0175 | -0.2518 | 7.1901 | 1046.3697 | 0.0000 |
GCARSOA1 | 0.0007 | 0.0000 | 0.0192 | -0.2304 | 6.1817 | 607.2330 | 0.0000 |
GEOB | 0.0017 | 0.0000 | 0.0245 | -0.1054 | 10.2044 | 3051.9052 | 0.0000 |
GFINBURO | 0.0005 | 0.0000 | 0.0194 | 0.2199 | 5.0447 | 256.9903 | 0.0000 |
GFNORTEO | 0.0014 | 0.0000 | 0.0205 | 0.2748 | 6.7824 | 858.2517 | 0.0000 |
GMODELOC | 0.0004 | 0.0000 | 0.0158 | 0.1737 | 5.6468 | 418.6632 | 0.0000 |
PE_OLES_01 | 0.0010 | 0.0000 | 0.0295 | -0.3729 | 10.1686 | 3051.7488 | 0.0000 |
SORIANAB | 0.0002 | 0.0000 | 0.0186 | -0.0839 | 4.6112 | 154.1588 | 0.0000 |
TELECOA1 | 0.0003 | 0.0006 | 0.0195 | -0.1156 | 4.7901 | 191.3930 | 0.0000 |
TELMEXL | 0.0002 | 0.0000 | 0.0156 | -0.1018 | 6.0378 | 544.6098 | 0.0000 |
TLEVICPO | 0.0002 | 0.0006 | 0.0220 | -0.1052 | 6.6617 | 790.3090 | 0.0000 |
TVAZTCPO | -0.0001 | 0.0000 | 0.0244 | -0.5064 | 8.0397 | 1552.4342 | 0.0000 |
WALMEXV | 0.0007 | 0.0006 | 0.0187 | 0.1244 | 5.9440 | 512.8407 | 0.0000 |
CEMEXCP | 0.0008 | 0.0000 | 0.0162 | 0.1342 | 4.2068 | 89.7969 | 0.0000 |
KIMBERA | 0.0002 | 0.0000 | 0.0151 | -0.5530 | 9.0290 | 2207.3787 | 0.0000 |
Mean | Median | Std. Dev. | Skewness | Kurtosis | Jarque-Bera | Probability | |
---|---|---|---|---|---|---|---|
ALFAA | 0.0005 | -0.0001 | 0.0246 | -0.1215 | 6.3955 | 680.8189 | 0.0000 |
ARA_01 | 0.0008 | -0.0002 | 0.0189 | -0.0495 | 5.9402 | 508.4618 | 0.0000 |
BIMBOA | 0.0004 | -0.0002 | 0.0187 | 0.3744 | 7.6211 | 1287.5568 | 0.0000 |
CIEB | -0.0006 | -0.0002 | 0.0213 | -0.6697 | 9.9707 | 2960.0790 | 0.0000 |
COMERUBC | 0.0003 | -0.0002 | 0.0204 | 0.4273 | 6.4467 | 740.8504 | 0.0000 |
CONTAL_01 | 0.0002 | -0.0002 | 0.0211 | -0.1962 | 6.7999 | 857.3613 | 0.0000 |
ELEKTRA_01 | 0.0003 | 0.0000 | 0.0245 | -0.1266 | 6.4854 | 717.4653 | 0.0000 |
FEMSAUBD | 0.0002 | -0.0002 | 0.0175 | -0.2567 | 7.2068 | 1055.2038 | 0.0000 |
GCARSOA1 | 0.0005 | -0.0001 | 0.0192 | -0.2365 | 6.1774 | 606.2876 | 0.0000 |
GEOB | 0.0015 | -0.0001 | 0.0245 | -0.1144 | 10.1975 | 3046.6028 | 0.0000 |
GFINBURO | 0.0003 | -0.0002 | 0.0193 | 0.2208 | 5.0571 | 260.0685 | 0.0000 |
GFNORTEO | 0.0012 | -0.0001 | 0.0205 | 0.2716 | 6.7766 | 855.2821 | 0.0000 |
GMODELOC | 0.0001 | -0.0002 | 0.0158 | 0.1670 | 5.6406 | 416.2018 | 0.0000 |
PE_OLES_01 | 0.0008 | -0.0002 | 0.0295 | -0.3695 | 10.1326 | 3020.9541 | 0.0000 |
SORIANAB | -0.0001 | -0.0002 | 0.0186 | -0.0883 | 4.6225 | 156.4975 | 0.0000 |
TELECOA1 | 0.0000 | 0.0005 | 0.0195 | -0.1242 | 4.7890 | 191.6613 | 0.0000 |
TELMEXL | 0.0000 | -0.0002 | 0.0156 | -0.1130 | 6.0560 | 551.6562 | 0.0000 |
TLEVICPO | -0.0001 | 0.0004 | 0.0220 | -0.1122 | 6.6667 | 792.8200 | 0.0000 |
TVAZTCPO | -0.0003 | -0.0002 | 0.0244 | -0.5083 | 8.0248 | 1544.0783 | 0.0000 |
WALMEXV | 0.0004 | 0.0004 | 0.0187 | 0.1142 | 5.9465 | 513.1155 | 0.0000 |
CEMEXCP | 0.0006 | -0.0002 | 0.0161 | 0.1316 | 4.2152 | 90.8231 | 0.0000 |
KIMBERA | 0.0000 | -0.0002 | 0.0151 | -0.5621 | 9.0350 | 2213.9756 | 0.0000 |
We also carried out the Jarque-Bera test for UVN on the four databases, rejecting the null hypothesis of normality at 5% of probability for all the stocks in the daily databases, but not rejecting it for only one stock in the weekly databases that was normally distributed. The last two columns of the Tables 1 to 4 present the results of the Jarque-Bera test.
We used two classical alternatives for assessing the multivariate normality (MVN) tests: the Mardia [33] and the Henze-Zirkler [18] MVN tests. Mardia’s test is based on the multivariate skewness and kurtosis of the sample. Henze-Zirkler’s (H-Z) test considers a measure of the distance between the characteristic function of the MVN and the empirical one, where the computed statistic will be lognormally distributed, if the data is multivariate normal. Both techniques have shown very good performance in measuring the MVN against other classic and newer alternatives, as [34] remark in their study.
We performed two tests following the accepted criterion of applying more than one MVN test when assessing this property of a sample.5 Our results with both tests reject the null hypothesis of MVN at 5% of probability for all the databases. Tables 5 and 6 present the results of Mardia’s and H-Z’s tests, respectively.
DBWR | DBWE | DBDR | DBDE | |
---|---|---|---|---|
Multivariate Skewness (Ms) | 3305.50 | 3297.10 | 6659.40 | 6666.30 |
p-value | 0.00 | 0.00 | 0.00 | 0.00 |
Multivariate Skewnes corrected (Msc) | 3342.80 | 3334.40 | 6674.80 | 6681.70 |
p-value | 0.00 | 0.00 | 0.00 | 0.00 |
Multivariate Kurtosis (Mk) | 37.83 | 37.71 | 141.05 | 141.16 |
p-value | 0.00 | 0.00 | 0.00 | 0.00 |
Notes:
DBWR = Database of weekly returns. DBWE = Database of weekly excesses. DBDR= Database of daily returns. DBDE= Database of daily excesses. H0 = Multivariate Normality. p-value lower than 0.05 = Rejection of the H0.
DBWR | DBWE | DBDR | DBDE | |
---|---|---|---|---|
Henze-Zirkler's Statistic | 1.05 | 1.05 | 1.22 | 1.22 |
p-value | 0.00 | 0.00 | 0.00 | 0.00 |
Notes:
DBWR = Database of weekly returns. DBWE = Database of weekly excesses. DBDR= Database of daily returns. DBDE= Database of daily excesses. H0 = Multivariate Normality. p-value lower than 0.05 = Rejection of the H0.
We extended this analysis by making an experiment concerning the horizon of Mardia’s test, i.e., we ran the test using different numbers of observations so as to check the multivariate normality in different scenarios. The results showed that from 101 observations on, inclusive, the sample is non-Gaussian according to the three statistics.
On the basis of the foregoing results6, we cannot accept as completely reliable the outcomes of techniques assuming the multivariate normality of data such as PCA and FA; thus, we are led to the application of more suitable techniques like ICA. In fact, this part of our investigation represents an important, but in most cases ignored, aspect in empiric studies that uses classic multivariate techniques to extract the pervasive factors; since in many cases the MVN is assumed but not tested, the results and conclusions may be flawed.
In addition, the assumption done in the ICA models, is that the third and fourth moments differ significantly from the values of a Gaussian distribution.
In addition, the tests of normality are based on checking this assumption. In particular the non-linearities used for the implementation of the experiments in this paper, guaranteed the presence of high order interactions from the Taylor expansion, and therefore the presence of moments of all orders.
3.2.2 Estimation of the ICA Model
In order to estimate the ICA model in expression (2), we used the ICASSO methodology [20], which is based on the FastICA algorithm [22]7. According to the foregoing authors, the FastICA algorithm is based on a fixed-point iteration scheme for finding the local extrema of the objective functions. The basic iteration for the vector w for each IC obtained by this method is:
where the nonlinearity g can be almost any smooth function such as:
and g’ is the derivative of g(.).8
The final vector gives one of the ICs as a linear combination in y = wT z. The specific resulting algorithm depends both on the estimation principle used and the approach selected to estimate several numbers of ICs, i.e., the nonlinearity and the decorrelation method chosen. In [21], the authors state that by setting the options, nonlinearity tanh (hyperbolic tangent) and symmetric approach, one can obtain a good estimation of the ICA model; this would be equivalent to performing the three estimation approaches at the same time.
In addition, the positive kurtosis obtained in the multivariate normality tests leads us to use the hyperbolic tangent function.
Furthermore, as reported in [14], the best trade-off for estimating the ICA model, from statistical performance and computational load perspectives, is represented by the FastICA algorithm with symmetric orthogonalization and tanh nonlinearity estimation. In our study we followed these specifications.
The election of the ideal number of ICs to estimate still represents an unsolved problem.
Although in ICA literature we can find diverse criteria to determine this number, in most cases it is actually chosen by trial and error without any theoretical basis. One alternative is to reduce the number of dimensions in the whitening pre-processing stage, considering some criteria from among those used in PCA or FA, and to estimate the same number of ICs. For the sake of comparison with our previous study, we use the same test window, which ranges from two to nine components.9
As stated by [20], one problem that the ICA estimation presents is that the reliability of the estimated ICs is not known since the results are stochastic, i.e., they might be dissimilar in different runs of the algorithm.
Thus, the results of a single run of the FastICA algorithm could not be completely trusted and an additional analysis of the reliability of the estimation should be performed. In this context, reliability has two aspects the algorithmic and the statistical. According to the former authors, ICASSO methodology represents an alternative for dealing with this problem, since it ensures the algorithmic and statistical stability and reliability of the estimated components by running the FastICA algorithm many times, using different initial conditions and/or a differently bootstrapped data set.
Following [20], ICASSO first runs the FastICA algorithm M times on data set
These elements form the similarity matrix, which can be obtained by:
where, Σ is the covariance matrix of dataset x, and
According to [19], reliable estimates of ICs correspond to tight clusters, since they agglomerate estimates generated by many runs of the algorithm which are similar, even when the initial values and datasets for the estimation have been changed. Conversely, estimates which do not belong to any cluster are considered unreliable estimates. The centrotype of each cluster is considered a more reliable estimate than that generated by any single run.
Besides the previously declared parameters for FastICA, there are some additional parameters to set when using ICASSO, such as the resampling mode, number of resampling cycles (M) and number of clusters (L). In order to ensure both statistical and algorithmic reliability, in our study we used both resampling modes, i.e., each time the dataset was bootstrapped and the initial conditions of the algorithm were randomized. We used the default number of resampling cycles fixed by the software, i.e., 30, and we set the number of clusters according to the number of ICs (m) estimated in each experiment in order to obtain squared mixing (A) and demixing (W) matrices.
The demixing matrix (W) computed by ICASSO corresponds to the centrotypes of each cluster as well, representing a more reliable estimate than that produced by a single run of FastICA; however, they are not strictly orthogonalized. In the context of our research where we need to obtain orthogonalized ICs, we will have to make an orthogonalization procedure in a later step.
Consequently, we first took the demixing matrix (W) produced by ICASSO, then we computed the mixing matrix:
and the matrix of independent components or sources:
3.2.3 Ranking and Orthogonalization of the Independent Components
The ICA model does a decomposition by means of a criterion related to statistical independence, which does not allow to order in a natural way the components and thus the residual. The criterion presented in this section is one criterion that has sense in the application at hand. In contrast with the case of linear regression or PCA, where the driving noise is easy to identify, because it is a residual obtained after the components of maximum variance are determined, in the case of ICA such an interpretation will not be natural. Because of this, in the literature about ICA it is not clearly specified the difference between the components and the residual, and therefore the results are usually presented as a complete projection in the space statistically independent components.
Then, next we ordered the independent components in terms of their explained variability by means of the criterion proposed by [12]. This criterion ranks the ICs according to the amount of variance of the stocks that explains each one of them, thus we obtain a ranked matrix of independent components (Sr), as well as sorted mixing (Ar) and demixing matrices (Wr).
Finally, we orthogonalized the matrix of ICs by means of the following process of transformation:
where V is a transformation matrix to decorrelate the matrix of sorted independent components, and So represents the matrix of orthogonalized ICs.
3.2.4 Extraction of Underlying Systematic Risk Factors Via ICA
In each one of the four databases, we computed eight multifactor models in order to extract a window from two to nine independent components. Then, we proceeded to reconstruct the original variables according to the generation process of expression (1), including the inverse of the transformation matrix V in order to orthogonalize the mixing matrix A as well:
The reproduced values were very similar to the observed series for greater part of the equities in all the datasets, which indicates that the generative multifactor model performed by ICA was effective. However, stocks such as GMODELO, CEMEX, SORIANA and GCARSO were not very well reconstructed, especially in the cases of daily returns and excesses, due to the high volatility they presented during the studied period. To save space, we only present the line plots for the first five stocks appearing in the returns and excesses observed and reproduced from each database.
Figures 1 to 4 present the results of the case when we extracted nine underlying factors; the reconstruction performance is evident.10 An interesting fact of the ICA algorithm is that it captures the global interaction between stocks, independently of the non-stationarity of the joint behavior. That is, the required assumption in the model is that there are independent sources that are mixed by a matrix W.
Note: Logarithmic returns of the first five stocks observed in each database and their respective reconstructions using the estimated ICA model. Stock symbols of the stocks presented appear above each line plots.
If the matrix does not change, the ICA algorithm will give an estimation, and therefore, given that the matrix does not change, it will impute the components of volatility to some of the non-observable factors.
3.2.5 Independence Test
In order to test the independence of the computed ICs, we ran the Hilbert-Schmidt Independence Criterion (HSIC) test [15]11, which tests whether random variables X and Y are independent based on a sample of observed pairs (xi, yi). The results of our independence tests confirmed the statistical independence, between each pair of components estimated from the weekly and daily databases.
3.2.6 Econometric Contrast
We carried out an econometric contrast under a statistical approach to the Arbitrage Pricing Theory (APT) using the underlying systematic risk factors extracted via ICA. The APT’s pricing equation is expressed as follows:
In the same outline that in [28], λ0 represents the riskless interest rate, λk the risk premium for each kind of systematic risk factor, and βk the exposures to each type of systematic risk. We tested the former expression by way of an average cross-section methodology estimating the coefficients by ordinary least squares (OLS) in the following regression model:
We used again the two-stage methodology for the econometric contrast of the APT used in our aforementioned study [28], which is explained as follows: In the first stage, we estimated the betas to be used in expression 18 from the scores of the extracted factor. In the second stage, we estimated the lambdas. In the first stage we estimated the betas by regressing the factor scores obtained by ICA as a cross-section on the returns and excesses. In order to improve the efficiency of the parameter estimates and to eliminate autocorrelation in the error terms of the regressions, we used weighted least squares (WLS) to estimate the entire system of equations at the same time.
The results of the regressions in the four databases were very good, producing, in almost all cases, statistically significant parameters, high values of the R2 coefficients and results in the Durbin-Watson test of autocorrelation, which lead us to the non-rejection of the null hypothesis of no-autocorrelation. In the second stage we estimated the lambdas or risk premia in expression 17 by regressing the betas obtained in the first stage as a cross-section on the average returns and excesses, using ordinary least squares (OLS).
In order to avoid the econometric problems of heteroskedasticity and autocorrelation in the residuals of the model estimated through OLS, we corrected it by means of the Newey-West heteroskedasticity and autocorrelation consistent covariance estimates (HEC). Additionally, we verified the normality in the residuals by carrying out the Jarque-Bera test of normality.
In order to accept the APT pricing model, we require the statistical significance of at least one parameter lambda different from λ0, and the equality of the independent term to its theoretic value, i.e., the average returns, in the models expressed in returns:
and zero, in the models expressed in excesses of the riskless interest rate:
We used Wald’s test to confirm these equalities.
In Table 7, we present a summary of the results of the econometric contrast for the four databases. In general, the results of the explanation power, the adjusted R-squared (R2*), the statistical significance of the multivariate test (F), and the Jarque-Bera normality test of the residuals are very good in almost all the contrasted models. The univariate tests for the individual statistical significance of the parameters (statistic t) priced from one to five factors exclusive of λ0 in the weekly and daily databases, thus giving evidence in favor of the APT in 27 models.
λ0 | λ1 | λ2 | λ3 | λ4 | λ5 | λ6 | λ7 | λ8 | λ9 | R2* | λsig/λtot | F | WALD | J-B | ||||||
Database of weekly returns. | ||||||||||||||||||||
Model with 2 betas | ● | ● | ● | 5.78% | 0.00% | ● | ○ | ○ | ||||||||||||
Model with 3 betas | 0.00530 | ● | ● | 0.01665 | 46.78% | 33.33% | ○ | ● | ○ | |||||||||||
Model with 4 betas | 0.00546 | ● | -0.01492 | -0.01219 | ● | 46.58% | 50.00% | ○ | ● | ○ | ||||||||||
Model with 5 betas | 0.00507 | ● | -0.01770 | ● | ● | ● | 47.28% | 20.00% | ○ | ● | ○ | |||||||||
Model with 6 betas | 0.00546 | ● | -0.01899 | ● | ● | ● | ● | 44.21% | 16.67% | ○ | ○ | ○ | ||||||||
Model with 7 betas | 0.00505 | ● | 0.02035 | ● | ● | ● | ● | ● | 38.45% | 14.29% | ● | ● | ● | |||||||
Model with 8 betas | 0.00557 | ● | 0.01043 | -0.01765 | ● | ● | ● | ● | ● | 49.69% | 25.00% | ○ | ○ | ○ | ||||||
Model with 9 betas | 0.00557 | ● | ● | ● | ● | -0.01158 | ● | ● | ● | ● | 34.51% | 11.11% | ● | ○ | ○ | |||||
Database of weekly excesses. | ||||||||||||||||||||
Model with 2 betas | ● | ● | ● | 17.81% | 0.00% | ● | ○ | ○ | ||||||||||||
Model with 3 betas | 0.00376 | ● | ● | 0.01662 | 37.21% | 33.33% | ○ | ● | ○ | |||||||||||
Model with 4 betas | 0.00341 | ● | -0.01774 | 0.00890 | ● | 45.25% | 50.00% | ○ | ● | ○ | ||||||||||
Model with 5 betas | ● | ● | ● | ● | ● | ● | -29.79% | 0.00% | ● | ○ | ○ | |||||||||
Model with 6 betas | 0.00249 | ● | ● | ● | ● | ● | 0.01716 | 39.81% | 16.67% | ○ | ● | ○ | ||||||||
Model with 7 betas | ● | ● | ● | ● | ● | ● | 0.01431 | -0.00499 | 31.63% | 14.29% | ● | ○ | ○ | |||||||
Model with 8 betas | ● | ● | ● | ● | ● | ● | ● | ● | -0.01046 | 9.34% | 12.50% | ● | ○ | ○ | ||||||
Model with 9 betas | 0.00450 | ● | -0.01257 | ● | ● | 0.01049 | ● | 0.01246 | -0.01057 | 0.00941 | 63.49% | 55.56% | ○ | ● | ○ | |||||
Database of daily returns. | ||||||||||||||||||||
Model with 2 betas | ● | ● | ● | -2.48% | 0.00% | ● | ○ | ○ | ||||||||||||
Model with 3 betas | 0.00055 | ● | -0.00302 | ● | 30.49% | 33.33% | ○ | ○ | ○ | |||||||||||
Model with 4 betas | 0.00108 | ● | 0.00286 | -0.00262 | ● | 52.34% | 50.00% | ○ | ● | ○ | ||||||||||
Model with 5 betas | 0.00105 | ● | ● | -0.00254 | ● | ● | ● | 46.41% | 20.00% | ○ | ○ | ○ | ||||||||
Model with 6 betas | ● | ● | ● | ● | ● | 0.00290 | -0.00162 | 40.33% | 33.33% | ○ | ○ | ○ | ||||||||
Model with 7 betas | ● | ● | ● | ● | 0.00288 | ● | 0.00118 | ● | 40.22% | 28.57% | ○ | ○ | ○ | |||||||
Model with 8 betas | 0.00131 | 0.00243 | 0.00329 | ● | 0.00281 | ● | ● | ● | 0.002665 | 56.08% | 50.00% | ○ | ● | ○ | ||||||
Model with 9 betas | ● | ● | -0.00353 | ● | ● | 0.00287 | ● | ● | ● | 0.001 | 69.62% | 33.33% | ○ | ○ | ○ | |||||
Database of daily excesses | ||||||||||||||||||||
Model with 2 betas | ● | ● | ● | -1.91% | 0.00% | ● | ○ | ○ | ||||||||||||
Model with 3 betas | ● | ● | 0.00318 | ● | 34.55% | 33.33% | ○ | ○ | ○ | |||||||||||
Model with 4 betas | ● | ● | ● | 0.00244 | ● | 50.53% | 25.00% | ○ | ○ | ○ | ||||||||||
Model with 5 betas | ● | ● | -0.00289 | ● | ● | ● | 39.87% | 20.00% | ○ | ○ | ○ | |||||||||
Model with 6 betas | ● | ● | ● | ● | 0.00309 | ● | ● | 36.25% | 16.67% | ○ | ○ | ○ | ||||||||
Model with 7 betas | ● | ● | 0.00222 | ● | ● | ● | -0.00287 | ● | 45.30% | 28.57% | ○ | ○ | ○ | |||||||
Model with 8 betas | ● | -0.00197 | ● | ● | 0.00096 | ● | 0.00283 | ● | ● | 44.95% | 37.50% | ○ | ○ | ○ | ||||||
Model with 9 betas | ● | 0.00300 | -0.00183 | 0.00250 | ● | -0.00076 | ● | ● | 0.002742 | 0.00109 | 78.98% | 66.67% | ○ | ○ | ○ |
Notes: (1) The level of statistical significance used in all the tests was 5%. (2) Empty circles mean that the required results in the different tests were fulfilled, whereas filled circles represent that those tests were not passed according to the different null hypotheses posed in each one of them. (3) λj: Estimated coefficients. H0: λj = 0. Numeric value of the coefficient = Rejection of H0. Parameter significant. ● = Not rejection of H0. Parameter not significant. (4) R2*: Adjusted R-squared = Explanatory capacity of the model. (5) λsig / λtot : Ratio number of significant lambdas / total number of lambdas in the model. (6) F: Global statistical significance of the model. H0 = λ1 = λ2 = … = λk = 0. ○ = Rejection of H0. Model globally significant. ● = Not rejection of H0. Model globally not significant. (7) Wald: Wald's test for coefficient restrictions. Databases in returns: H0: λ0 = Average riskless interest rate. Databases in excesses: H0: λ0 = 0. ○ = Not rejection of H0. The independent term is equal to its theoretic value. ● = Rejection of H0. The independent term is not equal to its theoretic value. (8) J-B: Jarque Bera's test for normality of the residuals. H0 = Normality. ○ = Not rejection of H0. The residuals are normally distributed. ● = Rejection of H0. The residuals are not normally distributed.
Nevertheless, only four models fulfilled both the statistical significance of the parameters and the equality of the independent term to its theoretic value, in addition to the fulfilment of normality in the residuals.
The referred models appear marked in Table 7, where we used the same methodology of presentation and analysis of the results as in our preceding paper [28].
4 Conclusions
Our results showed that the data of the Mexican Stock Exchange used in the study presented univariate and multivariate non-Gaussianity, revealing that classic techniques such as PCA and FA will produce a biased estimation of the betas.
This discovery led us directly to the use of techniques more suitable for non-Gaussian series such as ICA, which, by using the ICASSO methodology, produces a more reliable and realistic estimation of the underlying generative multifactor model of returns on equities than those produced by PCA and FA, since this methodology is capable of extracting the underlying systematic risk factors from non-Gaussian financial time series, and solves the problem that the regular ICA model estimation presents.
Regarding the results of our empirical study, on one hand, the reconstruction of the observed signals, by means of a reduced number of factors with respect to the original variables with our estimated ICA model was suitable. On the other hand, our econometric contrast of the APT in the stocks and periods used in this study produced signals in favor of the APT, revealing from 1 to 5 factors priced in the statistically significant models.
Compared with the results of our previous study [28] and given the univariate and multivariate non-gaussianity of the financial time series used in both studies, we find that from a theoretical standpoint, the underlying systematic factors extracted using ICA would represent a more reliable estimation than that produced by PCA and FA. Nevertheless, from an empirical stance, in general, both the reconstruction of the observed data and the results of the econometric contrast of the APT were similar. Further research will be needed in order to compare the performance of these extraction techniques in this context.