1 Introduction
Recently it was proposed a general approach to the representation and construction of correlation and association coefficients [1-3]. They are considered as (correlation or association) functions of two arguments defined on a set with involutive operation, taking values in
Correlation functions can be generated by similarity and dissimilarity functions satisfying suitable properties [1-3,6]. Moreover, a one-to-one correspondence exists between correlation and bipolar similarity functions [2]. For this reason, correlation functions are also referred to as invertible similarity correlations.
These results pave the way to construct correlation functions on almost any set if one can define an involutive operation and suitable similarity or dissimilarity function on this set. This approach was used in [7] to introduce a bipolar dissimilarity function and corresponding correlation function on the set of probability and relative frequency distributions. This proposal used the involutive negation of probability distributions [8]. The constructed correlation function surprisingly coincided with the Pearson product-moment correlation coefficient [4,5]. The authors of [7] used the introduced correlation function for calculating the correlation between frequency distributions in contingency tables.
Usually, these frequency distributions are defined on a set of categories of categorical variables presented in contingency tables [9-12].
In this paper, we apply the dissimilarity and correlation functions proposed in [7] to analyze the cognitive decline severity in asthmatics.
Currently, the manifestation of cognitive impairment in various diseases, including bronchial asthma, is being actively investigated [13]. Cognitive dysfunctions affect the control and development of asthma, which determines the relevance of more detailed studies of the role of cognitive impairment in the course of the disease. Various factors, including age, disease duration, education, lifestyle, etc., determine the degree of cognitive decline in asthmatics.
As a rule, existing tests for assessing cognitive impairment are analyzed by traditional statistical analysis methods allowing us to determine the significance of associations between the studied indicators [9,10]. In this paper, we analyze new data about relationships between the sociodemographic data of patients and the severity of cognitive decline in asthmatics published in [14].
The paper has the following structure. Section 2 discusses the basics of the theory of invertible similarity correlations.
The dissimilarity and correlation functions introduced in [7] for the analysis of relationships between frequency distributions and categorical data are considered in Section 3. Section 4 presents the data from [14]. Section 5 describes the results of the analysis of these data using the proposed method. The last Section discusses obtained results, a conclusion, and future work.
2 Basics of Correlation Functions
Consider the basic definitions and results related with correlation functions [1-3]. Let
and
is called a fixed point of the negation
Denote
For any element
A correlation function (association measure) on
Symmetry:
Reflexivity:
Inverse relationship:
Usually, correlation and association coefficients used in statistics are calculated between real variables, dichotomous variables, rankings, etc., without considering some involutive operation on the corresponding set of variables [4, 5]. For such coefficients taking values in the interval
From properties A1-A3, it follows that the correlation function satisfies for all
Opposite elements:
Hence the correlation between opposite elements equals
Correlation functions can be constructed from similarity and dissimilarity functions [1-3].
A function
and irreflexive:
A function
and reflexive:
Similarity
These functions are called bipolar if, for all
There exists a one-to-one correspondence between invertible correlation functions and bipolar similarity (dissimilarity) functions [2]:
These three functions compose complementary triplet
As it follows from (4), the invertible correlation function is nothing else but a rescaled bipolar similarity function. For this reason, the correlation function is also referred to as a similarity correlation function. From (4), we see that the similarity values from interval
As we can see, the correlation is positive if
In the following Section, we show how the correlation between frequency distributions is constructed using the involutive negation of probability distribution and suitable dissimilarity function between distributions.
3 Correlation of Frequency and Probability Distributions
The correlation of frequency distributions was introduced in [7]. We give a short description of the steps described in the previous Section and used for constructing the corresponding correlation function. Suppose,
We can consider
Let
where
The uniform probability distribution
is a unique fixed point of the negation
The bipolar dissimilarity function
It defines by (5) the invertible correlation function [7]:
This correlation function coincides with Pearson product-moment correlation coefficient.
In the following Section, we use this correlation function to analyze cognitive decline severity in asthmatics.
4 Data of Cognitive Declines in Asthmatics
When analyzing the literature devoted to assessing the effect of cognitive impairment on various characteristics in patients with asthma, we were attracted by the article of Haq Satti et al. [14]. These authors studied associations between sociodemographic factors and cognitive decline severity in asthmatics. Table 1 presents the results obtained in this work.
Table 1 Characteristics of the asthmatic patients and their cognitive decline severity. Adapted from [14]
Factors | No Cog. Decline | Mild Cog. Decline | Moder. Cog. Decline | Severe Cog. Decline |
Total | N (%) 68 (50.4%) | N (%) 45 (33.3%) | N (%) 16 (11.8%) | N (%) 6 (4.4%) |
Age | ||||
25-40 | 30 (44.1%) | 17 (37.8%) | 6 (37.5%) | 2 (33.3%) |
>40 | 38 (55.9%) | 28 (62.2%) | 10 (62.5%) | 4 (66.7%) |
Education | ||||
10 or less | 53 (77.9%) | 32 (71.1%) | 12 (75%) | 4 (66.7%) |
>10 | 15 (22.1%) | 13 (28.9%) | 4 (25%) | 2 (33.3%) |
Duration of Illness | ||||
<5 years | 63 (92.6%) | 32 (71.1%) | 13 (81.2%) | 3 (50%) |
>5 years | 05 (7.4%) | 13 (28.9%) | 3 (18.8%) | 3 (50%) |
Tobacco Smoking | ||||
Non Smoker | 34 (50%) | 16 (35.5%) | 5 (31.2%) | 2 (33.3%) |
Smoker | 34 (50%) | 29 (64.5%) | 11 (68.2%) | 4 (66.7%) |
Poly-Pharmacy | ||||
No | 36 (52.9%) | 12 (26.6%) | 4 (25%) | 3 (50%) |
Yes | 32 (47.1%) | 33 (73.4%) | 12 (75%) | 3 (50%) |
Note: Cog. – cognitive; Moder. – moderate
This study showed [14] that the Duration of Illness and the use of Poly-Pharmacy were closely associated with the presence and severity of cognitive decline (p=0.005 and p=0.019, respectively).
In this paper, we analyzed the presented data using the similarity and correlation of frequency distributions considered in the previous Section.
5 Results
Since the number of severe cognitive declines in Table 1 is small, we combined the last two columns into one column.
In addition, we transformed the frequency of patients in each cell of the table into relative frequency such that their sum in every string equals to 1 (see Table 2).
Table 2 Characteristics of the asthmatic patients and their cognitive decline severity (modified Table 1)
Factors | No Cog. Decline | Mild. Cog. Decline | Moder. + Severe Cognitive Decline |
Total | N=68 | N=45 | N=22 |
Age | |||
25-40 (n=55) | 30 (0.545) | 17 (0.31) | 8 (0.145) |
>40 (n=80) | 38 (0.475) | 28 (0.35) | 14 (0.175) |
Education | |||
10 or less (n=101) | 53 (0.52) | 32 (0.32) | 16 (0.16) |
>10 (n=34) | 15 (0.44) | 13 (0.38) | 6 (0.18) |
Duration of Illness | |||
<5 years (n=111) | 63 (0.57) | 32 (0.29) | 16 (0.14) |
>5 years (n=24) | 5 (0.21) | 13 (0.54) | 6 (0.25) |
Tobacco Smoking | |||
Non Smoker (n=57) | 34 (0.60) | 16 (0.28) | 7 (0.12) |
Smoker (n=78) | 34 (0.44) | 29 (0.37) | 15 (0.19) |
Poly-Pharmacy | |||
No (n=55) | 36 (0.65) | 12 (0.22) | 7 (0.13) |
Yes (n=80) | 32 (0.4) | 33 (0.41) | 15 (0.19) |
Note: Cog. – cognitive; Moder. - moderate
As a result, we obtain for each of the five factors two relative frequency (probability) distributions of the categorical variable Cognitive Decline Severity containing three levels (categories): No Cognitive Decline, Mild Cognitive Decline, and Moderate or Severe Cognitive Decline.
Denoting the first of each pair of distributions by
Age: | |
25-40: | P = (0.545, 0.31, 0.145), |
>40: | Q = (0.475, 0.35, 0.175), |
|
0.0100, |
|
0.9900; |
|
0.9800; |
Education: | |
10 or less: | P = (0.52, 0.32, 0.16), |
>10: | Q = (0.44, 0.38, 0.18), |
|
0.0372, |
|
0.9628, |
|
0.9256; |
Duration of Illness: | |
<5 years: | P = (0.57, 0.29, 0.14), |
>5 years: | Q = (0.21, 0.54, 0.25), |
|
0.6464, |
|
0.3536, |
|
-0.2928; |
Tobacco Smoking: | |
Non Smoker: | P = (0.60, 0.28, 0.12), |
Smoker: | Q = (0.44, 0.37, 0.19), |
|
0.0513, |
|
0.9487, |
|
0.8973; |
Poly-Pharmacy: | |
No: | P = (0.65, 0.22, 0.13), |
Yes: | Q = (0.40, 0.41, 0.19), |
|
0.2030, |
|
0.7970, |
|
0.5941. |
One can see that relative frequency distributions
On the contrary, the frequency distributions
The similarity is less than 0.5, the dissimilarity is greater than 0.5, and the correlation is negative (see also (6) for the relationship between these three functions).
These results allow us to conclude that Cognitive Decline Severity is associated with the factor Duration of Illness because the change in the levels of this factor causes a considerable change in corresponding distributions. This result is consistent with the results of the work [14].
Although the difference between distributions of Poly-Pharmacy is more considerable than for the first three factors, the similarity between distributions is high, and correlation has a high positive value.
For this reason, we can conclude that the association between Cognitive Decline Severity and Poly-Pharmacy is not very high.
6 Discussion and Conclusion
The method presented in this paper allows us to measure the similarity and difference in the frequency distributions of one categorical variable for different levels of another variable. Our method is based on calculating the similarities and correlations between the rows of the contingency table.
The proposed categorical data association analysis method can be used as an additional relationship assessment to the classical chi-square analysis method.
A comparative analysis of the results obtained in our work and [14], in which the Pearson chi-square test was used, showed the same associations for four factors. At the same time, our calculations revealed a significant similarity and positive correlation (r=0.6) between the degree of cognitive decline for different levels of Poly-Pharmacy, indicating a not large association between the considered variables.
The differences obtained require more detailed further research on the relationship between our and classical methods used to analyze the association of categorical data.
Frequency distributions appear in social-behavioral sciences, biology, medicine, marketing, business, etc. [9-12, 16-18]. We plan to apply the proposed method to data analysis in some of these areas. Another possible application of the considered methods is an analysis of relationships between subjective probability distributions and subjective weight distributions in models of probability reasoning and multicriteria decision-making [19].