The Role of WordNet Similarity in the Affective Analysis Pipeline

Segura Navarrete, Alejandra; Vidal-Castro, Christian; Rubio-Manzano, Clemente; Martínez-Araneda, Claudia; Segura Navarrete, Alejandra; Vidal-Castro, Christian; Rubio-Manzano, Clemente; Martínez-Araneda, Claudia

doi:10.13053/cys-23-3-3250

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Comp. y Sist. vol.23 no.3 Ciudad de México jul./sep. 2019 Epub 09-Ago-2021

https://doi.org/10.13053/cys-23-3-3250

Articles of the Thematic Issue

The Role of WordNet Similarity in the Affective Analysis Pipeline

Alejandra Segura Navarrete¹^*

Christian Vidal-Castro¹

Clemente Rubio-Manzano¹

Claudia Martínez-Araneda²

^¹ Universidad del Bío-Bío, Facultad de Ciencias Empresariales, Concepción, Chile. asegura@ubiobio.cl, clrubio@ubiobio.cl, cvidal@ubiobio.cl

^² Universidad Católica de la Santísima Concepción, Facultad de Ingeniería, Concepción, Chile. cmartinez@ucsc.cl

Abstract

Sentiment Analysis (SA) is a useful and important discipline in Computer Science, as it allows having a knowledge base about the opinions of people regarding a topic. This knowledge is used to improve decision-making processes. One approach to achieve this is based on the use of lexical knowledge structures. In particular, our aim is to enrich an affective lexicon by the analysis of the similarity relationship between words. The hypothesis of this work states that the similarities of the words belonging to an affective category, with respect to any other word, behave in a homogeneous way within each affective category. The experimental results show that words of a same affective category have a homogeneous similarity with an antonym, and that the similarities of these words with any of their antonyms have a low variability. The novelty of this paper is that it builds the bases of a mechanism that allows incorporating the intensity in an affective lexicon automatically.

Keywords: Natural language processing; computational linguistics; affective computing; sentiment analysis; knowledge representation

1 Introduction

Nowadays, Sentiment Analysis is a useful and important discipline in Computer Science, which allows obtaining potentially valuable knowledge about user's perceptions, expectations and attitudes in order to improve the decision-making process regarding products and marketing strategies, among other uses. Sentiment Analysis has not only been applied in business but also in very different areas such as recommender systems ^[¹^,²^], electoral analysis ^[³^] management of virtual museums ^[⁴^], multilingual processing ^[⁵^,⁶^] among others.

In general terms, there are two approaches to perform this type of analysis according to ^[⁷^]. The first one uses a corpus of tagged texts that allows the construction of a classifier trained to execute this task. This approach uses supervised learning techniques that come from machine learning and statistics ^[⁸^]. The second approach uses lexical resources, such as dictionaries or lexicons which are defined as a previously tagged set of words ^[⁹^], i.e., every word is tagged according to its orientation ^[¹⁰^]. There are two main lines of work: The identification of both positive and negative opinions, emotions and evaluations, using computing tools to assign a polarity to the content ^[¹¹^].

The estimation of the affective aspect of a text ^[⁹^] called Affective Analysis, where there is a lexicon containing a set of words classified according to the emotions they represent ^[¹²^,¹³^]. The emotion expressed in a sentence or text is obtained considering the emotion of all the words contained in that text ^[¹⁴^].

The classification process in Sentiment Analysis is simpler than in Affective Analysis.

The first one classifies in two or three categories, either {positive, negative} or {positive, negative or neutral} and the second one can do so in many, depending on the model of emotions used. In addition, a text can express more than one emotion and even two different texts can express the same emotion but with different intensities.

In Affective Analysis based on lexicon approach, the results depend on the quality and completeness of the lexicon used in the process. The affective lexicons include the words grouped by affective category. This fact only allows a words-bag analysis since the words of each affective category do not contain affective intensity information; therefore, it is not possible determining affective profiling of a document. In Sentiment Analysis, there are works having improved the quality of affective lexicons by adding information such as valence, arousal and dominance ^[¹⁵^-¹⁷^]. In the case of affective analysis based on lexicon, studies are mainly aimed at increasing the number of words of an affective lexicon ^[¹⁸^].

The objective of this paper is to enrich a lexicon of affects by the analysis of the similarity relationship between words. The hypothesis of this work states that the similarities of the words belonging to an affective category, with respect to any other word, behave in a homogeneous way within each affective category. We found evidences that similarities of the words belonging to an affective category, with respect to any other word, behave in a homogeneous way within each affective class. This finding will allow us to determine intensities for the emotions of an affective category and to improve automatic enrichment process of affective lexicons.

The rest of the article is structured as follows: Chapter 2 presents a background and a brief state of the art about the use of lexicons in affective analysis and similarity measures between two words. Chapter 3 presents the hypothesis and experiments performed to prove it. Chapter 4 presents the results and discusses the word's similarity behavior of the lexicon's affective classes. Finally, conclusions and future work lines are presented in Chapter 5.

2 Background and Related Work

In lexicons commonly used in Affective Analysis, consider different classifications of basic emotions, assuming that all other emotions would depend on these subsets.

For example, author proposed 6 categories in ^[¹⁹^]: anger, disgust, fear, joy, sadness and surprise. In ^[²⁰^] was proposed an affective lexicon called WordNet-Affect, which was built based on the WordNet knowledge base, through the selection and tagging of affective concepts. This initial base was extended using sentences and patterns extracted from Open Mind Commonsense ^[²¹^]. WordNet-Affect classifies words into the six categories of Ekman.

Each word in the lexicon contains lexical and affective information, for example, the role of the word in speech (part-of-speech), classification according to emotion theory or representation, among others. Another affective lexicon was proposed in ^[²²^], which considered 8 affective categories: anger, anticipation, disgust, fear, joy, sadness, surprise and trust. This lexicon was generated from a list of affective words extracted from the Thesaurus WordNet-Affect and the most frequent words in Google n-gram Corpus ^[²³^].

In the case of Affective Analysis based on lexicon, the studies have as main objective to increase the number of words of an affective lexicon. For example, the authors in ^[¹⁸^] present an approach for the Japanese language where a similarity metric was used to expand a small group of emotionally-charged words (containing 503 nouns) into an emotions dictionary (containing 15612 verbs). Other studies have improved the lexical resources through the integration ^[²⁴^] or the creation of lexicons for specific domains ^[²⁵^,²⁶^].

One aspect to consider is that a lexicon can be used in different domains. This may imply that one word may represent different affects in different domains ^[²⁷^]. On the other hand, in order to incorporate the concept of semantic similarity among words in Affective Analysis, it is necessary to analyze both, the metrics based on the structure and the metrics based on the Information Content (IC).

The semantics' similarity measures based on structure add variables such as lowest common ancestor (LCA) or least common subsumer (LCS), local specificity of the subtree that contains the concepts, the distance between concepts and the types of relationships involved between them (as shown in Figure 1 and Figure 2).

Fig. 1 Depth of the lowest common predecessor

Fig. 2 Local density of the subtree, distance between concepts and types of relationships involved between two concepts

For example, in ^[²⁸^] the Path measure was proposed, which is based on the shortest path that connects the senses in the "is-a" relationship of WordNet (1). Wu & Palmer measure, proposed in ^[²⁹^] is calculated based on the depth of the LCA and the number of links between concepts and predecessor (2). The measure proposed in ^[³⁰^] is calculated based on the depth of each of the concepts, the LCA depth and the shortest distance between concepts. Another proposal from the same authors ^[³¹^] also includes local specificity. Finally, in ^[³²^] the shortest route between concepts, LCA depth and empiric information are considered.

In addition to the previous ones, the Leacock & Chodorow metric (3) was used in this work, since it determines how similar two senses are, based on the shortest path that connects the senses (as above) and the maximum depth of them:

Simpathc1,c2=-log⁡pathlenc1,c2, (1)

SimWu&Pc1,c2=2DepthLCADepth1+Depth2, (2)

SimL&CHc1,c2=-log⁡⁡LenLCA2maxDepthLCA.pos. (3)

where Depth1 = min(depth({tree in T1|tree contains LCA}))

and Depth2 = min(depth({tree in T2|tree contains LCA}))

Semantics' similarity measures based on information content consider the IC of the nodes derived from the model and statistical corpus, for example, through measures such as term frequency and inverse document frequency (tf-idf). The more information (IC) they share; the more similar concepts are. Some measures in this category are the ones proposed in ^[³²^-³⁴^]. In Resnik's metric ^[³²^], the IC of the LCA is considered, and Jiang & Conrath's and Lin's proposals ^[³³^-³⁴^] include improvements to Resnik's measure as shown in (4), where the IC of each of the concepts is added. Lin's measure (5) is based on Jiang & Conrath's proposal (6):

SimResc1,c2=-log⁡PLCAc1,c2, (4)

SimLinc1,c2=2ICLCAICc1+ICc2, (5)

SimJ&Cc1,c2=1ICc1+ICc2-2ICLCA, (6)

re c1 and c2 are the concepts compared

LCA (c1, c2) is lowest common ancestor beetween c1 y c2

P(c) is the probability of encountering an instance of concept c

IC(LCA) is an information content of LCA

IC(c1) is an information content of c1

IC(c2) is an information content of c2

Recent works incorporate a metric ^[³⁵^] that uses IC and the amount of nodes to try to simplify and improve the calculation of similarity between pairs of words contained in graphs. In sentiment analysis based on lexical approach, there are works having improved the quality of affective lexicons by adding new information such as valence, arousal and dominance ^[¹⁵^-¹⁷^].

The above, basing on pointwise mutual information (PMI), latent semantic analysis (LSA) and the semantic proximity determine by a co-occurrence between the words and the benchmarks to obtain an index of proximity.

3 Method

To prove our hypothesis an experiment was designed using the affective lexicon in English proposed in ^[²⁰^], due to its availability and reputation in affective computing area. Such a lexicon has 1080 words (w1, w2...W₁₀₈₀), grouped into six affective categories (including repetitions). Anger category has 242 words, disgust has 50, fear has 143, joy has 387, sadness has 195 and surprise has 63 words.

From each of the categories the words selected were those with an affective connotation, some 371 in total (34.35%). The concept of affective connotation is used to refer to words that have at least a synset in WordNet, whose hypernyms coincides with at least one of the following concepts emotion, affect, emotionality, feeling and moods, which hereinafter will be called affective ancestor. From the words selected, it was detected that four belong to more than one affective category; these are horror ∈ {fear, disgust}, dismay ∈ {fear, sadness}, suspense ∈ {joy, fear} and admiration ∈ {joy, surprise}. Out of the 371 words selected, 100 belong to the anger category, 6 to disgust, 46 belong to fear, 145 to joy, 66 to sadness and 8 to surprise.

Considering that similarity between two words indicates the closeness between them, this work calculated the similarity between words of an affective category and an antonym.

The experiment was divided into two parts: The first one analyzes the behavior of words of an affective category based on the similarity of these words with an antonym (as shown in Figure 3). Regarding this, it is expected that, within each affective category, the variability of the similarity of words with their antonym is homogeneous. The second one (as shown in Figure 4) analyzes the similarity of words with 3 antonyms of the affective category. For this case, it is expected that the variability of the similarity of words, with each of their antonyms, is homogeneous and low.

Fig. 3 Research hypothesis and experiments (1/2)

Fig. 4 Research hypothesis and experiments (2/2)

The opinion of an expert in English language was used to identify antonyms, who selected the three best antonyms for each affective category.

It is worth mentioning that, although WordNet does not provide an antonym for each affective category, the ones proposed by experts are available in WordNet.

Table 1 shows the three antonyms arranged according to the importance determined by the expert. It was verified that each antonym also had an affective connotation, in the table the antonym is represented including the word itself, the part of the speech and the meaning given by the affective connotation (word#pos#sensenumber).

Table 1 List of antonyms for affective class

Class	Ranking	Antonym
Anger	1	happiness#n#1
	2	calmness#n#3
	3	peace#n#3
Disgust	1	fondness#n#1
	2	admiration#n#1
	3	love#n#1
fear	1	fearlessness#n#1
	2	bravery#n#2
	3	confidence#n#2
Joy	1	sorrow#n#1
	2	sadness#n#1
	3	melancholy
Sadness	1	happiness#n#1
	2	joy#n#1
	3	gladness#n#1
Surprise	1	expectation#n#3
	2	calmness#n#3
	3	coolness#n#2

To obtain similarities between the words of each affective category and each antonym, a Java application was developed which uses WS4J^¹ (WordNet Similarity for Java), API for several Semantic Relatedness/Similarity algorithms.

4 Results and Discussion

The behavior analysis of the similarity between words, and their three antonyms using the six metrics allows observing a homogeneous behavior in each category, this is, the ranges of values in which the similarities move are small. As an example, Figure 5 and Figure 6 show the results of the six-metrics obtained for the words of the anger category considering their three antonyms (calmness#n#3, happiness#n#1, peace#n#3).

Fig. 5 Behavior of metrics based on structure applied to words of the anger affective category, with their three antonyms

Fig. 6 Behavior of the three metrics based on IC applied to words of the anger affective category, with their three antonyms

In the Resnik's case, the similarity provides modest results for all words. This can be explained by the fact that his calculation is based exclusively in IC of the LCA (4), where the closest common ancestor is determined, which for many words of the category and their antonyms, is the same, and coincides with the words used to differentiate the affective character of a word (as shown in Figure 7).

Fig. 7 Low similarity case in Resnik

For example, considering the word gladness#n#1 as antonym of the sadness category and the word w₁=dispiritedness#n#1 as part of it, the result of Resnik's metric is Sim_Res(antonym, w₁) = 4.627, a value that repeats for 100% of the sadness category and 98% of all the words of the other affective categories, since most words share the common ancestor feeling#n#1.

The IC calculation proposed in Resnik (5) is based on the frequency words in corpus, for this predecessor's example feeling#n#1 is 4.627.

In general terms, for the metrics based on the IC, a low similarity or a similarity equal to zero could be due to the low frequency of some of the concepts in the vocabulary, even when both concepts are semantically related.

This could explain many of the values equal to zero obtained with the Jiang & Conrath and Lin metrics. In most cases, these values repeat for the same words regarding their three antonyms.

In addition to this, when Lin's metric is undetermined (5), the implementation of the API WS4J results in a zero value.

The same API yields a zero value when the IC of any of the two words is zero. The variabilities between the similarities of the words of each category with their antonyms, this is [Sim(antonym₁, w₁), Sim(antonym₁, w₂), Sim(antonym₁, w_n] are summarized in Table 2.

Table 2 Variabilities of the similarity of words with their antonyms per category

2	Antonym	Wu&P	J&C	L&CH	LIN	RES	PATH
Anger	calmness#n#3	0.123	0.051	0.159	0.237	0.000	0.029
	happiness#n#1	0.108	0.057	0.159	0.271	0.252	0.028
	peace#n#3	0.093	0.049	0.117	0.234	0.000	0.017
disgust	admiration#n#1	0.015	0.037	0.067	0.180	0.000	0.011
	fondness#n#1	0.015	0.034	0.067	0.169	0.000	0.011
	love#n#1	0.041	0.050	0.151	0.210	0.347	0.030
fear	bravery#n#2	0.037	0.039	0.160	0.185	0.000	0.031
	confidence#n#2	0.032	0.038	0.117	0.184	0.000	0.018
	fearlessness#n#1	0.037	0.039	0.160	0.185	0.000	0.031
joy	melancholy#n#1	0.086	0.044	0.190	0.218	0.576	0.031
	sadness#n#1	0.096	0.053	0.225	0.238	0.576	0.044
	sorrow#n#1	0.086	0.045	0.190	0.220	0.576	0.031
sadness	gladness#n#1	0.081	0.000	0.145	0.000	0.000	0.026
	joy#n#1	0.080	0.054	0.139	0.244	0.159	0.025
	happiness#n#1	0.074	0.052	0.139	0.231	0.614	0.025
surprise	calmness#n#3	0.034	0.013	0.158	0.024	0.000	0.042
	coolness#n#2	0.018	0.004	0.058	0.005	0.000	0.006
	expectation#n#3	0.034	0.013	0.158	0.027	0.000	0.042

The minimum standard deviation value is zero and the maximum value is 0.614. Regarding minimum values, these were obtained in the following cases: for the 3 antonyms of the surprise category, for the 3 antonyms of the fear category, for 2 antonyms of disgust, for 2 antonyms of anger and for 1 antonym of sadness.

These variability values were mainly obtained in Resnik's metric. However, minimum variability was also obtained with Jiang & Conrath and Lin's metric for the antonym gladness#n#1 of the sadness category. On the other hand, the maximum variability value was obtained for the antonym happiness#n#1 of the sadness category with Resnik's metric.

In general terms, as is presented in Table 2, the deviations obtained were low. It is worth mentioning that a certain degree of variability in the results is reasonable since in a same category there are words with different affective intensity.

For example, when calculating similarities of the words grief#n#1and sorrow#n#1, both belonging to the sadness category, with their antonym joy#n#1, Sim_Lin (joy#n#1, grief#n#1) = 0.50 and Sim_Lin (joy#n#1, sorrow#n#1) = 0.52 and the Sim_Wu&P (joy#n#1, grief#n#1) = 0.70 and Sim_Wu&P (joy#n#1, sorrow#n#1) = 0.75 were obtained.

This would show there are variations of similarity between words of a same category, indicating differences in the affective intensity of words. On the other hand, the standard deviation of the similarity of each word with their 3 antonyms was calculated, and since the antonyms were selected by the expert as the best antonyms, it is logical to expect a similar behavior of the words of a same category, regarding similarity, regardless of the antonym chosen.

In general terms, it is expected that the similarity calculation of a same word with the three antonyms yields a low variability. This is ratified in all affective categories; there were even cases with zero deviation.

As shown in Table 3, the similarities calculated between the words of the category and their antonyms, in most of the metrics, have a low variability, which leads us to the verification of the hypothesis. Results shown in Table 4 indicate that the antonym's selection performed by the experts was accurate since antonyms are similar to each other. As shown in Table 3, for each metric there were small differences between minimums and maximums, especially low in the fear category for each of the metrics.

Table 3 Variabilities of similarities between antonyms

Class	Wu&P	J&C	L&CH	LIN	RES	PATH
Anger	max	0.088065632	0.014142136	0.188561808	0.044969125	0.4384062	0.0377124
	min	0.028284271	0	0.103708995	0	0	0.0094281
	maxmin	0.059781361	0.014142136	0.084852814	0.044969125	0.4384062	0.0282843
Disgust	max	0.051854497	0.030912062	0.188561808	0.078740079	0.4384062	0.0377124
	min	0	0	0	0	0	0
	maxmin	0.051854497	0.030912062	0.188561808	0.078740079	0.4384062	0.0377124
Fear	max	0.042426407	0.004714045	0.188561808	0.004714045	0 0.0377124
	min	0.032998316	0	0.11785113	0	0	0.0141421
	maxmin	0.00942809	0.004714045	0.070710678	0.004714045	0	0.0235702
Joy	max	0.051854497	0.014142136	0.136707311	0.026246693	1.11E-16	0.0377124
	min	0.004714045	0	0.032998316	0	0	0
	maxmin	0.047140452	0.014142136	0.103708995	0.026246693	1.11E-16	0.0377124
Sadness	max	0.070395707	0.098432154	0.230362034	0.34127213	1.5167143	0.0449691
	min	0.018856181	0	0.061282588	0	0	0.0094281
	maxmin	0.051539526	0.098432154	0.169079446	0.34127213	1.5167143	0.035541
Surprise	max	0.188561808	0.035590261	0.565685425	0.224251843	1.8149074	0.108423
	min	0.164991582	0.023570226	0.414835978	0.193448242	1.8149074	0.0565685
	maxmin	0.023570226	0.012020035	0.150849447	0.030803601	0	0.0518545

Table 4 Similarity between antonyms for each affective category and each metric

Class Similarity between		Wu&P	J&C	L&CH	LIN	RES	PATH
Anger	calmness-happiness	0.75	0.9	2.07	0.474	4.62	0.2
	happiness-peace	0.66	0.09	1.74	0.4664	4.62	0.14
	peace-calmness	0.87	3.14	2.59	0.9833	9.36	0.33
	σ	0.02	4.98	0.36	0.1755	14.98	0.02
disgust	admiration-fondness	0.87	0.25	2.59	0.806	8.15	0.33
	fondness-love	0.75	0.10	2.07	0.4981	4.62	0.2
	admiration-love	0.75	0.12	2.07	0.5294	4.62	0.2
	σ	0.01	0.012	0.17	0.0574	8.29	0.01
fear	fearlessness-bravery	1	1.3E+07	3.68	1	9.81	1
	confidence-fearlessness	0.87	6.48	2.59	0.9922	9.82	0.33
	bravery-confidence	0.87	6.48	2.59	0.9922	9.82	0.33
	σ	0.01	1.1E+14	0.80	0.00004	0.00004	0.3
joy	melancholy-sadness	0.93	0.56	2.99	0.903	8.21	0.5
	sadness-sorrow	0.93	0.67	2.99	0.9175	8.21	0.5
	melancholy-sorrow	0.87	0.30	2.59	0.8352	8.21	0.33
	σ	0.002	0.071	0.10	0.0039	0	0.02
sadness	gladness-joy	0.75	0	2.07	0	4.62	0.2
	happiness-joy	0.82	0.14	2.30	0.613	5.55	0.25
	gladness-happiness	0.70	0	1.89	0	4.62	0.17
	σ	0.007	0.013	0.08	0.2505	0.57	0
surprise	calmness-coolness	0.5	0.06	1.49	0.227	2.39	0.11
	coolness-expectation	0.5	0.06	1.49	0.237	2.39	0.11
	calmness-expectation	0.85	0.11	2.59	0.5189	4.62	0.33
	σ	0.08	0.001	0.80	0.0549	3.31	0.03

During this work, the use of antonyms to analyze the behavior of the words of each affective category was very useful, since it allowed validating that similarity has a homogeneous behavior and a low variability. It is important to highlight that in order to analyze the similarity values and their relationship with the affective intensity or the diffuse belonging of a word to more than one affective category, it is necessary to use another word as a pivot, not an antonym, just an opposite word. This way, the problem represented in Figure 7 would be eliminated, and more significant values of similarity would be obtained for future analysis as visualized in Figure 8.

Fig. 8 Use of opposite concept to improve problem of common affective ancestor

5 Conclusions and Future Work

This article evidences that words of a same affective category have a homogeneous similarity, as stated in the hypothesis. This statement is supported by the results, which show a low standard deviation of the similarity of words that make up an affective category.

The results obtained so far show the usefulness of similarity to enrich a lexicon, for example, when identifying words regarding their diffuse classification or when determining the intensity of words that belong to a same affective category. Regarding this, it is possible to add words that do not have affective ancestors to a lexicon tagged with intensities using synonymy relationships. For this, it is important to identify the words that will be used as pivots. Although an antonym was used in this work, we believe that an intensity analysis requires a pivot that provides more meaningful information and that reduces the problem of metric's calculation based on IC, explained in the previous section. A priori, we believe it would be possible to obtain better results if a word with an opposite affective sense is used. For example, using Plutchik's taxonomy (8 emotions), this would require the re-classification of the lexicon based on Ekman's taxonomy (6 emotions).

The existence of mechanisms that improve the treatment of antonym relationships in WordNet, as well as the implementation of other similarity semantic metrics, would allow, with less effort, to improve affective knowledge bases, such as lexicons or dictionaries, considering, for example, the use of relationships of transitivity and of the semantic type for the analysis of knowledge structures.

Acknowledgements

This paper is the result of work by the SOMOS research group (SOftware - MOdelling -Science), funded by the Dirección de Investigación and Facultad de Ciencias Empresariales of the Universidad del Bío-Bío, Chile. The authors thank to Facultad de Ingeniería de la Universidad Católica de la Santísima Concepción, Chile.

References

1. Sun, J., Wang, G., Cheng, X., & Fu, Y. (2015). Mining affective text to improve social media item recommendation. Information Processing & Management, Vol. 51, No. 4, pp. 444-457. DOI: 10.1016/j.ipm.2014.09.002. [ Links ]

2. Gurini, D.F., Gasparetti, F., Micarelli, A., & Sansonetti, G. (2018). Temporal people-to-people recommendation on social networks with sentiment-based matrix factorization. Future Generation Computer Systems, Vol. 78, No. 1, pp. 430-439. DOI: 10.1016/j.future.2017.03.020. [ Links ]

3. Mohammad, S.M., Zhu, X., Kiritchenko, S., & Martin, J. (2015). Sentiment, emotion, purpose, and style in electoral tweets. Information Processing & Management, Vol. 51, No. 4, pp. 480-499. DOI: 10.1016/j.ipm.2014.09.003. [ Links ]

4. Bertola, F. & Patti, V. (2016). Ontology-based affective models to organize artworks in the social semantic web. Information Processing & Management, Vol. 52, No. 1, pp. 139-162. DOI: 10.1016/j.ipm.2015.10.003. [ Links ]

5. Balahur, A. & Perea-Ortega, J.M. (2015). Sentiment analysis system adaptation for multilingual processing: The case of tweets. Information Processing & Management , Vol. 51, No. 4, pp. 547-556. DOI: 10.1016/j.ipm. 2014.10.004. [ Links ]

6. Severyn, A., Moschitti, A., Uryupina, O., Plank, B., & Filippova, K. (2016). Multi-lingual opinion mining on YouTube. Information Processing & Management , Vol. 52, No. 1, pp. 46-60. DOI: 10.1016/j.ipm.2015.03.002. [ Links ]

7. Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-based methods for sentiment analysis. Computational linguistics, Vol. 37, No. 2, pp. 267-307. DOI: 10.1162/COLI_a_00049. [ Links ]

8. Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up?: sentiment classification using machine learning techniques. Proceedings of the ACL-02 conference on Empirical methods in natural language processing, Vol. 10, pp. 79-86. DOI: 10.3115/1118693.1118704. [ Links ]

9. Wilson, T., Wiebe, J., & Hoffmann, P. (2005). Recognizing contextual polarity in phrase-level sentiment analysis. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pp. 347-354. [ Links ]

10. Carrillo de Albornoz, J. (2011). Un modelo lingúístico-semántico basado en emociones para la clasificación de textos según su polaridad e intensidad. [ Links ]

11. Grefenstette, G., Qu, Y., Shanahan, J.G., & Evans, D.A. (2004). Coupling niche browsers and affect analysis for an opinion mining application. Proceedings of Recherche d'Information Assistée par Ordinateur (RIAO). [ Links ]

12. Ekman, P. (1993). Facial expression and emotion. American psychologist, Vol. 48, No. 4, pp. 384. [ Links ]

13. Strapparava, C. & Valitutti, A. (2004). Wordnet affect: an affective extension of wordnet. Lrec, pp. 1083-1086. [ Links ]

14. D'Urso, V. & Trentin, R. (2007). Introduzione alla psicologia delle emozioni. Laterza. [ Links ]

15. Bestgen, Y. (2002). Détermination de la valence affective de termes dans de grands corpus de textes. Actes du Colloque International sur la Fouille de Texte, pp. 1 -14. [ Links ]

16. Redondo, J., Fraga, I., Padrón, I., & Comesaña, M. (2007). The Spanish adaptation of ANEW (affective norms for English words). Behavior research methods, Vol. 39, No. 3, pp. 600-605. DOI: 10.3758/BF03193031. [ Links ]

17. Turney, P.D. & Littman, M.L. (2003). Measuring praise and criticism: Inference of semantic orientation from association. ACM Transactions on Information Systems (TOIS), Vol. 21, No. 4, pp. 315-346. DOI: 10.1145/944012.944013. [ Links ]

18. Ptaszynski, M., Rzepka, R., Araki, K., & Momouchi, Y. (2014). Automatically annotating a five-billion-word corpus of Japanese blogs for sentiment and affect analysis. Computer Speech & Language, Vol. 28, No. 1, pp. 38-55. DOI:10.1016/j.csl.2013.04.010. [ Links ]

19. Ekman, P. (1973). Cross-cultural studies of facial expression. Darwin and facial expression: A century of research in review, pp.169-222. [ Links ]

20. Valitutti, A., Strapparava, C., & Stock, O. (2004). Developing affective lexical resources. PsychNology Journal, Vol. 2, No. 1, pp. 61 -83. [ Links ]

21. Singh, P., Lin, T., Mueller, E.T., Lim, G., Perkins, T., & Zhu, W.L. (2002). Open Mind Common Sense: Knowledge acquisition from the general public. OTM Confederated International Conferences on the Move to Meaningful Internet Systems, pp. 1223-1237. DOI: 10.1007/3-540-36124-3_77. [ Links ]

22. Mohammad, S.M. & Turney, P.D. (2013). Crowdsourcing a word-emotion association lexicon. Computational Intelligence, Vol. 29, No. 3, pp. 436-465. DOI: 10.1111/j.1467-8640.2012.00460.x. [ Links ]

23. Brants, T. & Franz, A. (2006). Web 1T 5-gram Version 1 LDC2006T13. [ Links ]

24. Cho, H., Kim, S., Lee, J., & Lee, J.S. (2014). Data-driven integration of multiple sentiment dictionaries for lexicon-based sentiment classification of product reviews. Knowledge-Based Systems, Vol. 71, pp. 61-71. DOI: 10.1016/j.knosys.2014.06.001. [ Links ]

25. Huang, S., Niu, Z., & Shi, C. (2014). Automatic construction of domain-specific sentiment lexicon based on constrained label propagation. Knowledge-Based Systems , Vol. 56, pp. 191 -200. DOI: 10.1016/j.knosys.2013.11.009. [ Links ]

26. Park, S., Lee, W., & Moon, I.C. (2015). Efficient extraction of domain specific sentiment lexicon with active learning. Pattern Recognition Letters, Vol. 56, pp. 38-44. DOI: /10.1016/j.patrec.2015.01.004. [ Links ]

27. Hung, C. (2017). Word of mouth quality classification based on contextual sentiment lexicons. Information Processing & Management , Vol. 53, No. 4, pp. 751-763. DOI: 10.1016/j.ipm.2017.02.007. [ Links ]

28. Rada, R., Mili, H., Bicknell, E., & Blettner, M. (1989). Development and application of a metric on semantic nets. IEEE transactions on systems, man, and cybernetics, Vol. 19, No. 1, pp. 17-30. [ Links ]

29. Wu, Z. & Palmer, M. (1994). Verbs semantics and lexical selection. Proceedings of the 32nd annual meeting on Association for Computational Linguistics, pp. 133-138. DOI: 10.3115/981732.981751. [ Links ]

30. Al-Mubaid, H. & Nguyen, H.A. (2006). A cluster-based approach for semantic similarity in the biomedical domain. International IEEE Conference in Medicine and Biology Society, pp. 2713-2717. [ Links ]

31. Nguyen, H.A. & Al-Mubaid, H. (2006). New ontology-based semantic similarity measure for the biomedical domain. International IEEE Conference on Granular Computing, pp. 623-628. DOI: 10.1109/GRC.2006.1635880. [ Links ]

32. Li, Y., Bandar, Z.A., & McLean, D. (2003). An approach for measuring semantic similarity between words using multiple information sources. IEEE Transactions on knowledge and data engineering, Vol. 15, No. 4, pp. 871-882. DOI: 10.1109/TKDE.2003.1209005. [ Links ]

33. Jiang, J.J. & Conrath, D.W. (1997). Semantic similarity based on corpus statistics and lexical taxonomy. International Conference Research on Computational Linguistics (ROCLING X). [ Links ]

34. Resnik, P. (1999). Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of artificial intelligence research, Vol. 11, pp. 95-130. DOI: 10.1613/jair.514. [ Links ]

35. Gao, J.B., Zhang, B.W., & Chen, X.H. (2015). A WordNet-based semantic similarity measurement combining edge-counting and information content theory. Engineering Applications of Artificial Intelligence, Vol. 39, pp. 80-88. DOI: 10.1016/ j.engappai.2014.11.009. [ Links ]

¹ http://ws4jdemo.appspot.com

Received: January 17, 2019; Accepted: March 04, 2019

^* Corresponding author is Alejandra Segura Navarrete. asegura@ubiobio.cl

This is an open-access article distributed under the terms of the Creative Commons Attribution License