1. Introduction
Artificial Neural Networks arise from the interpretation of the functioning of a human brain. Although the first to relate computing to the human brain was Alan Turing in 1936, it was Warren McCulloch and Walter Pitts who created the theory of how a neuron works10.
Joel W. Johnson in 2012 presents the main factors affecting vote inequality among incumbent cohorts (members of the same party and district), indicating the strong influence of vote splitting incentives on electoral environments focused on the candidates11.
The study developed by Ching-Hsing Wang in 2014 indicates that awareness and emotional stability can significantly increase female participation in electoral votes, but have no effect on male participation18. Furthermore, openness to experience has opposite effects on male and female participation. As openness to experience increases, men are more likely to vote, while women are less likely to cast ballots. However, extraversion and agreeableness are not associated with participation, regardless of gender18. Orlando D'Adamo in July 2015, makes a study of the usefulness and scope of the use of social networks during electoral campaigns. The authors in 5 present results of an investigation that analyzes the use of social networks made by candidates for deputies and senators for the city of Buenos Aires in the legislative elections.
In 2017, Dimitrios Xefteris studied several factors influencing electoral voting, including religion, race, culture, and others, making use of optimized data access for the Data Warehouse maximizes differentiated voting participation8. Artificial Neural Networks or ANN for its acronym in English, have numerous publications made each year. Several researchers have done studies on the different neural networks, the simplest being the multilayer perceptron which has a pattern recognition architecture, which means that its neurons are only connected from one layer to the next6,13. Artificial Neural Networks can be used to predict the difficulties of the electoral process, since they have been used for different complex problems with adaptive and cognitive mechanisms of human learning. The literature tells us that training a neural network is an NP-hard optimization problem that represents several theoretical and computational limitations7. Dat Thanh Tran in January 2019, proposes a library to avoid the bottleneck in Machine Learning using the perceptron15, Recurrent Neural Networks for Sequential Data Modeling have also been published, for voice recognition considering the morphology of words, the syntax and semantics16. The literature shows that the information that influences the electoral choice is uncertainty. it is clear that various factors and attributes influence to know the winner of the elections, it should be noted that doing this manually represents a very laborious workload and knowing the main attributes that influence is always of great importance for both voters and the nominated candidates.
2. Problem statement
A database with artificial data is randomly generated with which the Artificial Neural Network can evaluate the attributes that affect a vote. These must be safe to avoid inconsistencies since the data must be managed by software which, according to the restrictions of the problem and the input information obtained from the database, will show the efficiency of the proposed algorithm. This software is an algorithm which will give a solution to find the attribute more efficiently with a back propagation method. The database will have the necessary initial information that is required for the Artificial Neural Network. These represent the different possibilities to make a choice when voting, among which are the economy, socio-cultural movements, and work. Taking that into account, the input data to the Artificial Neural Network are the following:
And the output variables are the political parties to take into consideration, in this case we will use the most common political ideology patterns among which are usually classified as Left, Right and agnostic or central. With which we will create three fictitious political parties called:
To understand the way in which supporters of each party are classified, Fig. 1 shows the schematic distribution of the political spectrum.
This political distribution has its origin in 1789 in France during the National Constituent Assembly in which the revocation of political power to the monarchy was discussed. Those who were against it were on the right and those who promoted a change, seeking national sovereignty, were on the left. Péronnet in 1985 Said distribution was modified while preserving the same political bases since at the beginning of the 19th century the aristocracy was supplanted by the bourgeoisie as the predominant class12. With which we can say that liberal or left politics seeks political equality, for the progress of the people. Without imposing on the most the law of the least. Rightwing or conservative politics represents maintaining the current political order, in which it represents those who possess power and wealth seeking the individual good without taking into account all social classes.
Moderate politics has gained popularity in recent years because it represents the union between liberal politics and conservative politics trying to get the best of both parties. The idea of joining and discerning the vote on the scale from Left to Right entails acceptance to the way that group of people work, taking into account the way in which they deal with problems, that is, the means used to resolve conflicts. Thus, generating the right with the other part of the scale, attributing pejorative con- notations to the identity of the opposition and vice versa3. Recently there is a tension due to not so clear circumstances between political geometries, leaving out the debate by differentiating political thought, focusing on the discrediting of the adversary. These acts cause confusion in the voters, frustrating the reasoning behind their vote, this means that the electoral decision in many cases is overwhelmed by the economy, socio-cultural movements and work, when making a decision. The disturbance in our representation Politics is represented by sociocultural movements and other events not initially considered as natural phenomena, electoral fraud, among others.
3. Backpropagation neural network
An Artificial Neural Network is a complex mathematical function inspired by the operation of its biological namesake. But it's also the interaction of many simpler parts called neurons, working together, which have numerical inputs and outputs. And its goal is to solve problems in a way similar to the human brain. The neural network is the integration of many neurons in which each neuron performs a weighted sum whose weighting is given by the weight assigned to each of the connections of entry. This means that each connection that reaches the neuron will have an associated value that will define the intensity with which the input variable will affect the neuron and therefore will influence the result that the output layer throws2. Backpropagation networks use feedback as a supervised method, consisting of three layers: input, hidden, and output. Having better precision because the error propagates inversely, that is, it starts from the output layer passing between the hidden layer to reach the input layer14.
As shown in Fig. 2, the X variables from X 1 to X n represent the inputs to the network, and the Y variable from Y 1 to Y n represents the result obtained from the neurons in the output layer.
4. Kohonen neural network
Unlike the backward propagation Neural Network, the Kohonen Neural Network, as shown in Fig. 3, is simpler as it has only one layer, which uses an unsupervised method, therefore it does not have a specific vector to be trained, among others. reasons that affect the veracity of the output result1.
5. Model of an artificial neuron
It is the base unit of a neural network, basically it is an elementary processor which processes a vector X and produces an output resulting from the weighted sum9. The model of an artificial neuron is an imitation of the process of a biological neuron, as seen in Fig. 4.
Where the X 1 are the inputs (through the dendrites) to the neuron. These undergo a multiplying effect on the weight W 1 , due to their communication to the nucleus of the neuron, and b is bias. Thus, obtaining the Eq. (1) as can be seen in Fig. 4.
The basic characteristic of a neural network is that it is composed of three layers. The input layer is in charge of receiving the input values and sending these values to the second layer called the hidden layer and these carry out their process and transfer the information to the output layer, the network can contain more layers if required, they can be modified by adding or removing input or output variables or by changing the learning or training process4. A conventional neural network is composed of three characteristics:
The interconnection model between the different layers of the network.
Development of learning in the variation of weights between the interconnections.
The activation function modifies the weighted result format of the network in the output activation value.
In this case, a Trained Neural Network will be used with a backward propagation algorithm oriented towards gradient descent and the use of the chain-rule.
6. Activation function
The activation function is used to modify the data and enter it within a shorter range to make a simpler calculation. Next, we have the activation function that we will use in our backpropagation algorithm19. This function modifies the input values. where the high values are close to 1 and the very low values are close to 0 and is represented in Eq. (2), called Sigmoid function.
6.1. Backpropagation
Backpropagation is an algorithm widely used in the training of forward neural networks for supervised learning. It works by calculating the gradient of the loss function with respect to each weight by the chain rule, iterating backwards one layer at a time from the last layer to avoid redundant calculations of middle terms in the chain rule, and is based on partial derivatives of calculus. Each weight and bias value have an associated partial derivative. You can think of a partial derivative as a value that contains information about how much and in which direction a weight value should be adjusted to reduce error. The collection of all partial derivatives is called a gradient. However, for simplicity, each partial derivative is commonly called a gradient17.
6.2. Chain rule
If a variable y depends on a second variable u, which in turn depends on a third variable x, then the rate of change of y with respect to x can be calculated as the product of the rate of change of y with respect to u multiplied by the rate of change of u with respect to x. If g(x) is differentiable at the point x and f(x) is differentiable at the point g(x), then f is differentiable in x. Also, let y=f(g(x)) y u=g(x), then Eq. (3) is obtained. What is the chain rule.
6.3. Cost function
The cost function tries to determine the error between the estimated value and the real value, in order to optimize the parameters of the neural network. In this case we will use the root mean square error. In regression analysis, Mean Square Error refers to the mean of the squared deviations of the predictions from the true values, over one space outside the test sample, generated by a model estimated over one sample space particular. Its formula is shown in the Eq. (4) root mean square error.
6.4. Mathematical model of our artificial neural network
Initially the network values are randomly generated. With this, it is very probable that the error obtained is very high, for which reason the network must be trained to obtain the minimum possible error. We will begin by calculating the derivative of the parameters in the last layer, in which the result obtained from the weighted sum that is show below in Eq. (5), z 1 is the result of the weighted sum, is the weight 1 is the representation of the bias.
Subsequently, the activation and cost function are added to this result, resulting in the error obtained from the network represented in the Eq. (5) where Result of the weighted sum activation function F. cost. What we will look for will be the partial derivative of the cost with respect to the weight and bias parameters, with which we will have to calculate two derivatives. As we have said, we are going to start working from back to front, therefore we begin to calculate the derivative of the parameters of the last layer. The number of the layer to which the parameter belongs if our neural network has layers. To calculate this derivative it is important to analyze which is the path that connects the value of the parameter and the final cost in the last layer of this path is not very long, although it still has several steps, previously we saw that in the operation of the neuron the parameter participated in a weighted sum now which we will refer to as which would then be passed by the activation function represented in the Eq. (2) and the result of the activations of the neuron in the last layer would conform to the result of the network that would later be evaluated by the cost function Having to determine the network error. With this, a composition of functions is formed and we will use a mathematical calculation tool called the Chain Rule represented in the Eq. (6) To calculate the derivative of composition of functions, what it tells us is that to calculate the derivative of a composition of functions we simply have to multiply each of the intermediate derivatives. Considering the Eqs. (7) and (10) represent the weighted sum of the last layer.
We will obtain the derivative of the weight with respect to the cost in Eq. (8) and the derivative of the bias with respect to the cost in Eq. (9) Derivative of weight with respect to cost.
Thus, obtaining three partial derivatives where the derivative of the activation with respect to the cost Eq. (9) to obtain the cost variation of the network when the output of the activation of the neurons in the last layer is varied, that is to say that a derived from the cost function with respect to the output of the neural network. The cost function that we will use in this case will be the root mean square error described in Eq. (4) with the parameters of our network Eq. (10) partial derivative of activation with respect to cost.
Thus, the derivative of the function with respect to the output of the network, Eq. (11), would be represented. Eq. (12) indicates the derivative of the activation function with respect to the output of the network.
We continue with the activation function of the weighted sum Eq. (13) Activation function of the weighted sum of and its Derivative of the activation function with respect to the weighted sum in Eq. (14). This reveals the output variation of the neuron when the weighted sum of the neuron is varied, thus calculating the activation function. This derivative is calculated depending on the type of activation function, in this case we will use the sigmoid function Eq. (15) with all that we would only be missing two partial derivatives with respect to bias Eq. (16). Partial derivative with respect to bias and weight Eq. (17). Partial derivative with respect to weight. These are obtained by deriving the weighted sum of the neuron as shown below in Eq. (15). Derivation of the weighted sum of the neuron. By applying the chain rule and using two partial derivatives in the derivative of the bias with respect to cost Eq. (18), the error is obtained as a function of the value of represented in Eq. (19). What is the weighted sum calculated within the neuron, that is, what this derivative tells us is to what degree the cost error is modified when there is a small change in the sum of the neuron if this derivative is large, before a small change in the value of the neuron this will be reflected in the final result and on the contrary if the derivative is small it does not matter how we vary the value of the sum since this will not affect the error of the network, that is, the derivative from here is the one that will tell us what responsibility the neuron has in the final result and therefore in the error, this is what we said before, if the neuron is a part responsible for the final error then we should use this information to extract part of that mistake for this one. We will need the derivation of the bias with respect to the cost with error imputed to neuron 1 is represented by Eq. (20) which will be the error imputed to the neuron or derivation of the bias with respect to the cost with error imputed to neuron 2 shown in the Eq. (21), which is calculated in Eq. (22), which is the error imputed to the neuron.
Later we will do the same, but with the partial derivative of the weight with respect to the cost of the error imputed to neuron 1, Eq. (23), which reduces to Eq. (24). Derivative of the weight with respect to the cost with error imputed to neuron 2.
We have deduced three different expressions that allow us to obtain the partial derivatives that we are looking for the last layer, one that tells us how to calculate the error of the neurons in the last layer, and another for each of the partial derivatives and thus we obtain the result of the last layer. To obtain the result of the previous layer, we apply the Chain Rule again to the following composition. Eq. (25) Indicates the Error in the penultimate layer that with the chain rule generates two derivatives; derivative of weight with respect to cost in the penultimate layer Eq. (26) and derivative of bias with respect to cost in the penultimate layer, Eq. (27).
Calculated that is the error of the layer and these derivatives are operated the same as before minus 1 and the activation of the previous layer and this here is the derivative of the function of the activation in this expression. The only thing that would need to be calculated would be this derivative that tells us about how the weighted sum of a layer varies when the output of a neuron in the previous layer is varied. This derivative is also simple to calculate and is basically the parameter matrix, Eq. (28), which connects both layers, what it does is move the error from one layer to the previous layer, distributing the error based on the weights of the connections, with this we would again have an expression from which to obtain the derivatives partials we are looking for. Again, the block highlighted in Eq. (29). It becomes this derivative, Eq. (30). Which again represents the error of the neurons in this layer.
The effectiveness of the back propagation algorithm lies in the fact that what we have done in this layer is already extensible to the rest of the layers of the network, applying the same logic, we take the error from the previous layer, we multiply it by the weight matrix in a transformation that comes to represent the back propagation of the errors, Eq. (28), and we calculated the partial derivatives with respect to the parameters and so on, going through all the layers of the network until the end, with a single step we had calculated all the errors and the partial derivatives of our network using only four expressions.
With what in the end we will obtain four expressions to calculate error starting with the last layer, Eq. (30), that computationally indicates the error of the last layer and later perform a retro propagation of the error of the previous layer, Eq. (31), to do the retro propagation from the error to the previous layer and the calculation of the derivatives of each layer, Eq. (32), the derivative bias of the layer is developed using the error with the Eq. (28) of our network the weight of the house is calculated using the error.
To calculate the partial derivatives that we are looking for in the Eq. (32) and Eq. (31) expressions are quite intuitive because we are simply telling ourselves how we have to use the error from the previous layer to calculate the error in this layer there are two different cases. one is in the last layer where the error already belongs to the cost function, Eq. (30), and others are the rest of the layers of our network that depend on another layer, Eq. (31), and of course once we have these two expressions that tell us how we can calculate the error in the current layer with respect to the previous one.
7. Implementation of the artificial neural network
The program randomly creates an artificial database, it starts by randomly generating 1,000 synthetic data items.
Each data item has four input values and three output values as can be seen in Fig. 5. The four input values are all between -10.0 and +10.0 and correspond to predicted values that have been normalized, so that values below zero are less than the average, and values above zero are greater than the average. The three output values correspond to a variable to predict that can take one of the three categorical values. To predict a person's political inclination: conservative, moderate or liberal.
The program randomly divides the data into a training set of 800 items and a test set of 200 items Fig. 6. The training set is used to create the neural network model, and the test set is used to estimate the accuracy of the model. After the data is partitioned, the program creates an instance of a neural network with n hidden nodes. The number of hidden nodes is arbitrary and must be determined by trial and error. To finish the program, it generates a final neural network with the optimal weights and bias with values generated during the previous training of the network to obtain the final result.
8. Algorithm
In Fig. 7, we present a flowchart illustrating the application of the backpropagation algorithm within an artificial neural network. The primary goal is to develop an algorithm capable of predicting electoral votes. The process commences by defining the parameters of the network, followed by the generation of a randomized database, as depicted in Fig. 5. Subsequently, an artificial neural network is created.
The input data extracted from the randomized database is then introduced into the input layer of the neural network. After traversing hidden layers, the processed data ultimately reaches the output layer. The ensuing step involves a comparison between the neural network's output and the anticipated results derived from the training data. This assessment yields an error or loss metric, which quantities the network's performance.
If the calculated error falls within an acceptable range, the results are displayed, as illustrated in Fig. 8, and the process concludes. However, if the error remains outside this acceptable threshold, the data is looped back to the input layer, where it undergoes further processing iterations. This iterative procedure continues until the error converges to an acceptable value.
9. Results
These results are tuned in order to know if the dependence of the parameters of the Artificial Neural Network finds the normalized average value. And know if the number of hidden Layers affects the results of the voting. We can see in Fig. 8 the optimal number of hidden layers (NumHidden), obtained when there is a hidden layer, because it is the smallest value with a mean of 0. 0667240612695. We also see the worst result when having 8 hidden layers in the network, with an average of 0.116324855868.
10. Conclusions
In conclusion, the parameters with the best performance in our Network Artificial Neural are those expressed below in Fig. 9.
It only remains for us to take the approach based on electoral votes, with experimental data, and the disturbance as a determining event in the alteration, or not, of the final result. We have three possible outcomes which are on a scale of 0 to 10. If you are representing a person who is younger than average, has a much lower income than average, is somewhat more educated than average, and has more debt than average. The person has a liberal political view. Likewise, if another person on the same scale is older than average, has a higher-than-average income, is slightly less than, equal to, or slightly more educated than average, and has less than average debt, that person has a conservative political vision. But when a person is within the average or a little below or a little above in all the parameters, he is in a moderate ideology.