1 Introduction
Selection of qualified human resources is a key success factor for an organization. The adequate personnel training has a huge effect on improving the employees’ performance, which has a direct impact on the growth and competence of the whole organization, especially in large-size and multinational companies and organizations. The personal selection is one of the fundamental activities of the human resource management; it has as objective to select the most appropriate candidate for the organization. An important activity of organizations is to seek more powerful ways of ranking a set of employees or personnel who have been evaluated in terms of different competencies [1]. Personnel selection is the process of choosing among the candidates applying for a particular job in the company, those who have the qualifications required to perform the job in the best way [2]. The personnel selection problem involves many conflicting objectives; and therefore it is a complicated problem.
This process is defined as a comparison and decision making process. In this process the human experts have an active participation. Recently, authors in [3], proposed to consider this problem a multi-criteria decision making problem under uncertainty. Many conflicting criteria should be considered when comparing alternatives to choose from, therefore a Multi-Criteria Decision Making (MCDM), approach is used [4, 5, 6, 7, 8, 9, 10]. The application of ranking and choice processes to decision making is crucial in different human activities (engineering, economics, etc). The main goal of managers is to obtain a ranking of the set of candidates who have been evaluated according to different competence; therefore, the development of efficient and flexible information aggregation methods has become a main issue in personnel selection [11].
A ranking is an ordering of a set of elements (objects, alternatives, actions, or candidates in the personal selection), indicating some sort of preference relationship among them, from the best to the worst, while these objects are evaluated from multiple points of view considered relevant for the problem. Every ranking can be viewed as being produced by applying an overall ordering criterion to a given set of objects. As different people tend might judge the criteria differently, they usually end up with different orderings.
Dealing with permutations/rankings is a research area, which has gained a great interest. Ranking is among the most frequent real-world decision problems, the reason is that ranking data is ubiquitous nowadays and we can find applications in many fields like preference lists, voting in elections, information retrieval, collaborative filtering, combinatorial optimization, computational biology, etc, [12]. Magazines regularly publish rankings of universities, colleges, study programs, hospitals, pension funds, or cities [13].
More formally the problem can be described as follows: Let
For instances, in politics, rankings focus on the comparison of economic, social, environmental, and governance performance of countries; in [14], authors address the issue of how to construct suitable aggregates of individual journal rankings, using an optimization-based consensus ranking approach. The personal selection problem is another area in which the rankings are relevant; different rankings of the candidate workers can be established taking into account different criteria and an interesting problem is to aggregate these rankings to support the decision maker.
In the aggregation ranking problem a way of measuring how different two rankings are is required, and distances are the conventional tool to do that. A common approach to this problem is to find a permutation that minimizes the sum of the distances to the voters rankings, where in principle any distance (-like), function on permutations can be used [15]. The most frequently used distance measures among rankings are the Spearman footrule distance and the Kendall
Computing the consensus ranking is equivalent to the rank aggregation problem. The problem of computing the consensus ranking is nowadays an active field of research. An aggregation for a set of rankings
A new method to rank aggregation is proposed in this paper; it is based on a Reinforcement Learning (RL) approach.
RL is a powerful technique to learn to take optimal decisions by trying out actions and evaluating the effect. Gradually the performance is increasing based of the feedback that is received, see [21], for an introduction to the field and [22], or an overview of recent advances. In the setting considered here, the feedback is the quality of the resulting aggregated ranking.
An important aspect of the learning process is the balancing of the exploration versus exploitation. Exploration refers to trying out new actions or decisions, exploitation refers to using the knowledge already acquired so far. In this paper, we will show that RL can lead to an efficient exploration of the search space. To the best of our knowledge our approach is novel to the ranking problem.1 In the next section some related work to the rank aggregation problem is discussed.
The method proposed in this paper is presented in section 3. After that, an experimental study about the performance of this method is reported in section 4.
2 Related Work
Among the most commonly applied methods for this purpose are those based on distance measu-res between individual and collective preferences, and which look for the solution that minimizes the disagreement across decision makers [24]. The Kemeny ranking problem is the problem of finding the ranking defined by equation (1); it is the ranking that minimizes the total number of disagreements with the input rankings:
The distance
Below we summarize some algorithms that have been proposed to solve this NP-hard minimization problem given in 1.
Bargagliotti [25], presents a study about the aggregation of ranked data; she analyzes some characteristics and difficulties of this problem and the relation between the overall ranking and the input rankings. There are different methods for extracting overall rankings into specific applications but also, there are some more general methods to solve this problem.
In [12], authors propose to use genetic algorithm to tackle the rank aggregation problem. According to the authors these algorithms perform especially well when they face complex instances (those with large dimension and small degree of consensus). The study shows that the GA always obtains the best result (especially when the number of rankings increases), but however with respect to computational time, the GA algorithms are slower than other approaches.
In [15], a new approach to the problem of aggregating preferences of multiple agents based on the notion of popular ranking is introduced: a ranking of a set of elements is popular if there is no other permutation of the elements that a majority of the voters prefer. They analyzed the computational complexity and proved it is NP-hard to find a ranking with a majority of preferences.
The problem of aggregating preference rankings is analyzed in [26], where the authors propose a method based on Ordered Weighted Averaging (OWA) [27], operators. Each candidate may receive some votes in different ranking places; the total score of each candidate is the weighted sum of the votes she/he receives in different ranking places. Usually, the quantity the votes depends on the places (for instance in the Borda-Kendall method [28]); a key issue of the preference aggregation is how to determine the weights associated with different ranking places. In [26], OWA is used to determine the weights associated with different ranking places; OWA operators are also used in [11], moreover, the authors introduce a parametric aggregation model based on the fuzzy weighted.
The problem of preference ranking in the case of partial and/or incomplete preference data at multiple times is studied in [29]; an algorithm is developed to determine the maximum consensus sequences from the users’ partial ranking data.
An other alternative is presented in [30], a semi-supervised ranking aggregation method is proposed, whose preference constraints of several item pairs are given and the aggregation function is learned based on the ordering agreement of different rankers; in these methods a weight vector is used.
In the next section a new method is proposed, the purpose is to introduce a general approach, designed for a broad variety of applications, and taking into account a minimum number of parameters (usually the estimation/learning of the model parameters can be done), which simplifies the process.
3 Learning Automata
Learning Automata (LA) [31, 32] are simple reinforcement learning components for adaptive decision making in unknown environments. An LA operates in a feedback loop with its environment and receives feedback (reward or punishment), for the actions taken. A single learning automaton maintains a probability vector
|
|
In this table,
4 Our Approach to Rank Aggregation for the Personnel Selection Problem
In our setting we consider a set of
The decision maker needs to aggregate the rankings in
Two questions need to be addressed in the approach we presented above: How to define the distance between rankings? What algorithm should be used to compute efficiently the aggregation? In this paper the Spearman footrule distance is used, given by expression 4. The Spearman footrule distance between two given rankings
where
This distance is extended to measure distances between one ranking and a set of rankings. Given a ranking
According to [16], a strong connection between the Kemeni optimal aggregation and the aggre-gation based on the Spearman footrule (called footrule aggregation exists; this result is interesting if we note that the footrule aggregation can be computed in polynomial time, consequently, this can avoid the difficult task of computing the Kemeni optimal aggregation, by approximating this aggregation with the footrule aggregation. This proves the usefulness of the Spearman footrule distance in the case of the full rankings aggregation problem. This distance was used in [20] to compare the results of different methods.
We propose an algorithm based on the reinforcement learning to minimise expression (5) efficiently. The proposed method uses learning automata for learning the overall ranking. This idea is inspired by the method proposed in [33] for permutation learning. The advantages of this approach are that it does not use any problem specific information, does not rely on domain knowledge and only very few parameters are involved.
Our approach assumes a stochastic matrix
Generating a permutation from M: Uniformly select a row
Retrieve a reward: Using 5 as a reward function.
Update M using reward r: The probability matrix
5 Experimental results
First, the performance of the method is reported; following, we show how the proposed consensus ranking methodology performs against other methods of aggregating individual rankings is studied, based on an evolutionary method. In Table 1 are the criteria and parameters used in the experimentation.
In Table 2 and Fig. 1 we show the results of the average minimum distance found in 30 runs, of the proposed algorithm for100, 1000, 1000 and 50000 iterations and the number of candidates ranging from 3 to 10 candidates. The results show that our approach always finds the optimal result for small instances, i.e. NC <= 6 in at most 1000 iterations. In case of larger instances, we find good solutions in 1000 iterations, but increasing the number of iterations allows to improve the quality of the overall ranking that is found by our algorithm.
Criteria | P arameters |
Number of candidates for a one job (NC) | 3, 4, 5, 6, 7, 8, 9, 10 |
Number of iteration (NI) | 1000, 5000, 10000, 50000 |
Number of run (NR) | 10 |
Number of expert (NE) | 3 |
Stop condition (SC) | reach NI or 5 iteration without perform |
|
0.75 |
Table 3 shows the number of iterations needed for the algorithm to find the minimum distance value averaged over 30 runs, of the proposed algorithm for 100, 1000, 1000 and 50000 iterations and from 3 to10 candidates for a job; as it can be observed that for small instances up to 6 candidates the algorithm is able to find the optimal solution (error is zero in Table 2) in less than 500 iterations, for larger instances the algorithm also quickly finds quite good solutions. For example in the case of 10 candidates, the best solution found after max. 1000 iterations has an error 0.11.
|
|
|
|
|
|
0.00 | 0.00 | 0.00 | 0.00 |
|
0.00 | 0.00 | 0.00 | 0.00 |
|
0.00 | 0.00 | 0.00 | 0.00 |
|
0.00 | 0.00 | 0.00 | 0.00 |
|
0.04 | 0.02 | 0.01 | 0.00 |
|
0.08 | 0.04 | 0.04 | 0.01 |
|
0.11 | 0.06 | 0.06 | 0.04 |
|
0.11 | 0.08 | 0.07 | 0.05 |
This error quickly drops to 0.08 after max. 5000 iterations.
In Figure 2 the number of permutations are displayed grouped in distance intervals for all possible permutations and the permutations generated by the algorithm respectively, for 10 job candidates. Figure 2 gives an idea of the hardness of the problem, only 41 permutations belong to the first interval corresponding to a distance between 0 and 0.9. Figure 3, shows that the proposed algorithm tends to generate permutations nearby optimal value zero.
Figure 2 illustrates the hardness of the problem as there are only very few permutations in the best bin and Figure 3 gives some insight in how algorithm explores the search space.
Below we compare our algorithm to the well-known Borda-Kendall (BK) method proves to be a typical application of OWA operator weights in preference aggregation, that is, the BK method corresponds to a special case of the OWA operator weight method [26]. In the study presented in [20] the BK methods obtained similar results as other more complex methods, and according to the expert criteria this method approved as second-best due to its simplicity. With its simple calculations.
5.1 Borda-Kendall Method
The Borda-Kendall method is the most widely used technique for rank aggregation. For
The final rankings are determined by a weighted sum, where the alternative with the highest sum is most preferred followed by the other alternatives in descending sum order. Since this method determines weights to be used in a weighted sum, it is called a weight-determining method. Several weight-determining methods have been developed from this one.
In this case are
The votes that each candidate receives j-th ranking place. The best candidate will be the one with the least total score.
Table 4 and figure 4 show the probability of finding permutations in the range from 0 to 0.09 for a D number of iterations between 1000 and 100 as seen in column 1 of the table; the results of the proposed algorithm are compared (column 3) to all possible permutations probability (column 2) and the method based on the Genetic Algorithm (GA) proposed in [12] (column 4), which was selected due to the good results shown in the paper, compared to the other existing methods.
|
|
|
|
|
|
2.60 | 2.10 | 2.50 | 2.50 |
|
14.90 | 8.80 | 9.60 | 7.90 |
|
51.60 | 53.50 | 49.20 | 74.20 |
|
254.00 | 374.20 | 356.50 | 210.30 |
|
376.70 | 803.40 | 1210.00 | 2790.40 |
|
365.10 | 1819.00 | 2592.70 | 10376.00 |
|
367.60 | 1954.00 | 2490.20 | 19687.4 |
|
368.90 | 2172.90 | 3567.00 | 18625.90 |
NI | Probability permutation random | Probability permutation GA | Probability permutation proposed-algorithm |
1000 | 0.63 | 0.67 | 0.90 |
700 | 0.50 | 0.58 | 0.81 |
600 | 0.45 | 0.55 | 0.75 |
500 | 0.39 | 0.46 | 0.72 |
400 | 0.32 | 0.41 | 0.60 |
300 | 0.26 | 0.33 | 0.48 |
200 | 0.18 | 0.26 | 0.39 |
100 | 0.10 | 0.20 | 0.31 |
Table 4 shows the probability of finding the best permutation by eight candidates.
Fig. 4 presents probability of finding the best permutation randomly in the case of eight candidates. The table shows that our algorithm is doing significantly better than both random search and a genetic algorithm approach [12]. Actually the approach proposed by [12] is similar to random search in terms of results.
6 Conclusion
The personnel selection problem is a typical problem in decision making. The problem concerns a set of candidates that are evaluated from the perspective of several criteria, to be ordered in to a single ranking. A common problem in personnel selection is to add these rankings to get a definite order to assist the decision maker in the selection.
The method proposed in this paper for the aggregation is based on the reinforcement learning approach. The goal is to find a ranking which is as much as possible in line to the individual rankings.
Our study shows that we can efficiently find the best solution, or near best solution for large instances, compared to other methods. Another advantage of our approach is that it requires only few parameters, so it is easy to use by the decision maker, moreover the implementation is simple.