1 Introduction
The Internet allows efficient communication throughout the world [10]. The Internet has revolutionized the way business is carried out due to the incorporation of commercial marketing, sales, and customer service tools [10].
Due to the great the importance of the Internet in organizations, E-commerce is one of the main contributors of large companies [1]. On the other hand, the Internet allows communication from multiple digital devices such as sensors, cameras, smart cities, among others [2, 3, 4]. Nowadays, this scenario is known as “The Internet Shopping Problem”.
It is a classic scenario of electronic commerce due to the multiple benefits that users obtain by buying or acquiring goods or services through the Internet [5]. Online shopping makes it easier for people to access a wide variety of products and services offered by companies without having restrictions on time, place, or space [1].
In one of the most relevant works in the state-of-the-art field, the authors propose an innovative solution for the basic case of the Internet shopping problem with shipping costs.
This method consists of a memetic algorithm (MAIShOP) that incorporates standard instances, solution generation through the first-best heuristic, and a local search based on a heuristic that selects the lowest cost of each product in all stores [6].
Morales et al. [7] review the developed models, the implemented solution methods, and the instances used to analyze the performance of the algorithms described in the state-of-the-art.
Finally, it can be identified that one of the variants little investigated is the one that involves more than one optimization objective, in which the total cost of the purchase and the delivery time of products are considered.
Some Internet purchases require optimizing the total purchase cost, including the shipping cost and delivery time of different online stores [1]. Typically, users want to find the store with the lowest total cost and the shortest delivery time [1].
These decisions allow us to minimize the effort and maximize the benefit of the shopping list [10]. Chung [8] proposes a new Internet shopping optimization model that includes two objectives (total cost and delivery time) in which he incorporates for the first time a multi-objective optimization model.
Chaerani et al. [9] establishes the similarity between the model developed by Chung and the maximum flow problem with circular demand (MFP-CD) because it matches the multiples sources with respect to the multiple stores.
Chung’s bi-objective model incorporates the decision variable on delivery time. Chaerani et al. [9] modifies this decision variable into an adjustable robust counterpart (ARC) method. Chaerani et al. [1] propose the Benders decomposition method to solve the Adjustable Robust Count Party Problem adapted to “the Internet Shopping Problem (ARC-ISOP)”.
García-Morales et al. [10] propose a “MOEA/D algorithm to solve the bi-objective Internet shopping optimization problem (MOEA/D-BIShOP)”; this algorithm presents a basic MOEA/D version and has a clear superiority in two of the three metrics that were evaluated concerning the results of the state-of-the-art.
This research work proposes the implementation of a multi-objective evolutionary algorithm based on decomposition with adaptive adjustment of control parameters as a solution method to “the Bi-objective problem of Internet Shopping (MOEA/D-AACPBIShOP)”.
In the computational feasibility tests, nine instances generated using the Web Scraping technique with data from technological products extracted from Amazon were used [10].
1.1 Definition of the Problem
This model is first proposed by Chung [8] to solve “the bi-objective Internet shopping optimization problem”. “In this problem, a customer wants to buy a set of
Now, the set
The Bi-objective Internet Shopping Optimization Problem (BIShOP) consists of minimizing the total cost of purchasing all products
Variable/Parameter | Description |
Group of stores | |
Group of products | |
Array solution | |
Number of stores, |
|
Number of products, |
|
Store indicator | |
Product indicator | |
Container of products available in a store |
|
Shipping cost of all products in the store |
|
Cost of product |
|
Delivery time of a product |
|
Binary variable that indicates wheter producto |
|
Binary variable indicating wheter to add the sipping cost of store |
The model presents the optimization of two objectives: one is the purchase cost, and the other is the delivery time limitation. The first objective seeks to minimize the purchase cost; the second objective seeks to minimize the delivery time of the products (see Equation 1):
where
2 General Structure of Multi Objective Algorithms Applied to BIShOP
This section provides a detailed explanation of the essential components that form the multi-objective optimization algorithms utilized in “BIShOP”. To represent each solution in the population, these algorithms employ a vector representation, which is an
2.1 Crossover Operator
This operator randomly selects two solutions called parent1 and parent2 [11]. The solution child1 is generated by taking the initial half of parent1 and joining it with the second half of parent2. Later, to form child2, the initial half of parent2 is joined with the second half of parent1 [12]. Subsequently, a random number is generated; if this generated value is less than 0.5, the crossover operator selects child1; otherwise, it takes child2 to advance to the mutation process.
The crossover operator uses
2.2 Mutation Operator
The mutation process of the MOEA/D algorithm takes the candidate solution selected by the crossover operator. It immediately positions itself on the first element of the solution and generates a random number; if this random value is less
This process continues until all elements of the current solution have been examined”. The mutation process of the NSGA-II algorithm goes through all the elements of the vector and searches in which store that product has the lowest cost. This search ends when all stores in all products have been reviewed.
2.3 The Non-Dominated Sorting Genetic Algorithm II to Solve the IShOP Bi-Objective Problem (NSGA-II/BIShOP)
NSGA-II is a multi-objective optimization algorithm proposed as an improvement of NSGA [13], it uses the structure of genetic algorithms and is based on these principles: the best individuals never disappear from the population and during the selection if two non-dominated solutions are found, the most diverse one is preferred.
Algorithm 1 describes the general structure of the NSGA-II algorithm applied to the BIShOP problem. The algorithm in step 1 starts by defining the parameters such as chromosome size
In step 2, a Pop population is created randomly. From steps 3 to 5, the population is ordered according to the levels of non-dominance (ordering of the Pareto fronts:
The population obtained in the previous step is used in the crossover operator and is updated in step 8. In step 9, the mutation operator is applied and a new population of PopM descendants is obtained. In step 10, the three populations (Pop, PopC and PopM) are joined. From steps 11 to 13, a ranking is assigned to each individual in the fronts and the crowding distance is obtained, subsequently they are ordered, first by fronts from lowest to highest and then by crowding distance from highest to lowest. In step 14 the list of elements is truncated to leave only the best individuals, and which fits the initial
2.4 The Multi-Objective Evolutionary Algorithm based on Decomposition with Adaptive Adjustment of Control Parameters to Solve the IShOP Bi-Objective Problem (MOEA/D AACPBIShOP)
The multi-objective evolutionary algorithm based on decomposition (MOEA/D) was developed by Zhang and Li [14, 15, 16] and serves as a reliable and robust alternative for working with MOPs. Initially it makes a distribution of a set of weight vectors
Subsequently it creates a matrix of
The MOEA/D-AACPBIShOP algorithm is represented in Algorithm 2. In steps 1 to 4,
The main loop runs through all individuals within the population. In step 7, two parents are chosen. These are taken from the neighborhoods created in
The aggregation values of the two are calculated using
In this research work, the modified version of the adaptive operator selection method is used to achieve adaptive adjustment of control parameters. Using the Fitness-Rate-Rank-Based Multi-armed Bandit Adaptive (FRRMAB) method [18]. The FRRMAB method avoids this problem using fitness improvement rates (FIR). The formula for calculating these rates is shown in Equation 3:
where
To assign credits to action
The lower the
Bandit-based action selection chooses a stock considering the credits assigned to it and using the FRR values as a quality indicator [18], this process is shown in Algorithm 4.
Algorithm 5 contains the various actions that are excecuted and said action determines the increase or decrease in the value of the parameters that are adjusted adaptively.
3 Computational Experiments
The names of instances determine their size,
The designs are obtained from Web Scraping of multiple technological products (USB flash, Modem, RAM) that were carried out on Amazon’s e-commerce website. In this process, approximately 8002 records containing product names, prices, suppliers, delivery time, and shipping costs were obtained [10].
Fig. 1 shows the process of building the instances from real-world data described below: collect product and store information from the Amazon.com page.
Build an application in the Python language that allows us to explore within the search engine and obtain information using the Web Scraping technique, using various keywords such as laptop, headphones, and speakers, among others.
With a depth of 10 pages for each, the Beautiful Soup Python library is used to process the information [10]. A first version of the instances has been generated, and its construction is carried out by taking the products obtained with a defined price range and the stores are obtained.
Shipping times are defined arbitrarily (randomly) with values between 1 and 5 days. For the shipping cost, four arbitrary values are used, which are assigned randomly. These values are 88, 99, 120, and 140. The types of instances generated are shown in Table 2 [10]”.
3.1 Configuration of the Parameters
The configuration parameters of the proposed MOEA/D-AACPBIShOP algorithm is shown below pop =
The above configuration was determined based on related works found in the state-of-the-art. Modifying the values of the parameters can affect the behavior of the algorithm and, therefore, the quality of the solutions.
The size of the population is important because it affects the diversity and convergence of the algorithm. A small population can lead to loss of performance, diversity, and early convergence.
An inadequate number of generations can cause the algorithm to converge prematurely or have excessive resource consumption, and incorrect use of the crossover and mutation operators can lead to deadlocks or inefficient explorations of the solution space and the size of the neighborhood because it determines the number of neighboring solutions to explore contributes to the quality of the generated solutions.
In the computational experiments, the 30 non-dominated fronts were obtained from each of the three sets of instances for each subset; subsequently, non-parametric tests were applied, and the
Table 3 shows the parameters used for each algorithm used. The algorithms were implemented in the Java language.
3.2 Results
Tables 4, 6, and 8 organize the experimental results by metric. Friedman and Wilcoxon non-parametric tests were used with a significance level of 5%.
Problem | MOEA/D-BIShOP | MOEA/D-AACPBIShOP | NSGA-II/BIShOP |
3n20m | 0.00e+00 3.33e-01 | 3.33e-01 3.33e-01 == | 1.00e+00 3.33e-16▼ |
4n20m | 0.00e+00 2.50e-01 | 0.00e+00 2.50e-01 == | 1.00e+00 2.50e-01▼ |
5n20m | 0.00e+00 3.24e-01 | 0.00e+00 3.33e-01 == | 1.00e+00 3.33e-16▼ |
5n240m | 0.00e+00 3.33e-01 | 0.00e+00 3.33e-01 == | 1.00e+00 3.33e-16▼ |
5n400m | 0.00e+00 0.00e+00 | 0.00e+00 3.33e-01 == | 1.00e+00 3.33e-16▼ |
50n240m | 0.00e+00 0.00e+00 | 0.00e+00 0.00e+00 == | 0.00e+00 0.00e+00 == |
50n400m | 0.00e+00 0.00e+00 | 0.00e+00 0.00e+00 == | 0.00e+00 0.00e+00 == |
100n240m | 1.00e+00 0.00e+00 | 1.00e+00 0.00e+00 == | 0.00e+00 0.00e+00 == |
100n400m | 1.00e+00 0.00e+00 | 0.00e+00 0.00e+00 == | 1.00e+00 0.00e+00 == |
Problem | MOEA/D-BIShOP | MOEA/D-AACPBIShOP | NSGA-II/BIShOP |
3n20m | 4.86e-01 1.02e-01 | 4.83e-01 6.74e-02 ▼ | 4.66e-01 5.52e-02 ▼ |
4n20m | 4.15e-01 6.55e-02 | 4.15e-01 6.64e-02 == | 4.05e-01 5.43e-02 ▼ |
5n20m | 4.86e-01 1.28e-01 | 4.98e-01 9.84e-02 ▲ | 4.66e-01 5.52e-02 ▼ |
5n240m | 4.94e-01 9.88e-02 | 4.89e-01 7.74e-02 ▼ | 4.76e-01 8.29e-02 ▼ |
5n400m | 4.15e-01 8.92e-02 | 4.11e-01 7.67e-02 ▼ | 4.07e-01 6.41e-02 ▼ |
50n240m | 0.00e+00 0.00e+00 | 0.00e+00 0.00e+00 == | 0.00e+00 0.00e+00 == |
50n400m | 0.00e+00 0.00e+00 | 0.00e+00 0.00e+00 == | 0.00e+00 0.00e+00 == |
100n240m | 0.00e+00 0.00e+00 | 0.00e+00 0.00e+00 == | 0.00e+00 0.00e+00 == |
100n400m | 0.00e+00 0.00e+00 | 0.00e+00 0.00e+00 == | 0.00e+00 0.00e+00 == |
Problem | MOEA/D-BIShOP | MOEA/D-AACPBIShOP | NSGA-II/BIShOP |
3n20m | 6.94e-01 1.93e-01 | 6.94e-01 2.20e-01 == | 6.65e-01 2.57e-01▼ |
4n20m | 1.58e+00 8.64e-01 | 1.59e+00 8.74e-01 ▲ | 1.53e+00 9.49e-01▼ |
5n20m | 9.53e-01 6.41e-01 | 9.60e-01 6.45e-01 ▲ | 8.92e-01 7.43e-01▼ |
5n240m | 1.87e+00 1.25e+00 | 1.89e+00 1.16e+00 ▲ | 1.83e+00 1.30e+00▼ |
5n400m | 1.77e+00 1.23e+00 | 1.79e+00 1.12e+00 ▲ | 1.73e+00 1.39e+00▼ |
50n240m | 1.34e+154 0.00e+00 | 1.34e+154 0.00e+00 == | 1.34e+154 0.00e+00 == |
50n400m | 1.34e+154 0.00e+00 | 1.34e+154 0.00e+00 == | 1.34e+154 0.00e+00 == |
100n240m | 1.34e+154 0.00e+00 | 1.34e+154 0.00e+00 == | 1.34e+154 0.00e+00 == |
100n400m | 1.34e+154 0.00e+00 | 1.34e+154 0.00e+00 == | 1.34e+154 0.00e+00 == |
The first column of each table corresponds to the evaluated instance name. The second column corresponds to the reference algorithm results (MOEA/D-BIShOP). The third column contains the results of the proposed MOEA/D-AACPBIShOP and the fourth the results of the NSGA-II algorithm.
In the table, the symbol ▲ represents the statistical significance in favor of the reference algorithm, the symbol ▼ indicates that there is significant statistical difference in favor of the comparison algorithm (current column), and the symbol == means that the algorithms being compared have the same statistical performance.
The cells marked in dark gray represent the winning algorithm in a given problem and the front, and second places are marked in light gray.
3.1.1 Hypervolume
“The hypervolume (HV) calculates the volume of the objective space weakly dominated by an approximation set [17]. The first column in Table 4 represents the reference algorithm”. As can be seen, in the hypervolume metric, the NSGA-II/BIShOP algorithm is better in five of the nine problems compared to the reference algorithm and compared to the MOEA/D-AACPBIShOP Algorithm it has a similar performance.
3.1.1.1 Friedman Test
“The p-value calculated with the Friedman test is 0.12110333239233029, so with a level of statistical significance of 5%, it is significant. Table 5 below shows the average ranks per algorithm obtained with the Friedman test”. The Friedman test suggests that no algorithm differs significantly.
The above shows that the algorithm obtains better approximate Pareto fronts for all the evaluated instances.
3.1.2 Generalized Spread
“Generalized Spread (GS) evaluates the degree of dispersion and uniformity of the solutions identified. In Table 6, the first column is the reference algorithm”. As can be seen, in the generalized spread metric, the reference algorithm is statistically better in one of nine problems compared to the MOEA/D-AACPBIShOP Algorithm and compared to the NSGA-II/BIShOP it has a lower performance.
3.1.2.1 Friedman Test
“The p-value calculated with the Friedman test is 0.09697196786440554, so with a level of statistical significance of 5%, it is significant. Table 7 below shows the average ranks per algorithm obtained with the Friedman test”. The Friedman test suggests that no algorithm differs significantly. Therefore, the approximate Pareto fronts obtained in the three algorithms have similar performance.
3.1.3 Inverted Generational Distance
“The inverted generation distance (IGD) gives the average distance between any point in the reference set and its nearest point in the approximation set [18]. In Table 8, the second column is considered as the reference algorithm”.
As can be seen, in the generalized spread metric, the reference algorithm is statistically better in four of nine problems compared to the MOEA/D-AACPBIShOP Algorithm and compared to the NSGA-II/BIShOP it has a lower performance.
3.1.3.1 Friedman Test
“The p-value calculated with the Friedman test is 1.0, so with a level of statistical significance of 5%, it is not significant. Table 9 below shows the average ranks per algorithm obtained with the Friedman test”. The Friedman test suggests that no algorithm differs significantly. Therefore, the inverted generation distance metric indicates that the three algorithms find the best solution in fewer iterations.
4 Conclusions and Future Work
Finally, with the results obtained, it is observed that the three proposed multi-objective algorithms have a statistically similar performance in three evaluated metrics, which suggests that the algorithms have good dispersion in the solutions and a similar convergence.
Therefore, it is assumed that by using other genetic operators and including new elements the performance of these new BIShOP solution methods can be improved.
This paper proposes future work to explore and develop genetic operators. They would also be very useful in online stores, Internet search engines, and other complex problems similar to BIShOP.
These tools allow Internet searches to be carried out considering more than one attribute at a time and allow more than one solution to be chosen that can provide great benefits to users and companies.