The modifiable areal unit problem in the analysis of the demand for urban freight transport

Tovar Plata, Lizbeth; Hinojosa Reyes, Raquel; Tovar Plata, Lizbeth; Hinojosa Reyes, Raquel

doi:10.35424/rcarto.i105.1383

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Revista cartográfica

versión On-line ISSN 2663-3981versión impresa ISSN 0080-2085

Rev. cartogr. no.105 Ciudad de México jul./dic. 2022 Epub 10-Oct-2022

https://doi.org/10.35424/rcarto.i105.1383

Artículos

The modifiable areal unit problem in the analysis of the demand for urban freight transport

El problema de la unidad espacial modificable en el análisis de la demanda de transporte urbano de mercancias

Lizbeth Tovar Plata¹
http://orcid.org/0000-0003-1499-0810

Raquel Hinojosa Reyes²
http://orcid.org/0000-0002-6511-0759

^¹Universidad Autónoma del Estado de México, México, e-mail: ltovarp@uaemex.mx.

^²Universidad Autónoma del Estado de México, México, e-mail: rhinojosar@uaemex.mx.

Abstract

This document diagnoses the effects of the Modifiable Areal Unit Problem (MAUP) in the study of the demand for urban freight transport, by means of a comparison of different analysis scales to determine the relevance of working with different levels of detail. It is recommended to analyze the number of retail businesses, because they are the ones that mainly attract urban freight transport, and other variables of a socio-territorial type, considering three different spatial units: basic geostatistical area, electoral section, and regular 500-by-500-meter grids. For each scale, descriptive statistics parameters, correlation between variables, as well as global and local spatial autocorrelation (Moran-LISA’s I) were obtained to determine the impact of the MAUP. The results obtained show that the electoral sections do not result in an adequate scale, unless it is used as a complementary scale, since the statistical results indicate a greater variation. However, its cartographic representation allows seeing spatial groupings that are not observed in the other scales, which can be a valuable contribution. In the case of the Basic Geostatistical Area (BGA) scale and the 500-by-500-meter grids, data show less dispersion. The variables analyzed show a greater correlation and present a greater global spatial autocorrelation, which is why it is determined that they may be the most appropriate scales to model the transport of merchandise in the study area. The site that had no discrepancy, regardless of the scale of analysis, was the historic downtown of the city of Toluca, so it can be considered a priority site based on its demand for merchandise and territorial characteristics, where regulation and transport control alternatives can be incorporated, as well as the creation of infrastructure to make deliv-eries more efficient and to mitigate its negative impacts.

Key words: Urban Freight Transport; Modifiable Areal Unit Problem; Spatial Autocorrela-tion; Analysis scale

Resumen

El presente documento diagnostica los efectos del Problema de la Unidad Espacial Modificable (PUEM) en el estudio de la demanda de transporte urbano de mercancías mediante una comparación de distintas escalas de análisis para determinar la pertinencia de trabajar con diferentes niveles de detalle. Se propone analizar los comercios minoristas, por ser los principales atractores de transporte urbano de mercancías, y otras variables socio-territoriales en tres unidades espaciales distintas: área geoestadística básica, sección electoral y cuadrícula regular de 500 x 500 metros. Para cada escala se obtuvieron parámetros de estadísticos descriptivos, correlación entre variables y autocorrelación espacial global y local (I de Moran-LISA) para determinar el impacto del Problema de la Unidad Espacial Modificable. Los resultados obtenidos muestran que las secciones electorales no resultan una escala adecuada, a menos que se use como una escala complementaria, ya que los resultados estadísticos señalan una mayor variación, sin embargo, su representación cartográfica permite ver agrupaciones espaciales que no se observan en las otras escalas, lo que puede ser un valioso aporte. En el caso de la escala por AGEB y las cuadrículas de 500 x 500 metros los datos tienen menor dispersión, las variables analizadas muestran una mayor correlación y presentan una mayor autocorrelación espacial global por lo que se determina que pueden ser las escalas más adecuadas para modelar el transporte de mercancía en la zona de estudio. El sitio que no tuvo discrepancia, independientemente de la escala de análisis, es el centro histórico de la ciudad de Toluca, por lo que puede considerarse un sitio prioritario en fun-ción de su demanda de mercancías y características territoriales en el que se pueden proponer alternativas de regulación y control de este transporte, así como creación de infraestructura para hacer más eficientes las entregas y mitigar sus impactos negativos.

Palabras clave: Transporte Urbano de Mercancías; Problema de la Unidad Espacial Modificable; Autocorrelación espacial; Escala de análisis

1. Introduction

In Mexico, there are no specific zoning for transportation study issues, such as the Traffic Analysis Zones (TAZ), which are managed in countries such as the United States and Canada to model transportation demand (^{Department Of Transport U.S, 2021}; ^{Sun, 2007}). At national level, there is the National Institute of Statistics and Geography (INEGI, by its acronym in spanish), which is in charge of the production and dissemination of geographic information that generates information at some scales, like national, state, municipal, by locality, BGA, electoral section and blocks, in addition of the Mexican Transportation Institute (IMT) that generates and publishes information regarding national transportation, mainly at the state and municipal level, which are the two main scales used in most transportation demand studies (^{Moreno et al., 2021}; ^{Betanzo, 2015}; ^{Gradilla & Rico, 2005}).

Considering its nature, urban freight transport requires data at the urban, local, or even at establishment level. However, given the aggregation levels of the existing data for transport studies, the Modifiable Areal Unit Problem (MAUP) often arises, which refers to the sensitivity of the statistical and cartographic results, depending on the unit analysis where the data is found. The reason is because the area units are usually subjectively defined and their limits are modifiable (^{Buzzelli, 2020}). This problem is known in the geographical and statistical field, but it is very little analyzed in the study of transport (^{Briz-Redón et al., 2019}). That is why this document aims to show the variability of the descriptive statistical results and some spatial analysis techniques based on different scales of analysis, to model the demand for urban freight transport, in order to deter-mine which of the scales is better to analyze this transport.

To achieve the goal, this document has been divided into four sections. The first one has a literature review on two main topics: the MAUP and, specifically, its consideration in urban freight transport studies. The second one corresponds to the description of the materials and methods used to analyze the effect of the MAUP, at different scales of analysis: BGA, electoral section, and the 500-by-500-meter regular grids, in the variables related to the demand for urban freight transport in the study zone. The third one presents the results of some descrip-tive statistics, such as measures of central tendency, measures of dispersion and distribution, correlation between variables, as well as global and local spatial autocorrelation, obtained in the three scales of analysis. Finally, the fourth one states the final considerations.

1.1 The Modifiable Spatial Unit Problem in spatial analysis

The Modifiable Spatial Unit Problem, also known as the Modifiable Area Unit Problem (MAUP), is related to the fact that the measurements for crosssectional data are sensitive to the levels of aggregation and to the combinations of contiguous units that are made (^{Anselin, 1988}). It is characterized by having two effects, the first is that of scale and refers to variations in the results when considering different aggregations of spatial units, within larger ones, for example: aggregated data in federal entities versus disaggregated data in municipalities. The second effect is that of zoning, which concerns the differences in the results when different unit formations are used, that is, for a fixed number of zones, different aggregation alternatives are used (^{Vela, 2016}).

When the unit of analysis is modified, the data can be regrouped in infinitely new and different arrangements and, therefore, the results and the relationships between the phenomena are different, according to the scale of observation and the spatial extent of the region studied. This difference is evident in the cartographic representation and in the statistical results of the analyzed phenomena because the results of these analyzes depend directly on the definition of the units studied (^{Buzzelli, 2020}).

Some studies that analyze the sensitivity to the MAUP identified that the parame-ters such as the correlation coefficient, univariate statistics, factor analysis, vari-ance, as well as the results and significance in bivariate regression and multiple regression models, vary depending on the number and size of the analysis area (^{Clark y Avery, 1976}; ^{Ravenel, 2003}; ^{Xu et al., 2014}; ^{Biehl et al., 2018}). The MAUP has a direct consequence on the results obtained when executing spatial analysis tools, completely modifying the cartography and the output statistics. This situation occurs mainly in tools whose methods are based on spatial contiguity (maps of discontinuities, local deviations, spatial autocorrelation, measurement of spillover effects, among others) since this is completely dependent on the choice of basic territorial units (^{Grasland & Madelin, 2006}). Therefore, in this paper it was considered important to analyze the sensitivity to MUAP with three methods in particular: descriptive statistics, correlation of variables and spatial autocorrelation, which allow observing the behavior of the data individually, with respect to other variables of analysis and at the spatial level, respectively.

In the past, most studies ignored MAUP effects, as the data and tools to avoid these effects were not available. However, digital cartographic data and Geo-graphic Information Systems make it possible to evaluate the effects of the MAUP and select the optimal units of analysis. Currently, different studies that require spatial analysis have paid special attention to the MAUP. Some the main topics are health studies, vegetation analysis, spatial segregation, relationship between species, among others (^{Wang & Di, 2020}; ^{Nouri et al., 2017}; ^{Nielsen & Hennerdal, 2017}; ^{Lechner et al., 2012}).

1.2 MAUP in urban freight transport studies

The effects of MAUP have received little attention in transportation studies. It has been analyzed mainly in the transport of people, both motorized and non-motorized (^{Biehl et al., 2018}; ^{Mitra & Buliung, 2012}), as well as road safety is-sues (^{Briz-Redón et al., 2019}; ^{Xu et al., 2018}), in the influence of urban form in the choice of travel mode (^{Zhang & Kukadia, 2005}), as well as in the design of traffic analysis zones (^{Viegas et al., 2009}).

Although urban freight transport has not received as much attention as passenger transport, in recent decades there has been an increasing interest to include spatial variables in its study (^{Ducret, 2015}; ^{Alho & Silva, 2014}). These variables are measured at different scales of analysis, which can be macroscopic (at the city or metropolitan level); mesoscopic (at the corridor, neighborhood or local level) and microscopic (establishment level). The microscopic scale has advantages over the other two scales, as it can be more directly related to the explanatory variables and can reflect the behavior of decision makers in freight transport. However, this scale requires estimating the data that are aggregated in the macroscopic and mesoscopic scales (^{Pani et al., 2019}).

The choice of the unit of analysis is mainly determined by the availability of the data, so the results of the models vary depending on the analysis unit used. The variation of results is one of the problems that researchers can face in the modeling of freight transport by the MAUP. The variability in the model parameters is due to the fact that the limits of the analysis units can be modified in infinite ways, which represents a loss of information that changes with each alternative.

Existing studies of urban freight transport modeling have used different analysis units to study this transport. Some have been based on census units (^{Sánchez-Díaz et al., 2016}); municipalities (^{Cantillo et al., Veras, 2014}); regular grids (^{Du-cret et al., 2016}; ^{Alho & Silva, 2014}); zones of influence or buffers (^{Kawamura & Miodonski, 2012}); homogeneous industrial sectors (^{Sahu & Pani, 2020}); commercial sectors (^{Sánchez-Díaz, 2017}), and services sectors (^{Sánchez-Díaz, 2018}).

Despite the different scales used to analyze transport, there is not much attention in demonstrating the link between the choice of the unit of analysis and the quality of the model. Here, the model to be described is the one that represents the cargo activity. Freight travel is an induced demand from retail establishments (^{Rodrigue et al., 2016}).

^{Biehl et al. (2018)} mention that, to adequately represent the true relationship between measures of the urban environment and travel demand, the MAUP must be evaluated in the context of several possible spatial representations available, to measure aggregate variables. ^{González and Sánchez (2019)} analyzed the impact of the units of analysis and the aggregation of the data in the models of generation of freight transport trips, concluding that the level of disaggregation can influence the quality of the generation rates of freight trips, although not proportionally, so disaggregated estimates generally give more adequate results. ^{Pani et al. (2019)} evaluated the impacts of the MAUP effect on load generation models and trip generation models, where they obtained a wide variation in the estimated coefficients in terms of magnitude, statistical significance and direction of association. They concluded that an analyst can design different policy instruments, which can become counterproductive, taking as a reference the effects of the urban environment on freight travel patterns.

2. Methodology

Vector files were used at three different scales of analysis to be able to compare the results, both with some descriptive statistics and when applying different spatial analysis techniques. The three units of analysis were: Basic Geostatistical Area (BGA), which is defined by the National Institute of Statistics and Geography as a geographic area occupied by a set of blocks perfectly delimited by streets, avenues, walkways or any other feature of easy identification on the ground and whose land use is mainly residential, industrial, service providers, commercial, and so on, which are only assigned to the interior of urban areas that are those with a population greater than or equal to 2,500 inhabitants and in the municipal capitals (^{INEGI, 2020}); electoral section, which is defined as the territorial fraction of the single-member Electoral Districts for the registration of citizens in the Electoral Register and in the Nominal Lists of Voters (^{National Electoral Institute [INE], 2020}); and a 500-by-500-meter regular grids, taking as a reference the recommendations of researchers on urban freight transport (^{Ducret et al., 2016}).

The information used to carry out the analysis was the retail businesses in the Metropolitan Area of the City of Toluca, which is in vector format and was obtained from the National Statistical Directory of Economic Units (NSDEU) of the year 2018. The variables selected were the retailers, was chosen as it is considered the main attractive pole for urban freight transport, income, population aged 15 to 64 years, density of primary roads and density of employment per hectare and number of dwellings, as suggested by studies of urban freight transport that include spatial indicators (^{Ducret et al., 2016}; ^{Sánchez-Díaz et al., 2016}; ^{Alho & Silva, 2014}; ^{Kawamura & Miodonski, 2012}). This information was obtained, and in some cases, it was calculated, with the census variables of the 2020 Population and Housing Census, with information from the 2018 NSDEU, and with vector information on the roads in the study area. See Table 1.

Table 1 Variables used

Variables	Author	Source
Retail trade (retail)	Ducret et al., 2016; Sánchez-Díaz et al., 2014	NSDEU 2018, INEGI
Income (income)	Ducret et al., 2016	Calculated with the 2020 Population and Housing Census, INEGI
Population aged 15 to 64 (pop_15_64)	Ducret et al., 2016; Kawamura et al., 2012	2020 Population and Housing Census, INEGI
Density of primary roads (dens_prim)	Kawamura et al., 2012	Calculus in QGIS with roadways vector files
Density of employment (dens_empl)	Sánchez-Díaz et al., 2014; Kawamura et al., 2012	Calculated with the 2018 NSDEU
Number of dwellings (NumDwe)	Ducret et al., 2016	2020 Population and Housing Census, INEGI

To be able to analyze the effects of the MAUP, in the analysis of the demand for merchandise in the Metropolitan Area of Toluca, several tests were carried out with the variables of interest. These tests consisted of descriptive statistics; cor-relation between variables, as well as global and local spatial autocorrelation (Moran-LISA’s I).

2.1 Descriptive statistics

In this section, statistics of central tendency, dispersion and distribution were specifically analyzed.

In the central tendency statistics, the mean was obtained, which is the sum of all the values of the variable divided by the number of cases, and the median, which reflects the value below where 50% of the cases are found.

Regarding the dispersion statistics, the minimum and maximum values were considered, which is the smallest and largest value respectively of the observed values; the variance, which is a measure of dispersion about the mean, equal to the sum of the squared deviations of the mean divided by the number of cases minus one, as well as the standard deviation (SD), which measures the degree to which the scores of the variable deviates from the mean.

In the distribution statistics, skewness and kurtosis were obtained. Skewness expresses the degree of skewness of the distribution, where positive values indi-cate that the most extreme values are above the mean, while negative skewness indicates that the most extreme values are below the mean. The asymmetry value close to zero indicates symmetry. On the other hand, the kurtosis of the variables is a measure of the degree of existence of outliers that indicates the degree to which a distribution accumulates cases at its extremes, compared to the cases accumulated at the extremes of a normal distribution. A positive kurtosis indicates that the data exhibit more extreme outliers than a normal distribution, while a negative kurtosis indicates that the data exhibit fewer extreme outliers than a normal distribution.

The purpose of obtaining these descriptive statistics is to be able to explore the individual behavior of the data in the three units of analysis, their variation and distribution, which in turn helps to identify the way in which they can be treated later.

2.2 Variables Correlation

Correlation makes it possible to determine the degree of association between variables, that is, the degree to which two variables tend to change at the same time; there are different correlation coefficients such as Pearson's or Spearman's. The first evaluates the linear relationship between two continuous variables and is used when the data have a normal distribution, while the second evaluates the monotonic relationship between two continuous variables that tend to change at the same time, but not necessarily at a constant rate and are used when the distribution of the data is not normal. The correlation can be positive or negative. A positive correlation indicates that one variable increases as another increases, while a negative correlation indicates that while one variable decreases, the other increases and vice versa. Both coefficients take values between -1 and 1, where the zero value, or values close to it, indicates an absence of association of the variables. In this sense, the correlation between the variables of interest in the study area was analyzed in the different analysis units.

Analyzing the correlation coefficient between the variables considered in this study will make it possible to determine how one variable is related to another, and to identify whether this correlation increases or decreases as the scale of analysis is modified, as stated by ^{Openshaw and Taylor (1979)}.

2.3 Spatial Autocorrelation

Spatial autocorrelation allows identifying how a phenomenon varies across geographic space, to determine spatial patterns and the degree to which local elements are affected by their neighbors. It shows the correlation that exists within the variables through space, where the values observed in a single study variable and the relationship with its closest units are considered (^{Siabato & Guzmán-Manrique, 2019}; ^{Goodchild, 1986}). ^{Vilalta (2005)} defines it as the concentration or dispersion of the values of a variable on a map. This concept is related to the socalled first law of geography formulated by Waldo ^{Tobler in 1970}, which establishes that the closest things in space have a greater relationship than those that are more distant (^{Tobler, 1970}).

Spatial autocorrelation can be positive (when conglomerates or clusters are formed), negative (when they are dispersed); or may not exist when the phenomenon behaves randomly. Spatial autocorrelation is interpreted as a statistical index that allows measuring the degree to which a geographic variable is correlated with itself, in different zones, within the study area.

The Moran Index (I) is one of the most widely used indices to identify the type of pattern that the study variable presents globally. The value of the index I de-pends on the previously established neighborhood criteria, which can be of dif-ferent types: queen, rook, and bishop. These neighborhood criteria take as a reference the movements of the pieces in the game of chess. For the case of this analysis, the queen-type neighborhood criterion was used (^{Moran, 1950}).

The limitation of the Moran Index I to generate global values is sort out by using local indicators of spatial association, better known as LISA, which allow the detection of agglomerations (clusters) and provide a quantification of the degree of significant grouping of similar values around an observation. The sum of the LISA's for all the observations is proportional to the global indicator of spatial association, so it is useful to measure the contribution of each observation to the value of the global contrast (^{Anselin, 1995}).

Spatial autocorrelation, as mentioned by ^{Goodchild (2008)}, is susceptible to MAUP; analyzing this index allows us to identify which of the units of analysis provides the most relevant information to be represented cartographically, or to be considered in subsequent studies.

2.4 Software

The integration of statistical information to the vector files was carried out with the QGIS software, version 3.10. Statistical and correlation analysis were per-formed in the statistical software SPSS, version 25, and spatial autocorrelation analysis was performed in Geoda version 1.16.0.12.

3. Study zone

The study area is the Metropolitan Area of Toluca (MAT), which is considered the fifth metropolitan area with the largest number of inhabitants in Mexico and the second at the state level. It has a population of approximately two million inhab-itants (^{National Population Council [CONAPO], 2018}). It is located in the central portion of the State of Mexico, to the West of Mexico City, and made up of 15 municipalities. Toluca is the most relevant municipality, given its status as the entity's capital city. See Figure 1.

Figure 1 Study zone.

With the industrialization of the area, which emerged in the sixties (^{Rendón & Godínez, 2016}) and the growth of the tertiary sector in the eighties, it is considered one of the most dynamic metropolitan areas in the country. Its economic dynamics is based mainly on the tertiary sector, where practically 90% of the 95,000 economic units in the metropolitan area correspond to this sector. The tertiary sector is made up of 48,000, where 94% of them are retail businesses supplied through urban freight transport, which represents a key aspect that allows inferring the high demand for freight transport trips in the area of study (^{INEGI, 2018}).

4. Results and discussion

This section contains the results obtained that allow identifying the effects of the MAUP in the modeling of the demand for freight transport, using different scales of analysis: BGA, electoral section, and the 500-by-500-meter regular grids.

4.1 Descriptive Statistics

Descriptive statistics, specifically of central tendency, distribution and dispersion, were obtained for the six study variables, in each of the three analysis units.

In the retail trade variable, it can be observed that the central tendency and dis-persion parameters have the lowest values in the grids, while the highest values are in BGA, which allows determining that in the regular grids, the values have less dispersion than in the other two scales of analysis. In the case of the distribution measures, both asymmetry and kurtosis, the three scales have positive values, which indicates that the most extreme values in the three scales are above the mean and that the data in each of they have more extreme outliers than a normal distribution. The scale where this variable has a distribution close to normal is that of the electoral section. See Table 2.

Table 2 Number of retail stores

Parameters	BGA	Electoral section	Grids 500x500m
N	530	660	8232
Mean	32.14	26.08	7.36
Median	25.00	20.00	5.18
SD	31.69	25.648	8.15
Varianza	1004.409	657.796	66.49
Minimum	0.00	0.00	0.00
Maximum	250.00	176	142.74
Skewness	2.137	1.834	4.316
Kurtosis	8.302	5.647	47.360

The higher the level of disaggregation, the lower the level of dispersion of the stores, because the grids have smaller dimensions than the other two scales, which makes it possible to notice the concentration of retail stores in the grids. It is worth mentioning that on a BGA scale it is not possible to appreciate, since the extension of this unit of analysis is much greater.

For the case of the others variables, the results obtained are observed in Table 3.

Table 3 Descriptive statistics of the variables of interest

	Income			Population aged 15 to 64			Density of primary roads			Density of employment			Number of dwellings
Parameters	BGA	ES	GR	BGA	ES	GR	BGA	ES	GR	BGA	ES	GR	BGA	ES	GR
N	530	660	8232	530	660	8232	530	660	8232	530	660	8232	530	660	8232
Mean	342.4	320.7	78.3	1901.1	1921.2	434.4	9.4	10.7	1.9	12.4	21.2	10.1	694	698.1	157.4
Median	293.3	253.4	38.6	1752	1653	305	0	0	0	3.7	7.7	3	663	580	105
SD	291.9	336	104.5	1466	1417.3	446.6	21.3	24.3	8.3	26.4	45.1	22.1	523	552.3	170.5
Varianza	85192	112887	10922	2149124	2008849	199454	454	590	68	697.3	2031	488.9	273571	305005.9	29070
Minimum	0	0.1	0	0	9	0	0	0	0	0	0	0.00	0	4	0
Maximum	1969.5	5959.1	843.9	8490	20075	3553	187	194	148	304.3	637.6	343	3328	8663	1327
Skewness	1.3	8.3	2.5	0.9	4.1	1.8	3.9	3.5	6.9	6.1	6.7	6.8	0.9	5.6	1.9
Kurtosis	2.6	123	7.8	1.1	41.8	4.4	20	15.5	66.9	52.7	68.2	73.6	1.2	67.5	4.8

BGA = Basic Geostatistical Area ES = Electoral Section GR = 500 by 500 meters grids

In the case of income and the population aged 15 to 64, the values for the measures of central tendency are similar in the BGA scale and in the electoral section. In the case of the grids, the values are lower, while in the three scales, the data is scattered, although to a lesser degree in the regular grids. Regarding the distribution statistics, the BGA scale and the grids show a distribution closer to normal, but in the electoral sections, more extreme outliers are appreciated above the mean.

Regarding the density of primary roads, the three scales have the same value in the median, while the mean is lower in the grids. In the dispersion measures, the values are more dispersed in the electoral section scale, while in the grids they have less dispersion. This variable does not show a normal distribution in any of the three scales. The regular grids are the most outliers, above average.

The employment density shows that the values for the measures of central tendency are similar on the BGA and regular grids scales. In the dispersion measures, it is observed that the variable is more dispersed in the electoral section scale. In both BGA and grids, the values are similar. Regarding the distribution, it is observed that in the three analysis scales there is a similar distribution and more outliers, above the mean.

Finally, regarding the total number of dwellings, we have the following: regarding the measures of central tendency, the BGA scale and the electoral section have similar values. Regarding the dispersion measures, the variable is observed less dispersed in the regular grids. Regarding distribution, on the BGA scale, the data have a distribution close to normal. The values furthest from the normal distribution are found on the electoral section scale, where there are also more outliers, above the mean.

As shown by the descriptive statistics, regarding the measures of central tendency and dispersion, the scales that show a more similar behavior of the data are BGA and electoral section, while in the grids a less dispersed behavior of all the variables is observed. The scale, on which a normal distribution of the data can be observed, in most cases, is that of BGA. These results become important indicators of the type of correlation to be used in the next exercise.

4.2 Correlations

With the results of the descriptive statistics, specifically those of distribution, it was possible to identify that the data of each variable analyzed did not show a normal distribution, so the Spearman coefficient was used to obtain the correlation coefficient between variables.

In the BGA scale, the variables that have a significant correlation are retail trade, with the population aged 15 to 64 and total dwellings, the population aged 15 to 64 with income and total dwellings. Income and housing also show a high correlation, but the rest of the variables do not show a significant correlation. See Table 4.

Table 4 Spearman Correlation for BGA´s variables

	Retail	Income	Pop_15_64	Dens_prim	Dens_empl	NumDwe
Retail	1.000	.621**	.834**	.253**	.341**	.797**
Income	.621**	1.000	.877**	.246**	.355**	.921**
Pop_15_64	.834**	.877**	1.000	.165**	.258**	.989**
Dens_prim	.253**	.246**	.165**	1.000	.301**	.186**
Dens_empl	.341**	.355**	.258**	.301**	1.000	.271**
NumDwe	.797**	.921**	.989**	.186**	.271**	1.000

** The correlation is significant at the 0.01 level (bilateral).

In the electoral section scale, less correlation is observed between the variables. The only ones that show a considerable correlation are income with the population aged 15 to 64 and the total number of dwellings, as well as the population aged 15 to 64 with the total number of dwellings. See Table 5.

Table 5 Spearman Correlation for electoral section variables

	Retail	Income	Pop_15_64	Dens_prim	Dens_empl	NumDwe
Retail	1.000	.517**	.588**	.088*	0.018	.567**
Income	.517**	1.000	.764**	.133**	0.016	.814**
Pop_15_64	.588**	.764**	1.000	-.144**	-.420**	.988**
Dens_prim	.088*	.133**	-.144**	1.000	.454**	-.105**
Dens_empl	0.02	0.02	-.420**	.454**	1.000	.377**
NumDwe	567**	.814**	.988**	-.105**	-.377**	1.000

** The correlation is significant at the 0.01 level (bilateral).

The 500-by-500-meter regular grids shows similar correlations obtained on the BGA scale; however, a new variable appears to be correlated on this scale, which is that of retail trade with income. It is notable that the correlation coefficients are higher than the ones on the BGA scale. See Table 6.

Table 6 Spearman Correlation for 500-by-500-meter regular grids variables

	Retail	Income	Pop_15_64	Dens_prim	Dens_empl	NumDwe
Retail	1.000	.849**	.924**	.210**	.580**	.900**
Income	.849**	1.000	.961**	.210**	.579**	.962**
Pop_15_64	.924**	.961**	1.000	.195**	.563**	.985**
Dens_prim	.210**	.210**	.195**	1.000	.259**	.195**
Dens_empl	.580**	.579**	.563**	.259**	1.000	.564**
NumDwe	.900**	.962**	.985**	.195**	.564**	1.000

** The correlation is significant at the 0.01 level (bilateral).

When generating the Spearman correlation matrix with the study variables, it was observed that the most similar correlation values are found in the BGA scales and in the grids, while the variables show less correlation in the electoral sections.

4.3 Spatial Autocorrelation

Moran’s test - global spatial autocorrelation

The results of the Moran Index for the analysis of global spatial autocorrelation of the variables analyzed vary considerably between each analysis unit.

In the case of the retail trade variable, it has a considerable spatial autocorrelation on the grids, so clusters are observed, while on the BGA scale, the spatial distribution pattern is practically random. In the case of the income variable, in the same way as the previous variable, a grouping pattern is observed on the grids scale, while the electoral section has the lowest Moran I, which reflects a more random pattern. The population variable aged 15 to 64 tends to appear grouped in the grids, while in BGA it has a practically random pattern. Regarding the density of primary roads and the density of employment, the highest auto-correlation is observed in the electoral sections, while the other two scales show a weak positive autocorrelation. Finally, the total of dwellings shows a grouping pattern in the grids and a random pattern in BGA. See Table 7.

Table 7 Moran Index

	BGA	Electoral Section	Grid 500X500m
Retails	0.184	0.316	0.582
Income	0.255	0.179	0.692
Population aged 15 to 64	0.073	0.235	0.628
Density of primary roads	0.377	0.614	0.489
Density of employment	0.524	0.620	0.520
Number of dwellings	0.086	0.190	0.642

In conclusion, the global spatial autocorrelation differs in the three analysis scales considerably, being the 500-by-500-meter grids the one that presents the highest global auto-correlation index in almost all the variables. The BGA scale is where a random pattern is observed in most of the variables.

Local Indicators of Spatial Association - LISA

In the case of the local spatial autocorrelation analysis, only was analyzed the retail stores, since it is one of the variables of greatest interest, because it is the attractive pole of urban freight transport, obtaining the spatial groupings and the spatial outliers in the three scales analyzed. See Figure 2.

Spatial groupings (clusters) High-High

They are the clusters that are characterized by having high retail values and that are surrounded by neighboring units with the same characteristics. In this case, significant differences were found in the three analysis scales. 5% of BGAs are in this group, located in the center of the city of Toluca and in practically all the municipal seats of the municipalities that make up the MAT. In the case of elec-toral sections, 9.5% of their polygons have this characteristic. The periphery stands out instead of the center of the city of Toluca, where these clusters are located. Regarding the 500-by-500-meter grids, the high-high clusters represent 11% of the total and are mainly in the downtown area of the city of Toluca.

Low-Low

They are clusters characterized by having a low number of retail businesses, whose neighbors have this characteristic, too. In the case of BGAs, 2% of their polygons are in this category, located mainly in the southern part of the city of Toluca. In the case of the electoral sections, these low-low clusters can be seen to the East and West of the MAT, with 11% of the total polygons. Regarding the grids, 8% corresponds to this type of cluster, standing out in the peripheries of the downtown area of the city, as well as in the neighboring municipalities.

Spatial outliers Low - High

This spatial outlier is characterized by polygons with a low number of retail stores that are surrounded by neighboring polygons with a high number of retailers. The results were as follows: on the BGA scale, these industrial estates represented 2.6% located mainly in neighboring estates of the historic center and some municipal capitals. In the electoral sections, 3.6% have these characteristics and are located on the outskirts of the city center. Finally, in the grids, 0.5% of the polygons are in this category, which are located mainly in the center of the city and in the surroundings of some municipal capitals.

High-Low

In this category are the polygons that have a high number of retail stores and that their contiguous neighbors have a low number of retailers.

In the case of BGA, the polygons with these characteristics represent 1% of the total. They are located to the East and South of the city of Toluca. Similarly, 1% of the polygons of the electoral sections belong to this category, located in the northeast and northwest portion. Regarding the grids, only one of its polygons belongs to this classification and is located to the north of the city.

Corroborating what other researchers have affirmed, regarding the variability of the results, depending on the analysis units, it is evident that in the different scales used, the results vary considerably. This confirms the need to evaluate different scales of work to choose the one that better explains the formulation of proposals that help to improve the urban freight transport, such as regulatory measures or construction of specific infrastructure.

The South-central portion of the city of Toluca and some municipal capitals located to the East and Northwest present coincidences on the BGA and grids scales. The sites that are located on the peripheries of the city center show greater discrepancies, so it is convenient to analyze them with other techniques to determine the optimal scale for the proposal of alternatives in these areas.

It can be concluded that the electoral sections, although at first it seemed to be an ideal scale of analysis due to the continuity of the elements, it is not an adequate scale of analysis, unless it is used as a complementary scale, since the statistical results indicate greater variation. Furthermore, at this scale, none of the variables that were taken for the analysis have a significant correlation with retail stores. The cartographic representation allows to see areas with high-high values that are not observed in the other scales, which can be a valuable contribution. However, there are some units, mainly those in the city center, that are not considered significant and could be underestimated, which would result in limited proposals if only working with this scale.

Figure 2 Local spatial autocorrelation of retail stores in the MAT, in three different analysis scales.

The BGA scales and the 500-by-500-meter grids may be the most appropriate to model the transport of merchandise in the study area, since they give the analyst a similar result, reflecting the downtown area of the city of Toluca and the capital cities in most of the suburbs. These sites are the main focus of concentration of retail businesses and therefore attraction poles of freight transport, as well as the correlation of these with some socioeconomic variables, with data that can be empirically corroborated, to generate regulation proposals that can have a greater reach.

The choice of one or another scale suggested in this document will be subject to the availability of data, as well as the level of specificity that is sought. However, there are some advantages and disadvantages on both one scale and another that are worth considering before choosing the final scale. See Table 8.

Table 8 Advantages and disadvantages of the BGA scale and the 500-by-500-meter grids

	BGA	500-by-500-meter grids
Advantages	Lower number of units of analysis. Areas perfectly delimited by main roads, which can be the benchmark for the application of specific regulations. Census data are at this scale, so socioeconomic variables can be integrated without any problem.	They have a greater degree of homogeneity within each analy-sis unit, as they have the same surface area. It allows the identification of high demand spaces that the BGA scale does not allow, since the data of the establishments de-manding the urban freight transport are represented indi-vidually
Disadvantages	Greater dispersion of the data. Discontinuity in the units. Surfaces do not have the same dimensions, so the analysis da-ta would have to be homogenized to a unit area (m2, km2, hectares, etc.).	They are more analysis units to handle. There is no delimitation with factors of the urban environ-ment that facilitate the imple-mentation of regulations. There are no census statistics at this level, so the data must be es-timated in order to be assigned to each quadrant.

The site that had no discrepancy, regardless of the analysis scale, was the historic center of the city of Toluca, characterized by a high density of employment that is closely related to the mobility of people and goods, as has been observed in other Mexican cities (^{Chaparro and Hernandez, 2020}), making it a priority place in which alternatives regarding regulation and control of this transport can be proposed, as well as the creation of infrastructure to make deliveries more effi-cient and to mitigate negative impacts. See Figure 3.

Working with spatial indicators will always make the researcher wonder about the optimal work scale to carry out any territorial study. Many times, that decision is limited to the availability of spatial data. However, exercises can be done, adding or disaggregating information in different work scales, with the intention of being able to compare the behavior of the data, in each one of them, and determine the impact of the problem of the modifiable spatial unit, with the intention of verifying the sensitivity of the data, when choosing a certain scale of work.

The spatial indicators of the urban environment that were analyzed in this article, such as the number of retail stores, are often added to scales of analysis not suitable for studies of this nature. It is evident that the problem of the modifiable spatial unit results in a wide variation in the estimated parameters of the different variables analyzed, in each of the scales worked.

Figure 3 Local spatial autocorrelation results of retail stores in the historic downtown of the city of Toluca, with different analysis scales.

5. Conclusions

These results make it possible to point out that a researcher can propose differ-ent or even counterproductive alternatives, taking as a reference the results obtained, when considering the demand for freight transport (retail stores) and the socio-territorial characteristics of the study area (population, density of communication channels, employment, housing, income, among many others), at different scales. It can be suggested that the most appropriate analysis scale for the variables analyzed in this document is the grid, since the statistics analyzed show less variation and the correlation between variables is much higher, even the spatial autocorrelation index both globally and locally (specifically in the High-High clusters) is much higher; however, it is advisable not to discard the results obtained in the other two scales, since these can be complementary to identify sites of importance not detected in the grid scale.

This corroborates what ^{Xu et al. (2014)}; ^{Biehl et al. (2018)}, ^{Grasland and Madelin (2006)} have mentioned about the impact of MUAP on correlation, univariate statistics and spatial contiguity, since it was possible to verify that when changing the unit of analysis the data show different behaviors; in the same way, we agree with ^{González and Sánchez (2019)}, in concluding that the level of disaggregation influences the results obtained, although not proportionally, and that the units with greater disaggregation give more adequate results.

For this reason, it is advisable to previously analyze different analysis units for the object of study, in order to improve the understanding of the sensitivity of the parameters when changing from one scale to another, with the intention of working on the one that best adapts to the characteristics of the study and, based on them, make proposals for improvement. The recommendation is to learn from the modifiable area units using analysis and representation methods, integrating the multi-scalar dimension and seeing the MAUP not as a "problem", but as a tool to explore the multi-scalar structure of the object of study.

Results in cartographic representations can show changes when modifying the scale of analysis. Each map obtained can be complementary to some other. The results of the statistical analysis, although different, provide knowledge about the behavior of the phenomenon studied.

The results shown in this document are intended to illustrate the need to recon-sider work scales and zoning that may be impacted by the Modifiable Spatial Unit Problem. It is suggested to give continuity to this research, to analyze the results for the same analysis scales with other statistical and spatial analysis tech-niques, such as multiple regression and geographically weighted regression, including the use Artificial Intelligence techniques, which are usually appropriate when there is spatial autocorrelation; or consider other units of analysis different from those analyzed here, which could be grids of different dimensions, or areas obtained from the aggregation of data at the block level to expand the units of analysis offered by official institutions.

Bibliography

Alho, A. R., & Silva, J. (2014). Analyzing the relation between land-use/urban freight operations and the need for dedicated infrastructure/enforcement - Application to the city of Lisbon. Research in Transportation Business & Management, 85-97. DOI: https://doi.org/10.1016/j.rtbm.2014.05.002 [ Links ]

Anselin, L. (1988). Spatial Econometrics: Methods and Models. Boston: Kluwer Academic Publishers. [ Links ]

Anselin, L. (1995). Local indicators of spatial association-LISA. Geographical Analisys, 93-115. DOI: https://doi.org/10.1111/j.1538-4632.1995.tb00338.x [ Links ]

Betanzo, E. (2015). Prospects for urban growth: retail business activity and the transportation of goods in the Metropolitan Area of Querétaro (Mexico). Science Ergo-Sum, 63-74. [ Links ]

Biehl, A., Ermagun, A., & Stathopoulos, A. (2018). Community mobility MAUP-ing: A socio-spatial investigation of bikeshare demand in Chicago. Journal of Transport Geography, 80-90. https://doi.org/10.1016/j.jtrangeo.2017.11.008 [ Links ]

Briz-Redón, Á., Martínez-Ruiz, F., & Montes, F. (2019). Investigation of the consequences of the modifiable areal unit problem in macroscopic traffic safety analysis: A case study accounting for scale and zoning. Accident Analysis and Prevention, 1-17. DOI: https://doi.org/10.1016/j.aap.2019.105276 [ Links ]

Buzzelli, M. (2020). Modifiable Areal Unit Problem. International Encyclopedia of Human Geography, 169-173. [ Links ]

Cantillo, V., Jaller, M., & Holguín-Veras, J. (2014). The Colombian Strategic Freight Transport Model Based on Product Analysis. Promet-Traffic and Transportation, 487-496. DOI: https://doi.org/10.7307/ptt.v26i6.1460 [ Links ]

Chaparro, I & Hernández, V. (2020). La reconfiguración de los subcentros de empleo en Ciudad Juárez, Chihuahua, 2004-2014. Región y Sociedad. DOI: https://doi.org/10.22198/rys2020/32/1268 [ Links ]

Clark,W. & Avery, K. (1976). The Effects of Data Aggregation in Statistical Analysis. Geo-graphical Analysis, 428-438. DOI: https://doi.org/10.1111/j.1538-4632.1976.tb00549.x [ Links ]

Department Of Transport U.S. (May 22, 2021). Traffic Analysis Zone. https://www.transportation.gov/ [ Links ]

Ducret, R. (2015). New organizations for urban parcel distribution over the last mile: innovating through a spatial approach. [PhD Thesis, National School of Mines of Paris]. [ Links ]

Ducret, R., Lemarié, B., & Roset, A. (2016). Cluster analysis and spatial modeling for urban freight. Identifying homogeneous urban zones based on urban form and logistics characteristics. Transportation Research Procedia, 301-313. DOI: https://doi.org/10.1016/j.trpro.2016.02.067 [ Links ]

Gonzalez-Feliu, J., & Sánchez-Díaz, I. (2019). The influence of aggregation level and category construction on estimation quality for freight trip generation models. Transportation Research Part E: Logistics and Transportation Review, 134-148. DOI: https://doi.org/10.1016/j.tre.2018.07.007 [ Links ]

Goodchild, M. (1986). Spatial Analytical Perspective on Geographical Information Sys-tems. International Journal of Geographical Information Systems, 327-334. DOI: https://doi.org/10.1080/02693798708927820 [ Links ]

Goodchild, M. (2008). Geographic information science: the grand challenges. The Hand-book of Geographic Information Science. Malden, Blackwell. [ Links ]

Gradilla, L., & Rico, O. (2005). Spatial analysis of the distribution of cargo transported by air in Mexico. Sanfandila, Querétaro: Mexican Institute of Transportation. [ Links ]

Grasland, C., & Madelin, M. (2006). The modifiable area unit problem (Final Report). ESPON European Commission, https://www.espon.eu/sites/default/files/attachments/espon343_maup_final_version2_nov_2006.pdf [ Links ]

Kawamura, K., & Miodonski, D. (2012). Examination of the Relationship between Built Environment Characteristics and Retail Freight Delivery. Transportation Research Board, 1-13. [ Links ]

Lechner, A. M., Langford, W., Jones, S., Bekessy, S., & Gordon, A. (2012). Investigating species-environment relationships at multiple scales: Differentiating between in-trinsic scale and the modifiable areal unit problem. Ecological Complexity, 91-102. DOI: https://doi.org/10.1016/j.ecocom.2012.04.002 [ Links ]

Mitra, R., & Buliung, R. (2012). Built environment correlates of active school transpor-tation: neighborhood and the modifiable areal unit problem. Journal of Transport Geography, 51-61. DOI: https://doi.org/10.1016/j.jtrangeo.2011.07.009 [ Links ]

Moran, P. (1950). Notes on Continuous Stochastic Phenomena. Biometrika, 17-23. DOI: https://doi.org/10.2307/2332142 [ Links ]

Moreno, E., De la Torre, E., & Piña, J. (2021). O-D matrix estimation of freight transpor-tation based on consignment notes. Sanfandila, Querétaro: Mexican Institute of Transportation. [ Links ]

National Institute of Statistics and Geography (December 14, 2018). Statistical Directory of Economic Units. INEGI. https://www.inegi.org.mx/app/mapa/denue/default.aspx [ Links ]

National Institute of Statistics and Geography (November 19, 2020). Glossary. INEGI. https://www.inegi.org.mx/app/glosario/default.html?p=localidades [ Links ]

National Electoral Institute (November 26, 2020). Electoral Glossary. INE. https://centralelectoral.ine.mx/2018/06/08/glosario-electoral-seccion-electoral/ [ Links ]

National Population Council (November 25, 2018). Delimitation of the metropolitan areas of Mexico 2015. CONAPO. https://www.gob.mx/conapo/documentos/delimitacion-de-las-zonas-metropolitanas-de-mexico-2015 [ Links ]

Nielsen, M., & Hennerdal, P. (2017). Changes in the residential segregation of immigrants in Sweden from 1990 to 2012: Using a multi-scalar segregation measure that accounts for the modifiable areal unit problem. Applied Geography, 73-84. DOI: https://doi.org/10.1016/j.apgeog.2017.08.004 [ Links ]

Nouri, H., Anderson, S., Sutton, P., & Beecham, S. (2017). NDVI, scale invariance and the modifiable areal unit problem: An assessment of vegetation in the Adelaide Parklands. Science of The Total Environment, 11-18. DOI: https://doi.org/10.1016/j.scitotenv.2017.01.130 [ Links ]

Openshaw, S. & Taylor, P. (1979). A Million or so Correlation Coefficients: Three Exper-iments on the Modifiable Areal Unit Problem. Statistical Applications in the Spatial Sciences, 127-144. [ Links ]

Pani, A., Prasanta, K., Chandra, A., & K. Sark, A. (2019). Assessing the extent of modifi-able areal unit problem in modelling freight (trip) generation: Relationship be-tween zone design and model estimation results. Journal of Transport Geography, 1-17. DOI: https://doi.org/10.1016/j.jtrangeo.2019.102524 [ Links ]

Ravenel, L. (2003). La présence d'étrangers entraîne-t-elle le vote pour l'extrême droite? Espace Populations Sociétés, 541-547. DOI: https://doi.org/10.3406/espos.2003.2108 [ Links ]

Rendón, L., & Godínez , J. (2016). Evolution and industrial change in the Metropolitan Zones of the Valley of Mexico and Toluca, 1993-2008. Economic Analysis Journal, 115-146. [ Links ]

Rodrigue, J. P., Comtois, C., & Slack, B. (2016). The Geography of Transport Systems. London: Routledge. [ Links ]

Sahu, P. K., & Pani, A. (2020). Freight generation and geographical effects: modelling freight needs of establishments in developing economies and analyzing their geo-graphical disparities. Transportation, 2873-2902. DOI: https://doi.org/10.1007/s11116-019-09995-5 [ Links ]

Sánchez-Díaz, I. (2017). Modeling urban freight generation: A study of commercial establishments’ freight needs. Transportation Research Part A: Policy and Practice, 3-17. DOI: https://doi.org/10.1016/j.tra.2016.06.035 [ Links ]

Sánchez-Díaz, I. (2018). Potential of implementing urban freight strategies in the ac-commodation and food services sector. Transportation Research Record, 194-203. DOI: https://doi.org/10.1177/0361198118796926 [ Links ]

Sánchez-Díaz, I., Holguín-Veras, J., & Wang, X. (2016). An exploratory analysis of spatial effects on freight trip attraction. Transportation, 177-196. DOI: https://doi.org/10.1007/s11116-014-9570-1 [ Links ]

Siabato, W., & Guzmán-Manrique, J. (2019). Spatial autocorrelation and the develop-ment of quantitative geography. Notebooks of Geography: Colombian Journal of Geography, 1-22. DOI: https://doi.org/10.15446/rcdg.v28n1.76919 [ Links ]

Sun, J. (May 12, 2007). Traffic collision analysis for Vancouver. https://ibis.geog.ubc.ca/courses/geob370/students/class07/accident_vancouver/index.html [ Links ]

Tobler, W. (1970). A Computer Movie Simulation Urban Growth in the Detroit Region. Economic Geography, 234-240. DOI: https://doi.org/10.2307/143141 [ Links ]

Vela, H. M. (2016). Reviewing spatial unit aggregation methods: maup, algorithms, and a brief example. Demographic and Urban Studies, 385-411. DOI: https://doi.org/10.24201/edu.v31i2.1592 [ Links ]

Viegas, J. M., Martínez, M., & Silva, E. (2009). Effects of the modifiable areal unit prob-lem on the delineation of traffic analysis zones. Environment and Planning B: Plan-ning and Design, 625-643. DOI: https://doi.org/10.1068/b34033 [ Links ]

Vilalta y Pérdomo, C. (2005). How to teach spatial autocorrelation. Economy, Society and Territory, 323-333. [ Links ]

Wang, Y., & Di, Q. (2020). Modifiable areal unit problem and environmental factors of COVID-19 outbreak. Science of The Total Environment, 1-5. DOI: https://doi.org/10.1016/j.scitotenv.2020.139984 [ Links ]

Xu, P., Huang, H., Dong, N. & Abdel-Aty, M. (2014). Sensitivity analysis in the context of regional safety modeling: Identifying and assessing the modifiable areal unit problem. Accident Analysis and Prevention, 110-120. DOI: https://doi.org/10.1016/j.aap.2014.02.012 [ Links ]

Xu, P., Huang, H., & Dong, N. (2018). The modifiable areal unit problem in traffic safety: Basic issue, potential solutions and future research. Journal of Traffic and Transportation Engineering, 73-82. DOI: https://doi.org/10.1016/j.jtte.2015.09.010 [ Links ]

Zhang, M., & Kukadia, N. (2005). Metrics of Urban Form and the Modifiable Areal Unit Problem. Transportation Research Record: Journal of the Transportation Research Board, 71-79. DOI: https://doi.org/10.1177/0361198105190200109 [ Links ]

Received: April 20, 2022; Accepted: June 28, 2022

This is an open-access article distributed under the terms of the Creative Commons Attribution License

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

Compartir

Revista cartográfica

versión On-line ISSN 2663-3981versión impresa ISSN 0080-2085

Rev. cartogr. no.105 Ciudad de México jul./dic. 2022 Epub 10-Oct-2022

https://doi.org/10.35424/rcarto.i105.1383