Extreme Value Theory and Auction Models

Morganti, Paolo Riccardo; Morganti, Paolo Riccardo

doi:10.21919/remef.v16i2.596

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Revista mexicana de economía y finanzas

versión On-line ISSN 2448-6795versión impresa ISSN 1665-5346

Rev. mex. econ. finanz vol.16 no.2 Ciudad de México abr./jun. 2021 Epub 08-Abr-2022

https://doi.org/10.21919/remef.v16i2.596

Artículos de investigación y revisión

Extreme Value Theory and Auction Models

Teoría de valores extremos y modelos de subastas

Paolo Riccardo Morganti¹^*

^¹Universidad Panamericana, México

Abstract

The objective of this article is to develop a parametric approach to estimating auctions with incomplete data using Extreme Value Theory (EVT). The methodology is mainly theoretical: we first review that, when only transaction prices can be observed, the distribution of private valuations is irregularly identified. The sample bias produced by nonparametric estimators will affect all functionals of practical interest. We provide simulations for a best-case scenario and a worst-case scenario. Our results show that, compared to nonparametric approaches, the approximation of such functionals developed using EVT produces more accurate results, is easy to compute, and does not require strong assumptions about the unobserved distribution of bidders' valuations. It is recommended that financial operators working with auctions use this parametric approach when facing incomplete datasets. Given the difficult nature of the analysis, this work does not provide large sample properties for the proposed estimators and recommends the use of bootstrapping. This article contributes originally to the literature of structural estimation of auction models providing a useful and robust parametric approximation.

JEL Classification: C53; C57; C65; D44

Keywords: Extreme Value Theory; Structural Estimation; Auctions; Transaction Prices; Irregular Identification

Resumen

El objetivo de este artículo es desarrollar un enfoque paramétrico para estimar subastas con datos incompletos utilizando la Teoría de los Valores Extremos (EVT). La metodología es principalmente teórica: primero revisamos que, cuando solo se pueden observar los precios de transacción, la distribución de las valoraciones privadas se identifica de manera irregular. El sesgo de la muestra producido por los estimadores no paramétricos afectará a todos las formas funcionales de interés práctico. Proporcionamos simulaciones para el mejor de los casos y el peor de los casos. Nuestros resultados muestran que, en comparación con los enfoques no paramétricos, la aproximación de tales formas funcionales desarrolladas usando EVT produce resultados más precisos, es fácil de calcular y no requiere fuertes suposiciones sobre la distribución no observada de las valoraciones de los oferentes. Se recomienda que los operadores financieros que trabajan con subastas utilicen este enfoque paramétrico cuando se enfrentan a conjuntos de datos incompletos. Dada la naturaleza difícil del análisis, este trabajo no proporciona propiedades de muestra grande para los estimadores propuestos y recomienda el uso de bootstrapping. Este artículo contribuye originalmente a la literatura de estimación estructural de modelos de subasta proporcionando una aproximación paramétrica útil y robusta.

Clasificación JEL: C53; C57; C65; D44

Palabras clave: Teoría de los Valores Extremos; Econometría Estructural; Subastas; Precios de Transacción; Identificación Irregular

1 Introduction

During the last few years Extreme Value Theory (henceforth EVT; see ^{Fisher and Tippet, 1928}; ^{Gumbel, 1935}) has proved its usefulness across various scientific fields, such as engineering, finance and even public health (^{Thomas et al. , 2016}). The new literature that has started to explore the potential of EVT in Economics has focused mostly on theoretical problems (^{Gabaix et al. , 2003}, ²⁰⁰⁶; ^{Benhabib and Bisin, 2006}).

In this article I address the use of EVT for the estimation of auction models within the framework presented by Haile and Tamer (²⁰⁰³), which restricts the focus to the case of an incomplete datasets and, in particular, to those environments where only transaction prices can be observed. This is the case, for instance, of descending bid (Dutch) auctions, where only the winning bidder reveals information about his valuation.

Menzel and Morganti (²⁰¹³) show that under such conditions the nonparametric estimator for the distribution function converges slowly, and that the small sample bias spreads to all the estimates of functionals of practical interest, such as the expected revenue or the optimal reserve price. In general, with small samples it is preferable to adopt a parametric approach. However, the choice of the parametric distributional form is usually arbitrary, as researchers typically do not have theories to guide their choice (see ^{Mohlin et al. , 2015}; ^{Takano et al. , 2014} for studies under complete datasets). Luckily, this is not the case in the present context. EVT theoretically guides us toward a natural parametric assumption allowing us to analytically approximate functionals of interest. de Haan et al. (²⁰⁰⁹) and (²⁰¹³) introduce EVT in the estimation of auction models, restricting their analysis to the expected value and on the number of active bidders. In this article, we extend the analysis to other functionals such as the optimal reserve price. We also analyze the performance of standard nonparametric estimators and quantify how the bias spreads across functionals of practical interest. Over the years, the analysis of auctions has inspired one of the most successful marriages between theoretical and econometric models. Since the seminal work of Vickrey (¹⁹⁶¹), theorists have constructed a rich framework to map private valuations into bids. In their attempt to identify and estimate the distribution of these private values, econometricians (see, for instance, ^{Guerre, Perrigne and Vuong, 2000}; ^{Aradillas-Lopez, Gandhi and Quint, 2013}) have adapted the results from the theory as restrictions for these data (that is, the bids).

The general approach to nonparametric identification in auction models relies on this theoretical mapping between the distribution of bidders’ valuations - the object of interest - and the distribution of observed bids - the data. Given the latter, we can obtain the former by inverting the mapping.

When an econometrician has access to limited data - for instance, to dataset reporting only transaction prices - Athey and Haile (²⁰⁰²), and Haile and Tamer (²⁰⁰³) show that it is still possible to recover the missing object of interest using a statistical mapping , which establishes a relationship between the distribution of any order statistics and the underlying distribution of the data. The use of such mapping is justified by the observation that transaction prices are an order statistics of the bids, as explicitly described by the rules of the auction. For instance, in a second price auction, the transaction price is equal to the second highest bid. Given the distribution of any order statistics, it is possible to invert the statistical mapping to back out the underlying distribution.^¹

However, as Menzel and Morganti (²⁰¹³) pointed out, even though the statistical inversion preserves consistency, convergence of the estimated distribution to the true one with respect to an appropriate function norm fails to reach the root-n rate, affecting all subsequent computations. Moreover, the convergence rate is affected by the number of bidders, N: when the number of bidders diverges, the rate converges to zero, and the magnitude of the sample size becomes irrelevant.

Because an econometrician observes just an extreme (or a function of an extreme) of the parental distribution, the dataset is unbalanced - that is, observations on the lower part of the support will be undersampled, whereas observations on the higher portion of the support will be oversampled.

Consequently, inverting the distribution of an extreme imposes a downward bias around the left end of the support, and an upward bias on the right end. All the quantiles are thereby pushed to the right, and the estimates based on them will suffer as a result. The problem is particularly evident when the number of participants in an auction approaches infinity, as the distribution of transaction prices collapses to a degenerate one with mass point at the upper extreme of the support. Monte Carlo experiments show that even when N is finite and small, the bias remains significatant even in the presence of large samples.

In principle it is possible to attenuate the problems on the right tail by smoothing the nonparametric estimators with an appropriate Nearest Neighborhood Estimator, but in practice this will be difficult and time consuming. Trimming and smoothing procedures could, in theory, solve the problem on the left tail, though the choice of the regularization parameters is obstructed by several trade-offs, and the criteria for an efficient procedure are still not available.

Given these considerations, we suggest an alternative, practical approach based on EVT. This parametric method relies on well known convergence results concerning the extremes of a distribution (^{Fisher and Tippet, 1928}; ^{Gnedenko, 1943}). Under very mild assumptions, the distribution of such extremes - appropriately normalized - converges uniformly to one of three possible distributions, the so-called Extreme Value Distributions (EVD). When we rely on these results, it is possible to obtain approximate estimates of functionals of practical interest, such as the expected revenue or the optimal reserve price, in two steps. First, we estimate the two normalizing constants by minimizing the distance between the normalized empirical distribution of an extreme and the corresponding EVD. Second, by applying a simple change of variable to the integral that expresses the expected revenue of the auction, we can rewrite everything in terms of EVDs and their transformations. EVT also suggests a natural approximation for the underlying distribution of bids: Generalized Pareto.

We present results from Monte Carlo simulations, which show that the approximation method performs better than the nonparametric one - even in cases where the convergence of the extreme and the limiting distribution happens at a very slow rate.^²

Even though this extreme value estimator and its functionals suffer from the same limitations on the left tail as their nonparametric counterparts, they appear to be more robust. Moreover, as this relative advantage of EVT seems to hold also for those distributions with poor approximation, we can confidently count on the generality of this approach.

This approximation gets more precise as N increases, making EVT more appealing to estimation exactly in those instances where the nonparametric estimator gets less accurate. Finally, we observe that computation time is minimal, making this approach particularly attractive for applied works.

EVT provides a general framework that can be adapted to all problems in which an order statistics is observed. For instance, an interesting application for financial markets is the estimation of the unobserved distribution of valuations in multi-unit auctions with uniform price.

This article is structured as follows: Section 2 presents the nonparametric estimator and discussed its behavior on the tails. Section 3 introduces basic and general results from EVT. In Section 4 we apply EVT to the auction framework and show how it is possible to obtain useful results relying only on EVDs and their transformations. Finally, Section 5 presents results from Monte Carlo simulations.

2 Nonparametric Identification and Estimation

We restrict our attention to symmetric independent private value (IPV) auction models, where only the transaction price is observed. For expositional purposes we focus on the case of second price auctions^³. The typical dataset consists of observations from n identical and independent auctions, where each auction counts exactly N bidders. We are use the capital letter N to denote the number of bidders, whereas lower case n denotes the size of the sample. We assume that N is exogenous (^{Athey and Haile, 2002}; ^{Haile and Tamer, 2003}). Every bidder i=1,⋯,N submits an offer, bi, which depends on her own private value for the item, vi, on the format of the auction, and on the game she is playing against all the other bidders. Private valuations are independently drawn from a common distribution, FV. The distribution of the bids is denoted by F. The econometrician observes only the transaction price from each auction: this transaction price is an extreme of the parental distribution. For instance, in a second price auction, the transaction price corresponds to the N-1th order statistics (the second maximum) ^⁴. The k-th order statistic of N independent bids {b1,⋯,bN} has distribution^⁵

Gk:N(z)=N!(N-k)!(k-1)!∫0F(z)‍tk-1(1-t)N-kdt (1)

Athey and Haile (²⁰⁰²) show that the mapping implicitly described above is always invertible: therefore it is possible to obtain the distribution of the bids, F(z)=ϕ(Gk:N(z),N), whenever we can estimate the distribution of the transaction prices, Gk:N ( statistical inversion ). A simple nonparametric estimator for the distribution of the transaction prices is

G^k:N(z)=1n∑j=1n‍1{Pj≤z} (2)

which^⁶, by Glivenko-Cantelli theorem, converges almost-surely uniformly to the true distribution G^k:N(z)-Gk:N(z)=op(1).

Following Haile and Tamer (²⁰⁰³), the Continuous Mapping Theorem gives ϕ(G^k:N,N)-ϕ(Gk:N,N)=ϕ(G^k:N,N)-F(z)=op(1). The convergence of the last quantity is also uniform in z: as the mapping ϕ is continuous over a compact space, it is also uniformly continuous. This establish uniform convergence.

However, as shown in Menzel and Morganti (²⁰¹³), this mapping is not Lipschitz continuous, meaning that its derivative is unbounded at critical points of the support, {z_,z¯} ^⁷

This creates a serious problem in the estimation, because even small biases will be magnified in neighborhoods around these points. Moreover, it is possible to see that, as N increases, the problem becomes more severe on the lower tail, whereas it attenuates on the right end of the support.

The convergence of the estimated distribution to the true one will be slow and dependent on the number of bidders. When N grows indefinitely, identification is lost: the distribution of the extreme degenerates to a mass point at the upper bound of the support, and the rate of convergence becomes equal to zero.

This means that when the number of bidders is high we should expect nonparametric estimates to be a poor description of the behavior of the lower tail of the distribution. The distribution of the bids is irregularly identified. A similar problem with finite-dimensional parameters has been analyzed by Khan and Tamer (²⁰¹⁰). Figure 1 shows how the nonparametric estimator of the cdf of a uniform fails to identify the lower tail of the distribution. Notice that increasing the number of observations from n1=25 to n2=1000 does not improve the quality of the estimation.

Figure 1 Nonparametric estimation of the CDF of a Uniform Distribution: N=10,n1=25 and n2=1000

The rate of convergence of the nonparametric estimator ϕ(G^k:N(z),N), with F(z)∈(0,1), decreases in N, and approaches the value of zero as N goes to infinity. Proof Remark 1 For the kernel estimator defined above, n[G^k:N(z)-Gk:N(z)]⟶N(0,σ2(z)), where σ(z)2=F(z)[1-F(z)]. Then, using the Delta Rule:

n[ϕ(G^k:N(z),N)-ϕ(Gk:N(z),N)]⟶N(0,σ2(z)[ϕ'(Gk:N(z),N)]2)

We need to show that ϕ'(G,N) diverges to infinity as N increases. From the implicit definition of the mapping, we obtain

ϕ'(G,N)≡∂ϕ∂G=N!(N-k)!(k-1)!ϕ(G,N)k-1(1-ϕ(G,N)N-k)-1

We restrict our attention to the class of problems where k/N⟶1 ^⁸. First we show that G(z,N) falls to 0 in the lower tail of the distribution^⁹ as N increases. When k/N⟶1, we can denote N!(N-k)!(k-1)! as P(N,q+1), a polynomial in N of degree q+1, where q=N-k. Then Gk:N(z)≤P(N,q+1)∫0F(z)‍tk-1NNdt. The argument of this integral is continuous over a compact set, therefore it is uniformly continuous. Riemann integrability applies to the limit of the sequence, limN⟶∞∫0F(z)‍tk-1NNdt=∫0F(z)‍limN⟶∞tk-1NNdt=0 for z such that F(z)<1, and for k/N⟶1. The integral falls to zero fast and dominates the diverging effect of the polynomial.

Because G(z,N) falls to zero as N increases to infinity when z belongs to a lower tail of the distribution, ϕ(G,N) must fall to zero as well, in order to balance expression (1). This makes the derivative ϕ' unbounded.

The typical dataset is necessarily unbalanced. Higher values of the support are oversampled whereas lower values are undersampled to the point that entire portions of the lower tail might not even be observed in finite samples. All the measures based on our nonparametric estimates will be distorted accordingly: for instance, both expected revenue of the auction and reserve price will be systematically upward biased. This problem becomes worse as N grows but it should eventually fade as sample size increases. However, Monte Carlo simulations show that the increase in N dominates the effects of an increase in n. Nonparametric estimators perform poorly on both tails of the distribution: the bias fades slowly, and in general affects all the measures of interest. Appropriate smoothing procedures might help reducing the bias, but they would require appropriate calibrations of the regularization parameters, and this task is difficult and time consuming. No criterion is available that guides the researcher around the trade offs that such regularizations imply. In the next sections we are going to introduce a new approach to estimation that will require minimum computation time: we will show that such parametric method produces better results than the nonparametric one. But in order to discuss the method, we need to introduce some basic concepts about EVT.

3 Extreme Value Theory

The fundamental result of EVT is the following: if the distribution of the maximum of N independent draws from F, appropriately normalized, converges to a distribution function G as N goes to infinity, then G must be one of the following three:

G1(z)=exp(-z-α),z>0(Frechet)

G2(z)=exp(-(-z)α),z≤0(Weibull)

G3(z)=exp(-e-z),z∈R(Gumbel)

Formally, let P be a probability measure with distribution function F. Denote with Zi:N the ith order statistics for the sample of size N. [ Fisher-Tippet-Gnedenko] If there exist real numbers aN>0 and bN, such that PNZN:N-bnan≤z ^¹⁰tends to some nondegenerate limit G(z) then, either G=G1, or G=G2, or G=G3

If it is possible to find a shifting parameter and a scaling parameter, such that the normalized distribution of the maximum converges, then the limiting distribution belongs to the Extreme Value family. The theorem grants a natural parametric approximation for the distribution of the maximum, up to two normalizing parameters. Gnedenko (¹⁹⁴³) also gave necessary and sufficient conditions for F to belong to the domain of attraction of any of the above limits (denoted F∈D(Gh)h=1,2,3). Von Mises (¹⁹³⁶) derived a set of sufficient conditions which are more easily testable.

It is possible to show that the class of distributions that satisfy the Von-Mises conditions is wide, and includes all known analytical distributions. More interestingly for our purposes, Falk and Marohn (¹⁹⁹³) rewrite the von Mises conditions in terms of convergence of the underlying distribution to a corresponding Generalized Pareto Distribution (gPds). Falk (¹⁹⁸⁵) shows that the von Mises conditions imply pointwise convergence of the density fN to gN as N goes to infinity ^¹¹. This, by virtue of Scheffe'’s Lemma, in turns entails its uniform convergence over all Borel sets (convergence in Total Variation).

The rate of convergence of supx|FN(aNx+bN)-G(x)| to zero depends on the particular distribution F: for instance, it is of order O(1/N) for the negative exponentials, and of order O(1/logN) for normal distributions^¹². The fastest possible convergence rate is of order O(1/N) and is achieved by members of the gPd family. We can only make conjectures about the quality of the approximation, because we don’t have information about the underlying distribution. So, as the normal is known to converge at low rates, we will use it as a worse-case scenario for our simulations. Because we obtained satisfactory results with the normal, we are optimistic about the robustness of the estimator to different distributions.

The results of EVT presented so far are not limited to the first maximum: in fact, they extend to the whole joint distribution of the extremes. Define m=N-k+1. If F satisfies one of the Gnedenko conditions, then Gk:N(z) converges uniformly to Gh(m)(z)=Gh(z)∑i=0m-1‍1i![-logGh(z)]i, where h=1,2,3 indicates the appropriate limiting EVD. For example, for the case of the second maximum (m=2), the limiting distribution becomes

Gh(2)(z)=Gh(z)[1-logGh(z)] (4)

4 EVT in the Estimation of Auction Models

We can now use the results of the previous section to approximate the distribution of the extreme with the appropriate EVD. We are going to show that objects of interest such as expected revenue and optimal reserve price can be easily obtained through a simple transformation.

We assume that F possesses a derivative f. The expected revenue for First Price and Second Price auctions, corresponding to the expectation of the second maximum valuation, is given by the following integral (see, for instance, Krishna (²⁰⁰²))

E[R|N]=∫0w¯‍N(N-1)xF(x)N-2[1-F(x)]f(x)dx (5)

We want to emphasize that, for the simple case we are considering, to obtain the expected revenue of the auction it is not necessary nor suggested to compute the integral: for this purpose it is enough to find the expected value of the transaction prices. The expected value does not suffer from the bias and should therefore be used in estimation. However, for expositional purposes, we are going to refer to the integral as a benchmark for the heavy bias that affects the nonparametric estimator. Estimation of the distribution F and computation of the integral will be required in order to compute the optimal reserve price and to perform counterfactual analysis. For this reason, it is important to understand how, and with what magnitude, the nonparametric estimator can affect our analysis. For simplicity, we will focus on Second Price auctions, so that the distribution of the bids corresponds to the distribution of the private values. Because F is unknown we cannot compute directly the value of the integral.

We are going to show that the integral can be transformed and expressed in terms of EVDs, with no significant loss in precision.

[Expected Revenue] If there exists aN>0 and bN such that

PZN:N-bNaN≤z converges to G(z), then

E[R|N]≈∫-bNaNw¯-bNaN‍(N-1)(aNt+bN)[-logG(t)1N]g(t)dt (6)

For instance, for the class of distributions F∈D(G3), the expression becomes

E3[R|N]≈∫-bNaNw¯-bNaN‍(N-1)(aNt+bN)e-2t-e-tNdt (7)

We construct the proof through a sequence of simple Lemmas. FN-2(aNt+bN)≈G(t) This comes directly from the assumption of the theorem. [1-F(aNt+bN)]≈-logG(t)1N

Proof : if F belongs to the domain of attraction of G then

FN(aNt+bN)⟶G(t)⟺NlogF(aNt+bN)⟶logG(t)⟺

N[F(aNt+bN)-1]⟶logG(t)⟺N[1-F(aNt+bN)]⟶-logG(t)⟺

1-F(aNt+bN)-logG(t)1N⟶1

aNf(aNt+bN)≈1Ng(t)G(t)

Proof : Because F has a derivative f near the right end of the support, the previous condition implies

aNf(aNθ+bN)1Ng(θ)G(θ)=F(aNt+bN)-F(aNy+bN)[-logG(t)1N]-[-logG(y)1N]⟶1

for some θ∈(t,y).

Proof of Theorem 2 The proof of the theorem is concluded by performing a simple change of variable in the original integral, t=(x-bN)/aN, and applying the previous lemmas.

E[R|N]=∫-bNaNw¯-bNaN‍N(N-1)(aNt+bN)F(aNt+bN)N-2*

1-F(aNt+bN)]f(aNt+bN)aNdt≈

≈∫-bNaNw¯-bNaN‍N(N-1)(aNt+bN)G(t)[-logG(t)1N]g(t)NG(t)dt=

=∫-bNaNw¯-bNaN‍(N-1)(aNt+bN)[-logG(t)1N]g(t)dt

The approximation does not depend on the unknown distribution F: the new expression depends entirely on the normalizing constants aN,bN and on the EVD, G. Procedures that test for the particular type of EVD to use have long existed in the literature.

The normalizing constants can be estimated through some standard minimum distance (MD) criterion^¹³. A widely used criterion is the Crame'r-von-Mises, which uses the integral of the squared difference between the empirical and the estimated distribution functions. Among the estimators based on non-Hilbertian^¹⁴ metrics, the most common is the Kolmogorov-Smirnof

{a^N,b^N}=argminaN,bNsupxnG^k:Nxn-bNaN-G(m)(xn) (8)

where m=N-k+1. It is well known that Kolmogorov-Smirnof distance immediately provides a test for goodness of fit. This procedure is simple and avoids having to compute the maximum likelihood estimator of the generalized extreme value distribution.

Optimal Reserve Value: Using a similar approach we can estimate the optimal Reserve Price (RP) of the auction, given a specific value for the seller, x0 ^¹⁵: through a numerical search over the parameter θ=RP-bNaN that maximizes the expected revenue

maxθE[R|N,θ]=∫θw¯-bNaN‍(N-1)(aNt+bN)[-logG(t)1N]g(t)dt+x0Gθ++N(aNθ+bN)[logG(θ)1N]G(θ) (9)

Notice that Lemma 2 suggests the possibility to approximate the right tail of the distribution^¹⁶ F with a Generalized Pareto distribution (see Pickands (¹⁹⁷⁵), Balkema and de Haan (¹⁹⁷⁴)).

Can we use what we learn from auctions with high participation (that is, with high N) to make inference about auctions with a low number of bidders? The theory proves that for second price, IPV auctions, the optimal reserve price does not depend on N: therefore, the reserve price computed in high-participation auctions holds for any possible N. On the other hand, the expected revenue from an auction increases with N. Given sufficient variation in N, we can estimate the sequences {a^N,b^N}Nobserved and interpolate their values for smaller Ns. Plugging the new values into the integral returns the expected revenue. The next section shows results from Monte Carlo simulations.

5 Monte Carlo Simulations

In this section we are going to present some results from Monte Carlo simulations in support of the theory advanced in the previous chapters. In order to simplify the discussion, we are going to focus on the case of Second Price auctions: this implies that the bids drawn are also the valuations of the bidders. Using MATLAB, we draw n observations from two distributions, chosen for their opposite N-asymptotic behavior: the first distribution is a Normal with parameters μ=10 and σ=2. The second distribution is a Negative Exponential with parameter λ=0.2. The specific choice of the parameters does not affect the results. As discussed above, extremes of a normal distribution converge at a slow rate to the Gumbel family, whereas the negative exponential has the fastest possible rate of convergence. Ideally, a general distribution’s behavior will follow between these two. The normal distribution is used as a worst-case scenario, while the exponential as a best-case scenario. We are considering asymptotic behavior by letting both N (that is, the number of bidders), and n (that is, the number of auctions, i.e. the sample size) increase. While raising n improves precision of all estimators, increasing N has opposite effects on EVT-based estimators and on standard nonparametric ones. In particular, higher values of N make EVD a better approximation to the true distribution, while nonparametric estimators move further away from it.

From equation (2), we estimate the nonparametric distribution of our set of random draws, which we then use to find the normalizing constants using the Kolmogorov-Smirnof measure (see equation 8).^¹⁷ A useful outcome of the Kolmogorov-Smirnof criterion is the availability of a test for the goodness of fit: in all simulations, the normalized empirical distribution is not significantly different from the corresponding EVD, the Gumbel^¹⁸.

Figure 2 and Figure 3 provide a graphical representations of the goodness of fit of the nonparametric estimator and of the estimator based on EVT^¹⁹. While we used different values for both N and n for our simulations, for brevity we only plot results for N taking values 5 and 100, and n values 50 or 5,000. Both N=100 and n=5,000 are good representations of a large sample , for the purpose of asymptotic behavior. The remaining values define a realistic small sample . The approximate-distribution is represented by the dash curve; the continuous curve represents the nonparametric estimator. The dotted curve is the true CDF.

The figures immediately illustrate four points: first, as the number of bidders rises the bias of the nonparametric estimator increases. Second, the nonparametric estimator is biased in two different regions of the support: in the upper tail, because those observations are overweighted, and in the lower tail. Third, the size of the dataset seems to have very little effect on the quality of the estimates. Finally, for the case of the Negative Exponential the approximation performs well, whereas when we analyze the case of the normal distribution the fit is less satisfactory: as the number of bidders increases, EVT delivers better results than the nonparametric estimator, but the bias in the lower tail stays relevant.

Figure 2 CDF estimation: Normal, μ=10,σ=2.

Figure 3 CDF estimation: Negative Exponential, λ=0.2

Next, we are going to show how the different approaches perform in predicting the expected revenue from the auction, computed using equation (7). Rather than analysing asymptotic behavior, we here focus on plausible datasets of size 50 and 100, though in our simulations we produced results for a wide range of values. Obviously, increasing sample size makes all estimators more precise. However, we show that it is the impact of the number of bidders, N, that dominates on all functionals that we compute. In the next simulations, we let N vary between 5 and 50 to better represent realistic bidding environments. As the number of bidders increases from 5 to 50, the expected revenue from the auction increases correspondingly: this is intuitive, because the expectation of receiving a higher bid increases with the number of participants in the auction. Notice that the nonparametric estimator is systematically upward biased. The reason is that the revenue depends on the upper tail of the distribution which, as explained before, is upward biased because of oversampling of the large values of the support. The problem becomes more severe as the number of bidders grows.

EVT provides a good estimate of the expected revenue: the bias from the Approximation is high for small number of bidders, but it rapidly decreases. The sample size affects the precision of the estimation of the normalizing constants, a^N,b^N, and, with them, the precision of the fit. The nonparametric estimator however is severely affected by the number of bidders: for both cases it starts around 50% and increases above 1,000% when N reaches the value of 50. Increasing further the sample size does not significantly benefit the estimates.

Again, EVT performs slightly better when the parental distribution is the negative exponential, but the difference in the fit is small. The nonparametric approach favors distributions with slow rate of convergence, like the normal one; but still drastically underperforms compared to EVT.

Table 1 Expected Revenue - Normal distribution μ=10;σ=2

N. bidders	n. auctions	True Rev.	EVT Rev.	NonP Rev.	Bias EVT %	Bias NonP %
5	50	10 . 95	8 . 78	17 . 11	−19 . 80	56 . 30
5	100	10 . 95	8 . 79	17 . 40	−19 . 69	59 . 00
50	50	13 . 72	13.43	172 . 88	−2 . 14	1160 . 05
50	100	13 . 72	13.41	174 . 58	−2 . 25	1172 . 47

Table 2 Expected Revenue - Negative Exponential λ=0.2

N. bidders	n. auctions	True Rev.	EVT Rev.	NonP Rev.	Bias EVT %	Bias NonP %
5	50	6 . 44	4 . 81	9 . 43	−25 . 38	46 . 65
5	100	6 . 44	5 . 46	11 . 26	−15 . 16	74 . 87
50	50	17 . 33	17 . 27	231 . 63	−0 . 37	1236 . 60
50	100	17 . 33	16 . 48	221 . 44	−4 . 89	1177 . 81

Next, we are going to focus on the optimal Reserve Price of the auction when the seller has an outside value equal to x0 (we initially assume that x0=0 for both distributions; in a second moment we increase x0 to 1.25 for the negative exponential case, and to 10.8 for the normal. We report results only for this last case). We compute the optimal Reserve Price using equation (9). Tables 3 - 4 present results for the two distributions.

Table 3 reserve Price - Normal μ=10,σ=2,x0=10.8

N. bidders	n. auctions	True RP.	EVT RP.	NonP RP.	Bias EVT %	Bias NonP %
5	50	12 . 08	13 . 23	12 . 31	9 . 52	1 . 90
5	100	12 . 08	13 . 16	12 . 27	8 . 94	1 . 57
50	50	12 . 08	12 . 34	10 . 8	2 . 15	−10 . 60
50	100	12 . 08	12 . 33	10 . 8	2 . 06	−10 . 60

Table 4 reserve Price - Negative Exponential λ=0.2,x0=1.25

N. bidders	n. auctions	True RP.	EVT RP.	NonP RP.	Bias EVT %	Bias NonP %
5	50	6 . 25	7 . 85	1 . 25	−25 . 6	−80
5	100	6 . 25	7 . 79	1 . 25	−24 . 64	−80
50	50	6 . 25	7 . 07	1 . 25	−13 . 12	−80
50	100	6 . 25	6 . 86	1 . 25	−9 . 76	−80

Auction theory shows that the true reserve price is not affected by the number of bidders, nor by the sample size: within the boundaries of numerical computation, the Monte Carlo exercise supports the theory. However, the number of bidders does affect the estimated reserve price under both approaches. The EVT-estimator gets closer to the true value as N increases. On the other hand, nonparametric estimator gets worse as N increases. Moreover, the nonparametric estimator runs in computational problems: with the negative exponential the numerical search of the optimum tends to get stuck in the initial region of the support. The optimization algorithm begins the search around x0, in an area where the function is flat and after a few iterations it stops because it believes the function cannot be improved any further. The nonparametric estimator severely underestimates the negative exponential on the lower tail, making numerical search useless. It could be possible to use a different starting value for the numerical search, on the right of this flat area, however the optimization algorithm is very susceptible to the spikes of the nonparametric estimator: results become very fragile, and we notice that computation time increases significantly.

The magnitude of the sample size affects only slightly the precision of the estimates: this confirms the argument that convergence occurs slowly.

Last, from the estimates of the normalizing constant we try to make out-of-sample predictions about the expected revenue. As above, we take draws from a normal distribution and a negative exponential. We try to interpolate the expected revenue for N between 5 and 15. For each value of N we draw data from 50 auctions (n=50).

Table 5 Interpolation Expected Revenue - Normal Distribution μ=10,σ=0.2

N. bidders	Interpolated Revenue	True Revenue
15	11.92	11.92
10	11.80	11.3
5	11.74	9.54

Table 6 Interpolation Expected Revenue - Negative Exponential λ=0.2,

N. bidders	Interpolated Revenue	True Revenue
15	11.82	11.82
10	10.62	10.2
5	8.92	7.23

As expected, the interpolation deteriorates the further we go out-of-sample. However, the expected revenue functional seems to mitigate the progressive bias of the normalizing constant: as far as this exercises is concerned, the results seem close to the true ones.

We have derived results from other distributions, such as uniform, lognormal and mixed distributions for which there is no analytical expression, and the evidence seems consistent. The approach based on EVT systematically provides better estimates than the nonparametric approach. It is to be noted that the approximation method is computationally easier to perform, as it breaks down to the estimation of only two normalizing constants: all the subsequent steps can be solved analytically, using the appropriate gPd or EVD.

6 Conclusions

Econometricians are usually left to make arbitrary parametric choices for the estimation of their models. In this article we showed how EVT guides us towards a natural parametric approximation in auction models with incomplete data.

We addressed the quality of nonparametric estimators in auction models with incomplete data, and we show through simulations the magnitude of the bias that affects estimates of functionals of practical interests. Monte Carlo simulations show that, even when the sample size increases the bias stays relevant and does not disappear fast enough. The number of bidders strongly affects the precision of the estimates, and dominates benefits coming from large sample sizes.

The approximate distribution performs better than its nonparametric counterpart, even when the approximation is known to occur slowly, such as the case of the normal distribution. Increasing the value of N makes the EVT estimates more precise, and, simultaneously, the nonparametric estimates worse.

Even though the form of the approximating distribution is analytical, the set of assumptions that justify its use are very mild and we could reasonably expect most of existing distributions to satisfy them. The practical advantage of adopting analytical formulas relies on saving computational time, making the computation of the relevant measures a minor feat.

References

A. Aradillas-Lopez, A. Gandhi and D. Quint (2013) “Indentification and Inference in Ascending Auctions with Correlated Private Values", Econometrica , 81 (2), 489-534. https://doi.org/10.3982/ECTA9431 [ Links ]

Athey, S., and P.A. Haile (2002), “Identification of Standard Auction Models ", Econometrica , 70 (6), 2107-2140. https://doi.org/10.1111/1468-0262.00371 [ Links ]

Balkema, A. and L. De Hann (1974) “Residual Life time at great age", Annals of Probability , 2, 792-804. https://doi.org/10.1214/aop/1176996548 [ Links ]

Benhabib, J and A. Bisin (2006), “The distribution of wealth and redistributive policies", NYU Working Paper [ Links ]

de Haan, L., de Vries, C.G. and Zhou, C. (2009) “The expected payoff to Internet auctions". Extremes 12, pp. 219-238. https://doi.org/10.1007/s10687-008-0077-z [ Links ]

de Haan, L., de Vries, C.G. and Zhou, C. (2013) “The number of active bidders in internet auctions," Journal of Economic Theory , 148 (4), 1726-1736.https://doi.org/10.1016/j.jet.2013.04.017 [ Links ]

Donoho, D.L., and R.C. Liu, (1988), “The “Automatic” Robustness of Minimum Distance Functionals ", The Annals of Statistics , 16 (2), 552-586. https://doi.org/10.1214/aos/1176350820 [ Links ]

Falk, M. (1986), “Rates of Uniform Convergence of Extreme Order Statistics ", Annals of the Institute of Statistical Mathematics , 38 (2), 245-262. https://doi.org/10.1007/bf02482514 [ Links ]

Falk, M. (1990), “A Note on Generalized Pareto Distributions and the k Upper Extremes", Probability Theory and Related Fields , 85 (4), 499-503. https://doi.org/10.1007/bf01203167 [ Links ]

Falk, M., and F. Marohn, (1993), “Von Mises Conditions Revisited", The Annals of Probability , 21 (3), 1310-1328. https://doi.org/10.1214/aop/1176989120 [ Links ]

Fisher, R. A., and L. H. C. Tippet (1928), “Limiting forms of the frequency distribution of the largest and smallest member of a sample", Proceedings of the Cambridge Philosophical Society , 24, 180-190 https://doi.org/10.1017/s0305004100015681 [ Links ]

Gabaix X., D. Laibson and H. Li (2005), “EVT and the Effects of Competition on Profits", mimeo MIT [ Links ]

Gabaix X., P. Gopikrishnan, V. Plerou and H.E. Stanley (2003) “A theory of power law distributions in financial market fluctuations", Nature , 423, 267-270. https://doi.org/10.1038/nature01624 [ Links ]

Gabaix X., P. Gopikrishnan, V. Plerou and H.E. Stanley (2006), “Institutional Investors and Stock Market Volatility", Quarterly Journal of Economics , 121 (2), 461-504. https://doi.org/10.1162/qjec.2006.121.2.461 [ Links ]

Gnedenko, B. (1943), “Sur la distribution limite du terme maximum d’une série aléatorie ", Annals of Mathematics , 44, 423-453. https://doi.org/10.2307/1968974 [ Links ]

Gumbel E.J. (1935) “Les valeurs extrêmes des distributions statistiques" Annales de l’Institut Henri Poincaré , 5 (2): 115-158, [ Links ]

Guerre, E., I. Perrigne, and Q. Vuong (2000), “Optimal Nonparametric Estimation of First-Price Auctions", Econometrica , 68 (3), 525-574.https://doi.org/10.1111/1468-0262.00123 [ Links ]

Haile, P.A., and E. Tamer (2003), “Inference with Incomplete Models of English Auctions", Journal of Political Economy , 111 (1), 1-51. https://doi.org/10.1086/344801 [ Links ]

Hall, P. (1979), “On the Rate of Convergence of Normal Extremes", Journal of Applied Probability , 16 (2), 434-439. https://doi.org/10.2307/3212912 [ Links ]

Hall, W.J. and J.A. Wellner, (1979), “The Rate of Convergence in Law of the Maximum of an Exponential Sample" , Statistica Neerlandica , 33 (3), 151-154.https://doi.org/10.1111/j.1467-9574.1979.tb00671.x [ Links ]

Hayashi, F. (2000) “Econometrics", Princeton University Press. [ Links ]

Ibragimov, R., D. Jafee and J. Walden (2009), “Nondiversification traps in Catastrophe Insurance Markets", Review of Financial Studies , 22 (3), 959-993.https://doi.org/10.1093/rfs/hhn021 [ Links ]

Khan, S., and E. Tamer, (2010), “Irregular Identification, Support Conditions, and Inverse Weight Estimation", Econometrica , 78 (6), 2021-2042.https://doi.org/10.3982/ECTA7372 [ Links ]

Krishna, V. (2002), “Auction Theory", Academic Press [ Links ]

Menzel, K. and P. Morganti (2013) “Large Sample Properties for Estimators Based on the Order Statistics Approach in Auctions", Quantitative Economics , 4 (2), 329-375. https://doi.org/10.3982/qe177 [ Links ]

Mohlin, E., R. Östling, and J.T. Wang (2015) “Lowest Unique Bid Auctions with Population Uncertainty", Economic Letters , 134, 53-57.https://doi.org/10.1016/j.econlet.2015.06.009. [ Links ]

Parr, C.W., and W.R. Schucany, (1980), “Minimum Distance and Robust Estimation", Journal of the American Statistical Association , 75 (371), 616-624, https://doi.org/10.1080/01621459.1980.10477522 [ Links ]

Pickands, J. III (1975), “Statistical Inference Using Extreme Order Statistics", Annals of Statistics , 3 (1), 119-131, https://doi.org/10.1214/aos/1176343003 [ Links ]

Rao, P.V., E.F. Schuster and R.C. Littell, (1975) “Estimation of Shift and Center of Symmetry Based on Kolmogorv-Smirnov Statistics", The Annals of Statistics , 3 (4), pp. 862, https://doi.org/10.1214/aos/1176343187 [ Links ]

Reiss, R. D., (1981) “Uniform Approximation to Distribution of Extreme Order Statistics", Advances in Applied Probability , 13, pp. 533-547. https://doi.org/10.2307/1426784 [ Links ]

Takano, Y., N. Ishii and M. Murak (2014) “A Sequential Competitive Bidding Strategy Considering Inaccurate Cost Estimates," Omega , 42(1), 132-140.https://doi.org/10.1016/j.omega.2013.04.004. [ Links ]

Thomas, M., M. Lemaitre, M.L. Wilson, C. Viboud, Y. Yordanov, H. Wackernagel and F. Carrat (2016) “Applications of Extreme Value Theory in Public Health" PLoS ONE 11(7): e0159312. https://doi.org/10.1371/journal.pone.0159312 [ Links ]

Vickrey, W. (1961) “Counterspeculation, Auctions, and Competitive Sealed Tenders", Journal of Finance , 16 (1), pp. 8-37. https://doi.org/10.1111/j.1540-6261.1961.tb02789.x [ Links ]

von Mises, R. (1936), “La distribution de la plus grande de n valeurs" Reprinted (1954) in Selected Papers II. Amer. Math. Soc., Providence, RI, 271-294 [ Links ]

¹ Here it is important to stress the different roles taken by the theoretical mapping and the statistical mapping mentioned above. The theoretical mapping links bids to individual valuations, whereas the statistical mapping concerns the link between transaction prices (that is, order statistics) and bids. From now on, we are going to abstract from the first and focus on the second. The inversion problem that we will refer to goes from the distribution of transaction prices to the distribution of unobserved bids.

²The case of the Normal distribution is one example of this. The rate of convergence for extremes drawn from a normal distribution is on the order of O(1/logN).

³As the dominant strategy in second price auctions is to bid one’s private value, we can conveniently ignore the theoretical inversion CHECK and focus on what we called the statistical mapping

⁴We define the k-th order statistics in the following way: given a set of N bids, we order them starting from the smallest and ending with the largest. The set {b1,…,bN} denotes the ordered list. The first element of the list is the first order statistics, and corresponds to the minimum of the set. The N-th order statistics is the last element of the list, and corresponds to the maximum. The k-th order statistics is simply the element in the k-th position of the list.

⁵so that, for instance, the distribution of the second maximum, (or, the (N-1)th order statistics) is GN-1:N(z)=N(N-1)F(z)N-1N-1-F(z)NN=NF(z)N-1-(N-1)F(z)N.

⁶The symbol 1{A} denotes the indicator function, which assumes value equal to 1 when A is true, and equal to 0 when A is false. Pj denotes the transaction price from the jth auction.

⁷ For k=N-1, the mapping ϕ is defined implicitly by G=N(N-1)ϕ(G,N)N-1N-1-ϕ(G,N)NN(3). By the Implicit Function Theorem, we can obtain its derivative ϕ'(G,N)=1/N(N-1)ϕ(G,N)N-2[1-ϕ(G,N)] which is unbounded on the lower tail of the distribution, where G goes to zero, and on the right end, where G goes to 1.

⁸We focus on the higher extremes of the distribution: the first maximum, the second maximum and so on. We do not consider the lower extremes of the distribution: the minimum, the second minimum... This assumption is consistent with the framework that we are using: auctions models will be involved with the former type of extremes.

⁹What we mean by lower tail of the distribution depends on the particular extreme that we are considering: for instance, if what we are considering is the maximum, the relevant range becomes the full support of the distribution, excluding the upper extreme.

¹⁰PN denotes the N-fold independent product of P

¹¹The result presented in Falk (¹⁹⁸⁵) extends to the generic k-th order statistics. We denote by Gk:N the Extreme Value limit distribution for the k-th order statistics. Then, if one of the von Mises conditions is satisfied, fk:N converges pointwise to gk:N, for any possible k.

¹²Finding the normalizing constants aN,bN is not a straightforward task. In practice, for F∈D(G3), we might start with the following guess: bN that solves F(bN)=1-1/N.

¹³Let {Pθ} be a family of probabilities indexed by θ, and let μ be a metric between probabilities. Let θ^(P) be the corresponding minimum distance functional , i.e., the solution to μ(P,Pθ^)=minθμ(P,Pθ). The MD functional is consistent and robust over μ-neighborhoods (see Rao-Schuster-Littel ¹⁹⁷⁵, Parr-Schucany ¹⁹⁸⁰, and Donoho-Liu ¹⁹⁸⁸)

¹⁴By Hilbertian we mean based on a quadratic measure of deviation

¹⁵The expected revenue with reserve price is equal to maxθE[R|N,RP]=∫RPw¯‍N(N-1)xF(x)N-2[1-F(x)]f(x)dx+x0F(RP)N+N(RP)[1-F(RP)]F(RP)N-1

¹⁶The relative magnitude of this right tail depends on N and on the particular parental distribution F.

¹⁷We compared them with estimates obtained with the Crame'r-von Mises criterion and found no significant differences.

¹⁸We can produce standard errors for expected revenue and reserve price through a Bootstrapping procedure. However, given the erratic behavior of the nonparametric estimator for the reserve price in the next chapter, and the impossibility to draw a comparison with the standard errors produced under EVT, we decided to leave them out.

¹⁹because EVT is based on an approximation, we are going to call this estimator the “approximate-distribution". As discussed in the previous section, the approximate-distribution will be an appropriately normalized gPd

Received: February 21, 2020; Accepted: June 25, 2020

^*Contacto de correspondencia. Correo: pmorganti@up.edu.mx

This is an open-access article distributed under the terms of the Creative Commons Attribution License