Non‐parametric Estimator for Conditional Mode with Parametric Features*

Published date01 February 2024
AuthorTao Wang
Date01 February 2024
DOIhttp://doi.org/10.1111/obes.12577
OXFORD BULLETIN OF ECONOMICS AND STATISTICS, 86, 1 (2024) 0305-9049
doi: 10.1111/obes.12577
Non-parametric Estimator for Conditional Mode
with Parametric Features*
TAO WANG
Department of Economics, University of Victoria, Victoria, V8W 2Y2, BC, Canada
e-mail: taow@uvic.ca
Abstract
We in this paper propose a new approach for estimating conditional mode non-
parametrically to capture the ‘most likely’ effect built on local linear approximation,
in which a parametric pilot modal regression is locally adjusted through a kernel
smoothing fit to potentially reduce the bias asymptotically without affecting the variance
of the estimator. Specifically, we first estimate a parametric modal regression utilizing
prior information from initial studies or economic analysis, and then estimate the non-
parametric modal function based on the additive correction by eliminating the parametric
feature. We derive the asymptotic normal distribution of the proposed modal estimator for
both fixed and estimated parametric feature cases, and demonstrate that there is substantial
room for bias reduction under certain regularity conditions. We numerically estimate
the suggested modal regression model with the use of a modified modal-expectation-
maximization (MEM) algorithm. Monte Carlo simulations and one empirical analysis are
presented to illustrate the finite sample performance of the developed modal estimator.
Several extensions, including multiplicative correction, generalized guidance, modal-
based robust regression and the incorporation of categorical covariates, are also discussed
for the sake of completeness.
I. Introduction
Skewed or heavy-tailed data (e.g. wages, prices, scores on a difficult exam, movie ticket
sales, and expenditures) appear in a broad variety of practical applications, including
economic, statistical, social and educational research studies, among others. In such
instances, the mean estimate may not adequately disclose the data’s characteristics, and
the mode estimate (one of the center measures) should be considered as a supplemental
JEL Classification numbers: C01, C14, C50.
*I am grateful to the Editor Debopam Bhattacharya and three anonymous referees for their constructive comments,
which have greatly improved the previous version of the paper. I also thank Ivan Canay, Yanqin Fan, Jonathan
Hersh, Esfandiar Maasoumi, Aman Ullah, and Weixin Yao for the helpful comments and suggestions. The paper
also benefited from feedback received from seminar participants at UC Riverside, Bangor Business School, and
NEOMA Business School. This research is supported by Social Sciences and Humanities Research Council of
Canada Insight Development Grant (SSHRC-IDG No. 430-2023-00149), UVic Research/Creative Project Grant
and UVic Research Start-up Grant.
44
©2023 Oxford University and John Wiley & Sons Ltd.
Non-parametric estimator for conditional mode 45
measure to capture the ‘most likely’ element of the data. In light of the robustness of mode,
an increasing amount of literature pays attention to the conditional modal regression,
which is defined as
Mode(Y|X)=arg max
YfY|X(Y|X)=arg max
YfY,X(Y,X),(1)
where Yis the dependent variable, Xis the independent variable, fY|X(Y|X)is the
conditional density of Ygiven X,andfY,X(Y,X)is the joint density. In contrast to
conditional mean regression, we do not need to impose any moment conditions when
conducting modal estimation. However, the density function-based estimating method
described in (1) has at least two drawbacks: one is the ‘curse of dimensionality’, which
occurs when the dimension of covariates is large, and the other is that it cannot accurately
estimate the marginal effect. To avoid these two drawbacks caused by maximizing
the density function, researchers impose different model structural assumptions (fully
parametric, semi-parametric and purely non-parametric) on Mode(Y|X)and achieve the
modal estimate by maximizing a kernel-based objective function (defined in section II);
see Lee (1989,1993), Kemp and Santos Silva (2012), Yao and Li (2014), Chen
et al. (2016), Yao and Xiang (2016), Zhou and Huang (2016), Krief (2017), Chen (2018),
Kemp, Parente, and Santos Silva (2020), Ota, Kato, and Hara (2019), Ullah, Wang, and
Yao (2021,2022,2023), Wang (2023), among others. Despite the fact that parametric
modal regression is convenient to implement and interpret (‘most likely’ effect), it has
the potential to generate inaccurate inference and provide a completely inappropriate
connection between variables (it will be smooth with a low variance). Non-parametric
modal regression, on the other hand, is appealing since it is more resistant to model
misspecification and can be applied without imposing restrictive assumptions on the
form of the relationship between the response variable and the covariates (it may be more
volatilized with a high variance). Nevertheless, similar to non-parametric mean regression,
with the optimal amount of bandwidths, bias is present in the limiting distribution of
the non-parametric modal estimator and can be substantial even for moderate sample
sizes (see simulation results in section IV), which may distort the underlying dependency
between variables. Consequently, researchers might need to undersmooth the modal
estimator in order to asymptotically reduce the effect of bias, which will inevitably
increase the variance of the estimator as well as the length of the confidence intervals,
making subsequent statistical inference more challenging.
To achieve bias reduction without significantly influencing the variance for the non-
parametric modal estimator with local linear estimation, we conceivably have two different
approaches constructed in the existing mean or quantile regression literature. One way
researchers can take is to utilize a higher-order local polynomial fitting to reduce bias by
orders of magnitude. According to the asymptotic theorem presented in this paper, the bias
rate of non-parametric modal regression with local linear approximation is Oh2
1+h2
2
(see section II), where bandwidths h1=h1(n)>0andh2=h2(n)>0 are two sequences
of positive constants that converge to zero with sample size napproaching infinity. We
will suppress the dependence of bandwidths hj,j=1, 2, on nin what follows. With local
polynomial fitting techniques such as local cubic approximation, the bias rate can be
improved to Oh2
1+h4
2. A similar approach has been taken in non-parametric mean or
©2023 Oxford University and John Wiley & Sons Ltd.

To continue reading

Request your trial

VLEX uses login cookies to provide you with a better browsing experience. If you click on 'Accept' or continue browsing this site we consider that you accept our cookie policy. ACCEPT