Confidence Set for Group Membership
Abstract
We develop new procedures to quantify the statistical uncertainty from sorting units in panel data into groups using data-driven clustering algorithms. In our setting, each unit belongs to one of a finite number of latent groups and its regression curve is determined by which group it belongs to. Our main contribution is a new joint confidence set for group membership. Each element of the joint confidence set is a vector of possible group assignments for all units. The vector of true group memberships is contained in the confidence set with a pre-specified probability. The confidence set inverts a test for group membership. This test exploits a characterization of the true group memberships by a system of moment inequalities. Our procedure solves a high-dimensional one-sided testing problem and tests group membership simultaneously for all units. We also propose a procedure for identifying units for which group membership is obviously determined. These units can be ignored when computing critical values. We justify the joint confidence set under N, T → ∞ asymptotics where we allow T to be much smaller than N. Our arguments rely on the theory of self-normalized sums and high-dimensional central limit theorems. We contribute new theoretical results for testing problems with a large number of moment inequalities, including an anti-concentration inequality for the quasi-likelihood ratio (QLR) statistic. Monte Carlo results indicate that our confidence set has adequate coverage and is informative. We illustrate the practical relevance of our confidence set in two applications.
Other description
JEL: C23, C33, C38
Collections
View/ Open
Date
2018-03Author
Dzemski, Andreas
Okui, Ryo
Keywords
Panel data
grouped heterogeneity
clustering
confidence set
machine learning
moment inequalities
joint one-sided tests
self-normalized sums
high-dimensional CLT
anti-concentration for QLR
Publication type
report
ISSN
1403-2465
Series/Report no.
Working Papers in Economics
727
Language
eng