ArXiv TLDR

Distributional Change in Ordinal Data with Missing Observations: Minimal Mobility and Partial Identification

🐦 Tweet
2604.12611

Rami V. Tabri

econ.EM

TLDR

This paper introduces a framework using optimal transport and partial identification to analyze distributional changes in ordinal data with missing observations.

Key contributions

  • Framework for measuring and interpreting distributional changes in ordinal data with limited information.
  • Uses $L_1$ distance and optimal transport to quantify minimal probability mass reallocation.
  • Introduces "minimal-mobility configurations" to characterize how distributional change must occur.
  • Employs partial identification for missing data, yielding sharp bounds on distributions and measures.

Why it matters

This paper provides a robust method for analyzing changes in ordinal data, even when the joint distribution is unobserved and data is missing. It offers a transparent way to understand shifts in ordered categories and assess the impact of nonresponse. This is crucial for empirical research relying on repeated cross-sectional surveys.

Original Abstract

Empirical analyses of ordinal outcomes using repeated cross-sectional data rely on marginal distributions, leaving the joint distribution unobserved and the sources of distributional change unidentified. This paper develops a framework to measure and interpret such changes under limited information. The $L_1$ distance between cumulative distribution functions admits an optimal transport representation as the minimal reallocation of probability mass across ordered categories, which provides a foundation for the analysis. This yields both a scalar measure of discrepancy and a structured characterization of how distributional change must occur, which I term minimal-mobility configurations. To address missing data, I adopt a partial identification approach that delivers sharp bounds on the marginal distributions and, in turn, on both the discrepancy measure and its associated configurations. The resulting framework supports inference using standard resampling methods and provides a transparent basis for assessing sensitivity to nonresponse. An application to Arab Barometer data illustrates the approach.

📬 Weekly AI Paper Digest

Get the top 10 AI/ML arXiv papers from the week — summarized, scored, and delivered to your inbox every Monday.