Machine Learning from Big GPS Data about the Heterogeneous Costs of Congestion


We exploit GPS-coded vehicle movement data that records millions of road users every second of their way over a full year in Berlin to suggest a novel approach to estimate the external costs of traffic congestion from revealed choices in a big data setting. To measure travel choices, we use unsupervised machine learning to assign anonymous commuting and non-commuting trips to individual drivers and track their repeated trip and route choices. We move beyond considering particular roads in isolation and quantify externalities created by actual trips on routes that combine the diverse road technologies of main roads and side streets. Causal identification for the route-level congestion elasticity relies on exogenous increases in traffic density from rerouting induced by traffic incidences that occur on adjacent roads off a given trip’s route. Our findings suggest significant temporal heterogeneity in the marginal external costs of congestion between 3.0 and 87.4 Euro cents per vehicle kilometer during daytime. Hour-specific congestion taxes maximize large welfare gains, while uniform taxes are only 78% as effective.

Ben Thies
Ben Thies
Strategy Consultant / Researcher