Skip to main content

CEP/STICERD Applications Seminars

Deep learning methods to curate economic data at scale

Melissa Dell (Harvard University)

Monday 13 June 2022 16:00 - 17:30


To receive invites to online seminars, please join our mailing list for this series. Subscribe/Unsubscribe here (seminar_applications).

About this event

Vast amounts of data are trapped in non-computable formats, such as document image scans and text. Deep learning has the potential to greatly expand the questions that economists can study by providing rigorous methods for converting non-computable information into structured, computable data. Combined with advances in GPU compute and inexpensive cloud compute, this makes it feasible to process data on a massive scale. This talk will provide an overview of our work to develop deep learning methods and tools for creating computable social science data, with an aim of making structured digital data more representative of documentary history. This work emphasizes lower resource contexts - for which there are few incentives for commercial technology – and encompasses novel approaches and tools for document layout analysis, OCR, and NLP pipelines.

Participants are expected to adhere to the CEP Events Code of Conduct.


This series is part of the CEP's Labour Markets programme.