Wednesday, January 10, 2018

Labor market regulations

For cross-country data, see Botero et al. (2004). The data can be downloaded from Andrei Shleifer's website. Since then, the World Bank's Doing Business survey updates the data: the latest is for 2017. The reference worker is a 19-year-old full-time worker at a supermarket with one year of experience (see the methodology description).

For India, see Besley and Burgess (2004). The data can be downloaded from LSE EOPP research program's website.

Monday, December 18, 2017

Global Nonviolent Action Database


The Nonviolent and Violent Campaigns and Outcomes (NAVCO) Data Project

The dataset on "major nonviolent and violent resistance campaigns around the globe from 1900-2011." (

Saturday, September 2, 2017

Ultra-violet Radiation across the world

"NASA produces daily satellite-based data for ambient UV-R. The UV index captures the
strength of radiation at a particular location, and it is available in the form of geographic grids
and daily rasters with pixel size of 1◦ latitude by 1◦ longitude." (Anderson et al. (2016), p. 1339)

Used by Anderson et al. (2016), who show that the average of 1990 and 2000 values at the country level correlates with per capita income in 2004 conditional on latitude and many other controls.

Friday, August 11, 2017

World Migration Matrix, 1500-2000

Constructed by Louis Putterman and his colleagues.

For each of 165 countries, the data provides the shares of the current (as of 2000) population's ancestors living in the area of each of today's 172 countries back in around 1500.

See this page for details. See also Putterman and Weil (2010).

Used by Andersen et al. (2016), among others.

Michalopoulos (2012) uses this data to identify countries in which more than 40% of the current population can trace ancestry within the same country boundary back to 1500 AD, for which variation in agricultural suitability and elevation is found to be a strong predictor of ethnic diversity

Sunday, July 23, 2017

Administrative Boundaries

There are several GIS datasets on administrative boundaries.

GADM is the most popular among economists. My experiences of using both GADM and GAUL show that GADM is indeed more trustworthy than GAUL.
  • GAUL's parish (fourth level administrative boundary) data for Uganda shows multiple polygons for the same parish name. This is not the case for GADM.
  • GAUL's coastline on the south of Baku in Azerbaijan is drawn where elevation (according to SRTM30: see this post) is above zero meter while GADM's coastline is drawn where elevation changes from positive to negative values.

National Boundaries

the World Vector Shoreline (WVS)
  • If you're interested in the historical national boundaries since 1945

Sub-national Boundaries

Global Administrative Areas (GADM)
  • an alternative to GAUL. Whether it is better or worse is not clear. 
  • mentioned by Gleditsch and Weidmann (2012) in their review of spatial data analysis in political science.
  • Used by the Gridded Population of the World Version 4 (see here).
  • Used by Dreher et al (2015).
    • They mention that GADM does not include the second level administrative boundaries (counties/districts) for Egypt, Equatorial Guinea, Lesotho, Libya, and Swaziland.
  • Also used by Alesina et al. (2016) to measure inequality across subnational administrative regions (which turns out to be negatively correlated with per capita GDP).
Global Administrative Unit Layers (GAUL)
  • Supposedly an annual panel data from 1990, but the district boundary changes are properly tracked only for some countries.
  • Used by Briggs (2015).

The Second Administrative Level Boundaries (SALB) dataset

  • compiled by the United Nations
  • provides the GIS data on second-tier subnational administrative boundaries (ie. district boundaries). 
  • I'm not sure whether the GAUL dataset mentioned above incorporates this or the SALB dataset has its original data.
  • For subnational boundary changes during early years
  • This is the online updated version of the book Administrative Subdivisions of Countries by Gwillim Law (Jefferson, North Carolina: McFarland & Company, 1999).
  • provides the list of administrative regions for every country, past and present. Very useful if you need to match different sub-national or micro datasets based on sub-national regions, especially when a country of your interest has changed the boundaries of sub-national regions quite frequently such as Nigeria and Uganda.

Sunday, July 16, 2017

Raw Material Data (RMD)

Annual panel data of mines around the world since 1980, compiled by IntierraRMG now part of S&P Global. See

According to Berman et al. (2017), the dataset includes variables such as
  • whether a mine is active, 
  • the year production started, 
  • the specific minerals produced, 
  • the total production for each of them
  • ownership structure and characteristics of the mines
  • extraction methods

  • small-scale mines, and those that are illegally operated, are not included.

    Used by Berman et al. (2017) (and see their footnote 12).