The CIA World Factbook is a great source of information on every country and territory in the world, from data such as land area or GDP to information on history and government. It makes for a rather interesting data set, but since the data isn’t available to download in a machine-readable format, I wrote a python script with Beautiful Soup to scrape a selection of the data from the site and Pandas to write it to a file. (Here’s the resulting csv file.)

With the data in a useful format I can start to create plots with Bokeh. With every plot you can pan, zoom, and hide individual regions by clicking on the relevant legend entry. Hover over each point to see what country it represents.

Here’s the population against land area for 237 countries and territories:

And here’s the life expectancy as a function of GDP per capita (in purchasing power parity dollars):

There’s a huge amount of interesting data contained here, such as unemployment rate as a function of budget surplus:

Which is dominated by outliers – 100% budget deficit in Timor-Leste and 95% unemployment in Zimbabwe.

There’s also some interesting lack of correlations, such as life expectancy as a function of health spending:

There’s a huge amount of data here, so let me know if there’s any other plots you want to see!

  1. How about Oil Consumption vs. Life Expectancy, or Oil Consumption vs. GDP per Capita, or Deaths per year VS. Oil Exportation.

  2. Excellent work, I love BeautifulSoup. But not as much as I am loving the embedded plots on this page. Beautiful.

    1. Yeah Bokeh is really nice. You can add a lot more interactivity as well if you have a server, but even without it’s just a nice way to make use of the web format!

