wbdata icon indicating copy to clipboard operation
wbdata copied to clipboard

Is it possible to retrieve data on all countries but not aggregates?

Open MaxGhenis opened this issue 6 years ago • 3 comments

I'd like to get data on all countries, and exclude aggregates. Is there a way to do this, e.g. with get_dataframe?

MaxGhenis avatar Mar 02 '19 20:03 MaxGhenis

Here's my workaround: since aggregate geos are returned first, I get the index of the final aggregate geo ("World") and remove all geos with that index or lower.

Example:

df = wbdata.get_dataframe({'SP.POP.TOTL': 'pop'}).reset_index()
geos = pd.Series(df.country.unique())
world_index = geos[geos == 'World'].index[0]
aggs = geos[:world_index+1]
df[~df.country.isin(aggs)].head()
country date pop
Afghanistan 2018 NaN
Afghanistan 2017 35530081.0
Afghanistan 2016 34656032.0
Afghanistan 2015 33736494.0
Afghanistan 2014 32758020.0

MaxGhenis avatar Mar 02 '19 20:03 MaxGhenis

Not at the moment, that's how the WB API handles things. I suppose we could build that in manually without too much trouble by indicating a special code that means "actually just countries". Or we could have a constant. The difficulty there is that we'd ideally want to be able to identify which "countries" are aggregates at runtime. I'll noodle on that.

OliverSherouse avatar Mar 09 '19 15:03 OliverSherouse

Another workaround is to use [i for i in wbdata.get_country() if not i['incomeLevel']['value'] == "Aggregates"]; that seems to be fairly comprehensive. I'll consider adding that as a utility in the next version.

OliverSherouse avatar Dec 11 '19 20:12 OliverSherouse