geomancer icon indicating copy to clipboard operation
geomancer copied to clipboard

research potential data sources to add

Open derekeder opened this issue 11 years ago • 11 comments

  • [x] US Census American Community Survey: CensusReporter.org
  • [x] USASpending.org: http://www.usaspending.gov/data?tab=API
  • [x] Bureau of Economic Analysis: http://www.bea.gov/API/signup/index.cfm
  • [ ] Census: http://www.census.gov/developers/
  • [ ] BJS: http://www.bjs.gov/developer/ncvs/index.cfm
  • [ ] Dept. of Labor (especially BLS): http://developer.dol.gov/
  • [ ] EPA: http://developer.epa.gov/
  • [ ] PANDA: http://pandas.pydata.org/pandas-docs/stable/api.html

We should pick one of these to work on next, keeping in mind we want to build an extensible API wrapper that plugs in to Geomancer. This is a similar approach to what Open Civic Data does: http://opencivicdata.readthedocs.org/en/latest/scrape/index.html

derekeder avatar Sep 10 '14 21:09 derekeder

@evz lets pick one of these that is the most different from the CensusReporter API and work on incorporating it in to the Geomancer data sources

derekeder avatar Oct 07 '14 21:10 derekeder

PANDA would provide the greatest flexibility for the end user, but it may not be the best use case for developing the extensible API wrapper.

It does introduce issues the other APIs don't, though. For example, how do we identify data sets in PANDA that can be merged via Geomancer? (How can the PANDA user identify new data sets that should be made available to Geomancer without requiring a change to the API wrapper?)

tthibo avatar Oct 08 '14 14:10 tthibo

@tthibo PANDA may be the next best data source to integrate. However, we don't have a PANDA install available and it would take time to set one up. Does AP have one we could test with?

If not, the next best candidate would be USASpending, as the data comes in XML format, which we don't handle yet.

derekeder avatar Oct 08 '14 14:10 derekeder

The AP's PANDA install is behind the firewall. We can set one up on a world-facing server, but that would take some time, as you mention. In the meantime we could consider using their public demo: http://demo.pandaproject.net/#login

Or if it makes more sense to give USASpending a shot, I'm fine with that. I'll be honest, that was one recommended by a reporter, but I'm not quite clear on the use case for it. Does that API provide data aggregated at the geographical level, or does it only provide data at the contract level, available by state, for example?

tthibo avatar Oct 08 '14 15:10 tthibo

@tthibo Looks like you can get the contracts summarized by vendor location or by performance location. Locations can looked up by Congressional District, State, Zip Code, or City. There are varying degrees of detail that you can get back, the most general being totals by whatever your search criteria is.

Same things are true about the Federal Assistance and Federal Sub-awards endpoints.

That aside, I am also looking at the Bureau of Labor Statistics stuff. That might be another good case for integrating mainly because it's a multistep process to get to the numbers.

evz avatar Oct 08 '14 15:10 evz

@tthibo do you have a sense of which data sources would be the most valuable to add next?

Census: http://www.census.gov/developers/ BJS: http://www.bjs.gov/developer/ncvs/index.cfm Dept. of Labor (especially BLS): http://developer.dol.gov/ EPA: http://developer.epa.gov/

We have a decent start on BLS and could wrap that one up pretty quick. Any others you want us to investigate?

derekeder avatar Feb 13 '15 18:02 derekeder

Let's do BLS next, then. After that, I'd do BJS. I think decennial Census will be great to have, but since we already have ACS, it's a little less pressing. I do have other ideas, but those listed here would trump any additional sources.

tthibo avatar Feb 13 '15 21:02 tthibo

The only API that BJS has listed in its data tools is the natl crime victimization survey (ncvs)

in the ncvs field descriptions (personal and household), the only geography in the data is region (i.e. Northeast, Midwest, South, West)

cathydeng avatar Feb 16 '15 16:02 cathydeng

Suggestion from the NICAR session - add country geotype and international data from the World Bank http://data.worldbank.org/

derekeder avatar Mar 06 '15 16:03 derekeder

Another suggesting from NICAR, OpenElections data: https://github.com/openelections

@zstumgoren would know something about this :smile_cat:

derekeder avatar Mar 06 '15 16:03 derekeder

We also had another vote for decennial Census at the NICAR session.

tthibo avatar Mar 06 '15 19:03 tthibo