draft JSON data schema, etc.
All right, per Issue #13 here's the JSON Schema and a sample transcription in JSON (Dell's 2010 EEO-1). However: I don't think the schema, validation tool(s), and raw JSON belong in this repository, so I don't actually think you should merge this ;)
Instead my proposal is maybe you could open a separate repo (odd-data?), publish it to npm and/or bower and/or pip and then this repo/site can install the npm package in order to get the JSON files for linking to download, but anyone else who just wants all the data at once can get it all (along with the toolkit - i.e., schema, validation, etc.) just by doing an npm install or similar. I came across this somewhere else recently - releasing data by publishing it as a package, but I can't remember where :/ I thought it was a cool idea though, and I think it makes sense, separation-of-concerns-wise.
I'll be able to take care of the PR on this repository to add the dependency, links to JSON (just Dell at the moment), etc. I've never (myself) published to npm or bower or pip before, but I'm quite familiar with the first two and pretty sure I can figure it out.
Thoughts?
@hypatia @jhlch I might have some time to work on this this weekend. How do you feel about the separate repo idea?
I like the idea of this being in a different repo. I'll have a close look at the JSON format today or tomorrow and comment on it. Thanks for the great work!
Awesome, thanks @jhlch I may add a README with some notes about the schema (and the decisions I made, many/all of which could be debated). One of the really annoying things about JSON Schema is it doesn't have a comment syntax.
I am pro a separate repo for the data!
This looks pretty good to me. My high-level comments are:
- Consider versioning the schema so that you can change it more easily in the future. This may or may not be worth the effort/overhead, but it is something to consider.
- Adding a readme that describes the schema and design decisions seems like a good idea and would probably be pretty instructive.
- How do you want to try and keep the data in sync between the data repo, and having a link to the data added here? Not sure what the best solution is, but we should think about it + talk about it.
@jhlch I'm working on the README now and chilling in DU's IRC if you wanna chat about keeping the data in sync - but the short answer is publish the separate repo to npm which is super easy, then add the npm package as a dependency for this website and commit the data files to the appropriate directory here (and link to them where appropriate so people can download, of course). Every time the data gets updated, rev the version of the npm dependency and re-publish, then rev the dependency version here, run an npm install to get the new version, and commit any new data files or changes, add links, etc.
We can also publish to other repositories like bower or pip, but I think npm is a good place to start. I'd say it's the most standard for JS developers, including people who develop D3 visualizations.