Next steps for TaxData
Now that #314 has been merged I have a few ideas about the next updates to taxdata we can make. In no particular order:
- [ ] #333
- [ ] Refactor the repo to remove redundant code. We can use the same code to make the tax units for the CPS file to make them for the statistical matching and consolidate the stage 2 scripts.
- [ ] Re-write the make files and documentation to reflect the changes in PR #314
- [ ] Revisit how we handle imputations for the CPS
- [ ] Possibly make parts of taxdata into a standalone package
- [ ] Put together some scripts to make a simple report detailing how projections change when we update the CBO projections our extrapolations are based on so that we can have a log of those changes
- [ ] Work on making it as easy as possible for others to use taxdata to prepare taxcalc-ready microdata files using other years of the PUF and CPS. This could go hand in hand with making parts of taxdata a standalone package
Would love to hear other's thoughts!
@andersonfrailey, these sound fantastic.
I have been thinking on similar lines and have a very rough draft of how a "Tax Data Generator" app might be structured. I have been thinking of this as an app on C/S available to people with access to the PUF, driven by a ParamTools-powered API within TaxData. But everything is highly speculative. This could be the same as the "taxdata" standalone package you describe, or it could be an additional API.
The order of adding different pieces -- in particular, prioritizing the not-yet-ready 50-state file over the CPS file -- indicated by color in the doc, is based on a goal to get an alpha 50 state file as soon as possible to support election analyses and COVID response, but I am not at all averse to others contributor having other priorities!
This is somewhat duplicative of the doc linked above, but a few other steps that I'd add to Anderson's list in the top comment:
- [ ] Incorporate state targeting for the PUF. (@Peter-Metz, @donboyd5, and I are currently working on this)
- [ ] Revisit whether we can combine Stage 3 with Stage 2 (and add other distributional targets) thanks to the adoption of new solvers for the federal problem, CVXOPT (already included in TaxData) or IPOPT (used in the state work).
- [ ] Revisit the PUF-CPS match and filer/non-filer distinction. In particular, Perese's approach is quite elegant and worth considering as an enhancement or alternative option.
@MattHJensen
I have been thinking on similar lines and have a very rough draft of how a "Tax Data Generator" app might be structured. I have been thinking of this as an app on C/S available to people with access to the PUF, driven by a ParamTools-powered API within TaxData. But everything is highly speculative. This could be the same as the "taxdata" standalone package you describe, or it could be an additional API.
I think this would line up perfectly with the standalone package I'm thinking of. It would also be part of the last bullet point I listed:
Work on making it as easy as possible for others to use taxdata to prepare taxcalc-ready microdata files using other years of the PUF and CPS. This could go hand in hand with making parts of taxdata a standalone package
I'll look through your draft and open up a separate issue with some ideas for how we can put this together.
Revisit the PUF-CPS match and filer/non-filer distinction
I really like Perese's approach you linked to. Adopting it would also solve issue #323.
Also on the list:
- [ ] fulfill requirements to go from
psl-incubatingtopsl-cataloged
Spending program projections could probably use review given COVID.
- [ ] Consider whether to use (relatively-newly offered) CBO projections of spending by budget account for extrapolating C-TAM-imputed benefits, or document the issue for later consideration. https://www.cbo.gov/data/budget-economic-data#9
Just wanted to 👍 the below. UBI Center has a few potential analyses that would involve 2018 ASEC+taxcalc, including simulating Covid unemployment benefit changes in prior years.
Work on making it as easy as possible for others to use taxdata to prepare taxcalc-ready microdata files using other years of the PUF and CPS. This could go hand in hand with making parts of taxdata a standalone package
@andersonfrailey These are great next steps. I'm going to add this to a "roadmap" document in PR #401