Relicence + REUSE compliance
Closes #74
Changes proposed in this Pull Request
-
Change licence from GPLv3 to MIT. Requires confirmation by the following contributors (please provide below by answering "Yes" to this issue!)
-
@martavp ✔️
-
@nworbmot
-
@lisazeyen ✔️
-
@euronion ✔️
-
@millingermarkus ✔️
-
@pz-max ✔️
-
@fneum ✔️
-
Also create REUSE compliance of repository
TODO
- [ ] How should we handle files in
docuandinputswhich are PDFs / XLSX files from others serving as input? Exclude them from licencing? Remove completely from repository and instead add them to theSnakemakeworkflow? - [ ] Add http://precommit.ci/ check for repo
Checklist
- [x] Code changes are sufficiently documented; i.e. new functions contain docstrings and further explanations may be given in
doc. - [x] Data source for new technologies is cleary stated.
- [x] Newly introduced dependencies are added to
environment.yaml(if applicable). - [ ] A note for the release notes
doc/release_notes.rstof the upcoming release is included. - [x] I consent to the release of this PR's code under the MIT / CC-BY-4.0 license.
[RE relicense technology-data to MIT] Ping @martavp @nworbmot @millingermarkus @lisazeyen @pz-max
Yes!
I give my consent for relicensing to MIT
[RE relicense technology-data to MIT] Ping @martavp @nworbmot @lisazeyen
[RE relicense technology-data to MIT] Ping @martavp @nworbmot @lisazeyen
Yes!
Yes! Sorry for the delay :)
Thanks! Now only @nworbmot is missing.
Just going to give this a bump because it may impact our use case. I'd love to have insight into the licensing of the source and output files. Right now, it is not clear to me whether the data can be used commercially, for instance.
If you can use help, finishing this PR @euronion, let me know!
@nworbmot are you fine with changing the licence?
We now have all "ok"s that we need.
- I just updated the newest files to be compliant and resolved a licensing inconsistency on the
.pyfiles inlatex_tables - I noticed
latex_tablesdoes not get a lot of attention. I'm not using it, is anyone else? Or is this an artifact we could remove in the future? (Not relevant for this PR) - Removed two files that were not supposed to be committed to the repo (@fneum @millingermarkus *wink* *wink*)
What still needs to be resolved
We have a lot of files in input/ and docu without licenses. We'd have to check all of them to see whether they have actual licenses. We can't become REUSE compliant with the following files without a license in our repository and would need to remove them or think of another solution:
The following files have no copyright and licensing information:
* docu/Anhang-Studie-Wege-zu-einem-klimaneutralen-Energiesystem.pdf
* docu/Appendix-Study-Paths-to-a-Climate-Neutral-Energy-System.pdf
* docu/DIW_cost.pdf
* docu/Fraunhofer-ISE-Studie-Wege-zu-einem-klimaneutralen-Energiesystem.pdf
* docu/NREL_H2 storage costs.pdf
* docu/True Competitiveness of Solar PV.pdf
* docu/Vartiainen_et_al-2019-Progress_in_Photovoltaics__Research_and_Applications.pdf
* docu/bp-stats-review-2019-approximate-conversion-factors.pdf
* docu/bp-stats-review-2019-full-report.pdf
* docu/eng_note_on_technology_costs_for_offshore_wind_turbines.pdf
* docu/lazards-levelized-cost-of-energy-version-130-vf.pdf
* docu/metodebeskrivelse_engelsk.pdf
* docu/opdatering_af_teknologidata_for_solceller_oktober_2017.pdf
* docu/technology_data_catalogue_for_el_and_dh_-_0009.pdf
* docu/technology_data_catalogue_for_energy_storage.pdf
* docu/technology_data_catalogue_for_individual_heating_installations.pdf
* docu/technology_data_catalogue_for_industrial_process_heat_-_0001.pdf
* docu/technology_data_for_energy_transport.pdf
* docu/technology_data_for_energy_transport_0321.pdf
* docu/technology_data_for_renewable_fuels_-_0003.pdf
* docu/update_of_financial_data_for_coal_fired_chp_plants_may17_july17.pdf
* inputs/EWG_costs.csv
* inputs/Fraunhofer_ISE_costs.csv
* inputs/Fraunhofer_ISE_energy_prices.csv
* inputs/Fraunhofer_ISE_vehicles_costs.csv
* inputs/data_sheets_for_renewable_fuels.xlsx
* inputs/energy_transport_data_sheet_dec_2017.xlsx
* inputs/energy_transport_datasheet.xlsx
* inputs/pnnl-energy-storage-database.xlsx
* inputs/technology_data_catalogue_for_energy_storage.xlsx
* inputs/technology_data_for_carbon_capture_transport_storage.xlsx
* inputs/technology_data_for_el_and_dh.xlsx
* inputs/technology_data_for_el_and_dh_-_0009.xlsx
* inputs/technology_data_for_industrial_process_heat.xlsx
* inputs/technologydatafor_heating_installations_marts_2018.xlsx
Thanks @thesethtruth for pinging us on this and bringing it back to our attention.
Awesome work @euronion, and no thanks to me, but thanks to all of you here! I'm grateful to be able to work with the high quality work you all produce.
Like I said before, if I can assist in the matter; let me know! For instance, would it be nice if I tackle a part of the files that still have unknown licensing?
Thanks @thesethtruth . Yes, if you could help with those files it would be appreciated. Here's what we need:
- For each of the files we need to know under which license they are available and if possible who the copyright holder is.
- The license & copyright information then goes into the
.reuse/dep5file - If you can't find any copyright information, post the URL to the website from where we got the file here; then we can discuss how to proceed. We might be able to contact some of the authors directly, other files might no longer be needed.
But maybe first:
Check which files are actually needed and used in the repository right now, serving as input files to the snakemake workflow. We might have some legacy files still in the repo, especially files in docu/ which may no longer be needed. I'd prefer to remove these files if it is okay for the others as well (tagging @lisazeyen here primarily). If they are gone from the repository we would not need to find licensing information for them :)
Yes, I agree, better to remove them!
So that would leave us with https://github.com/PyPSA/technology-data/blob/1e6e79ae7a7815132f379b0e62b772a42ea94165/Snakefile#L5-L20
The EWG csv is derived from the PDF in a lower level rule. https://github.com/PyPSA/technology-data/blob/1e6e79ae7a7815132f379b0e62b772a42ea94165/Snakefile#L41-L45
Also, I don't know why master contains code that has been commented out, might be for a good reason. To me it seems that the top rule depends on that conversion of the Fraunhofer PDF to csv
https://github.com/PyPSA/technology-data/blob/1e6e79ae7a7815132f379b0e62b772a42ea94165/Snakefile#L29-L34
That makes things much easier. Let's remove all the PDF files then. Can you maybe add a link to them (the original sources) somewhere to the documentation, so we keep them saved for prosperity?
LUT costs
The original document from which the csv is derived is not in the repo, is it? (that should be this one).
Three options:
- Leave as is and put a license claim on it (lazy variant, but I'm pretty sure we're not allowed to do that)
- This only affects the technologies
home battery inverterandhome battery storageand an option in the configuration file. You could consider to reimplement it and move the input toinputs/manual_input.csv. Not sure how much the config option actually affects results; since it is anTrueorFalseoption, one could just add the values to themanual_input.csvand read them from there forTrueand forFalsethe DEA standard assumptions should be used anyways. - Download the file with a rule in snakemake (see here) and remove the
.csvfile. Then there is no file left to put a license on ;-) - Ask the study authors from LUT about the license (none is mentioned in the document)
Fraunhofer ISE costs
Judging from the commit message, rule convert_fraunhofer broke at some point. If I see it correctly, we only use technology=Gasnetz from that study. Not sure if anyone is actually using that technology anymore.
Two options:
- Manually transfer the values from the values from study to
inputs/manual_input.csv(-> backward compatibility) - Remove the technology altogether (@lisazeyen : Do you know what technology specifically "Gasnetz" refers to? Distribution grid? Can we remove that technology or is it still useful for something? If it is still useful: Can we have a better description for it?)
In either case: We can remove the document from docu, remove the code block from Snakefile and related code in compile_cost_assumptions.csv
For LUT and Fraunhofer ISE costs I lean towards option 2.). Would you agree?