Download awards endpoint fails when specifying columns
The query fails, when I specify columns to download with the:
/api/v2/download/awards
endpoint. It works fine when I set:
"columns": []
Full JSON options:
{"columns":["obligated_amount","product_or_service_code","product_or_service_code_description","naics_code","naics_description"],"filters":{"keywords":["transportation"," aviation"],"time_period":[{"start_date":"2018-06-30","end_date":"2019-06-30"}]}}
I do not get any error message. Just 500 Internal Server Error.
Can I ask which tool you're using to run this query or are you using the default GUI that comes up when you visit the URL?
The online GUI and my code give me the same error. I use httpparty gem on RubyOnRails to access the API.
We are currently experiencing an issue with the online GUI. If you're using an external tool, be sure to set the Content-Type header to application/json.
I did set it to JSON. The following works just fine:
{"columns":[],"filters":{"keywords":["transportation"," aviation"],"time_period":[{"start_date":"2018-06-30","end_date":"2019-06-30"}]}}
I'm trying to reduce the size of the download.
Have you tried your original query recently? It seems to be working for me this morning.
Still not working. The status is:
running running running running running running running failed
However, I don't get anything in the error message telling me what's wrong. I'm going to try the same query in postman.
Please (re)post the exact query and URL you are using. I will try to reproduce again.
The file should be:
https://files.usaspending.gov/generated_downloads/all_prime_awards_subawards_20190701135515927430.zip
The options in JSON:
{"columns":["obligated_amount","product_or_service_code","product_or_service_code_description","naics_code","naics_description"],"filters":{"keywords":["transportation","aviation"],"time_period":[{"start_date":"2018-07-01","end_date":"2019-07-01"}]}}
The response of the call to: 'https://api.usaspending.gov/api/v2/download/awards'
{"status":"ready","total_columns":null,"total_rows":null,"seconds_elapsed":null,"url":"https://files.usaspending.gov/generated_downloads/all_prime_awards_subawards_20190701135642182405.zip","total_size":null,"file_name":"all_prime_awards_subawards_20190701135642182405.zip","message":null} https://files.usaspending.gov/generated_downloads/all_prime_awards_subawards_20190701135642182405.zip
The call to the API:
self.class.post('/download/awards', :body => options.to_json, :headers => { 'Content-Type' => 'application/json' })
When I query https://api.usaspending.gov/api/v2/download/awards with
{
"columns": [
"obligated_amount",
"product_or_service_code",
"product_or_service_code_description",
"naics_code",
"naics_description"
],
"filters": {
"keywords": [
"transportation",
"aviation"
],
"time_period": [
{
"start_date": "2018-07-01",
"end_date": "2019-07-01"
}
]
}
}
in Postman, I get
{
"url": "https://files.usaspending.gov/generated_downloads/all_prime_awards_subawards_20190701140105877504.zip",
"file_name": "all_prime_awards_subawards_20190701140105877504.zip",
"total_columns": null,
"message": null,
"status": "ready",
"total_rows": null,
"seconds_elapsed": null,
"total_size": null
}
Have you had a chance to try it in Postman yet so we can eliminate other issues?
It get the same, what does the status give you? If I try to download the file of your query, it is not working:
https://files.usaspending.gov/generated_downloads/all_prime_awards_subawards_20190701140105877504.zip
Trying the status of you query, I get the more detailed error message:
https://api.usaspending.gov/api/v2/download/status/?file_name=all_prime_awards_subawards_20190701140105877504.zip
{
"url": "https://files.usaspending.gov/generated_downloads/all_prime_awards_subawards_20190701140105877504.zip",
"file_name": "all_prime_awards_subawards_20190701140105877504.zip",
"total_columns": 5,
"message": "An exception was raised while attempting to process the DownloadJob:\nTraceback (most recent call last):\n File \"/data-act/backend/usaspending_api/download/filestreaming/csv_generation.py\", line 58, in generate_csvs\n parse_source(source, columns, download_job, working_dir, piid, zip_file_path, limit)\n File \"/data-act/backend/usaspending_api/download/filestreaming/csv_generation.py\", line 165, in parse_source\n temp_file, temp_file_path = generate_temp_query_file(source_query, limit, source, download_job, columns)\n File \"/data-act/backend/usaspending_api/download/filestreaming/csv_generation.py\", line 290, in generate_temp_query_file\n csv_query_annotated = apply_annotations_to_sql(generate_raw_quoted_query(source_query), source.columns(columns))\n File \"/data-act/backend/usaspending_api/download/filestreaming/csv_generation.py\", line 337, in apply_annotations_to_sql\n raise Exception(\"Length of alises doesn't match the columns in selects\")\nException: Length of alises doesn't match the columns in selects\n",
"status": "failed",
"total_rows": 296369,
"seconds_elapsed": "11.680957",
"total_size": null
}
Ok. You are correct. There does, indeed, seem to be an issue with downloads. With the information you provided I was able to nail down exactly where things are going awry. Thank you very much for your help.
I will discuss this issue with the project lead and post back here when I know something.
I have created two bug tickets that have been added to the prioritized bug list. I do not have an ETA.
https://federal-spending-transparency.atlassian.net/browse/DEV-2998 https://federal-spending-transparency.atlassian.net/browse/DEV-2999
Thank you again!
This should now be resolved