ckanext-cloudstorage icon indicating copy to clipboard operation
ckanext-cloudstorage copied to clipboard

[WIP] Py3

Open smotornyuk opened this issue 6 years ago • 19 comments

Support of upcomming CKAN v2.9 and Python3(https://github.com/ckan/ckan/wiki/Migration-guide-for-extensions#step-by-step-workflow)

smotornyuk avatar Dec 19 '19 12:12 smotornyuk

Thanks, however I won't accept this with the black-ified styling. Black's styling is terrible and I do not use it in my projects.

TkTech avatar Dec 19 '19 13:12 TkTech

No problem, I have no particular preferences in terms of formatting as long as there is a way not to do it manually:) I've reverted commit with formatting and added a new one, only with futurize -w. Till CKAN v2.9 is announced, this PR still in-progress, so any suggestions are welcome.

smotornyuk avatar Dec 19 '19 13:12 smotornyuk

What is the status of this PR? I have checked out this code and ran it on my ckan 2.9/python 3 setup with xloader, and I get two errors in /etc/ckan/default/uwsgi.ERR when trying to upload a file:

DEBUG [ckanext.xloader.action] Enqueued xloader job=85fb45c0-35b2-4fab-9fdc-172332b442ff res_id=cd4a0668-5c76-4ebe-b9aa-2ffe163b5b84
2022-05-12 13:10:13,096 ERROR [ckan.config.middleware.flask_app] 'NoneType' object has no attribute 'transaction'
Traceback (most recent call last):
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/views.py", line 89, in view
    return self.dispatch_request(*args, **kwargs)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/views.py", line 163, in dispatch_request
    return meth(*args, **kwargs)
  File "/usr/lib/ckan/default/src/ckan/ckan/config/middleware/../../views/resource.py", line 252, in post
    get_action(u'resource_create')(context, data)
  File "/usr/lib/ckan/default/src/ckan/ckan/logic/__init__.py", line 504, in wrapped
    result = _action(context, data_dict, **kw)
  File "/usr/lib/ckan/default/src/ckan/ckan/logic/action/create.py", line 332, in resource_create
    model.repo.commit()
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/sqlalchemy/orm/scoping.py", line 162, in do
    return getattr(self.registry(), name)(*args, **kwargs)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 1027, in commit
    self.transaction.commit()
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 494, in commit
    self._prepare_impl()
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/sqlalchemy/orm/session.py", line 464, in _prepare_impl
stx = self.session.transaction
AttributeError: 'NoneType' object has no attribute 'transaction'

and

ERROR [ckan.config.middleware.flask_app] 'ResourceUpload' object has no attribute 'get_url_from_filename'
Traceback (most recent call last):
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/app.py", line 1949, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/flask/app.py", line 1935, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/ckanext_cloudstorage-0.1.1-py3.8.egg/ckanext/cloudstorage/views.py", line 12, in download
    return utils.resource_download(id, resource_id, filename)
  File "/usr/lib/ckan/default/lib/python3.8/site-packages/ckanext_cloudstorage-0.1.1-py3.8.egg/ckanext/cloudstorage/utils.py", line 158, in resource_download
    uploaded_url = upload.get_url_from_filename(resource['id'],
AttributeError: 'ResourceUpload' object has no attribute 'get_url_from_filename'

KatiRG avatar May 12 '22 17:05 KatiRG

This branch is used in our environments and it somehow gets a couple of new features/bug fixes during this year. Because of this, it probably will not get merged into the original repo.

But I'm using it inside a dozen of environments and I'm sure that it works. As for your errors:

'NoneType' object has no attribute 'transaction'

This looks like a known issue of ckanext-xloader. Check this issue. Basically, it looks like you are not using the latest versions of software. Try CKAN v2.9.5 and the latest version of ckanext-xloader

AttributeError: 'ResourceUpload' object has no attribute 'get_url_from_filename'

This error was raised on L158. It reports that upload object from L150 doesn't have a particular method. Note, that its class is ResourceUpload. ckanext-cloudstorage, when properly configured, should use the ResourceCloudStorage class instead. This means, that either you misconfigured the plugin(unlikely, there is no chance that something like this happened), or some other plugin takes priority and intercepts the upload. Try using a pure CKAN setup with ckanext-cloudstorage, with no other plugins at all.

smotornyuk avatar May 12 '22 19:05 smotornyuk

This branch is used in our environments and it somehow gets a couple of new features/bug fixes during this year. Because of this, it probably will not get merged into the original repo.

Where's a good point to prune the commits back to?

TkTech avatar May 12 '22 19:05 TkTech

I think,88984c9 - Increase upload speed for multiline files

From this point following commits provide extra features:

  • Standard version check: fix for CKAN v2.10 support

  • Py2 support of max upload size: make sure that py2 instances apply WebUI restriction on max uploaded filesize

  • Update last_modified: uploads update last_modified date of the related resource object

  • Add ckanapi to requirements: restore previously removed ckanapi dependency

  • Set content-type for multipart uploads: try to set content-type during multipart uploads. Otherwise some features(updated pdfview that uses object tags) do not work.

  • Multipart allows subclassing of uploader: Instead of using ResourceCloudStorage class everywhere, try to get current uploader using CKAN IUploader interface. In this way, one can extend the original uploader and change some functionality. We used it in order to compute file paths in the cloud in a bit more complex way.

  • better multipart removal: fix for the bug, when files are not removed from the bucker during multipart upload.

  • Update test usite: align test configuration with the CKAN recommendations

  • Use custom uploader all the time: fixes for Multipart allows subclassing of uploader commit. Initially, I haven't changed all the places where the uploader class was hardcoded. This led to the usage of different uploader classes in different situations.

smotornyuk avatar May 13 '22 07:05 smotornyuk

Excellent, thank you! I was running xloader 0.8.0 (installed with pip--don't do this!) instead of 0.9.0 (installed with git clone instead).

For the ResourceUpload error, I discovered that in cloudstorage/utils.py L158 I had to change in function resource_download() the call to get_url_from_filename() from:

uploaded_url = upload.get_url_from_filename(resource['id'], filename, content_type=content_type)

to

storage = ResourceCloudStorage(resource)
uploaded_url = storage.get_url_from_filename(resource['id'], filename, content_type=content_type)

This branch is used in our environments and it somehow gets a couple of new features/bug fixes during this year. Because of this, it probably will not get merged into the original repo.

But I'm using it inside a dozen of environments and I'm sure that it works. As for your errors:

'NoneType' object has no attribute 'transaction'

This looks like a known issue of ckanext-xloader. Check this issue. Basically, it looks like you are not using the latest versions of software. Try CKAN v2.9.5 and the latest version of ckanext-xloader

AttributeError: 'ResourceUpload' object has no attribute 'get_url_from_filename'

This error was raised on L158. It reports that upload object from L150 doesn't have a particular method. Note, that its class is ResourceUpload. ckanext-cloudstorage, when properly configured, should use the ResourceCloudStorage class instead. This means, that either you misconfigured the plugin(unlikely, there is no chance that something like this happened), or some other plugin takes priority and intercepts the upload. Try using a pure CKAN setup with ckanext-cloudstorage, with no other plugins at all.

KatiRG avatar May 13 '22 12:05 KatiRG

@smotornyuk Are you by any chance using Azure blob storage? I found a bug with the way the URL is obtained in the libcloud driver which ended up causing xloader to fail in my setup with cloudstorage. It can be fixed by dynamically assigning the scheme based on if the storage is secure. I submitted an issue in libcloud about this.

KatiRG avatar Jun 03 '22 15:06 KatiRG

No, but I'll subscribe to the issues in the libcloud repo and, when it gets fixed, will update the required version of apache-libcloud to the one with the fix(the master branch of my fork uses latest stable version of the apache-libcloud) If I have spare time, I'll try to patch this case inside the cloud storage. But at the moment I'm too busy on projects so I can't work on this issue.

smotornyuk avatar Jun 05 '22 13:06 smotornyuk

Hi @TkTech , what is the status of this PR and the py3-fixes branch which is now gone? I am using the code and wonder how to continue. Thanks!

KatiRG avatar Jun 16 '22 17:06 KatiRG

No, but I'll subscribe to the issues in the libcloud repo and, when it gets fixed, will update the required version of apache-libcloud to the one with the fix(the master branch of my fork uses latest stable version of the apache-libcloud) If I have spare time, I'll try to patch this case inside the cloud storage. But at the moment I'm too busy on projects so I can't work on this issue.

@smotornyuk It has been merged :)

KatiRG avatar Jun 16 '22 17:06 KatiRG

There is one additional change needed for ckan 2.9: swap out this pylons import with ckantoolkit in both cloudstorage/logic/action/multipart.py and cloudstorage/storage.py:

-    from pylons import config
+    from ckantoolkit import config

Is this PR still under consideration?

KatiRG avatar Aug 24 '22 20:08 KatiRG

@TkTech @smotornyuk Hey guys, whats the status of this PR, it would be great if this get merged soon

tino097 avatar Oct 17 '22 17:10 tino097

This exact PR won't get merged, because it contains quite a lot of opinionated customizations. It hangs here just for reference and, as I understand, when Tyler has time for it, he is going to pick some changes from it and upgrade the master branch independently of this PR

smotornyuk avatar Oct 18 '22 10:10 smotornyuk

Thanks @smotornyuk Maybe @TomeCirun or me could try to do that in separate PR

tino097 avatar Oct 18 '22 12:10 tino097

I haven't received any funding to work on CKAN in quite awhile, so this stuff is on the absolute bottom of the list for me. I'm happy to add @smotornyuk and/or others on the team as maintainers if they're willing.

TkTech avatar Oct 18 '22 14:10 TkTech

@TkTech what about transfering the extension under CKAN org ?

cc @amercader @wardi

tino097 avatar Oct 18 '22 14:10 tino097

@TkTech if you support transferring the extension to the ckan org I think that would be great. Otherwise having a few more people as maintainers works.

wardi avatar Oct 18 '22 15:10 wardi

For now, I'd rather keep the repo where it is. I am definitely open to revisiting in a few months.

@wardi if you can bring this up during the Thursday dev meeting to find, say, 2 volunteers, I'd appreciate it. I will do some cleanups and add CI tasks for release management on the weekend.

TkTech avatar Oct 19 '22 08:10 TkTech