OpenWPM icon indicating copy to clipboard operation
OpenWPM copied to clipboard

Add support for saving screenshots, page source, and other arbitrary files to unstructured storage providers

Open englehardt opened this issue 7 years ago • 1 comments

Screenshots, page source, and other files collected in the browser manager process are currently written directly to disk. This worked when OpenWPM only saved data locally, but will not work for the S3Aggregator. Instead, BaseAggregator should include a save_file method. In LocalAggregator we can implement that to save to disk, and in S3Aggregator we can upload to S3.

englehardt avatar Nov 09 '18 01:11 englehardt

Updating this comment as #753 removed everything mentioned in the original issue. Observations:

  • UnstructuredStorageProviders already have an interface suitable for storing a bunch of bytes under a user-defined name
  • The base path for storing is specified at time of object instantiation
  • => There is no more need for a data_directory in the manager params similiar to the database_name name being removed in #753

Paths forward:

  1. Add a second UnstructuredStorageProvider to the StorageController that is responsible for saving unstructured platform data
  2. Expand the UnstructuredStorageProvider interface with a second method that is responsible for saving unstructured platform data

I prefer option 1 as it is inherently more flexible, e.g. this way screenshots can get saved into the cloud while web content just gets saved to disk.

vringar avatar Feb 22 '21 12:02 vringar