matrix-react-sdk icon indicating copy to clipboard operation
matrix-react-sdk copied to clipboard

Setting to strip EXIF metadata from JPEG uploads

Open t3chguy opened this issue 8 years ago • 19 comments

This strips EXIF metadata from uploaded JPEGs by default, thus avoiding privacy sensitive metadata such as GPS and timestamps being leaked over Matrix.

Fixes https://github.com/vector-im/riot-web/issues/4426

t3chguy avatar Aug 15 '17 21:08 t3chguy

Leads to broken thumbnails: (but working and stripped EXIFs) image

{
  "origin_server_ts": 1502833036623,
  "sender": "@abc123abc:matrix.org",
  "event_id": "$15028330362519985ISVKc:matrix.org",
  "unsigned": {
    "age": 7299
  },
  "content": {
    "body": "12382975864_09e6e069e7_o.jpg",
    "info": {
      "mimetype": "image/jpeg",
      "thumbnail_info": {
        "mimetype": "image/jpeg",
        "h": 600,
        "w": 800,
        "size": 205399
      },
      "h": 2976,
      "thumbnail_url": "mxc://matrix.org/nNjRtwWgHjjLCvXCKuubQAfx",
      "w": 3968,
      "size": 6327505
    },
    "msgtype": "m.image",
    "url": "mxc://matrix.org/KPImtwzFwwJDXVSthmbyedRh"
  },
  "type": "m.room.message",
  "room_id": "!UInSXSErRtlgialnsk:riot.ovh"
}

t3chguy avatar Aug 15 '17 21:08 t3chguy

/me is unsure what is going on :/

t3chguy avatar Aug 21 '17 21:08 t3chguy

@ara4n any update on this?

lukebarnard1 avatar Jun 25 '18 13:06 lukebarnard1

not from me; it didn't work for @t3chguy and we haven't yet debugged it.

ara4n avatar Jun 25 '18 13:06 ara4n

Thanks for making this contribution a while back. Since the code base has changed since this was opened, it no longer applies cleanly, and I don't think there's a need to keep it open in this state, as we can always find the code here again if needed.

It looks like we should probably file an issue (if there isn't one already) to track the problem being solved here.

jryans avatar Mar 20 '19 18:03 jryans

https://github.com/vector-im/riot-web/issues/4426 is the bug.

ara4n avatar Mar 20 '19 19:03 ara4n

So the reason why the thumbnails weren't working is that synapse silently chokes on thumbnailing the exif-stripped image, and so the server-side generated thumbnails 404. looks like it's refusing to thumbnail an application/octet-stream upload, given we've lost the mime-type while stripping.

ara4n avatar May 10 '20 15:05 ara4n

empirically this seems to work now. tested in e2ee & non-e2ee rooms, with small & massive images.

It's worth noting that you have to try quite hard to make Riot/Web use server-generated thumbnails these days since https://github.com/matrix-org/matrix-react-sdk/pull/2439; images have to be bigger than 1MB and larger than 1600x1200 (on a retina display) before riot-web will actually ask synapse for a thumbnail.

ara4n avatar May 10 '20 15:05 ara4n

Thanks for making this contribution a while back. Since the code base has changed since this was opened, it no longer applies cleanly, and I don't think there's a need to keep it open in this state, as we can always find the code here again if needed. If you are still interested in pursuing this feature, please discuss with us in #element-dev:matrix.org to find a good approach forward.

jryans avatar Mar 04 '21 15:03 jryans

One deficiency with this PR is that it strips the Orientation EXIF which will cause images to rotate

image

top is with EXIF stripping bottom is without

image image

t3chguy avatar Jul 09 '21 19:07 t3chguy

So I don't see a sane way to retain the Orientation IFD0 - as for the colorimetry we definitely strip colourSpace and all the chromacities stuff, all residing in APP1 along with IFD0

t3chguy avatar Jul 09 '21 21:07 t3chguy

So whilst this won't work ideally in all cases, I think its good enough for 90% of cases. Alternative solutions of course are heavyweight library or canvas based approach.

t3chguy avatar Jul 09 '21 21:07 t3chguy

ugh. i failed to think through that this would break orientation and colorimetry. can we exclude those from being stripped? last thing we need is to make orientation even more flakey.

ara4n avatar Jul 09 '21 23:07 ara4n

can we exclude those from being stripped?

Not easily, would be saner to load re-export out of a canvas IMO

t3chguy avatar Jul 10 '21 16:07 t3chguy

@t3chguy this PR is here for a while now, does it still apply? If yes, wouldn't it make sense to make a new issue for it so we can properly prioritise it take care of it later? It seems that you've figured out some problems while working at it, I think it'd be a good idea to add those in the issue as the definition of done.

Palid avatar Sep 03 '21 13:09 Palid

does it still apply? If yes, wouldn't it make sense to make a new issue for it so we can properly prioritise it take care of it later?

@Palid what's wrong with the existing issue https://github.com/vector-im/element-web/issues/4426 ?

t3chguy avatar Sep 06 '21 10:09 t3chguy

does it still apply? If yes, wouldn't it make sense to make a new issue for it so we can properly prioritise it take care of it later?

@Palid what's wrong with the existing issue https://github.com/vector-im/element-web/issues/4426 ?

Like a lot of old issues it has spread thin in the details over the years and it's hard to see if it's still relevant. I think this issue is still relevant, but it made more sense to ping you. I think that details of the issue should be probably updated with some newer informations that we already have after initial implementations.

Palid avatar Sep 06 '21 10:09 Palid

I might be late on this, but...

While I couldn't definitively be against option to strip metadata form cerrtain files if user chooses to, I'm totally against matrix (umbrella) touching anything in uploaded media by default. That is so big can of worms that will never stop even if you could figure out to do prefect stripping for every file format you accept to touch... And not to say you even can't do it really selectively for images.. (rotation, colorspace, etc data)...

https://github.com/vector-im/element-web/issues/4426#issuecomment-886918321 is still spot on comment on the thing.. Specificly: "private and secure is not the same as anonymous", those who don't want to exif data to "leak", they shoudn't send images with exif data in first place... It can't be matrix task to decide for user, at the least not by default...

olmari avatar Oct 01 '21 21:10 olmari

Existing social media platforms (Twitter, discord, telegram) already do this, they already strip potentially user-compromising data out of media, element would only be following the trend.

I do agree that there'd be an option to disable it, but I think that element would be making a right choice for (at least) removing GPS data from images wherever possible.

ShadowJonathan avatar Oct 07 '21 00:10 ShadowJonathan