Upload image issue
My image seems to be corrupted when uploaded to server, It only works with .txt files.
I'm using sls 1.27, EspressJS and multiparty Is there a way to upload images without any problems?
Thanks.
function uploadImage(req, res) {
let multiparty = require('multiparty');
let fs = require('fs');
let form = new multiparty.Form();
form.parse(req, function(err, fields, files) {
console.log(files);
//uploading file...
})).then(result => {
res.status(200).send({ msj : 'Images were uploaded'});
}).catch(err => {
res.status(500).send(err);
});
});
}
const serverless = require('serverless-http');
const express = require('express')
const app = express()
require('./middlewares/authenticated');
var bodyParser = require('body-parser');
var cors = require('cors');
app.use(cors({'origin' : '*'}));
app.use(bodyParser.urlencoded({ extended : false }) )
.use(bodyParser.json());
service: my-backend
provider:
name: aws
runtime: nodejs6.10
stage: prod
region: us-east-1
plugins:
- serverless-offline
- serverless-s3-local
functions:
app:
handler: index.handler
events:
- http: ANY /
- http: 'ANY {proxy+}'
upload:
handler: index.handler
events:
- http: 'POST /api/registro/uploadImage/'
@dfloresgonz did you ever get this working? I'm running into the same issue.
@evangow I read that sls offline doesnt support binary types so it will never work, but only offline. I used base64 to make it work on local env.
@dfloresgonz Are you converting it to base64 on the client side, then sending it to the server? Or, are you using something like the the gist linked below to parse the event and pass it onto serverless-http?
Would you happen to have a code sample or gist you could share if you're doing it on the server side? I think I could probably figure it out on the client side, but I've been banging my head against the wall for hours looking through issues trying to sort this out.
Parser Gist: https://gist.github.com/lteacher/9ef1c7bc5908418b30a18719521ff3c7#file-parsers-js-L12-L41 ^ Found in this issue: https://github.com/dherault/serverless-offline/issues/230
same here using
curl -X POST -F 'file=@/home/jan/figub/pass.mp3;type=audio/mpeg' localhost:3333/upload -H "Content-Type: audio/mpeg" --verbose -H "Accept: audio/mpeg"
With both serverless-offline-python and serverless-offline on a nodejs environment the body contains payload which is ~30% larger than the original file. A diff between the binary files showed perfect identity up until two thirds - where in the resulting file a bunch of data is appended.
Same here, cannot get it working, we looked through all the other code time after time without understanding what was happening. It would be good to at least throw something if it is not supported - took us more than a day to find this problem.
We would gladly accept a PR on this. It shouldn't be much, just a Hapijs tweaking.
FYI - I chose a different architecture. The lambda now just requests a temporary URL from my file bucket. The file bucket can be hidden behind an internal (transparent) redirect.
I have encountered this issue and after extensive investigation, I found that https://github.com/dherault/serverless-offline/blob/master/src/index.js#L490 causes the issue.
If I am trying to upload an image via multipart/form-data it is converted toString('binary') ( which seems is deprecated). Then it's passed to express (in my case), the request goes through https://www.npmjs.com/package/multer, which is decoded incorrectly and the image becomes corrupted. But if I commented out that toString call, express with its middleware correctly handles the raw buffer.
It was added with https://github.com/dherault/serverless-offline/pull/394 as a workaround? As there behavior that binary data was converted to a utf8 string.
At least from initial testing, it seems that serverless offline works fine without 490 line. Maybe @dherault and @lteacher @daniel-cottone could comment if it's still required?
If the line is removed then i assume it would work, because the line does some magic check to ensure the toString('utf8') doesnt happen for multipart/form-data. So if it didn't ever do that the encoding detect thing wouldn't be needed.
Actually I haven't been using serverless for 2 years but we still have a service running that parses multi-part form data, though it was using an old branch.... I did switch to the latest serverless-offline just now and found that it worked ok still.
So regarding the above question, @arnas I also removed the line to see and as expected it worked just fine for me. That wasn't too surprising though because the check added was because of the toString('utf8') and that was to fix another issue so I have no idea if that issue would be a problem when removing that line. See #224
I also spent a bit of time checking out what the whole problem is around here and it was painful... I guess at the end of the day the issue seems to come down to the encoding (since I wasn't able to get an exact answer with code links to provide). So probably for most of the above, its something like... an issue because the parsing magic is done on pipe and the encoding has been set to binary string and its not a buffer when it comes to your logic.
For sure its just the above referenced line that converts to string, so if its removed and it doesn't break whatever that old issue was then it would be ok. Otherwise you could try setting the encoding, there is some check that happens for if it is an actual buffer so it would probably still work both online and in offline...
So summary try locally setting the encoding... In this gist the write function is called and given the encoding. In the examples from above the magical middleware you are all using is doing it in the pipe so try maybe setting req.setEncoding('binary') before passing it to the parse... which is a random guess since i didn't test it.
Well, it seems that this issue only appears for images, my colleague has tested and for example, uploading .txt works fine. It's a bit strange but from my finding, only images are affected.
@lteacher I have checked the pr #224 and at least form the first look seems that toString() is unnecessary as to prevent hapis.js from parsing payload you can simply set payload to parse to false.
And I am assuming if hapi parse is set to false function toString() should do nothing as payload already is a string.
@arnas sorry I dont get exactly what you mean with the last sentence.
Problem History
- #224 added the
toStringfor whatever reason. This change breaks everyone who has data for upload etc as it converts to 'utf8'. - #230 added function to ensure that just for
multipart/form-datathe encoding will be chosen that doesn't destroy the data, which is then 'binary'. - In this issue thread, people are piping the
reqdata without any set encoding but the data was converted to string, so that wont work. To resolve it without changing here they need to set the encoding as it will find its not a buffer and process as a string and it will not be the correct encoding as it will default to utf8 (side note, I saw in the multiparty code that it will throw an error if you try to set encoding)
Fixing in this repo
So to resolve the issue based on what you said above then the best thing is to not do any toString and remove this logic but you might need to fix that issue from #224 some other way, maybe with that parse option or whatever
Fixing original author issues (assuming the fix is not done here in rep)
- Dont use multiparty.
- When using multer set the req encoding
- When using formidable pass the encoding as an option
- When using something else i dunno, check the source code and find out how to set the encoding
Also for original author if using wrapping packages with this kind of thing like the file data etc you need to check closely everywhere on the packages you are using like this doc over here. As I recall (but might have changed over last 2 years) you can only get binary content to AWS lambda functions in base64 anyway (but of course even the base64 string is destroyed if it is encoded incorrectly to utf8).
@arnas I assume that you have success already on AWS and that you use something like serverless-http.
@lteacher I have update the wording of my previous comment last sentence.
For fixing repo I believe that would be the best way. I am hoping that removing toString while hapi payload parse is to false will be enough. I will look into it if I have time.
@lteacher I am using serverless-http, but I havent deployed it to aws as I amasuming it will work without much problem as atleast from documentation aws api gateway supports that https://aws.amazon.com/about-aws/whats-new/2016/11/binary-data-now-supported-by-api-gateway/ .
@arnas oh well should be np if using that serverless-http
Actually I was thinking of this again for some reason and I had another look and noticed that the parse option is actually used already over here. However, I actually didn't notice before when i added that detectEncoding but that payload is just needed as a string for the JSON.parse in there. There is some createLambdaProxyContext which uses the payload but someone haxed something in to not parse if its not a string there (using rawPayload).
Theres a velocity template method too seems to assume the payload is json so dunno about that since sometimes its not already, otherwise could be able to just even move that stringifying stuff into the relevant scope where its needed which is like if it needs to be parsed per last reference?
@lteacher I have a bit of debugging and from the first look it seems that this issues is not only from serverless-offline side.
Basically I have created plain handler.js which uses serverless-http to transfer requests to express and multer to deal with multipart data. https://gist.github.com/arnas/d8fed4b78ff2940a3b390e754090cea9
Without changing any dependencies I have confirmed that txt files were encoded and decoded correctly, but there was an issue with png files. They have grown about 30 percentage bigger. (I think somebody mentioned that).
Also, I have confirmed that if I encode buffer and decode it afterward via Buffer(binaryString, 'binary') everything is fine. But as we are giving serverless-http string it uses Buffer.from(binaryString, 'utf-8') and corrupts file unless it was a text file :/ https://github.com/dougmoscrop/serverless-http/blob/c70957b47ac66e363c7179a6e0a4cd8c6d78a2ee/lib/request.js#L15.
tl;dr the bug happens because serverless-http wrongly encodes request.
Suggestions for solution
-
I haven't looked into these velocity tempaltes and I am not sure what are they for, but I believe that encoding to binary string are not required. So I would suggest to remove them and assume that the lambda handler will be able to handle raw buffer.
-
Wait till and if serverless-http will fix it from their side. There were some issues with binary files https://github.com/dougmoscrop/serverless-http/issues/50, but i will raise a new one.
I could create a pr for a first point if this approch is acceptable with lib maintainers.
Side note rawPayload is always equal to https://github.com/dherault/serverless-offline/blob/master/src/index.js#L490
Hey @arnas I think that you are missing some setup on AWS side. If you want to use binary content you need to do some different setup per these docs: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings.html
Thats why I posted this thing from serverless-http
I knew about that Buffer.from(binaryString, 'utf-8') when i last looked and posted my response but I didnt post it here because its not relevant if the AWS setup is correct. That is because on AWS side if you are using binary content it should be coming base64 encoded and there is a flag its called isBase64Encoded. Actually I looked in the serverless-http code lol and I saw this flag so I assumed they used the flag but.... actually now I see they detect it or something so instead you have to provide some option? to set binary: true
If you look here in this gist you can see the actual property that your aws lambda-proxy binary content enabled handler will receive on the event. That is not a magical value I added for the gist, sadly that is an AWS thing that gets attached to the event.
I actually don't use serverless-http so dunno about that option. I didn't think it would be nice to include express or koa etc inside the this stuff. I have a separate package I created to make it nice for adding parsing etc, then I use the exact gist I posted above for parsing before the file handler.
In summary I think you can change your AWS setup to correctly handle binary content, then you are going to start recieving base64 encoded blobs in serverless-http. Then you need to be sure that serverless-http is expecting it to be isBase64Encoded. After that it should already work on AWS. Then to get serverless-offline working is where the actual issues are left because it doesn't do the base64 encoding magic that the AWS stuff is doing so thats all this entire issue so far.
I'm running into this issue using serverless-http + serverless-offline and multer-s3 for handing multipart/form-data when uploading images.
@lteacher do you have a recommendation for getting serverless-offline to correctly handle this? My service works when deployed on AWS through api-gateway. For local development it is not.. Referring to your last comment above: "Then to get serverless-offline working is where the actual issues are left because it doesn't do the base64 encoding magic that the AWS stuff is doing so thats all this entire issue so far."
@hqnarrate Sorry im just not up to date on this issue any more, there seems to be a PR #784 that was working to address this somewhat through the config options.
Looking back over the comments it does seem like there is a way to refactor the offending line out but I just don't have any time to investigate that at the moment and I don't use any of the packages mentioned in these issues except serverless-offline so I can't really test any resolution.
Maybe @cmuto09 can give an update on where the PR referenced above is at, it looks like it needs rebasing at minimum.
@hqnarrate @lteacher a bit of a problem is that @cmuto09 PR depends on the https://www.npmjs.com/package/serverless-apigw-binary plugin. but it seems serverless supports this now as well.
some references for serverless support:
https://serverless.com/framework/docs/providers/aws/events/apigateway/#binary-media-types https://serverless.com/blog/framework-release-v142/ https://github.com/serverless/serverless/pull/6063
additional issues: https://github.com/serverless/serverless/issues/2797 https://forum.serverless.com/t/returning-binary-data-jpg-from-lambda-via-api-gateway/796 https://github.com/dougmoscrop/serverless-http/issues/88
@dnalborczyk - i was not clear. My service is working fine when deployed to AWS environment by setting the apiGateway.binaryMediaTypes correctly. It is when I'm running on offline mode with 'serverless offline start', my uploads would become corrupted.
@hqnarrate sorry, I think I haven't been clear. 😄 I knew what you meant. The links above are pointers for me (or anyone else getting to it) for the implementation.
@dnalborczyk Any update or workarounds on this issue?
My solution: https://stackoverflow.com/a/61003498/9585130
Just add this to serverless.yml.
provider:
apiGateway:
binaryMediaTypes:
- '*/*'
And I'm uing "aws-serverless-express-binary": "^1.0.1"
Does anybody know a way around this in offline mode? I am trying to upload files / pdf and this does not work for me
@pvsvamsi @lassesteffen
Try aws-serverless-express-binary instead aws-serverless-express
I am using apollo-server-lambda not express
Try upgrading to the latest versions of serverless and serverless-offline. I believe this was fixed, but I could not find anything about it in the changelog
Same issue there
@abinhho tried your solution but not working, Binary Media Types seems to be for download not upload. Not working in my case
My solution was to convert my file to base64 before upload, not really nice but only way I found to make it works
@ajouve This upload function still running in my before project. So I sure it's worked. Maybe something wrong on your code. If you can share your code, I think someone can help.