Regression in aws-sdk-v3 PutObject when using node.js streams
February 8, 2025
Update since posting
There is a simple fix for this. We can disable the new checksum features by using requestChecksumCalculation: 'WHEN_REQUIRED'
when constructing the S3Client
. e.g.
const client = new S3Client({
apiVersion: '2006-03-01',
region: 'eu-west-2',
requestChecksumCalculation: 'WHEN_REQUIRED'
})
This appears to skip the manipulation of the headers and allows the stream to work correctly as it did before. More details on github.
The original post
When storing objects in S3 from a node.js application server, streams allow a method to store the body of a user’s request without buffering to disk, or keeping a full in-memory footprint. Consider the following example using express:
app.put('/example', (req, res, next) => {
/* validation */
client.send(new PutObjectCommand({
Bucket: targetS3Bucket,
Key: targetS3Key,
ContentLength: Number(req.headers['content-length']),
ContentMD5: req.headers['content-md5'],
Body: req, // represents a stream of the request body
}))
.then(() => res.sendStatus(200))
.catch(next);
});
This worked until @aws-sdk/client-s3
version 3.726.1. However, in 3.729.0 the command fails when the user’s request is made over a WAN1. The following is seen written to the console:
Are you using a Stream of unknown length as the Body of a PutObject request? Consider using Upload instead from @aws-sdk/lib-storage.
the PutObjectCommand
itself fails with error:
InvalidChunkSizeError: Only the last chunk is allowed to have a size less than 8192 bytes
Isolating the problem
Are we using a stream of unknown length? No - we know the length. We provide the length as the ContentLength
field. Tracing that error message back to the aws-sdk-v3 middleware-sdk-s3 source suggests that the Content-Length
header is being checked on the request object itself. But again, that should be there, given it’s where we source ContentLength
from.
Adding some diagnostic logging:
- In 3.726.1 the
content-length
header is present - In 3.729.0 the
content-length
header is absent, but a newly addedx-amz-decoded-content-length
header is present instead.
Searching the aws-sdk-v3 source code, the headers are being manipulated in the middleware-flexible-checksums
module which is newly added to the middleware applied to requests in version 3.729.0.
As an experiment, I altered the source code to retain the content-length
header. This changes the error seen to one related to checksum validation.
Can we bypass the problematic code?
Seemingly not.
Manipulation of the arguments to PutObjectCommand
does not seem to allow us to avoid going through the flexible checksums middleware, or to avoid the code path which changes the headers. As such we seem stuck - streaming HTTP requests into PutObjectCommand
seems impossible from 3.729.0 onwards.
How to resolve then?
Consider using Upload instead from @aws-sdk/lib-storage.
Ok, let’s do that. The reason we opted against the Upload
helper historically was due to consistency checking. When automatically split into parts the user-provided Content-MD5
2 digest would not match the digest calculated AWS-side. We’ll mitigate that by setting part size artificially high to avoid splitting.
const upload = new Upload({
client,
partSize: 100 * 1024 * 1024,
params: {
Bucket: targetS3Bucket,
Key: targetS3Key,
ContentLength: Number(req.headers['content-length']),
ContentMD5: req.headers['content-md5'],
Body: req, // represents a stream of the request body
},
});
upload.done()
.then(() => res.sendStatus(200))
.catch(next);
This resolves our problem while keeping the feature-set of the original.