Download multiple Amazon S3 files

Asked

Viewed 1,864 times

2

Situation

I have hundreds (and even thousands) of small files (~50KB) on Amazon S3 separated into Buckets per day.

Problem

I need to download through my Java application delivering to the front end of all files for a certain period. My machine in Cloud is limited in memory and disk resources (it has 2GB of RAM and 5GB of disk).

Solution 1

Download one by one the files and transfer them to the front-end? Solution somewhat inefficient, since it comes to thousands of small files.

Solution 2

Download one by one the files and zip (considering the limits of the machine, breaking the zip in parts if applicable) and upload this zip to Amazon S3, delivering to the front only the zip link.

Question

Is there another solution someone has used, some native AWS resource or some more efficient idea to solve this problem?

  • I believe your solution is here : https://stackoverflow.com/questions/41764836/amazon-s3-console-download-multiple-files-at-once using AWS CLI you have the option to download multiple files.

  • Why not just provide S3 links to the file frontend within the desired period? The application will not need to download anything and the client will be able to access which file needs.

1 answer

1


If I understood correctly the problem would be performance, some things I believe can help:

1-Use a function to get the zip ready: https://docs.aws.amazon.com/lambda/latest/dg/with-s3.html

2- Deliver to the customer via Cloudfront (CDN): https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/MigrateS3ToCloudFront.html

3- Deliver via Bittorrent: https://docs.aws.amazon.com/AmazonS3/latest/dev/S3Torrent.html

4- Use Transfermanager to download in parallel: https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/examples-s3-transfermanager.html

5- Avoid using files so small, maybe aggregate in larger batches with lambda or Glue.

Direct delivery through S3/Cloudfront is better in cost, performance and safety.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.