amazon-web-services,amazon-s3,cloud , How to extract files from a zip archive in S3


How to extract files from a zip archive in S3

Question:

Tag: amazon-web-services,amazon-s3,cloud

I have a zip archive uploaded in S3 in a certain location (say /foo/bar.zip) I would like to extract the values within bar.zip and place it under /foo without downloading or re-uploading the extracted files. How can I do this, so that S3 is treated pretty much like a file system


Answer:

S3 isn't really designed to allow this; normally you would have to download the file, process it and upload the extracted files.

However, there may be a few options:

  1. You could mount the S3 bucket as a local filesystem using s3fs and FUSE (see article and github site). This still requires the files to be downloaded and uploaded, but it hides these operations away behind a filesystem interface.

  2. If your main concern is to avoid downloading data out of AWS to your local machine, then of course you could download the data onto a remote EC2 instance and do the work there, with or without s3fs. This keeps the data within Amazon data centers.

  3. You may be able to perform remote operations on the files, without downloading them onto your local machine, using AWS Lambda.

You would need to create, package and upload a small program written in node.js to access, decompress and upload the files. This processing will take place on AWS infrastructure behind the scenes, so you won't need to download any files to your own machine. See the FAQs.

Finally, you need to find a way to trigger this code - typically, in Lambda, this would be triggered automatically by upload of the zip file to S3. If the file is already there, you may need to trigger it manually, via the invoke-async command provided by the AWS API. See the AWS Lambda walkthroughs and API docs.

However, this is quite an elaborate way of avoiding downloads, and probably only worth it if you need to process large numbers of zip files! Note also that Lambda functions are limited to 60 seconds maximum duration (default timeout is 3 seconds), so may run out of time if your files are extremely large.


Related:


Why does my image url from Amazon S3 have AWSaccesskey and expiration even though I made the bucket public?


ruby-on-rails,amazon-s3,carrierwave
Here is the policy I added: { "Version": "2012-10-17", "Statement": [ { "Sid": "MakeItPublic", "Effect": "Allow", "Principal": "*", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::bucketname/*" } ] } I created an IAM and attached the AmazonS3FullAccess to that IAM. In my rails app, I display the pictures like this: - @pictures.each do |picture|...

How to configure aws CLI to s3 cp with anonymous user


amazon-web-services,amazon-s3,aws-sdk
I need to download files recursively from a s3 bucket. The s3 bucket lets anonymous access. How to list files and download them without providing AWS Access Key using an anonymous user? My command is: aws s3 cp s3://[email protected]/pavlo/text/tiny/rankings/uservisits uservisit --region us-east --recursive The aws compains that: Unable to locate...

Xcode + AWS Integration Apple Mach-O Linker Error


ios,xcode,amazon-web-services
I have a very simple Xcode project that I started and am now trying to integrate the Amazon Web Service (AWS) SDK into my project. I followed the instructions posted on their instruction page and everything looks good EXCEPT that I've got the following output... duplicate symbol _OBJC_CLASS_$_XMLDictionaryParser in: /Volumes/Macintosh...

Which is a better way: retrieve images from AWS S3 or download it and store locally in a temp folder to be displayed?


objective-c,core-data,amazon-web-services,amazon-s3,awss3transfermanager
Problem: Retrieve image from S3 and load into UIButton. I'm currently doing my research on this issue and can't seem to make up my mind. Which is a better way to do it in terms of performance and security issue? Also, do I need to do caching or store these...

How to transfer files from iPhone to EC2 instance or EBS?


ios,iphone,amazon-ec2,amazon-s3,amazon-ebs
I am trying to create an iOS app, which will transfer the files from an iPhone to a server, process them there, and return the result to the app instantly. I have noticed that AWS offers an SDK to transfer files from iOS app to S3, but not to EC2...

Installing Python 3 Docker Ubuntu error command 'x86_64-linux-gnu-gcc


python,python-3.x,amazon-web-services,docker
I'm trying to create a dockerfile that uses Python 3. FROM ubuntu:14.04 RUN apt-get update RUN apt-get install -y python3 python3-dev python-pip RUN apt-get install -y libxml2-dev libxslt1-dev libpq-dev libjpeg-dev libfreetype6-dev zlib1g-dev RUN cd /var/projects/apps && pip install -r requirements.txt I get the error fatal error: Python.h: No such file...

How do I SSH into EC2 with .pub?


amazon-web-services,ssh,amazon-ec2
When I create a new Elastic Beanstalk environment it asked me if wanted to create a new keypair. I say yes, and it created two file in my .ssh folder locally called app and app.pub. Normally to ssh into an instance I use a app.pem file. i.e ssh -i app.pem...

AWS Beanstalk - Passenger Standalone not serving web pages after Rails 4.2.1 migration


ruby-on-rails,ruby-on-rails-4,amazon-web-services,passenger,elastic-beanstalk
My Rails 3.2.21 app was running fine on AWS Beanstalk under Passenger Standalone 4.0.53. I migrated the app to Rails 4.2.1 and got it passing all tests on my local development machine (Ubuntu, WEBrick). I deployed it to Beanstalk (aws.push), the deploy succeeds (copied from /ondeck to /current) and: nothing....

Deleting Data from DynamoDb Table automatically


amazon-web-services,amazon-dynamodb
Is there any kind of life retention period concept in DynamoDB. I mean is there any way such that data inside a table will be deleted after some time like we can set some retention period in S3. Thanks,...

Storing user submitted images


node.js,file-upload,amazon-s3,amazon-cloudfront
I'm building a node application in which users can submit images to customize their profile. I'm wondering what the best way would be to store these images? Is something like Amazon S3 the way to go? What about CloudFront, can this accept user submitted images? Sorry if this question is...

How to change the IP address of Amazon EC2 instance using boto library


python,amazon-web-services,boto
How can I assign a new IP address (or Elastic IP) to an already existing AWS EC2 instance using boto library.

How to find Unused Security Groups of all AWS Security Groups?


python-2.7,amazon-web-services,amazon-ec2,amazon-s3,boto
How to find all the used security groups attached with all the aws resources using Boto? Currently the following script which is giving only ec2 instances- sec_grps = ec2_conn.get_all_security_groups() for group in sec_grps: print group, " Instances attached ", group.instances() Is there any way to get all security groups which...

Eclipse not compiling because of ClassNotFoundException


java,eclipse,amazon-web-services,compilation,aspectj
After following an AWS tutorial for Eclipse, my code no longer compiles and runs. I decided to undo what the tutorial told me, so I may have changed some settings that I forgot to unchanged but I really cannot find the root of my problem. Eclipse seems to be back...

Why is this python boto S3 multipart upload code not working?


python,amazon-web-services,amazon-s3,multiprocessing,boto
I am trying to upload a 10 GB file to AWS S3, and someone said to use S3 Multipart Upload, so I stumbled upon someone's github gist: import os import sys import glob import subprocess import contextlib import functools import multiprocessing from multiprocessing.pool import IMapIterator from optparse import OptionParser from...

how to use AWS cognito with custom authentication to create temporary s3 upload security token


amazon-web-services,amazon-cognito
So I'm a bit confused by the Amazon documentation on Cognito concerning one of their stated use cases: "use your own identity system... allowing your apps to save data to the AWS cloud". In my case I want to give them aws tokens to upload directly to s3 from the...

How to set a variable using dynamic inventory using Ansible


amazon-web-services,amazon-ec2,ansible,ansible-playbook,rds
I am looking for method to set a variable in ansible playbook using inventory information received from dynamic inventory. For example if we have a sample playbook like --- - hosts: localhost connection: local tasks: - set_fact: rds_hostname="{{ rds_mysql }}" #set rds endpoint from ec2.py - debug: var=rds_hostname I am...

AWS RDS on Eclipse


eclipse,amazon-web-services,amazon-ec2,amazon-rds
I know this question seems a repeat, but it's not and I have tried all the solutions I could find PROBLEM: I am running an AWS RDS instance for the database. It works fine when I connect to it using AWS EC2 instance(uses linux) but when I try it with...

How can I know the database url of AWS EC2 MySQL?


mysql,amazon-web-services,amazon-ec2
I would like to import gtfs files into mysql by using a tool from github, runing the follwoing command: gtfsdb-load --database_url <db url> <gtfs file | url> How can I get the database_url of mysql located in AWS EC2?...

Configure Dockerfile to set AWS configurations


node.js,amazon-web-services,docker
I've just started looking at Docker. I have a node app that resizes and image and then sends an SQS message to aws when finished. I have managed to create a docker image of my app, copying it from my local machine, but run into the issue that I can't...

Image Upload Strategy with Clusters And Amazon S3


php,image,amazon-s3
Trying to sort out a strategy to deal with uploaded images whose endpoint is Amazon S3. The goal is, upon upload, that the image is immediately visible. However, the current way of handling the situation is that the end-user uploads the image and then has to wait for it to...

“undefined method 'value'” when looping


ruby-on-rails,ruby,amazon-web-services
I am trying to loop through a list of tags returned from AWS API, but I'm getting "undefined method 'value'. I can provide further information if needed. This is my simple loop: @instances.each do |i| t = 0 while i.tags.any? do puts i.tags[t].value t += 1 end end ...

heroku pgbackups:url command is no longer working?


ruby-on-rails,postgresql,ruby-on-rails-4,amazon-web-services,heroku
How do I download my dump directly from Amazon AWS S3 if heroku pgbackups:url b004 isn't working? Specifically, when I run this command it returns: ! Please add the pgbackups addon first via: ! heroku addons:add pgbackups And then when I run this command I get: ! No such add-on...

AWS Elastic beanstalk scale triggering


amazon-web-services,elastic-beanstalk
I set the following parameters in my elastic beanstalk environment: Do you think this settings are reasonable? I didn't understand the breach duration parameter. What does it means? is 5 minutes is reasonable? Thanks...

Keep config file secure using github and Elastic Beanstalk?


amazon-web-services,github,passwords,config,elastic-beanstalk
I am using github (public) to keep track of my web app and about to deploy it to Elastic Beanstalk. Is there a good way to keep my config file secure which has RDS username/password? I have to add the file to git in order to push it to Elastic...

How to turn an s3 object string into something useful when using laravel 5.1 filesystem


php,amazon-s3,laravel-5,file-conversion,flysystem
I'm at a loss. I'm trying to display an object (image.jpg) I successfully have uploaded to my s3 bucket. I have made sure the file is set to public. I use the Storage::get(); method which the doc says "returns a string of the object". See here: The get method may...

Why are the object values getting pushed into the array 3 times?


javascript,jquery,arrays,amazon-s3
I have a simple object array into which I am pushing an object with 2 fields: bucketName and Date. The problem is that the values are getting pushed thrice into the array. Please help me. JS: sortBucket: function(bucketList) { var counter, j = 0; var str = "aws-billing-csv"; console.log("Bucket List...

Using Java web service on Amazon cloud


java,web-services,amazon-web-services,amazon-ec2
I want to make a web service in java which will take arguments and do processing and return a json response. I am not been able to figure out how to deploy this service on amazon ie (on ec2 or some where else) . what will be the url to...

ajax GET request times out for URL when browser and CURL work


jquery,ajax,amazon-web-services
I see one similar question but it does not have an accepted response. The following ajax request times out. But GET request on the same URL using browser or curl work fine. Note this is a cross domain AJAX since the code sits on a different server and URL is...

eb cli 3.0 is not putting my settings from my existing environment after connecting it


php,git,amazon-web-services,amazon,elastic-beanstalk
I created a customized Elastic Beanstalk environment from the web interface with configuration for VPC and other things. I now have a local repo that I want to connect to this created environment. I ran eb init and was able to spot my environment and selected it. I then ran...

cloudsearch query to boost exact match on range


amazon-web-services,amazon-cloudsearch
In a cloudsearch structured query. I have a couple of fields I am searching on. On field one, the user selects "2" On field two the user selects "1" I am wanting to run this as a range query, so that the results that are returned are -1 to +1...

Secure file upload directly to s3 or server to s3 (from iOS app) [closed]


ios,node.js,amazon-web-services,express,amazon-s3
I need to upload sensitive images to s3 from an iOS app. I'm wondering which option is better: Upload to my server first, then upload to s3. Upload to s3 directly, then upload metadata to my server. ...

ArgumentError - unknown SSL method `TLSv1_2'


ssl,amazon-s3,carrierwave,fog
I am trying to move my AWS integration over TLS instead of SSLv3, but I'm receiving an error when trying to set the config.fog_credentials as another SO post has suggested, but I am receiving the ArgumentError above (unknown SSL method 'TLSv1_2'. I am open to a different solution to move...

AWS Beanstalk autoscale user files


amazon-web-services,autoscaling,beanstalk
I have setup AWS Beanstalk instance where a server app is deployed. In the backend users can change files in images/ directory. But when autoscaling the instances, the user files are not mirrored. How to solve this requirement? Can I setup AWS Ec2 to create new AMI each night based...

Amazon DynamoDB Mapper - limits to batch operations


amazon-web-services,amazon-dynamodb
I am trying to write a huge number of records into a dynamoDB and I would like to know what is the correct way of doing that. Currently, I am using the DynamoDBMapper to do the job in a one batchWrite operation but after reading the documentation, I am not...

jets3t cannot upload file to s3


hadoop,amazon-s3,jets3t
I'm trying to upload files from local to s3 using hadoop fs and jets3t, but I'm getting the following error Caused by: java.util.concurrent.ExecutionException: org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: Request Error. HEAD '/project%2Ftest%2Fsome_event%2Fdt%3D2015-06-17%2FsomeFile' on Host 'host.s3.amazonaws.com' @ 'Thu, 18 Jun 2015 23:33:01 GMT' -- ResponseCode: 404, ResponseStatus: Not Found, RequestId: AVDFJKLDFJ3242, HostId: D+sdfjlakdsadf\asdfkpagjafdjsafdj I'm...

Polling Continuously from a SQS queue on AWS


java,amazon-web-services,amazon-sqs
I have a java class that connects to an SQS queue and I would like it to respond to messages that are sent to the SQS queue. Is that possible without running the java class continuously, sending receiveMessageRequests?

DynamoDB Conditional Check Fail Monitoring


c#,asp.net,amazon-web-services,session-state,amazon-dynamodb
I have used dynamodb-session to set DynamoDB for Asp.net Session state provider. In my ASP.NET_SessionState table in DynamoDb There is "Conditional CheckFailed " monitoring. My question is: In what condition these exceptions happen and how can I reduce them?...

Loop through list of AWS-instances shows only first item


ruby-on-rails,amazon-web-services,amazon-ec2,each,aws-sdk
I am working on a simple customer frontend for AWS. I want a list of all the users machines for start/stopping the EC2s. While the logic works I can only show the first of the machines in my view. I guess it's related to the AWS APIs pageable response format,...

what is the nodejs package for s3 image upload


node.js,image,amazon-s3
I'm looking to upload my assets to s3. Is there any package in Nodejs like carrierwave in rails I want to resize images with versions. I have come across papercut. What is the best node module for s3 image upload....

Amazon EC2 Storage lacks


amazon-web-services,amazon-ec2
I have launched Amazon EC2 instance of "m3.large" type. According to this page, m3.large should have 2vCPUs, 7.5GiB Memory and 1x32GB SSD Storage. But df -ah returns following results. It seems that the instance lacks the volume. Filesystem Size Used Avail Use% Mounted on /dev/xvda1 7.9G 797M 6.7G 11% /...

Use Reserved instance and autoscaling group


amazon-web-services,autoscaling
I would like to know if it would possible to create an architecture with both reserved instance (RI) and auto-scaling group to serve web pages. The idea would be to have one RI serving 24/7 and launching on demand instances in an auto-scaling group when the CPU of the RI...

How to route traffic by proximity from Route 53 to closest NGINX server?


amazon-web-services,amazon-ec2,amazon-s3,cloudflare
I'm trying to set up a web server stack in the following way: Use Route 53 for my DNS Serve static content with Cloudflare from S3 buckets Route API calls to nearest NGINX server that sits in front of some Nodejs servers. So all static content is done easily enough...

How to limit access in Amazon S3 files to specific people?


ruby-on-rails-4,amazon-s3
I work on a SaaS application where Creators can create Groups and invite others to their Group to share files, chat and so on. Only people within specific group should have access to this group's files. People from other group must not have access to not their group's files. And...

AWS Kinesis - data source on a third party server


amazon-web-services,amazon-kinesis
New to AWS Kinesis. We're trying to evaluate whether it makes sense or even possible to place events captured in a log file which is located on a third party server into AWS Kinesis stream, given that we only have a VPN access to this server where the log file...