Copy Contents of one S3 Bucket to Another.

Need to automate copying files from one Amazon S3 bucket to another? So did I. Everything I found on google, like this, was useless. Most of the scripts I found required downloading the objects first to the local machine and then reuploading them to the destination bucket. Unacceptable, especially if you are dealing with a large and or many files.

I’ve never written a line of Ruby before, but it seems like there are some great AWS libraries for it, so I decided to give it a shot. There is a cool library out there called right_aws. You can install it using “#gem install right_aws”. Then simply copy this script:

#!/usr/bin/env ruby
require 'right_aws'

        S3ID = "Your AWS ID Here"
        S3KEY = "Your AWS secret key"
        SRCBUCKET = "Source Bucket"
        DESTBUCKET = "Destination Bucket"

        s3 = RightAws::S3Interface.new(S3ID, S3KEY)
        objects = s3.list_bucket(SRCBUCKET)
        objects.each do |o|
        puts("Copying " +  o[:key])
        s3.copy(SRCBUCKET, o[:key], DESTBUCKET, o[:key])
        end
        puts("Done.")

Make sure the file is executable and you should be able to run it via command line on any unix system. To make a generic ruby script get rid of the first line.

I know its pretty brutish, probably sucks in more ways than one, but for now it works. And I think I like Ruby :D

Tags: , , , , , , ,

This entry was posted on Wednesday, February 16th, 2011 at 10:11 pm and is filed under Linux, Programming, Technology. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

13 Responses to “Copy Contents of one S3 Bucket to Another.”

  1. Rohan Dey says:

    Thanks for the quick script it works like a charm. All other tried tools are like junk.

  2. Awesome. This works really well. Hooked it up to “whenever” schedule so it just syncs production assets to staging bucket every day.

  3. Geo says:

    Great script man! Very usefull ;)

  4. Tim Olsen says:

    Your script copied only the first 1,000 items in my bucket (I have about a 100,000 items). Here is a modified version of your script which copies everything:

    require ‘right_aws’

    S3ID = “Your AWS ID Here”
    S3KEY = “Your AWS secret key”
    SRCBUCKET = “Source Bucket”
    DESTBUCKET = “Destination Bucket”

    s3 = RightAws::S3Interface.new(S3ID, S3KEY)
    s3.incrementally_list_bucket(SRCBUCKET) do |h|

    h[:contents].each do |o|
    puts(“Copying ” + o[:key])
    s3.copy(SRCBUCKET, o[:key], DESTBUCKET, o[:key])
    end
    end
    puts(“Done.”)

  5. [...] reasonable time. So, I started looking around for software tools to help. I found a helpful post on someone’s blog which suggested the right_aws ruby gem to handle interfacing with AWS, and included a standalone [...]

  6. Thanks! Just what I was looking for. The only modification I had to make to the script to get it to run on my Mac was to add the line:

    require ‘rubygems’

    before the

    require ‘right_aws’

    This may be obvious to ruby users, but for us non-rubies, a little googling was required.

  7. Walter says:

    Great program – works like a charm. Does anyone know a way or the syntax to have the program just sync the files that have been either add or changed?

  8. What about permissions? says:

    This works great. But it seems for me that the permissions on the copied file is not set to anything — I need it to be READ. I could not find any way to add that to the copy command… anyone else solved this?

  9. Johannes says:

    If you need to copy large amounts of files,
    take a look at s3funnel (python)

    https://github.com/sstoiana/s3funnel

    which allows you copy using multiple threads and hence speeding up the process considerably.

    #git clone https://github.com/sstoiana/s3funnel
    #easy_install ./s3funnel

  10. Raghav says:

    I just tried your script but getting below given error.

    I, [2013-01-07T09:20:29.339407 #8862] INFO — : New RightAws::S3Interface using shared connections mode
    I, [2013-01-07T09:20:29.339988 #8862] INFO — : Opening new HTTPS connection to sandeepactiance.s3.amazonaws.com:443
    W, [2013-01-07T09:20:29.736446 #8862] WARN — : ##### RightAws::S3Interface returned an error: 404 Not Found

    NoSuchKeyThe specified key does not exist.13-01-0781C70126EE4E9F16eFgm9Vw1aWzdbbZaAb0twLmgWJoIKg9lbQTwGzL9pW76V1WmHP/htlWon5zMCLZb #####
    W, [2013-01-07T09:20:29.736515 #8862] WARN — : ##### RightAws::S3Interface request: https://sandeepactiance.s3.amazonaws.com:443/13-01-07 ####
    /usr/local/lib/ruby/gems/1.8/gems/right_aws-3.0.4/lib/awsbase/right_awsbase.rb:562:in `request_info_impl’: NoSuchKey: The specified key does not exist. (RightAws::AwsError)
    from /usr/local/lib/ruby/gems/1.8/gems/right_aws-3.0.4/lib/s3/right_s3_interface.rb:203:in `request_info’
    from /usr/local/lib/ruby/gems/1.8/gems/right_aws-3.0.4/lib/s3/right_s3_interface.rb:324:in `list_bucket’
    from test33.rb:12

    Please look into this and update

  11. BidNinja says:

    This is really a incredibly good read for me, Need to admit you might be a single from the best bloggers I ever saw.Thanks for posting this informative article.

Leave a Reply