Copy files between cloud storage services

Introduction

In this tutorial, you’ll learn how to create a serverless function that copies files between cloud storage services.

You’ll learn how to:

  • Connect to your cloud storage accounts
  • Get a list of files to copy
  • Select the files that are copied based on filetype
  • Copy the files from one cloud storage account to the other

Setup

To complete this tutorial, you’ll need:

Create connections to your cloud storage accounts

First, you’ll need to connect to a cloud storage service you’d like to copy from and the one you’d like to copy to.

  1. To connect to these services, navigate to the Connection page and add your Connections.
  2. For the service you’ll copy from, give the connection the alias example-cloud-storage-copy-source. This will be used as part of the path in the code that performs a transfer, similar to a network drive mapping.
  3. For the service you’ll copy to, give the connection the alias example-cloud-storage-copy-target.

Get a list of files to copy

Next, get a list of files you want to copy. To do this:

  1. Create a new pipe in Flex.io

  2. In the pipe, you’ll see an initial Execute Task; make sure the language option is set to Python, and then overwrite the example “Hello, World” code with the following:

  1. After saving your changes, press the “TEST” button in the upper right of the Pipe Builder. You should see a list of files in your Output.

  2. Once you’ve got a basic list, change the path to the folder that has the files you’d like to copy. For example, if you want to copy files in a folder called /temp/files, you’d change the path to:

Select files based on filetype

In the previous step of this tutorial, the output list was limited to 5 items. Depending on the folder you listed, you may have also noticed that the list includes directories as well as files.

You’ll likely want to change the filter criteria to only include files, and probably other criteria, such as file extension or filename. For now, update the directory listing to only include text files.

  1. Update your execute step with the following:
  1. Run the pipe, and you should see something like:
[
{
"type": "FILE",
"full_path": "example-cloud-storage-copy-source:/temp/files/file1.txt",
"path": "/temp/files/file1.txt",
"name": "file1.txt",
"size": 128,
"modified": "2018-06-01T12:00:00Z"
},
{
"type": "FILE",
"full_path": "example-cloud-storage-copy-source:/temp/files/file2.txt",
"path": "/temp/files/file2.txt",
"name": "file2.txt",
"size": 256,
"modified": "2018-06-01T12:00:00Z"
},
...

Copy the files from one cloud storage to the other

Finally, copy the files that you’ve listed.

  1. First, add another Execute Task. Make sure the language option is set to Python and then update the execute code with the following placeholder logic that simply echoes the output of the previous step:
  1. Run the pipe and, and you should see the filtered list of items in the Output.

  2. Now, update your new Execute Task with code to copy the items in the list to your second cloud storage account:

This copy logic could be included in the Execute Task that lists the files. However, by breaking out the copy and list logic, you gain significant flexibility in changing either the list or copy functionality without affecting the other.

For example, you may want to copy files based on a list you supply when you call the pipe from an API endpoint, or based on a file that you read in, or some combination. In that case, you’d simply need to update the list logic to use that list and simply pass on the new list to the copy.

Other options in the copy functionality may include applying some type of transformation or conversion. Or you may want to add an additional task after the copy to send a notification that the files have been copied or report on an error. Chaining together distinct execute steps tends to offer better reusability.