Process data in Google Sheets using Pandas

Introduction

In this tutorial, you’ll learn how to process data in Google Sheets Using Python Pandas

Specifically, you’ll learn how to:

  • Connect to Google Sheets
  • Read data from a Google Sheet
  • Clean the Google Sheets data using the pandas library in Python
  • Create an API endpoint that returns the processed data

Setup

To complete this tutorial, you’ll need:

Connect to Google Sheets

  1. To connect to Google Sheets, navigate to the Connection page and add your Connection.

  2. Give the connection the alias example-google-sheets-process-with-pandas-source. This will be used as part of the path in the code that reads from the Google Sheets table, similar to a network drive mapping.

Read data from a Google Sheet

Next, you’ll be reading from the Google Sheet. You may have data that looks like the following:

Account,Name,Rep,Manager,Product,Quantity,Price,Status
882939,Melon Productions,Craig Booker,Joan Henley,Computer,1,32500,presented
882939,Melon Productions,Craig Booker,Joan Henley,Software,1,12000,presented
882939,Melon Productions,Craig Booker,Joan Henley,Maintenance,2,4000,pending
747551,"Hamel, Johnson and Ellers",Craig Booker,Joan Henley,Computer,1,35000,declined
221923,Ridgeco,Jonathan Hilton,Joan Henley,Computer,2,64000,won
228894,"Deserprises, Inc.",Jonathan Hilton,Joan Henley,Computer,2,45000,pending
228894,"Deserprises, Inc.",Jonathan Hilton,Joan Henley,Software,1,10000,presented
512291,Jenkins-Hamlin,John Smith,Joan Henley,Maintenance,2,5000,pending
740150,Voidscape LLC,John Smith,Joan Henley,Computer,1,35000,declined
141962,Kopp LLC,Kevin Moss,Fred Ellers,Computer,2,65000,won
163416,Dragger-Keeling,Kevin Moss,Fred Ellers,Computer,1,30000,presented
239344,Ice Arts LLC,Kevin Moss,Fred Ellers,Maintenance,1,5000,pending
239344,Ice Arts LLC,Kevin Moss,Fred Ellers,Software,1,10000,presented
307599,"Metz, Hendricks and Walford",Wendy Yates,Fred Ellers,Maintenance,3,7000,won
688981,Morton-Ellis LLC,Wendy Yates,Fred Ellers,Computer,5,110000,won
729855,Wavigations Ltd.,Wendy Yates,Fred Ellers,Computer,2,65000,declined
729855,Wavigations Ltd.,Wendy Yates,Fred Ellers,Monitor,2,5000,presented

To read this data:

  1. Create a new pipe in Flex.io

  2. In the pipe, you’ll see an initial Execute Task; make sure the language option is set to Python, and then overwrite the example “Hello, World” code with the following:

def flex_handler(flex):

# open the table in Google Sheets
input_path = 'example-google-sheets-process-with-pandas-source:/sales-funnel/sheet1'
input_file = flex.fs.open(input_path)

# read the data
data = input_file.read()

# write out the data
flex.output.write(data)

Change the input_path value to the path of the sheet you’d like to read. For example, if you have the sheet sheet1 inside a sales-funnel spreadsheet, you’d set the value of the input path to example-google-sheets-process-with-pandas-source:/sales-funnel/sheet1

  1. After saving your changes, press the “TEST” button in the upper right of the Pipe Builder. You should see the data in the pipe output, that looks something like:
[
{
"d": "Manager",
"f": "Quantity",
"b": "Name",
"h": "Status",
"e": "Product",
"a": "Account",
"c": "Rep",
"g": "Price"
},
{
"d": "Joan Henley",
"f": "1",
"b": "Melon Productions",
"h": "presented",
"e": "Computer",
"a": "882939",
"c": "Craig Booker",
"g": "32500"
},
...

Clean the Google Sheets data using the pandas library in Python

When data is returned from Google Sheets, the columns are listed as “a”, “b”, “c”, etc; however, for tabular data, the first row of data is often the header, so you’ll likely want to change the column names to be the first row of the data. In addition, columns are returned as text, so columns containing numeric data often need to be converted to a numeric data type. To perform these operations, do the following:

  1. Add a new Execute Task after the task that reads the table data from the sheet. In this task, select ‘Python’ as the language, then replace the “Hello World” example with the following code:

Create an API endpoint that returns the processed data

Finally, get an API endpoint to accessed the processed table from Google Sheets:

  1. First, turn on your pipe to activate it.

  2. Then, scroll down to the Deployment section and check the box next to Run using an API endpoint to see your API endpoint and API key.

  3. Next to the endpoint, which by default uses the unique pipe EID as a reference, click the Edit button to add an alias to your pipe as follows: example-process-data-in-google-sheets-using-pandas.

  4. To quickly test the endpoint, copy the HTTPS endpoint and paste it into your browser to see your data; the HTTPS endpoint will look like this (where {token} is your unique API key):

https://api.flex.io/v1/me/pipes/example-process-data-in-google-sheets-using-pandas/run?flexio_api_key={token}

Going further

Of course, now that you have your endpoint, you’re now ready to invoke it in any server or client code that allows you to make an HTTP request. Here are two examples, where {token} is your API Key.

HTML Script:

<script src="https://cdnjs.cloudflare.com/ajax/libs/axios/0.18.0/axios.min.js"></script>
<script>
axios({
method: 'post',
url: 'https://api.flex.io/v1/me/pipes/example-process-data-in-google-sheets-using-pandas/run',
headers: {
Authorization: 'Bearer {token}'
}
})
.then(response => {
// var converted_data = response.data
})
.catch(response => {
})
</script>

CURL:

curl -X POST 'https://api.flex.io/v1/me/pipes/example-process-data-in-google-sheets-using-pandas/run'
--header "Authorization: Bearer {token}"
'