Some weeks ago I wanted to analyze my spending habits using the data of one of the supermarket chains I shop at frequently. Each purchase at this particular chain sends a receipt to my email address. By a happy accident, I’ve been archiving these receipts for years. Therefore, I could download them from my Gmail account and run some code to analyze them.
There was a small problem: I had collected more than 300 receipts over the years. As much as I like splicing random data, one thing I knew for sure was that I did not want to download receipts manually. Fortunately, Gmail has an API that allows you to read emails programmatically. Instead of downloading the receipts manually, I could take twice the amount of time and write a script to do it for me. Perfect.
Accessing said API was not entirely straightforward. Given that I’m pretty sure I’ll think of other uses for this API in the near future, I wanted to document the process in detail. After all, I tend to forget things the moment I turn away from them, thus some guidance the next time I get around it would be nice. Hopefully, it will turn out useful for others, too.
In short, the process of downloading email messages consists of a few steps:
- Setting up Google Cloud project
- Setting up OAuth and getting credentials for the project
- Writing some code
Naturally, the first two steps are the most complicated.
Setting Up the Project
Right off the bat, I recommend going to Google’s official guide. It should be considered the source of truth. This post will eventually get outdated, whereas Google (presumably) will keep their guides up to date.
On the other hand, you’ll find a step-by-step guide here, which can be quicker than wading through reams of documentation.
That being said, let’s begin!
Setting Up Google Cloud Project
- The first step you’ll likely need to do is to create a Google Cloud Project. There are a couple of ways to do it:
- Navigate to Google Cloud Console > Click on a burger menu, hover over
IAM & Admin
, and selectCreate a Project
(near the bottom of a list). - Use a direct create a project link (the same one that the guide has on a button
Go to Create a Project
button). This is my preferred option since finding anything on Google Cloud Console is a Herculean task.
- Navigate to Google Cloud Console > Click on a burger menu, hover over
- Enter the project name (e.g.,
gmail-example
) on the next page.Location
is not mandatory. - Click
Create
and wait for everything to initialize.
If all goes well, you’ll see a bunch of lists, graphs, tables, and whatnot on your Dashboard. Congratulations! The Google Cloud Project is set up.
Enabling Gmail API
Now that we have a project to play with, we need to enable Gmail API for it. If you like feeling your brain slowly oozing through your eyes, you can try to find it in a list of APIs on the API dashboard. For the rest of us who are too weak of spirit for such endeavors:
- Click
APIs & Services
>Library
- Search for
gmail
- Click on it
- Click
Enable
and wait for a few seconds
Great success, we’re ready for the meat of the configuration.
Setting Up Credentials
Now that the API has been enabled for the project, you’ll be urged to create credentials.
- If you get a notification that you need to create credentials, click
Create credentials
button. If you did not, you can navigate toCredentials
on the menu on the left side, and clickCreate credentials
at the top of the page. - We’ll be accessing user emails, so choose
User data
- In scopes, click
Add or remove scopes
- Go to the page that has Gmail scopes
- Select
View your email messages and settings
and clickUpdate
.- You can add more scopes. However, it’s a good idea to keep the permissions as restrictive as possible. This helps to prevent disasters, such as deleting emails when you thought you were just reading them. Or allowing people using your computer/malware to do so.
- The next step will ask to setup
OAuth Client ID
- Choose
Desktop app
- Enter the client’s name, for example,
python-client
. Note that this is the name used to identify the client in the console, users would not see it) - Click
Create
- Choose
- If all goes well, the credentials should be generated and you’ll be presented with an option to download them. Do so by clicking
Download
. Credentials will be downloaded as a JSON file.
Now that you have the credentials generated, you can find them under the Credentials
option on the menu. If you ever lose the file, you can download them again from there.
Setting Up the OAuth Consent Screen
The final step is to set up OAuth consent screen. This is what the users will see when prompted to give access to their Gmail. For local projects it’s probably not as relevant since you’ll be the only one using it, but set it we must nevertheless.
To do so:
- Select
OAuth consent screen
on the left side menu - Depending on when you read this, you might get prompted to try the new experience. Or you might get it immediately. In the former case, click
Go to new experience
- no reason to learn the old UI. - Click on
Get Started
- Enter app information:
- App-name:
{your app's name}
- User support email:
{your email}
- App-name:
- In
Audience
step, selectExternal
audience. I believeInternal
would also work. - In
Contact Information
screen, enter your contact details - In
Finish
screen, agree to the usage policy
Once the creation process finishes, you can inspect the related information. One that we’re most interested in now is Audience
, and more specifically, Test Users
section at the bottom of the screen. We want to add a test user - this will be your Gmail account that you want to get email messages from.
With all that done, what remains is a small matter of programming the email download logic.
Invoking Gmail API
The code is based on Python 3.13, although it should run even on much older Python versions. Before starting, I recommend having a look at Python quickstart from Google itself, which has a basic setup outlined. A summary of how to make the first contact with Gmail is provided below.
First of all, we need to pull some libraries from Google:
Then, create an auth.py
file. Copy paste the code from Google’s example:
Inspecting the code, you’ll see that it expects either a token.json
or a credentials.json
file. Remember the credentials file you’ve downloaded a few minutes ago? This will be the credentials.json
file. Find the downloaded file, rename it and move it to the same directory as auth.py
. If you’re wondering about token.json
, it will be created automatically once we run the code and allow our code to access our Gmail account.
Talking about running the code, run the code:
This should open a browser window. Choose the account with the email address which you’ve added to Test users
when setting up OAuth screen. Google will warn you that the app has not yet been verified. This is expected, as we indeed have not verified the app. We’re just using it for our own needs, so there’s no need to do so.
Click Continue
. On the next page, permissions requested by the app will be detailed, urging you to either cancel or continue. Click Continue
once more. You should see a message The authentication flow has completed. You may close this window
. Feel free to close the window.
Note that you’ll see token.json
created in the same folder that the auth.py
resides. It will be used until the token within expires. Until it does, you won’t need to re-authenticate with Google. Obviously, this file should be kept secret, along with credentials.json
.
Aside from the generated token file, you should see Gmail labels printed out in the console. If that is so - congratulations are in order. You’ve just accessed your Gmail inbox!
Reading Emails
Now that we know the code works, we can improve it to read emails, not labels. To do so, we’ll first extract authentication code into a separate module. Then, we’ll code a basic Gmail client, and use it to read some emails.
Let’s start with authentication. First, let’s create an auth
package. Then, let’s move token.json
and credentials.json
to it. Finally, create an auth.py
file and paste the following code there:
This will let us keep the authentication separate from the Gmail client. This way, if authentication is broken, we’ll know where to look for issues.
With this done, we can start implementing Gmail client. For this demonstration, we’ll assume we want to find emails sent by a specific sender (for more options, see Gmail’s API).
Create a file client.py
in the top level directory. Paste the following code there:
Then, create a main.py
file:
Now you can run it with some sender name/email that you have in your inbox:
In your console you should see something like
Great success! But why do we see only some obscure ids? Well, that’s because we’re listing messages. To actually get their content, we’ll need to call a different API with a desired message id. This means we need to adjust our Gmail client a bit.
Replace the client.py
with the following code:
Note two things:
- First, when retrieving an email message, you get a pretty complicated object. We need to reach quite far down into it to retrieve the email message. You may want to explore the API and data structure to understand what’s happening there.
- Second, the content that we get is encoded in Base64. However, it’s not encoded in a way that Python’s
base64
can decode right off the bat. It appears that Python uses a slightly different Base64 standard for the email decoding than whatever Google uses to encode it. The fix is simple - replacing-
with+
and_
with/
so that Python can understand it.
Now that we have that out of the way, adjust the main.py
:
And there you have it! You should now see the content of your email printed to the console. Note that for more complicated emails you may need to use something like beautifulsoup to parse the HTML and/or extract data from it.
Source Code
One GitHub repo is worth a thousand words. With that in mind, you can find fully working code on GitHub. Note that you’ll need to provide your on credentials.json
file.