I’ll show you a meticulous process involving Mozilla Thunderbird and its AttachmentExtractor Continued addon to extract and remove attachments from emails. No email needs to be deleted, and you’ll still be able to see which email had what attachment. However, the payload will no longer take up space in your account, allowing you to reclaim space. Once done, you can look up attachments locally where storage is more abundant. The rest of your emailing experience remains at Gmail, so you don’t need to switch to a desktop email client at all.
My Gmail space reached 0 of the 15GB available. The service didn’t exactly live up to its tagline promise, and my expectation “never delete another email”. I got invited to Gmail in 2005 when it was providing 2GB of space. The number was slowly climbing up in front of our eyes to its current-day levels. The day came when, instead of being pressured into upgrading storage, I chose to reclaim the space. I’d rather not pay for storing trash. I tried deleting some colorful newsletters, but soon realized I need to remove attachments because they eat away all the space. If they weren’t around, 15GB would be enough for a lifetime’s worth of emails.
As a starting point, I’m assuming you know how to, and have already performed these:
- Install Thunderbird and the AttachmentExtractor Continued addon.
- Create an empty local folder in Thunderbird.
- Add your Gmail to it via IMAP.
- Wait till it syncs emails.
Setup AttachmentExtractor Continued like this
- Set a Default Save Path, where your attachments will accumulate.
- Choose: Always replace existing file with the new attachment.
- Tick: Delete attachments from the message / Delete without confirmation.
- Tick: Notify me when all the attachments have been extracted.
- Untick: Ask always for filename pattern.
- Pattern used for dates (#date#):
- Pattern used to generate filenames:
- Choose: Extract the Attachments.
- Tick: Show message subjects and attachment names on the progress dialog.
- Untick: Also extract embedded (‘inlinle’) images too.
Tips before you remove attachments
- If you got the space and time, copy/download everything to local folders as a backup.
- Do this on the weekend or during off-peak hours to avoid dealing with new messages mid-flight.
- Create a spreadsheet to keep track of message counts. Without this, you risk losing some. Always make a note of how many emails you are processing, what label you are currently doing, whether you are doing Inbox or Sent, and what step you are at. Long waiting times will make you forget, especially if you multitask.
- Expect to have your patience tested, but it works.
I’ll use this to refer to Gmail’s and Thunderbird’s organizing constructs:
- Label: Gmail labels that are not IMAP-standard, yet show up as folders in desktop clients.
- Tag: Thunderbird’s marking feature that doesn’t show up in Gmail.
- All Mail: the pool of Gmail where everything valuable sits.
- Inbox: This is what you primarily use. It’s just like a Gmail label, meaning an email does not have to be in either Inbox nor Sent.
- Sent Mail: Gmail’s outbox, consisting of emails with foreign recipient addresses (not your own).
- Local: Thunderbird’s local storage, we’ll work in this.
About the filtering/tagging workflow, keep in mind
- The lower, black bar does the filtering (Tags button) after opening it with Quick Filter.
- Pitfall alert! If you merely click the Tags button but no specific tag, it’ll filter to emails having any tag. So it won’t show you all email.
- You can Pin the filter choices. It might seem convenient, but resetting/changing them takes just as long.
- The top grey bar’s Tag dropdown is for applying the tags.
- Emails can take more than one tag at a time. Don’t rely on color.
- There is an option to remove all tags at once (can also press 0).
- If you remove all tags, a filtered list of emails won’t change, allowing you to add a new tag.
Create these Thunderbird tags
- Detached: These are emails we successfully processed.
- FreshlyDetached: The result of the current pass.
- CurrentLabel: Keeps track of emails from the Label you are processing in the current pass.
- PendingDeletion: These are the original emails with the full-size attachments.
- PendingDetach: Mail marked for extraction and eventual attachment-detach.
Since emails in multiple Labels are not duplicates, just references, if you tag a mail in one Label, you mark its reference in the other one as well.
Step by step
You need to follow the steps VERY very carefully, in this precise order. If you mess up, you’ll create chaos. These all take place in Thunderbird, mainly in the IMAP Gmail account. The first pass will remove attachments from Inbox, then do another for Sent Mail. Start with this only if you don’t have any meaningful Gmail labels to keep on attachment-containing mail.
Core steps for Inbox OR Sent Mail
- Go to Inbox OR Sent Mail.
- Filter to attachment-containing mail.
- Tag them with PendingDetach.
- Move them to the local folder with a simple drag and drop. It’ll take a while to download. This step removes them from Inbox or Sent Mail but leaves them in All Mail.
- Extract to remove attachments (right-click AE Extract from Messages -> To Default Folder). Handle the downloaded files.
- Add the FreshlyDetached tag to all local emails.
- Remove the PendingDetach tag.
- Leave alone local, and go to All Mail.
- Filter by PendingDetach tag.
- Remove all tags and add/ensure all have the PendingDeletion tag and nothing else.
- Go back to local.
- Copy (hold down Ctrl) the FreshlyDetached from local to Inbox OR Sent Mail (wherever they came from).
- Only if you are doing Labels, skip to Step #9 below. Otherwise:
- Filter in Inbox OR Sent Mail for FreshlyDetached (verify that message count is correct).
- Remove all tags and add/ensure all have the Detached tag and nothing else.
- Clean up local if all emails appropriately transferred.
Once you are done with the two passes, go to All Mail, filter for PendingDeletion, and delete those. The number of PendingDeletion should be less or equal to Detached.
Label-preserving pre/post-process steps
- Select a Label on the left side of Thunderbird.
- Filter to attachment-containing mail.
- If you see PendingDeletion mail, copy them to Inbox or Sent Mail regardless of which pass you are doing, but according to the sender/recipient (these are previously processed emails with more than one label).
- Select the messages shown after filtering.
- Apply the CurrentLabel tag to mark them.
- Go to Inbox OR Sent Mail.
- Filter for the CurrentLabel tag (note how many).
- Do the Core steps, starting from #3, then come back.
- Filter in Inbox OR Sent Mail for FreshlyDetached (verify the message count).
- Remove all tags and add/ensure all have the Detached tag and nothing else.
- Copy (hold down Ctrl) these emails from Inbox OR Sent Mail (whichever you just processed) to the original Label.
- Empty the local folder, so it’s clean for further passes.
If you did this for Inbox, repeat one more time for Sent Mail from Step #6 here. Once you are done with labels, go to All Mail, filter for PendingDeletion, and delete those. The number of PendingDeletion should be less or equal to Detached. You can change the Detached tag to something else to identify and separate them from what you’ll do next, which is processing Inbox and then Sent Mail for the un-labeled leftover bunch.
Which approach should I use?
Each will preserve Inbox and Sent status and the stars, but will remove attachments. However, if you have any meaningful Gmail labels that you don’t wish to re-apply, go for the Label-preserving route. It’s a lot of extra work, mind you. If you don’t do it, all processed email will lose their Gmail labels.
Yes, the steps could be optimized into obscurity, some safety tags skipped, but it requires a large mental capacity to follow what’s going on, so stick to baby steps.
A Web Applications Stack Exchange answer inspired this method, but I modified it to my needs. It’s safer, more comprehensive, and supports preserving all associated Gmail data.
Q&A, concerns, and gotchas
About the process
Will this get rid of inline images?
Inline images can buff up the size of a mail pretty easily, yet are not detached as attachments. What’s worse is that when replying to a mail and quoting the previous, these are copied over and over. Even just a 1 MB inline screenshot can take up 100MB of space if an email thread is 100-message long because people quote the whole thread when replying.
However, I don’t want to remove all images from email contents as that would break the majority. I can always look up attachments, but let’s not impair the re-reading flow that much. Feel free to try that option in the plugin, though. It works fine, but the end result looks soo broken because of missing images!
The solution is to only enable their removal on a select few where the sender clearly meant to attach and not inline them. This requires cherry-picking messages and might not be necessary (unless you are a perfectionist). Just remove attachments and see if you are happy with the results. I freed up more than half of the space.
What if I encounter an email with more than one Label?
Don’t delete PendingDeletion emails from All Mail until you’re done with everything. Please note that you might not have enough Gmail space to keep two copies of each mail: one PendingDeletion (with attachment), and one Detached, just to carry the multi-Label information on them. Eventually, mail with 2+ Labels will be all right too.
Signs you are running into a multi-Label situation:
- Attachment already exits among downloaded files upon extraction. Extract anyway. It’s ok to overwrite.
- You’ll find that you are working with PendingDeletion emails, but you’ll need to proceed as usual, don’t panic.
- Finding some CurrentLabel emails in All Mail that you already touched for a previous Label? These have to be cleaned up, do two more rounds on them (for Inbox and Sent Mail).
What if upon uploading back to Gmail, not all my emails arrive?
When there is a mismatch in message count, you can usually re-copy the same bunch of emails to the same destination, Gmail should de-duplicate them all. That’s why you keep a note in that spreadsheet and compare it with the results you are getting.
If you consistently get a bit fewer emails, it’s not a transmission error but a de-duplication. In the original set, some emails could have been identical.
What about the email I sent to myself?
These appear in both Sent Mail and Inbox, but they are the same, if you remove attachments from one, it’ll affect the other. I don’t think these even leave the server. If they do but are redirected back by something like Mailgun, Gmail still knows and creates a reference, so they are special snowflakes. Anyway, you should be using PushBullet to communicate with yourself 😀
What happens on Gmail with PendingDeletion emails?
- With the Gmail conversation view on, you’ll see the duplicate emails in the threads.
- If conversation view is off: you won’t see the to-be deleted originals unless yo go to All Mail.
- Only delete them (with Thunderbird) once you went through everything.
Why not do everything on All Mail and the Labels directly?
Don’t do it… There is a reason why I separate Inbox and Sent Mail. Google might not re-apply the Inbox and Sent Mail “labels”. It could have a hard time re-associating your sent mail as Sent Mail, mainly if you use more than one email address as an alias, or used to. If the processed mail only sits in All Mail but is part of a conversation, you’ll likely still see them. But let’s preserve proper Inbox and Sent Mail status!
Regarding the downloaded files
Then how do I handle all that mess of downloaded attachments?
They are just a repository of the files. For example, put them into year-based folders and archive them. I hardly ever look at old emails, let alone their attachments, but I do want to keep them just in case. Going down memory lane with your favorite image viewer on the desktop is probably smoother than fishing for image-attachment-containing emails in Gmail.
Should I delegate downloaded attachments to filesystem folders that resemble the Label names?
Unless you have a perfect auto-labeled inbox, don’t do it. It’s pointless because:
- In Gmail, entire conversations appear labeled in a certain way, even if only one email in them is labeled. This means your Sent Mail is likely not labeled if you manually dropped the labels on conversations started by your pal. You’d end up with only a handful of attachments belonging to a folder you created for such a label. Every other single message will be unlabeled and will come from the Inbox and Sent Mail pools.
- Imagine that after you remove attachments later you assign labels to everything (probably driven by OCD), every email in each conversation. Then you aren’t able to freshly sort attachments to folders according to labels because they no longer exist in Gmail. So if you really really want to do this, label everything first, even Sent Mail, then circle back to this article. It requires a ton of auto-labeling by filters, and I don’t think it’s worth it. But I certainly didn’t want to destroy existing labels as I did spend some time adding them in the past.
So how about delegating attachments to Inbox/Sent folders?
That works just fine, move the files after each pass, according to which (Inbox or Sent Mail) you just processed. I use these folder names to keep my sanity when moving attachments to them from the default save path:
- Outgoing from my address
- Incoming from foreign address
Since I save attachments with that peculiar file name pattern, the sender address is apparent in it. I look at the folder name vs. the file name, and if I see my address, it belongs to Outgoing, but attachments with foreign addresses belong to Incoming.
Put attachments in folders based on recipients/partners?
It works, but you need to have the foresight, and it’s unfeasible to do it for everyone. Your correspondence with a person will have half of the email with your address on them as the recipient. Recipient addresses cannot be in the filename of a downloaded attachment, so you won’t be able to find your attachments to the person based on the filename. If you are collecting attachments you exchanged with a special someone, do their emails first (before doing passes on Labels), and create a filesystem folder for them. This is worthy, even if your labeling is useless because this can be based on a simple Thunderbird search.
What do I do with too long filenames on the attachments?
Rename and shorten the part that is quoted from the subject. It’s rare, but you can help the system by making the path of the parent folder as short as possible. See more info on the 260-character path limit of Windows.
How will I quickly look up the attachment by name?
With the search feature of your OS. If you disable such indexing, you can use Everything on Windows. If you need to look manually, the filename begins with the date of the mail (using my pattern).
What if I don’t want to keep some attachments?
You can delete the files from the computer, and that’s it. Maybe it’s faster to identify what you don’t need once it’s on the computer, in bulk. The addon can remove attachments without downloading, but it can be a pain if you don’t just blanket-delete everything.
Cut-off at a specific size and skip detaching small attachments?
What would be the point of that? 90%+ of mail with attachment probably has just a small file, yet those also take up a large part of your storage. The method above will ensure you don’t have any attachments in Gmail, period. If people were exchanging pure text/HTML emails and used 3rd party services to share files, we would be able to store hundreds of thousands of emails in Gmail!
But one can argue using some Pareto principle only to detach the top 20% of attachments by size. But then the account would become inconsistent like you’d need to look up some attachments, but not others. Just remove attachments from everything. The only exception to this is inline images, they shouldn’t all be blindly detached.
What if I want to take some messages off Gmail?
You can create a local backup in Thunderbird. And you can actually put some of them back onto the server later if you prefer to read them in that conversation view.
Thoughts about Labels and maintenance in Gmail
Can I organize Gmail Labels with Thunderbird?
It is not compatible with Gmail’s label construct, but you can try. I would be afraid to mass move messages this way, as I guess it’s not the best interface for this task.
Nested labels? Importance markers? Automatic category labels (such as Promotions)?
Sorry, I have no idea how this process applies to them. Call me old-fashioned, but I don’t use these features. Gmail used to look like this and I haven’t been keeping up with their innovations that much.
What about Labels with only a handful of emails to process?
It’s a pain in the butt, but you need to do the same steps on them. Resist the temptation to remove attachments in-place without downloading the emails to a local folder first. If you touch them with the extractor and attack without Thunderbird tags, you’ll quickly see duplicates spawning. It’s easy to lose your mind without the tags, even on a handful of emails. If you must to take a shortcut, relabel these in Gmail first. If that made sense, try closing a few labels and merging them into bigger ones. Perhaps you don’t need a new label for every major sender, even though it’s easy to create auto-labeling filters for those.
What should my auto-label filters do?
That’s entirely up to you, but I can share one key idea. Whenever you’d add a Gmail label to a conversation manually, go the extra mile and create a filter based on that decision, and retroactively apply it. You’ll only have a perfect inbox if you auto-label everything with filters, including Sent Mail. In a way, you train the system, but it’s no AI, just a simple set of rules.
Why (auto)label at all?
Manual labeling, then looking up old mail based on that label is one thing. I think the most significant benefit of automatic labeling is that you quickly see what new mail is all about. Labels have a little counter in the sidebar, and you can see how many unread awaits you in which one. Also, your inbox becomes much more colorful and less boring. They also appear in the Gmail app. Nice!
Should I massively reorganize labels in Gmail BEFORE I remove attachments?
Up to you, but even just preserving the current Label system and “just detaching” attachments is a daunting task by itself. Reorganizing and issuing labels beforehand could save you a few passes of work, but that’s it. It has little effect on the end result. Unless, of course, you go as far as deleting individual mail, removing superfluous labels, creating automatic filters, and labeling literally everything. The only benefit of doing detaching on a “perfect” collection of mail is that you could put attachments from certain Labels in their own filesystem folders.
Furthermore, I’m assuming making space in Gmail is a priority (if not an SOS) task. Creating a perfect Gmail account may be pointless and way too much OCD to bother. I’m happy if most new email shows up auto-labeled with increasingly less amount of junk. Am I a marketer’s worst nightmare when I routinely unsub from every newsletter I see and recommend you to do so as well? I’ll gladly delete junk that I used to keep.
How do I make this a better world and prevent clogging up my Gmail again?
Do these yourself, and ask/educate friends to:
- Avoid attachments when possible, use Dropbox or Tresorit.
- Don’t inline images, use Snipboard.io for screenshots and stuff.
- Don’t attach videos, upload them to YouTube unlisted or Screencast.com.
- When replying, remove large inline images in the quoted part.
- Delete all but the most recent quote when you are replying to an email thread that is going to be long and there are images involved.
- Use Google Drive shared folders to temporarily share a number of photos instead of a series of emails. Remove these when everyone downloaded them and/or lost interest, as it counts toward the same space as Gmail.
Can I get rid of Thunderbird?
You can make a Gmail Label for marking all your processed, Detached mail. Copy the messages in question from All Mail to a new Label. This way Gmail will store the information that you already processed these, and you can uninstall Thunderbird as you no longer rely on the Tag.