I'm currently downloading all my Gmail emails
One of the things I want to do search for specific emails and put them into a separate folder
This is great because it creates a new email file that I can then use to parse and get information out of
BUT, I've noticed something that seems odd: the size of the 'All Mail' file does not change
The size of all emails I have is 3GB
If the total size of the new folder emails is 2GB, then this adds a 2GB file
So, the total space used on my machine is now 5GB
Help get this topic noticed by sharing it on Twitter, Facebook, or email.
matt: that's awesome - thank u
question: where's the trade off?
i assume there's a reason why it's not compacted in the first place?
does it take longer to search?
the only thing though for me is that i've downloaded to be able to access the mbox file - i'm trying to find someone to write a script that will parse and pull out data from about 2000 emails
so... compacting may not be good in this instance? since i assume the file format will change?
for backing up, i'm zipping up and storing onto a dvd anyway - if i compact, will this help or change anything - i.e. if compacted, then zip has little left to compress?
CHAMP0The trade off is speed, which is why almost all mail programs do it. By simply marking the entry in the file as deleted or moved and ignoring it, the very slow disk I/O is minimized so the program is more responsive. The first time I ran across this was in the DOS days with dBase II it is far more common that you might think where flat file databases are used. Even microsoft Word does it, appending changed to the file rather than writing a new one.
Compacting is supposed to be set up by default to compact once 100kb of space will be returned to the file system. The reality is that the actual compact my be delayed for quite some time as it has a whole raft of triggers that are designed to keep it from running when you are using the program.
Obviously the larger the underlying file, that longer it takes to search.
The space gained by compacting is simply the file being re written with the deleted mail finally removed from disk (moved = deleted for a single folder)
So compacting would be good, not bad as it will reduce the data to backup and reduce duplication/redundancy.
Note: unless you really need to Zip the files, I would suggest you use the import export tools extension to export the data uncompressed, and also use it to export a HTML listing of the mails in the files. That way you can save the list and scan through the list on the DVD to see if the mail you want is there.
They can also export the mail as a folder of EML files, in which case you would have your listing and a whole DVD of emails, one to a file, named from the subject, that double clicking would open.
The import export tools can be found here