MESSAGE
| DATE | 2003-05-22 |
| FROM | Marco Scoffier
|
| SUBJECT | Subject: [hangout] mbox_splitter.py
|
From owner-hangout-desteny-at-mrbrklyn.com Thu May 22 19:14:56 2003 Received: from www2.mrbrklyn.com (localhost [127.0.0.1]) by mrbrklyn.com (8.12.3/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id h4MNEulu006883 for ; Thu, 22 May 2003 19:14:56 -0400 Received: (from mdom-at-localhost) by www2.mrbrklyn.com (8.12.3/8.12.3/Submit) id h4MNEuEh006882 for hangout-desteny; Thu, 22 May 2003 19:14:56 -0400 X-Authentication-Warning: www2.mrbrklyn.com: mdom set sender to owner-hangout-at-www2.mrbrklyn.com using -f Received: from debian (ns.metm.org [64.81.200.243]) by mrbrklyn.com (8.12.3/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id h4MNEulu006877 for ; Thu, 22 May 2003 19:14:56 -0400 Received: by debian (Postfix, from userid 1001) id 810F4801F; Thu, 22 May 2003 19:17:51 -0400 (EDT) Date: Thu, 22 May 2003 19:17:50 -0400 From: Marco Scoffier To: hangout Subject: [hangout] mbox_splitter.py Message-ID: <20030522231750.GN20770-at-metm.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.3.28i Sender: owner-hangout-at-mrbrklyn.com Precedence: bulk Reply-To: Marco Scoffier List: New Yorker GNU Linux Scene Admin: To unsubscribe send unsubscribe name-at-domian.com in the body to hangout-request-at-www2.mrbrklyn.com X-Evolution: 000003a3-0000 X-Keywords: X-UID: 13775 Status: RO Content-Length: 1255 Lines: 37
Well this is the first time I have done this, and I am a little wary about the scrutiny of wide usage. But, I wrote a small script for my own usage that I think might be useful for many of you.
mbox_splitter.py : splits the mailboxes it recieves as arguements into user defined sizes (say -S=300000 bytes or about 3M).
It also strips out the attachments which are larger than a certain size and saves them to a seperate file. It also saves only a specific set of headers, so you keep only the imporant ones if you wish (this can reduce the size of an mbox by 20-50%).
Basically I have procmail sort my incoming mails into several mboxes. Because I am on so many lists these mboxes tend to get really large, to the point where it is tough to even open them. So, wrote this script which I run in a batch:
mbox_splitter.py ~/Mail/*
To download the code and view some examples which include the output of parsing the 19M of hangout we have received since Feb 12th. please visit:
http://marco.metm.org/code/
Oh and it's python so it is pretty slow.
-- Marco ____________________________ NYLXS: New Yorker Free Software Users Scene Fair Use - because it's either fair use or useless.... NYLXS is a trademark of NYLXS, Inc
|
|