|FROM ||Marco Scoffier
|SUBJECT ||Subject: [hangout] mbox_splitter.py
|From owner-hangout-desteny-at-mrbrklyn.com Thu May 22 19:14:56 2003
Received: from www2.mrbrklyn.com (localhost [127.0.0.1]) by mrbrklyn.com (8.12.3/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id h4MNEulu006883 for ; Thu, 22 May 2003 19:14:56 -0400
Received: (from mdom-at-localhost) by www2.mrbrklyn.com (8.12.3/8.12.3/Submit) id h4MNEuEh006882 for hangout-desteny; Thu, 22 May 2003 19:14:56 -0400
X-Authentication-Warning: www2.mrbrklyn.com: mdom set sender to owner-hangout-at-www2.mrbrklyn.com using -f
Received: from debian (ns.metm.org [22.214.171.124]) by mrbrklyn.com (8.12.3/8.11.2/SuSE Linux 8.11.1-0.5) with ESMTP id h4MNEulu006877 for ; Thu, 22 May 2003 19:14:56 -0400
Received: by debian (Postfix, from userid 1001) id 810F4801F; Thu, 22 May 2003 19:17:51 -0400 (EDT)
Date: Thu, 22 May 2003 19:17:50 -0400
From: Marco Scoffier
Subject: [hangout] mbox_splitter.py
Content-Type: text/plain; charset=us-ascii
Reply-To: Marco Scoffier
List: New Yorker GNU Linux Scene
Admin: To unsubscribe send unsubscribe name-at-domian.com in the body to hangout-request-at-www2.mrbrklyn.com
Well this is the first time I have done this, and I am a little wary
about the scrutiny of wide usage. But, I wrote a small script for my
own usage that I think might be useful for many of you.
splits the mailboxes it recieves as arguements into user defined sizes
(say -S=300000 bytes or about 3M).
It also strips out the attachments which are larger than a certain
size and saves them to a seperate file.
It also saves only a specific set of headers, so you keep only the imporant
ones if you wish (this can reduce the size of an mbox by 20-50%).
Basically I have procmail sort my incoming mails into
several mboxes. Because I am on so many lists these mboxes tend to get
really large, to the point where it is tough to even open them. So,
wrote this script which I run in a batch:
To download the code and view some examples which include the output of
parsing the 19M of hangout we have received since Feb 12th. please
Oh and it's python so it is pretty slow.
NYLXS: New Yorker Free Software Users Scene
Fair Use -
because it's either fair use or useless....
NYLXS is a trademark of NYLXS, Inc