1. stripMime.pl - pulls out all lines with no spaces in it (attachments) - input: filename - output: "test" 2. bowenFiles.pl - pulls out message text written by bowen - input: "emails-text" - output: "bowen" 3. manually scans output to make sure most of the crap is gone 4. ispell to double check 5. makelex.pl - build various lexicons taken from specific files - input: filename - output: "lexC" complementizers "lexD" determiners "lexJ" adjectives "lexN" nouns "lexP" prepositions "lexV" verbs "lexX" unknowns "lexY" adverbs