Extracting e-mails to Maltego
There is another recipe in this book that illustrates how to extract e-mails from a website. This recipe will show you how to create a local Maltego transform, which you can then use within Maltego itself to generate information. It can be used in conjunction with URL spidering transforms to pull e-mails from entire websites.
How to do it…
The following code shows how to extract e-mails from a website through the use of regular expressions:
import urllib2
import re
import sys
tarurl = sys.argv[1]
url = urllib2.urlopen(tarurl).read()
regex = re.compile((“([a-z0-9!#$%&’*+\/=?^_`{|}~- ]+(?:\.[*+\/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&’*+\/=?^_`” “{|}~- ]+)*(@|\sat\s)(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?(\.|” “\ sdot\s))+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?)”))
print”<MaltegoMessage>”
print”<MaltegoTransformResponseMessage>”
print” <Entities>”
emails = re.findall(regex, url)
for email in emails:
print” <Entity Type=\”maltego.EmailAddress\”>”...