Workflow - GMail Phishing

phil_pearce · November 24, 2020, 8:21am

Hi,
Im trying to write a workflow that allows a user to forward a mail as an attachment (eml) for checking.
(I tried looking at the library workflow but I need headers of the possible phishing email rather than the sender)

What I want to separate are the headers in the eml / links and attachments, so each can be tested.

Ive installed the eml plugin and the gmail.
Gmail collects the email no problem.

When its on the eml extraction part I get an error

rapid7/EML:1.1.3. Step name: parse
'utf-8' codec can't decode byte 0xba in position 2: invalid start byte
**********
{'result': {'date': None, 'from': None, 'to': '', 'subject': None, 'body': 'i(º{Hú,y!y(¾\\u2021§z¸2r\\u2030\\u0161¶\\u059c\\u2026^¨¥±«mi^u\\u968a[\x1a¶\\u059c\\u2026_W¬n\\u2021r»M7q\\u062fv*', 'attachments': [], 'headers': []}}
None is not of type 'string'

Failed validating 'type' in schema['properties']['result']['properties']['date']:
    {'order': 1, 'title': 'Date', 'type': 'string'}

On instance['result']['date']:
    None
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/komand-1.0.1-py3.7.egg/komand/plugin.py", line 311, in handle_step
    output = self.start_step(input_message['body'], 'action', logger, log_stream, is_test, is_debug)
  File "/usr/local/lib/python3.7/site-packages/komand-1.0.1-py3.7.egg/komand/plugin.py", line 426, in start_step
    step.output.validate(output)
  File "/usr/local/lib/python3.7/site-packages/komand-1.0.1-py3.7.egg/komand/variables.py", line 79, in validate
    validate(parameters, self.schema)
  File "/usr/local/lib/python3.7/site-packages/jsonschema-2.3.0-py3.7.egg/jsonschema/validators.py", line 432, in validate
    cls(schema, *args, **kwargs).validate(instance)
  File "/usr/local/lib/python3.7/site-packages/jsonschema-2.3.0-py3.7.egg/jsonschema/validators.py", line 117, in validate
    raise error
jsonschema.exceptions.ValidationError: None is not of type 'string'

Failed validating 'type' in schema['properties']['result']['properties']['date']:
    {'order': 1, 'title': 'Date', 'type': 'string'}

On instance['result']['date']:
    None

When I download the eml file manually, just looks like a json file.
Any ideas on how I progress?

Any help would be great? Has anyone else tried this?

brandon_mcclure · November 24, 2020, 1:34pm

I haven’t worked with the Gmail plugin, but here is what I do with o365 emails for our phishing Workflow.

We have a phishing button added to outlook that sends us an email with the original email as anattachment
InsightConnect Triggers on new email in folder
When the Workflow starts I pull out information about the submitter, then loop through Trigger.Icon Email.flattened Attached Emails
Inside the loop I then can loop I get all the information about the attached email and the headers can be found at Loop.$item.headers

https://extensions.rapid7.com/extension/Office_365_Analysis_with_Virus_Total might be a good starting point, you just have to swap out O365 for Gmail and look out for field name changes

joey_mcadams · November 24, 2020, 2:53pm

What you’re getting back from the GMail plugin is a json representation of the email. It’s not in the EML format. So when the EML plugin tries to parse it, it’s blowing up.

We don’t really have a good way to take something from one of our email plugins and combine it back into an EML format. The reason it’s like that is our use cases typically revolve around taking an email apart and analyzing each part for malicious indicators. We didn’t really anticipate someone trying to deal with the email as is in the EML format.

With all that said, can you describe your use case more? We can probably come up with something to satisfy it.

Why are you trying to forward the mail back out for example? What service are you trying to get it to?

phil_pearce · November 24, 2020, 3:21pm

Hi,
No problem.
We have a shared GMail mailbox, what we’d like is for users to forward their possible phishing emails into this shared mailbox.
Once its in the mailbox, I would like to check the email (urls in body / attachments etc) against services like virustotal / phishtank / isitphishing / openphish or similar.

The trouble is if you forward an email, you’ll get the headers from the users email rather than the phishing email. So the only way to preserve the headers (unless you tell me differently) is to forward as an attachment (hence the EML)

Once this is all checked I wanted certain headers, posted into slack with the results, then a security engineer can have a once over at all the collated information, and make a decision (interactive message) on whether its malicious or not. and then slack This then replies to the user via email with the result and the engineers decision
Then delete the email out of the shared mailbox

That make sense?

joey_mcadams · November 24, 2020, 3:54pm

Yup. So you should be able to pull the headers out of the attachment with something like
[Email Step].[attachment_email].[0].[headers]

Since you know the target email is attached, you can make the assumption that the first attached email is the one you’re looking for.

In your case, on the trigger, make sure “Flatten Emails” is set to false. This way if you have attachments in attachments, they aren’t all getting shoved in the same list.

Then, if you want particular headers out of that group, you can use a python step with a function like this to get them:

headers = [
    {'name': 'device_vendor', 'value': 'microsoft'},
    {'name':'from', 'value': 'bob@example.com'},
    {'name':'to','value':'rick@example.com'},
    {'name':'client', 'value':'gmail 1.0.0'}
]

some_value = None
for header in headers:
    if header.get("name") == "client":
        some_value = header.get("value")

print(some_value)

You can also get these out using a loop step or pattern match step. But that’s going to be kind of the long way around. Less code, but more steps.

I think that’s how I’d approach the problem.

joey_mcadams · November 24, 2020, 3:57pm

And one more trick. If you don’t care about particular headers, but are only looking for malicious indicators in the headers (like URLs, hashes, etc…).

Take the whole headers object and run it through ExtractIt to get the indicators out. It’ll save a bunch of code, and it’s much more performant.