Pulling IOCs from Websites

Sean · July 24, 2024, 8:22pm

I am building a workflow flow that monitors RSS feeds with the goal of pulling IOCs from websites and from github in order to search my environment for the IOC domains, IP, hashes, etc.

I am able to get the URLs for the web pages that have the IOCs from the RSS XML, but the Get URL plugin is not pulling the page down. I am running into a wall of how to actually pull the content so ExtractIt can extract the IOCs.

I was wondering how others are doing this since using the Get URL does not appear to be the right plugin for this.

Here are two examples of the pages that I want to extract the IOCs from:

https://unit42.paloaltonetworks.com/accelerating-malware-analysis/ (scroll to the bottom)
Unit42-timely-threat-intel/2024-07-20-squatting-and-improsonation-domains.txt at main · PaloAltoNetworks/Unit42-timely-threat-intel · GitHub

We do not have IDR.

sudhir_shirsath · July 25, 2024, 7:56pm

The Get URL plugin outputs the contents of the URL in base64 format in the “bytes” field. You have to base64 decode it and then run the Extractor plugin to extact IOCs.

I tested this out. Here’s the output from my workflow. The extractor plugin cannot extract the IOCs from the list because the URLs/domains in the IOC list are defanged (putting around “.”].

Also there seems to be some issue with the Get URL plugin. If you try to pull the same URL twice, as I was doing it while testing, the plugin errors out. I looked at the code and there seems to be some caching implemented. It might not be implemented correctly.

sudhir_shirsath · July 25, 2024, 7:59pm

This is how you chain the output from get url to base4 decoder to ioc extractor.

Sean · July 26, 2024, 1:50pm

So you are pulling the site with the get-url and then decoding the base64 before using the extractIt?

Get-URL → Decode → ExtractIt (for IOC)

Sean · July 26, 2024, 1:53pm

I started using strings to defang the IPs and URL/Domanis for the workflow that uses Teams to threat hunt for a specific IOC(s)

Sean · July 26, 2024, 3:00pm

Are you referring to the error message stating “rapid7/Get URL:2.0.1. Step name: get_file
expected string or bytes-like object”?

sudhir_shirsath · July 26, 2024, 3:45pm

Yes this is the workflow I am using : Get-URL → Decode → ExtractIt (for IOC)

And yes “rapid7/Get URL:2.0.1. Step name: get_file expected string or bytes-like object” is the error I received as well. I observed that the plugin works when the URL is called the first time. If you call the same URL twice, this is when the error pops up.