I am building a workflow flow that monitors RSS feeds with the goal of pulling IOCs from websites and from github in order to search my environment for the IOC domains, IP, hashes, etc.
I am able to get the URLs for the web pages that have the IOCs from the RSS XML, but the Get URL plugin is not pulling the page down. I am running into a wall of how to actually pull the content so ExtractIt can extract the IOCs.
I was wondering how others are doing this since using the Get URL does not appear to be the right plugin for this.
Here are two examples of the pages that I want to extract the IOCs from:
The Get URL plugin outputs the contents of the URL in base64 format in the “bytes” field. You have to base64 decode it and then run the Extractor plugin to extact IOCs.
I tested this out. Here’s the output from my workflow. The extractor plugin cannot extract the IOCs from the list because the URLs/domains in the IOC list are defanged (putting around “.”].
Also there seems to be some issue with the Get URL plugin. If you try to pull the same URL twice, as I was doing it while testing, the plugin errors out. I looked at the code and there seems to be some caching implemented. It might not be implemented correctly.
Yes this is the workflow I am using : Get-URL → Decode → ExtractIt (for IOC)
And yes “rapid7/Get URL:2.0.1. Step name: get_file expected string or bytes-like object” is the error I received as well. I observed that the plugin works when the URL is called the first time. If you call the same URL twice, this is when the error pops up.