Advice on wrangling the results from a SQL query

Greetings,

Sorry for the convoluted questions. I am not well versed with APIs or SQL so this may be a really backwards way to do things. Basically, I am trying to create a report that gives me the same (or similar) data that I get from the vulnerabilities section in the InsightVM portal.

So pretty basic Title, CVS, Risk, Date, Severity, and Instances.

What I have so far is a custom SQL query report with the following query:

SELECT Title, malware_kits as Malware, exploits as Exploits, round(cvss_score::numeric, 2) AS CVSS, round(cvss_v3_score::numeric, 2) AS CVSSv3, date_published as PublishedOn, date_modified as Modified, severity, affected_assets as Instances, nexpose_id\n FROM dim_vulnerability \n JOIN fact_vulnerability fv USING (vulnerability_id)\n WHERE fv.vulnerability_instances > 0\n ORDER BY modified DESC

Now this does most of the job here, but this is where I get lost in the weeds a bit.

  1. When running the report via an API call it dumps all of the data in a block and it is hard to manipulate it from that point. It looks like one large block as a CSV.

I am using GET /api/3/reports/{id}/history/{instance}/output. I have also tried GET /api/3/reports but that did not seem to give me anything aside from the report formats.

Is there a better way to query this specific data for use in an API?

  1. When I compaire the data that is in this report, it differes slightly from what I can see in the Vulnerabilities section in the console. There are just a handful of items that do not match up and it’s kinda weird. Like there may be 1 or 2 items that are on MY report that are not in the vulns section.

Any advice/critiques is appreciated.

Thanks,
Aaron

1 Like

I think your approach is fine, I did the same test but I got the same results on both reports. Or I should say the same report on both methods gave me the same results (number of columns and rows) as I expected.

I saved the contents to a csv and was able to open it in OpenOffice and work with it as needed. What exactly are you trying to do with the data that isn’t working?

1 Like

John,

Thanks for the reply. I wanted to make sure this method was a good way to pull data similar to what we see in the vulnerabilities portal. Seeing the small discrepancy made me question if my method would work. Thank you for testing it out.

I guess my end goal was to have a way to export this data so team members can review and make notes (something that is oddly not an option to do within the portal). Previously our team used Nessus and would get a weekly vulnerability export in an excel document, make the changes or add notes for escalation if needed.

I want to be able to export the data via an API and move it into a SharePoint list or PowerBI report so we could do a similar process instead of having tons of excel documents floating around. I am still working on that part, but I wanted to post in the forums to see if folks had a more elegant way of doing this.

Anyway, thanks again for your help.

Typically speaking, what I see in this type of scenario is that customers will use the Data Warehouse to export the scan results to which is a read-only database that they then hook up Power BI to. I think that’s probably the easiest and most straightforward method, however that does involve standing up and maintaining a separate server simply for the Data Warehouse.

I have also seen Customers using Power BI to query the API as you mentioned which worked fine for what they were doing. However, they were querying the Cloud Integrations API and only pulling the risk score per assets.

I think though that you could use Power BI to pull the report in the fashion that you are currently trying to do. Have Power BI pull the results into a Table and then build your dashboard cards off of that Table.

Until you’re able to get Power BI up and running I think your best bet would be to save the results as a csv to SharePoint like you mentioned and have everyone edit from there to avoid the duplicate copies floating around like you mentioned.

Also, out of curiosity can you elaborate on what you would be looking for to make notes/comments on vulnerabilities within InsightVM?

Would Tags accomplish what you’re trying to do?

I would like to work my way up to using a Data Warehouse at some point, but for now I we are all trying to become more familiar with the platform overall before we start to get fancy with the tools.

I’ll try and explain our process better haha. Myself and our sysadmin review the vulnerabilities from the InsightVM console

image

From there we review the alerts and remediate what we can (for example certain items are handled by our team leads). So, if there is a particular vulnerability that we either cannot resolve, or it requires some elevation/access we do not have we would notate that so when the higher ups meet, they can review what we have done and then take it from there.

Essentially just logging our progress. Which is what they used to do with the excel reports from our previous solution.

It’s a little backwards in my mind but I am low on the list of making recommendations to a process.

I suppose creating tags for “completed” or “needs review” would make sense.

Ideally, I want to just pull the data via the API and use something like Power Automate to create a flow that adds this into a SharePoint list. Still noodling around with the API though. But I guess that could be another post/topic for another time haha.

Hopefully that makes sense. I feel like we are doing things a little bit backwards.

Ahh ok, I don’t think Tags would get you what you’re looking for then because Tags are applied to assets and not specific vulnerabilities.

I would suggest using Remediation Projects which may help. You can set a status on a solution, however the solution may apply to several Vulnerabilities/Assets so there may still be some inconsistencies with what you’re looking for.

The solution you’re looking for though just doesn’t exist in the tool yet to actually make comments per vulnerability on specific assets but it’s worth putting in an IDEA ticket for. I’m not entirely sure the process the DEV team would need to go through to include that because this would need to be a new database entry. But either way, if you put in an IDEA ticket it very possibly could be something they put on the roadmap to get implemented.

Good deal. Thanks for the heads up. I will send one in just in case. Granted this feel like an ultra-specific use case, but I’ll give it a go.

Not to burden you with any more questions, but for API related help would it be best to do another post? I am a bit confused on if I need to be using the Report Download, Report, or Reports call. Report download matches the data but I am sure there is a better way. Unfortunately, I am also unskilled with API’s in general.

Anyway, I appreciate the help. Thank you!

Nah, this topic is fine to continue the thread in. The Report Download would be what you want to use for your use case. Given that you’re trying to pull specific fields from the reporting data model you would need to pull from this. The other two endpoints you mentioned don’t pull the actual results of the report, just the metadata. Now depending on what workflow works best for you, you may want to look into the POST Reports which you could use to create a new Report via the API with a new SQL Query. OR you could just go into the UI and create the Report which honestly, for validation, may be easier. But if you decide to use the API to create the new report, use the ID returned to kick off a POST Report Generation and then you can just use the GET Report Download and specify “latest” as the instance to always get the results from the most recent generation of the report.

1 Like

Awesome! Thank you for the advice. I was reading about the POST report, and I was kinda leaning towards that way.

I’ll give that a shot. Since I am calling the report download and it only is getting meta data, that might be why the formatting is a little goofy.

Thanks again, I will give POST a shot.

Just in case you wanted to see what I was trying. Using Postman I listed the following in the body:

{
    "name": "SQL Report API Test",
    "version":"1.2.0",
    "format": "sql-query",
    "query": "SELECT Title, malware_kits as Malware, exploits as Exploits, round(cvss_score::numeric, 2) AS CVSS, round(cvss_v3_score::numeric, 2) AS CVSSv3, date_published as PublishedOn, date_modified as Modified, severity, affected_assets as Instances, nexpose_id FROM dim_vulnerability JOIN fact_vulnerability fv USING (vulnerability_id) WHERE fv.vulnerability_instances > 0 ORDER BY modified DESC"
}

The response message is:

{
“status” : 400,
“message” : “The specified query is invalid.\n\nERROR: column “cvss_v3_score” does not exist\nColumn: 107\nState: undefined_column”,
“links” : [ {
“href” : “serverURL/api/3/reports/”,
“rel” : “self”
} ]
}

Actually, I jumped the gun a bit. I fixed it but when posting it just created another report in the console similar to the one I created manually.

Now I am just trying to format the results better.

Ahh ok good deal, I was confused there because that column certainly exists. You should be able to convert the csv to json if that makes it easier to manipulate for what you’re doing.

1 Like

Thank you for all your help and pointers. I have some follow up questions if you don’t mind. Or I can create a new topic.

I found this post Extracting Bulk Data with the InsightVM Console API - InsightVM / API - Rapid7 Discuss and it helped give me a better idea of generating, gathering data, and then deleting the report.

My struggle currently is the responses I am getting based off my SQL query. I am not sure if it pertains to my limited understanding of APIs, or if I am going about the report generation wrong.

I am using Postman to GET the generated report from a SQL query, but the responses are formatted like a CSV. I am not sure if I have to configure the body of the request to tie things together.

This is my return after the report generation, then report download. The results list the column info in the very first row, then the remaining content is just dumped out. My thought was I had to specify in the body what fields “Title” or “maleware” corresponds to. That is where I am stuck haha.

title,malware,exploits,cvss,cvssv3,publishedon,modified,severity,instances,nexpose_id
ICMP timestamp response,
0,
0,
0.00,
,
1997-08-01,
2023-01-31,Moderate,
184,generic-icmp-timestamp
Microsoft Edge Chromium: CVE-2023-0474 Use after free in GuestView,
0,
0,
4.40,
,
2023-01-27,
2023-01-30,Severe,
1,microsoft-edge-cve-2023-0474

Anyway, this is more free form thought. I appreciate your help. I will keep plugging away at this.

So the response is going to be formatted as a CSV as that is the output format of an SQL Query. That’s easier to see when looking at the actual report template within IVM that you are pulling.

In the API, when pulling the results of the SQL Query you don’t specify anything. The only time you would specify anything within the SQL would be when you create the report (if done through API) which I really don’t suggest.

As far as working with those results that you have now, what are you trying to do with them? Are you trying to put them into a PowerBI dashboard, send it out as a report, use for ticketing?

1 Like

That makes sense. So really, the best method for reporting with the API is to just generate a report you have already created within the portal, then downloading the updated contents via the API.

Currently, I just get the contents in a CSV, add a few things then send it to the team. My goal is to feed the data from the API into a web PowerAutomate flow that will update a SharePoint list.

It’s good to know that any SQL export will always be the CSV format, because I thought I was going insane.

Thank you for the explanation.

I think I figured it all out. Well, for the most part. Had to sidestep a few things, but now I can export the data into a SharePoint list.

It isn’t pretty but it’s working. Thanks for all your help.

1 Like