SQL query equivalents of these API calls

hi! I am looking for the SQL-query equivalents of these simple Rapid7-API calls.

Could you please assist with this? As they are not as obvious as they seem:) thanks!




Query 1)

asset.groups

IN

[‘class - windows - wintel team’]

vulnerability.categories

IN

[‘microsoft’]

vulnerability.categories

NOT IN

[‘microsoft sql server’, ‘microsoft .net framework’, ‘microsoft asp’, ‘microsoft exchange’, ‘microsoft sharepoint’, ‘microsoft asp.net’, ‘microsoft iis’, ‘microsoft msxml’, ‘microsoft xml’]




Query 2)

asset.groups

IN

[‘class - aix - linux - unix team’]

vulnerability.categories

IN

[‘red hat enterprise linux’, ‘centos’, ‘suse’, ‘red hat’, ‘ibm aix’, ‘snmp’, ‘ssh’, ‘samba’]

–bump–

What type of data are you looking to retrieve with these SQL-query equivalents? For instance, your queries seem to address asset groups that have specific vulnerabilities categories, while including / excluding certain groups. Is there specific asset + asset group data you’d like to retrieve?

Here’s an example query relating to asset groups, which could be a good starting point here:

https://docs.rapid7.com/insightvm/sql-example-asset-groups/

Hi @holly_wilsey thanks for your response.

What we are trying to do is to retrieve the data sets behind a couple of our visual widgets (‘Cards’) in Exposure Analytics, by using plain SQL. However, this seems to be quite an effort as the data models between our PostgreSQL database and the ‘exposure-analytics.insight.rapid7.com’ visuals appear to be (very) different to each other. Can you please confirm whether this is indeed the case?

The major blocker is that there doesn’t seem to be a usable database column representing the equivalent of vulnerability.categories in the API. Is this assumption correct? We need to filter this column on partial text strings (wildcards), so pinpointing us to vulnerability.categories in PostgreSQL would be a great step forward.

FYI, we have also logged this issue with R7 support under case number 01391643. Any suggestions on how to speed up the handling of this case?

Any help to finally get around this problem, would be very… helpful!

That’s a good question. I’m not 100% sure on this one, as I’d need to double check with the team on what data is being used in those visuals and how exactly it’s being pulled on the backend. For what it’s worth, I just scrolled through some example dashboard cards in my lab instance, and they largely appear to use data that I’ve seen used in other SQL queries. I know that doesn’t answer 100% regarding the data models aligning, but just an observation.

One thing I will say is that there are slightly differing data models between the console and the data warehouse, if you happen to be using it. We’ve got some more documentation on the reporting data model here and here, and you can check out the data warehouse schema here:

https://help.rapid7.com/nexpose/en-us/warehouse/warehouse-schema.html

You’re looking to filter with partial strings on the vulnerability category name - is that correct?

We do have the dim_vulnerability_category table, which you can use to identify associated vuln categories, as well as the category name. You can see the table here:

https://docs.rapid7.com/insightvm/understanding-the-reporting-data-model-dimensions#dim_vulnerability_category

SQL can be a bit of a tricky one regarding support cases, because we can’t always support every level of custom query requests that we receive. That said, hopefully we can at least partially help you get what you’re looking for here on the forum.

Hi @holly_wilsey thanks, we have indeed queried ‘dim_vulnerability_category’ extensively without the intended result.

Please allow me to elaborate on the intended result: we need to create a PostgreSQL query which returns the exact same results as the attached widget in the insightVM console. Several request have been logged with your internal tech support, without the desired response.

unix_os

unix_os_query

As can be seen in this screen prints, the Unix OS vulnerability count is 5130 right now. The query statement has been attached as well for your convenience.

Could you please let us know the equivalent SQL query behind this count? Or point us to an internal resource that can help?

Is there anything in particular that wasn’t working with this table? If you have an existing query that shows any errors you’re seeing with this, that would be a good jumping off point here.

That said, here’s another query that could be good to start with and update based on what you’re looking for. It’s difficult to provide a query that’s guaranteed to be 100% identical to the vuln findings card due to the backend info needed for that, but the below query pulls data on findings based on the vuln categories + asset groups from your query builder screenshot.

select da.host_name, da.ip_address, dv.title, dv.severity,
       dv.date_published, dv.date_modified, dag.name as "Asset Group",
       dvc.category_name
from fact_asset_vulnerability_finding favf
join dim_asset da using(asset_id)
join dim_vulnerability dv using(vulnerability_id)
join dim_asset_group_asset daga using(asset_id)
join dim_asset_group dag using(asset_group_id)
join dim_vulnerability_category dvc using(vulnerability_id)
where lower(dvc.category_name) in ('red hat enterprise linux',
      'centos', 'suse', 'red hat', 'ibm aix', 'snmp', 'ssh', 'samba')
and dag.name in ('class - aix - linux - unix team')
group by da.host_name, da.ip_address, dv.title, dv.severity,
         dv.date_published, dv.date_modified, dag.name, dvc.category_name

You would update the asset group and category names in the where clause to be whatever you want. If you’re only looking for a count of these vuln findings, then you would just remove the group by at the end and update the select statement to be something like:

select count(favf.vulnerability_id)

Thank you very much @holly_wilsey for your suggestions.
After a minor rewrite we can confirm that the counts from PostgreSQL are now very close to those reported in insightVM. Please find the amended queries below:

–windows os–
select count(favf.vulnerability_id)
from fact_asset_vulnerability_finding favf
join dim_asset da using(asset_id)
join fact_asset_date fa using (asset_id)
join dim_vulnerability dv using(vulnerability_id)
join fact_vulnerability fv using (vulnerability_id)
join dim_asset_group_asset daga using(asset_id)
join dim_asset_group dag using(asset_group_id)
join dim_vulnerability_category dvc using(vulnerability_id)
where lower(dvc.category_name) in (‘microsoft’)
and lower(dvc.category_name) not in (‘microsoft sql server’, ‘microsoft .net framework’, ‘microsoft asp’, ‘microsoft exchange’, ‘microsoft sharepoint’, ‘microsoft asp.net’, ‘microsoft iis’, ‘microsoft msxml’, ‘microsoft xml’)
and dag.name = ‘Class - Windows - Wintel Team’
and fa.day IN (CURRENT_DATE-1)
and COALESCE(cvss_v3_score, cvss_score) < 9

–windows os critical–
select count(favf.vulnerability_id)
from fact_asset_vulnerability_finding favf
join dim_asset da using(asset_id)
join fact_asset_date fa using (asset_id)
join dim_vulnerability dv using(vulnerability_id)
join fact_vulnerability fv using (vulnerability_id)
join dim_asset_group_asset daga using(asset_id)
join dim_asset_group dag using(asset_group_id)
join dim_vulnerability_category dvc using(vulnerability_id)
where lower(dvc.category_name) in (‘microsoft’)
and lower(dvc.category_name) not in (‘microsoft sql server’, ‘microsoft .net framework’, ‘microsoft asp’, ‘microsoft exchange’, ‘microsoft sharepoint’, ‘microsoft asp.net’, ‘microsoft iis’)
and dag.name = ‘Class - Windows - Wintel Team’
and fa.day IN (CURRENT_DATE-1)
and COALESCE(cvss_v3_score, cvss_score) >= 9

–unix os–
select count(favf.vulnerability_id)
from fact_asset_vulnerability_finding favf
join dim_asset da using(asset_id)
join fact_asset_date fa using (asset_id)
join dim_vulnerability dv using(vulnerability_id)
join fact_vulnerability fv using (vulnerability_id)
join dim_asset_group_asset daga using(asset_id)
join dim_asset_group dag using(asset_group_id)
join dim_vulnerability_category dvc using(vulnerability_id)
where lower(dvc.category_name) in (‘red hat enterprise linux’,
‘centos’, ‘suse’, ‘ibm aix’, ‘snmp’, ‘ssh’, ‘samba’)
and dag.name = ‘Class - AIX - Linux - Unix Team’
and fa.day IN (CURRENT_DATE-1)
and COALESCE(cvss_v3_score, cvss_score) < 9

–unix os critical–
select count(favf.vulnerability_id)
from fact_asset_vulnerability_finding favf
join dim_asset da using(asset_id)
join fact_asset_date fa using (asset_id)
join dim_vulnerability dv using(vulnerability_id)
join fact_vulnerability fv using (vulnerability_id)
join dim_asset_group_asset daga using(asset_id)
join dim_asset_group dag using(asset_group_id)
join dim_vulnerability_category dvc using(vulnerability_id)
where lower(dvc.category_name) in (‘red hat enterprise linux’,
‘centos’, ‘suse’, ‘ibm aix’, ‘snmp’, ‘ssh’, ‘samba’)
and dag.name = ‘Class - AIX - Linux - Unix Team’
and fa.day IN (CURRENT_DATE-1)
and COALESCE(cvss_v3_score, cvss_score) >= 9

The only queries where we still see a (large) variance between PostgreSQL and insightVM are the so-called ‘3rd parties’. Please find the insightVM screen prints attached and the queries as follows:

–windows 3rd–
select count(favf.vulnerability_id)
from fact_asset_vulnerability_finding favf
join dim_asset da using(asset_id)
join fact_asset_date fa using (asset_id)
join dim_vulnerability dv using(vulnerability_id)
join fact_vulnerability fv using (vulnerability_id)
join dim_asset_group_asset daga using(asset_id)
join dim_asset_group dag using(asset_group_id)
join dim_vulnerability_category dvc using(vulnerability_id)
where lower(dvc.category_name) not in (‘microsoft’)
or lower(dvc.category_name) in (‘microsoft sql server’, ‘microsoft .net framework’, ‘microsoft asp’, ‘microsoft exchange’, ‘microsoft sharepoint’, ‘microsoft asp.net’, ‘microsoft iis’, ‘microsoft msxml’, ‘microsoft xml’)
and dag.name = ‘Class - Windows - Wintel Team’
and fa.day IN (CURRENT_DATE-1)
and COALESCE(cvss_v3_score, cvss_score) < 9

–unix 3rd–
select count(favf.vulnerability_id)
from fact_asset_vulnerability_finding favf
join dim_asset da using(asset_id)
join fact_asset_date fa using (asset_id)
join dim_vulnerability dv using(vulnerability_id)
join fact_vulnerability fv using (vulnerability_id)
join dim_asset_group_asset daga using(asset_id)
join dim_asset_group dag using(asset_group_id)
join dim_vulnerability_category dvc using(vulnerability_id)
where lower(dvc.category_name) not in (‘red hat enterprise linux’,
‘centos’, ‘suse’, ‘red hat’, ‘ibm aix’, ‘snmp’, ‘ssh’, ‘samba’)
and dag.name = ‘Class - AIX - Linux - Unix Team’
and fa.day IN (CURRENT_DATE-1)
and COALESCE(cvss_v3_score, cvss_score) < 9

FYI, PostgreSQL is reporting many more 3rd party vulnerabilities than insightVM and it is unclear why. Could you please have a look at these last two queries to help us determine why we end up with such a difference in counts? Could there maybe be some syntax issues with ‘NOT IN’ and ‘OR’ ( ‘||’) ?

win_3rd

win3rd_widget

unix_3rd

unix3rd_widget

Hi @holly_wilsey any chance we can get a final response to this? Appreciate your efforts!

For the Windows third party query, one thing I would do is add parentheses around the first part of the where clause to group those two vuln category statements together. SQL can do some weird things when it comes to AND vs OR priorities. So you could try like this:

where (lower(dvc.category_name) not in (‘microsoft’)
or lower(dvc.category_name) in (‘microsoft sql server’, ‘microsoft .net framework’, 
‘microsoft asp’, ‘microsoft exchange’, ‘microsoft sharepoint’, ‘microsoft asp.net’, 
‘microsoft iis’, ‘microsoft msxml’, ‘microsoft xml’))

I also noticed that your SQL has two extra category names listed, compared to the query in the builder - microsoft msxml and microsoft xml. If you have the same category names in both the builder and the SQL, that could help align the numbers, as well.

The Unix third party one looks alright as far as SQL goes. How much of a discrepancy is there between that and the query builder count? I’m wondering if the difference would decrease if in the SQL you explicitly stated which vuln categories you want to include, rather than exclude. It might take some experimenting to figure out which vulnerabilities are being included in the SQL, and why they’re not shown in the case of the query builder.

1 Like

Thank you @holly_wilsey for your continued support. It is much appreciated. We will test the ‘include’ method as you suggest, and will report back to you in case of variances.

FYI, for both 3rd Parties it appears that the count in PostgreSQL is about three times as high as the count in the insightVM widgets. When running the below PSQL queries (counts divided by 3), we get an approximate count similar to those in insightVM. If you have any ideas why this could be the case, we’d be keen to hear!

–windows 3rd–
select count(favf.vulnerability_id)/3
from fact_asset_vulnerability_finding favf
join dim_asset da using(asset_id)
join fact_asset_date fa using (asset_id)
join dim_vulnerability dv using(vulnerability_id)
join fact_vulnerability fv using (vulnerability_id)
join dim_asset_group_asset daga using(asset_id)
join dim_asset_group dag using(asset_group_id)
join dim_vulnerability_category dvc using(vulnerability_id)
where (lower(dvc.category_name) not in (‘microsoft’)
or lower(dvc.category_name) in (‘microsoft sql server’, ‘microsoft .net framework’, ‘microsoft asp’, ‘microsoft exchange’, ‘microsoft sharepoint’, ‘microsoft asp.net’, ‘microsoft iis’))
and dag.name = ‘Class - Windows - Wintel Team’
and fa.day IN (CURRENT_DATE-1)
and COALESCE(cvss_v3_score, cvss_score) < 9

–unix 3rd–
select count(favf.vulnerability_id)/3
from fact_asset_vulnerability_finding favf
join dim_asset da using(asset_id)
join fact_asset_date fa using (asset_id)
join dim_vulnerability dv using(vulnerability_id)
join fact_vulnerability fv using (vulnerability_id)
join dim_asset_group_asset daga using(asset_id)
join dim_asset_group dag using(asset_group_id)
join dim_vulnerability_category dvc using(vulnerability_id)
where lower(dvc.category_name) not in (‘red hat enterprise linux’,
‘centos’, ‘suse’, ‘red hat’, ‘ibm aix’, ‘snmp’, ‘ssh’, ‘samba’)
and dag.name = ‘Class - AIX - Linux - Unix Team’
and fa.day IN (CURRENT_DATE-1)
and COALESCE(cvss_v3_score, cvss_score) < 9

–update: literally including the categories does unfortunately not resolve the issue–

Interesting :thinking: That’s a bit tricky to figure out without seeing all the data in your environment. One of the things you could do to try to further troubleshoot is look at the data from several vulnerabilities that are included in the results of the SQL query, but are NOT included in the results of the query builder. To get that data, you could use the “select” statement from the suggested query that I originally posted and combine it with your own updated queries.

The goal would be to determine what differences exist between those vulnerabilities, and figure out what about them would result in the SQL including these particular vulnerabilities, but would lead the query builder to not include them. You’d also want to ensure the SQL query isn’t returning duplicate rows for these vulnerabilities (and it may help to add a “distinct” keyword to your “select” statements). I know that’s a bit of a manual process, but at the end of the day it’s difficult to correlate this type of data without seeing the queries that are used on the backend in the builder.

1 Like

I get an error when trying to validate this, did you ever get it to work?