SQL Aggregate Help

My team and I are trying to create a report that includes all IP’s that have the same vulnerability, solution, and URL together. Below is our script, however, we can’t seem to get the groupings to work. Can someone please assist? We are converting the IVM scoring of cvssv2 to Nessus scoring. Hopefully that isn’t getting in the way of this query. Thank you in advance!

SELECT
       STRING_AGG(DISTINCT(tul.ip_address), ', ' Order by tul.ip_address) AS ip_address,
       STRING_AGG(DISTINCT(vuln.title), ', ' Order by vuln.title) AS vulnerability_description,
       CAST(vuln.cvss_score AS DECIMAL(10,1)) AS cvssv2_score,
       CASE WHEN vuln.cvss_score = 10 THEN 'Critical'
            WHEN vuln.cvss_score BETWEEN 7 AND 9.9 THEN 'High'
            WHEN vuln.cvss_score BETWEEN 4.0 AND 6.9 THEN 'Medium'
            WHEN vuln.cvss_score BETWEEN 0.1 and 3.9 THEN 'Low'
       ELSE 'Info'
       END AS severity, 
       htmltotext(sol.fix) AS solution,
       STRING_AGG(DISTINCT(sol.url), ', ' Order by sol.url) as solution_url
       FROM fact_asset_vulnerability_finding AS avf
       JOIN dim_asset AS tul ON avf.asset_id = tul.asset_id
       JOIN dim_vulnerability AS vuln ON avf.vulnerability_id = vuln.vulnerability_id
       JOIN dim_vulnerability_solution AS vs ON avf.vulnerability_id = vs.vulnerability_id
       JOIN dim_solution AS sol ON vs.solution_id = sol.solution_id
       WHERE vuln.cvss_score >= 7
       GROUP by vuln.title, sol.fix, tul.ip_address, vuln.cvss_score, sol.url
       ORDER by vuln.cvss_score DESC

If I’m not mistaken, I think you’ll get what you’re looking for by removing the vuln.title and vuln.cvss_score from the GROUP BY statement. Column’s that are part of an aggregate function don’t need to be included in the GROUP BY statement.

If that doesn’t get you what you’re looking for let me know and we can dig into it more.

1 Like

Thanks, @john_hartman. I removed those fields, but I’m getting an error stating that I need vuln.cvss_score in the GROUP BY clause. I’m not great on the syntax of STRING_AGG + CAST for vuln.cvss.score. Is this even possible to do?

STRING_AGG(CAST(vuln.cvss_score AS DECIMAL(10,1)), ', ' Order by vuln.cvss_score) AS cvssv2_score

Error:

function string_agg(numeric, unknown) does not exist

My apologies, I told you to remove the wrong one. I meant to say you could remove vuln.title and tul.ip_address from the GROUP BY. Leave the vuln.cvss_score in there.

This is an example of the output I got in my lab.

Screen Shot 2022-08-04 at 2.33.25 PM

That works great. Is there a way to do STRING_AGG, DISTINCT, and HTMLtoText for solution?

Yea that should work too.

SELECT
STRING_AGG(DISTINCT(tul.ip_address), ', ' Order by tul.ip_address) AS ip_address,
STRING_AGG(DISTINCT(vuln.title), ', ' Order by vuln.title) AS vulnerability_description,
CAST(vuln.cvss_score AS DECIMAL(10,1)) AS cvssv2_score,
CASE WHEN vuln.cvss_score = 10 THEN 'Critical'
  WHEN vuln.cvss_score BETWEEN 7 AND 9.9 THEN 'High'
  WHEN vuln.cvss_score BETWEEN 4.0 AND 6.9 THEN 'Medium'
  WHEN vuln.cvss_score BETWEEN 0.1 and 3.9 THEN 'Low'
ELSE 'Info'
END AS severity, 
STRING_AGG(DISTINCT(htmltotext(sol.fix)), ', ' Order by htmltotext(sol.fix)) AS solution,
STRING_AGG(DISTINCT(sol.url), ', ' Order by sol.url) as solution_url


FROM fact_asset_vulnerability_finding AS avf

JOIN dim_asset AS tul ON avf.asset_id = tul.asset_id
JOIN dim_vulnerability AS vuln ON avf.vulnerability_id = vuln.vulnerability_id
JOIN dim_vulnerability_solution AS vs ON avf.vulnerability_id = vs.vulnerability_id
JOIN dim_solution AS sol ON vs.solution_id = sol.solution_id

WHERE vuln.cvss_score >= 7

GROUP by vuln.cvss_score, sol.fix, sol.url

ORDER by vuln.cvss_score DESC

So it’s really just making the solution line like this:

STRING_AGG(DISTINCT(htmltotext(sol.fix)), ', ' Order by htmltotext(sol.fix)) AS solution,

I’m not entire sure how different the results would actually look though as I have a very small lab to compare against but I would assume it’s going to give you what you want.

@john_hartman that did it! I had it almost complete before you messaged back. Didn’t realize I had to add htmltotext in the ‘Order By’ portion of the STRING_AGG function. That’s what kept throwing me off.

Do you know if there is a way to select the top solution, like the most recommended solution from the table dim_solution? My multiple advisories (Red Hat) vulnerabilities group all of the fixes into one cell for say, python3. Is there a way to include only the top solution, like say “Update python to the latest version available from Red Hat, using tools like yum or up2date.”? Because that basically trumps all of the other solutions. Just curious if this is possible with that table or another table. If not, no biggy.

Yea you could use the “dim_asset_vulnerability_best_solution” table to get the rollup solutions. It should be as simple as swapping out “dim_solution” for that other table and changing the select. If you need help i can try and build it out for you tomorrow.

https://docs.rapid7.com/insightvm/understanding-the-reporting-data-model-dimensions/#dim_asset_vulnerability_best_solution

Sweet! Yeah, that would be fantastic. I’m trying to find that table in the data warehouse schema, but it doesn’t appear to be available. Any idea what fields are selectable?

I think in the data warehouse schema it would be the dim_solution_highest_supercedence would be the table you’re looking for.
Screen Shot 2022-08-04 at 9.07.08 PM

@john_hartman,

These three joins seemed to get what I needed. Thank you for assisting with the “best solution” table:

   JOIN dim_vulnerability AS vuln ON avf.vulnerability_id = vuln.vulnerability_id
   JOIN dim_asset_vulnerability_best_solution AS vs ON avf.vulnerability_id = vs.vulnerability_id
   JOIN dim_solution AS sol ON vs.solution_id = sol.solution_id

Below is the query used. Is there anything in here that you recommend I change?

SELECT
       STRING_AGG(DISTINCT(tul.ip_address), ', ' Order by tul.ip_address) AS ip_address,
       STRING_AGG(DISTINCT(vuln.title), ', ' Order by vuln.title) AS vulnerability_description,
       
       -- Convert CVSSv2 rating to decimal with 1 digit after the decimal
       CAST(vuln.cvss_score AS DECIMAL(10,1)) AS cvssv2_score,

       -- Below is the rating system for Nessus based off of CVSSv2
       CASE WHEN vuln.cvss_score IS NULL THEN 'N/A'
            WHEN vuln.cvss_score = 10 THEN 'Critical'
            WHEN vuln.cvss_score BETWEEN 7 AND 9.9 THEN 'High'
            WHEN vuln.cvss_score BETWEEN 4.0 AND 6.9 THEN 'Medium'
            WHEN vuln.cvss_score BETWEEN 0.1 and 3.9 THEN 'Low'
       ELSE 'Info'
       END AS severity_v2, 


      -- Convert CVSSv3 rating to decimal with 1 digit after the decimal
      CAST(vuln.cvss_v3_score AS DECIMAL(10,1)) AS cvssv3_score,


       -- Below is the rating system for Nessus based off of CVSSv3
       CASE WHEN vuln.cvss_v3_score IS NULL THEN 'N/A'
            WHEN vuln.cvss_v3_score BETWEEN 9.0 AND 10.0 THEN 'Critical'
            WHEN vuln.cvss_v3_score BETWEEN 7 AND 8.9 THEN 'High'
            WHEN vuln.cvss_v3_score BETWEEN 4.0 AND 6.9 THEN 'Medium'
            WHEN vuln.cvss_v3_score BETWEEN 0.1 and 3.9 THEN 'Low'
       ELSE 'Info'
       END AS severity_v3, 

    
       STRING_AGG(DISTINCT(TO_CHAR(vuln.date_published, 'MM-DD-YYYY')), ', ' Order by TO_CHAR(vuln.date_published, 'MM-DD-YYYY')) AS date_published,
       STRING_AGG(DISTINCT(TO_CHAR(vuln.date_added, 'MM-DD-YYYY')), ', ' Order by TO_CHAR(vuln.date_added, 'MM-DD-YYYY')) AS date_added,
       STRING_AGG(DISTINCT(TO_CHAR(vuln.date_modified, 'MM-DD-YYYY')), ', ' Order by TO_CHAR(vuln.date_modified, 'MM-DD-YYYY')) AS date_modified,
       STRING_AGG(DISTINCT(htmltotext(sol.fix)), ', ' Order by htmltotext(sol.fix)) AS solution,
       STRING_AGG(DISTINCT(sol.solution_type), ', ' Order by sol.solution_type) AS solution_type,
       STRING_AGG(DISTINCT(sol.estimate), ', ' Order by sol.estimate) AS estimated_completion_time,
       STRING_AGG(DISTINCT(sol.url), ', ' Order by sol.url) as solution_url
       FROM fact_asset_vulnerability_finding AS avf
       JOIN dim_asset AS tul ON avf.asset_id = tul.asset_id
       JOIN dim_vulnerability AS vuln ON avf.vulnerability_id = vuln.vulnerability_id
       JOIN dim_asset_vulnerability_best_solution AS vs ON avf.vulnerability_id = vs.vulnerability_id
       JOIN dim_solution AS sol ON vs.solution_id = sol.solution_id

       -- Only obtain the High and Critical vulnerabilities (based on Nessus severity rating) - v2 only
       -- WHERE vuln.cvss_score >= 7

       GROUP by vuln.cvss_score, vuln.cvss_v3_score, sol.url, vuln.date_published, vuln.date_added, vuln.date_modified

       -- Order from highest priority vulnerabilities to lowest
       ORDER by vuln.cvss_score DESC

I just ran that query in my lab and it looks like the results are giving you what you want. I don’t have any recommendations for changes.

Good job putting this together!