SQL report that only contains latest scan data?

Hi,

I’ve created a SQL report containing only vulnerabilities with a CVSSv3 score of 7 or higher to make our auditors happy. However, I’m having a terrible time trying to narrow it down to only include the latest scan data. Or I would even be happy if I could force it to only look at scans in the last 24 hours. I’ve tried several iterations with no luck. Here is what I used for the report if anyone might have thoughts on how to tweak it? Thanks so much!

SELECT DISTINCT ON (da.asset_id, dv.nexpose_id)
da.ip_address,
da.host_name AS asset_name,
dos.description AS asset_os_name,
svc.name AS service_name,
fasvi.port AS service_port,
dv.nexpose_id,
dv.title,
dv.description,
proofAsText(fasvi.proof) AS proof,
ds.summary AS vulnerability_solution,
ds.fix AS remediation_steps,
dv.date_published,
dv.cvss_vector,
NULLIF(dv.cvss_v3_score, 0) AS cvss_v3,
dv.severity,
fas.pci_status,
round(fas.riskscore::numeric, 0) AS risk_score,
round(dv.cvss_score::numeric, 2) AS cvss_score,
dv.exploits,
dv.malware_kits
FROM fact_asset_scan_vulnerability_instance fasvi
JOIN dim_asset da USING (asset_id)
JOIN dim_operating_system dos USING (operating_system_id)
JOIN dim_vulnerability dv USING (vulnerability_id)
LEFT JOIN dim_vulnerability_solution dvs USING (vulnerability_id)
LEFT JOIN dim_solution ds USING (solution_id)
LEFT JOIN dim_service svc USING (service_id)
JOIN fact_asset fas USING (asset_id)
WHERE dv.cvss_score >= 7
ORDER BY
da.asset_id,
dv.nexpose_id,
dv.cvss_score DESC;

You can probably used the scan finished date code from here insightvm-sql-queries/sql-query-export/Vulnerabilities-Between-Two-Dates at master · rapid7/insightvm-sql-queries · GitHub

Try this on your target(s) and see if this is what you are looking for

One more thing to consider is (duplicates)

Even with “latest scan only,” you can still see “duplicates” because the same vulnerability can appear on multiple ports/services in that scan (that’s normal).

It can be condensed further if needed into a single aggregated column prob….

WITH latest_asset_scan AS (
  SELECT asset_id, scan_id, scan_finished
  FROM (
    SELECT
      das.asset_id,
      das.scan_id,
      das.scan_finished,
      row_number() OVER (
        PARTITION BY das.asset_id
        ORDER BY das.scan_finished DESC
      ) AS rn
    FROM dim_asset_scan das
    WHERE das.scan_finished IS NOT NULL
  ) x
  WHERE rn = 1
),
best_solution AS (
  SELECT
    davbs.asset_id,
    davbs.vulnerability_id,
    ds.summary,
    ds.fix
  FROM dim_asset_vulnerability_best_solution davbs
  JOIN dim_solution ds
    ON ds.solution_id = davbs.solution_id
)
SELECT
  da.ip_address,
  da.host_name AS asset_name,
  dos.description AS asset_os_name,
  svc.name AS service_name,
  fasvi.port AS service_port,
  dv.nexpose_id,
  dv.title,
  proofAsText(dv.description) AS vulnerability_description,
  proofAsText(fasvi.proof)    AS proof,
  bs.summary AS vulnerability_solution,
  bs.fix     AS remediation_steps,
  dv.date_published,
  dv.cvss_v3_score AS cvss_v3,
  dv.severity
FROM latest_asset_scan las
JOIN fact_asset_scan_vulnerability_instance fasvi
  ON fasvi.asset_id = las.asset_id
 AND fasvi.scan_id  = las.scan_id
JOIN dim_asset da
  ON da.asset_id = fasvi.asset_id
JOIN dim_operating_system dos
  ON dos.operating_system_id = da.operating_system_id
JOIN dim_vulnerability dv
  ON dv.vulnerability_id = fasvi.vulnerability_id
LEFT JOIN dim_service svc
  ON svc.service_id = fasvi.service_id
LEFT JOIN best_solution bs
  ON bs.asset_id = fasvi.asset_id
 AND bs.vulnerability_id = fasvi.vulnerability_id
WHERE dv.cvss_v3_score >= 7.0
ORDER BY
  da.ip_address,
  da.host_name,
  dv.nexpose_id,
  fasvi.port;
1 Like

That did the trick perfectly!! And it ran so much faster than the one I was using. You’re the best, thanks so much for all the help!!!

1 Like

Thanks, interestingly written - not the way we have our SQL query reports to provide the same type of result. I did try this one and I do see duplication of vulns - same vuln 2-3 times on each asset with exact same detail - port, service, proof, solution. What is causing this, do you know?

It’s almost always join fan-out in a one-to-many join somewhere or multiple identical instances in the fact table and what we have is two common culprits here the dim_asset_vulnerability_best_solution isnt always going to be 1 row per and the fact_asset_scan_vulnerability_instance can have multiple rows that look identical unless you include key and status_id

even though it sounds like it should be “one best solution,” in practice you can get multiple rows for the same asset+vuln content/superceding quirks, multiple “bestoption” rollups, or other data artifacts etc

If you want one row per (asset, vuln, port, service) no matter what, you can dedupe at the end with a window function and keep only the first row. It’s taking the hammer approach, but it works.

id have to dig deeper but That fact table’s “uniqueness” isn’t just asset, scan, vuln, port, service. It can also include the secondary discriminator keys and sometimes other attributes. If the key differs (or is NULL in multiple rows), you’ll see what looks identical in your selected columns.

it can be painfull for these

This is the one we use the most its for CVSS 6+ as thats what we focus on.

@Starshine @wwolczaski

WITH
   asset_ips AS (
      SELECT asset_id, ip_address, type
      FROM dim_asset_ip_address dips
   ),
   vulnerability_age AS (
     SELECT fvi1.asset_id, fvi1.vulnerability_id, min(fvi1.date) as vulnerability_age
     FROM fact_asset_scan_vulnerability_instance fvi1
     GROUP BY fvi1.asset_id, fvi1.vulnerability_id
   ),
   asset_addresses AS (
      SELECT da.asset_id,
         (SELECT array_to_string(array_agg(ip_address), ',') FROM asset_ips WHERE asset_id = da.asset_id AND type = 'IPv4') AS ipv4s,
         (SELECT array_to_string(array_agg(ip_address), ',') FROM asset_ips WHERE asset_id = da.asset_id AND type = 'IPv6') AS ipv6s,
         (SELECT array_to_string(array_agg(mac_address), ',') FROM dim_asset_mac_address WHERE asset_id = da.asset_id) AS macs
      FROM dim_asset da
         JOIN asset_ips USING (asset_id)
   ),
   asset_names AS (
      SELECT asset_id, array_to_string(array_agg(host_name), ',') AS names
      FROM dim_asset_host_name
      GROUP BY asset_id
   ),
   asset_facts AS (
      SELECT asset_id, riskscore, exploits, malware_kits
      FROM fact_asset
   ),
   vulnerability_metadata AS (
      SELECT *
      FROM dim_vulnerability dv
   ),
   vuln_cves_ids AS (
      SELECT vulnerability_id, array_to_string(array_agg(reference), ',') AS cves
      FROM dim_vulnerability_reference
      WHERE source = 'CVE'
      GROUP BY vulnerability_id
   ),
   vuln_reference_ids AS (
      SELECT vulnerability_id, array_to_string(array_agg(reference), ',') AS ids
      FROM dim_vulnerability_reference
      WHERE source <> 'CVE' AND source <> 'URL'
      GROUP BY vulnerability_id
   ),
   vuln_tags AS (
      SELECT vulnerability_id, array_to_string(array_agg(category_name ORDER BY category_name ASC), ',') AS tags
       FROM dim_vulnerability_category
      GROUP BY vulnerability_id
   ),
   vuln_minimum_skill AS (
      SELECT vulnerability_id, skill_level AS min_skill
      FROM (
         SELECT vulnerability_id,
            row_number() OVER (
               PARTITION BY vulnerability_id
               ORDER BY CASE
                 WHEN skill_level = 'Expert' THEN 1
                  WHEN skill_level = 'Intermediate' THEN 2
                  WHEN skill_level = 'Novice' THEN 3
                  ELSE 4
               END DESC
            ) AS row, skill_level
         FROM dim_vulnerability_exploit dve
      ) v
      WHERE row = 1
   ),
   vulnerability_proofs AS (
      SELECT asset_id, vulnerability_id, csv(htmlToText(proof, false)) AS proofs
      FROM fact_asset_vulnerability_instance
      GROUP BY asset_id, vulnerability_id
   ),
   vulnerability_solutions AS (
      SELECT asset_id, vulnerability_id, csv(htmlToText(ds.fix, false)) AS solutions
      FROM vulnerability_proofs
         JOIN dim_asset_vulnerability_solution USING (asset_id, vulnerability_id)
         JOIN dim_solution_highest_supercedence dshs USING (solution_id)
         JOIN dim_solution ds ON ds.solution_id = dshs.superceding_solution_id
      GROUP BY asset_id, vulnerability_id
   )
SELECT
   ds.name AS "SITENAME",
   da.ip_address AS "IPADDRESS",
   an.names AS "ASSETNAME",
   favi.port AS "PORT",
   dp.name AS "PROTOCOL",
   dsvc.name AS "SERVICE",
   dos.name AS "Asset_OS_Name",
   dos.family AS "Asset_OS_Family",
   dos.version AS "Asset_OS_Version",
   cves.cves AS "CVE",
   vm.nexpose_id AS "Nexpose ID",
   vm.severity_score AS "Severity",
   vm.cvss_score AS "CVSS SCORE",
   vm.title AS "TITLE",
   proofAsText(vm.description) AS "DESCRIPTION",
   proofAsText(favi.proof) AS "PROOF",
   solutions AS "SOLUTION",
   round(age(va.vulnerability_age, 'days')) AS "Vulnerability_Age",
   va.vulnerability_age::date AS "Vulnerable_Since",
   vtags.tags AS "Vulnerability_Tags",
   favi.date::date AS "Vulnerability_Test_Date",
   vuln_minimum_skill.min_skill AS "Exploit_Minimum_Skill",
   af.malware_kits AS "Malware_Kit_Count"
FROM fact_asset_vulnerability_instance favi
   JOIN dim_asset da USING (asset_id)
   LEFT OUTER JOIN asset_addresses aa USING (asset_id)
   LEFT OUTER JOIN asset_names an USING (asset_id)
   JOIN dim_operating_system dos USING (operating_system_id)
   JOIN asset_facts af USING (asset_id)
   JOIN dim_service dsvc USING (service_id)
   JOIN dim_protocol dp USING (protocol_id)
   JOIN dim_site_asset dsa USING (asset_id)
   JOIN dim_site ds USING (site_id)
   JOIN vulnerability_metadata vm USING (vulnerability_id)
   JOIN dim_vulnerability_status dvs USING (status_id)
   JOIN vulnerability_age va USING (vulnerability_id, asset_id)
   LEFT OUTER JOIN vuln_cves_ids cves USING (vulnerability_id)
   LEFT OUTER JOIN vuln_tags vtags USING (vulnerability_id)
   LEFT OUTER JOIN vuln_minimum_skill USING (vulnerability_id)
   LEFT OUTER JOIN vulnerability_solutions USING (asset_id, vulnerability_id)
WHERE vm.cvss_v2_score >= 6.0

Thanks, I’ll dig into tweaking your report. I wonder if your syntax report will be faster than ours (which are slow, but we’re talking about 5-10k asset groups to report on).