SQL Query runs very long. Any optimization advise appreciated

Hi all,

I created the following SQL query. The purpose for the query is to get a CSV file with all found vulnerabilities on an asset and a separate line for each instance of that vulnerabily on the asset.So e.g.an asset has 4 log4j vulnerabilities than there should be 4 lines in the CSV.
The query:
SELECT DISTINCT
da.host_name AS “Asset Names”,
ds.name AS “Site Name”,
dv.title AS “Vulnerability Title”,
fasvi.port AS “Service Port”,
round(dv.cvss_score::numeric, 2) AS “Vulnerability CVSS Score”,
round(dv.riskscore::numeric, 1) AS “Vulnerability Risk Score”,
dv.severity AS “Vulnerability Severity Level”,
dv.nexpose_id AS “Vulnerability ID”,
dv.exploits AS “Exploit Count”,
dv.malware_kits AS “Malware Kit Count”,
fa.last_scan_id AS “Scan ID”,
da.ip_address AS “Asset IP Address”,
round(dv.cvss_v3_score::numeric, 2) AS “Vulnerability CVSSv3 Score”,
htmlToText(fasvi.proof) AS “Vulnerability Proof”
FROM dim_asset da
JOIN fact_asset AS fa ON fa.asset_id = da.asset_id
JOIN fact_asset_scan_vulnerability_finding AS fasvf ON fasvf.scan_id = fa.last_scan_id and fasvf.asset_id = da.asset_id
JOIN fact_asset_scan_vulnerability_instance AS fasvi ON fasvi.scan_id = fa.last_scan_id and fasvi.asset_id = da.asset_id and fasvi.vulnerability_id = fasvf.vulnerability_id
JOIN dim_vulnerability AS dv ON dv.vulnerability_id = fasvf.vulnerability_id
JOIN dim_site_scan AS dss ON dss.scan_id = fa.last_scan_id
JOIN dim_site AS ds on ds.site_id = dss.site_id

The query itself works and produces what I want. However, it runs a very, very long time. Does anybody see anything that would improve the generation time?

Thanks.

The only recommendation I have as far as the query goes would be to start your FROM with a fact table instead of a dimension and then JOIN your dimension tables. But even that is probably not going to get you very far. The console reports are just slow point blank. The best way to speed them up is one of two ways:

  1. Increase resources on the console
  2. Scope the query to a smaller subset of assets (outside of the query)

For option 1 it’s pretty self explanatory, if you can give it more CPU then it SHOULD run a little quicker.

For option 2, try to reduce the scope to a few sites or asset groups. Doing this Outside of the query makes the report run quicker because it only populates the necessary info into the reporting data model for the query to run against. Otherwise it has to create the entire reporting data model database and run it.
Screen Shot 2023-02-02 at 12.09.51 PM

Hi John,
Thanks for taking the time to look into this. Increasing the hardware is not an option at this time.
But I run it against 30 sites and a DAG, so that’s worth investigating and I will look into the possibility of doing the from on a fact table.