Warehouse query - mean time to resolve

I’ve been struggling with this for some time now. Does any one know of a good way of getting the date field for a vulnerability from fact_asset_vulnerability_finding_date (or any other table) to fact_asset_vulnerability_remediation_date? The metric I’m trying to produce is ‘mean time to resolve’, so I need to know the date of remediation + first found date. Or maybe there is a way to produce it with first_value/last_value window functions against the finding_date table itself?

This is what I have so far. I spot checked a few assets and the logic seems to be okay. For each vulnerability/asset pair, we take the difference between MAX(day) and MIN(day) in the set of days to determine the age, then evaluate if MAX(day) is greater than or equal to dim_asset.last_assessed_for_vulnerabilities to determine if the vulnerability is still open or not.

SELECT vulnerability_id, asset_id, MAX(day) - MIN(day) AS "age",
CASE WHEN MAX(day) >= last_assessed_for_vulnerabilities THEN 'open' ELSE 'closed' END AS "status"
FROM fact_asset_vulnerability_finding_date
JOIN dim_asset USING(asset_id)
WHERE  day >= now() - '4 months'::interval 
GROUP BY  vulnerability_id, asset_id, last_assessed_for_vulnerabilities
WITH vuln_remediation_date AS (

SELECT 

	 favrd.asset_id
	,favrd.vulnerability_id
	,favrd.day AS vuln_remediation_date
		
FROM 
	fact_asset_vulnerability_remediation_date favrd
JOIN
	dim_vulnerability dv ON favrd.vulnerability_id = dv.vulnerability_id
-- WHERE
-- 	asset_id = ''

), vuln_first_found_date AS (

SELECT 

	 favf.asset_id
	,vulnerability_id
	,DATE(favf.date) AS vuln_first_found_date

FROM
	fact_asset_vulnerability_finding_date favf
-- WHERE
-- 	favf.asset_id = ''
GROUP BY
	 favf.asset_id
	,vulnerability_id
	,DATE(favf.date)
	
), details AS (

SELECT

	 vrd.asset_id
	,vrd.vulnerability_id
	,vrd.vuln_remediation_date
	,vffd.vuln_first_found_date
	,vrd.vuln_remediation_date - vffd.vuln_first_found_date AS days_to_remediation
	
FROM
	vuln_remediation_date vrd
JOIN
	vuln_first_found_date vffd ON vrd.asset_id = vffd.asset_id AND vrd.vulnerability_id = vffd.vulnerability_id
	
)
SELECT 

	  AVG(days_to_remediation) AS avg_time_to_remediation
	 ,PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY d.days_to_remediation) AS median_time_to_remediation
	 ,MIN(days_to_remediation) AS min_time_to_remediation
	 ,MAX(days_to_remediation) AS max_time_to_remediation 
	
FROM
	details d
1 Like

Thanks @bobsledtedd! I tried a similar approach at first. Where I see some issues with joining the remediation_date table with the finding_table is when there are multiple vulnerability instances or when a vulnerability is re-opened at a later date. For instance the value of min_time_to_remediation I get is -135, likely from a system that was restored from a backup at some time. I think the major problem witht he remediation_date table is the lacking of instance level data (filepath, port, etc) that would allow each vulnerability/asset pair to be unique.

My final query. Seems to be right. Takes 15 mins to execute, but I think its good enough for now.

WITH 
vuln_remediation_date AS (
  	SELECT vulnerability_id, asset_id, 
    DATE_PART('days', max(day) - date)  AS "days_to_remediation",
	CASE WHEN MAX(day) >= last_assessed_for_vulnerabilities THEN 'open' ELSE 'closed' END AS "status"
	FROM fact_asset_vulnerability_finding_date
	JOIN dim_asset USING(asset_id)
	GROUP BY  vulnerability_id, asset_id, last_assessed_for_vulnerabilities, date
)

SELECT 
  status, 
  AVG(days_to_remediation) AS avg_age,
  PERCENTILE_DISC(0.5) WITHIN GROUP (ORDER BY d.days_to_remediation) AS median_age,
  MIN(days_to_remediation) AS min_age,
  MAX(days_to_remediation) AS max_age 
  FROM
  vuln_remediation_date d
  GROUP BY 1
)

@trevor_capps appreciate you providing your solution! Couple call outs after testing in my environment.

  • I changed DATE_PART(‘days’, max(day) - date) AS “days_to_remediation” to max(day) - date::date as some records where reporting 0 when they should be reporting 1, which skews the metrics.

  • Depending on needs, there might be a desire to have a minimum day of 1 instead of 0 for records that have the same max(day) / date::date.

  • Joining dim_vulnerability allows for metrics based on things like [severity], which might be handy.

1 Like