SQL query to export Policy scan results with Remediation, Rationale and Proof

@zyoutz Thanks for all your Help.

Regards,
Manny

@manny_singh - You will need to setup a Database Warehouse (external) to reference those values you are looking for. Instructions can be found here: Configure Data Warehouse

After that, you will need to construct the SQL query using the schema found here: Data Warehouse Schema

I think you will need to reference these tables in your SQL query to get the data you are looking for: fact_asset_policy_rule_check and fact_asset_policy_rule_test

1 Like

@Aniket-Menon
I looked at the database schema already , to me it looks like the table that needs to be referenced is dim_policy_rule_test , But i am totally confused how to construct this query.

I looked at the option of connecting to database using sql query editor tools. Can i not connect directly to the database ?

Regards,
Manny

@manny_singh - you can connect directly once you setup the Data Warehouse. It will have to be an external Postgres database. This will be a separate database from the one that comes with the Nexpose / InsightVM console.

1 Like

Manny, something that is important to remember is that for the reports within the ui you are essentially just working within a view. You can force access to the db directly through modifying authentication configurations but it’s likely not what you’re hoping for. If you are looking for full db query access the warehouse option that was mentioned by Aniket is one of your safest options and provides the _test tables you are trying to use. Otherwise using the reporting view you’ll be restricted to what is presented in the view’s datamodel which is different from what is available in the warehouse option. https://nexpose.help.rapid7.com/docs/understanding-the-reporting-data-model-dimensions for dimensions and https://nexpose.help.rapid7.com/docs/understanding-the-reporting-data-model-facts for facts.
The view can technically be modified on the console but it’s not something that I would recommend in most cases. It doesn’t look like the idea portal exists anymore but I would still suggest filing a feature request to have the information made available in the view for reports. https://kb.help.rapid7.com/docs/suggesting-a-feature-or-enhancement

2 Likes

Hi @a9315024_a9315024 @zyoutz ,

I modified the query to output clean data from rationale and remediation , currently the query that @zyoutz provided output xhtml strings and unnecessary information.
I hope @a9315024_a9315024 you can mark the question posted on another post you posted earlier as solution.

SELECT
  dp.policy_id, dp.title as policy_title, dpr.rule_id, dpr.title as policy_rule_title,
  dp.benchmark_name, da.ip_address, da.host_name, dpr.description, dp.category,
  fapr.date_tested, htmlToText(fapr.proof) as proof, fapr.compliance,
  dpr.severity, (regexp_replace(dpr.rationale,'<[^>]*>' , '','g')) AS rationale, (regexp_replace(dpr.remediation,'<[^>]*>' , '','g')) AS remediation
FROM fact_asset_policy_rule fapr
  JOIN dim_policy dp on dp.policy_id = fapr.policy_id
  JOIN dim_policy_rule dpr on dpr.policy_id = fapr.policy_id and fapr.rule_id = dpr.rule_id
  JOIN dim_asset da on da.asset_id = fapr.asset_id

We can still use WHERE clause if we need to filter the data for TRUE or FALSE
For Example :
WHERE fapr.compliance = false order by dp.title, dpr.title

1 Like

Hi @a9315024_a9315024 ,

I updated the query again to remove the spaces that was result of removing xhtml tagged data and also fixed the proof field which was not parsing out all text with the use of htmlToText.

SELECT
  dp.policy_id, dp.title as policy_title, dpr.rule_id, dpr.title as policy_rule_title,
  dp.benchmark_name, da.ip_address, da.host_name, dpr.description, dp.category,
  fapr.date_tested, (regexp_replace(fapr.proof,'<[^>]*>|^\s+|\s\s+','','g')) as proof, fapr.compliance,
  dpr.severity, (regexp_replace(dpr.rationale,'<[^>]*>|^\s+|\s\s+','','g')) AS rationale, (regexp_replace(dpr.remediation,'<[^>]*>|^\s+|\s\s+','','g')) AS remediation
FROM fact_asset_policy_rule fapr
  JOIN dim_policy dp on dp.policy_id = fapr.policy_id
  JOIN dim_policy_rule dpr on dpr.policy_id = fapr.policy_id and fapr.rule_id = dpr.rule_id
  JOIN dim_asset da on da.asset_id = fapr.asset_id

Hope this helps !
Manny

You should be able to use the proofAsText function to remove the HTML tags from the data.

Hi @jhaltorp ,

I am much aware of that , you should try using the query and figure out what proofastext can’t do or do in this case. I have already tried and using either of them and that’s a basic so i know but thanks. Furthermore proofastext is depreciated as well.

Test it yourself to see.

Regards,
Manny

Ah I see, I did not know that. Thanks :slight_smile:

Hi @zyoutz @holly_wilsey @tyler_schmidtke ,

Team , I have been trying to get the result from policy scan dated back in 2021-10-28 using the the query and tried using WHERE clause and it is still showing results from most recent date ( 2021-11-16.

Clause used – WHERE fapr.date_tested< ‘2021-10-30’

What do i need to change in the Query to get the result from 2021-10-28 ?
Manny

@zyoutz @Aniket-Menon @holly_wilsey @tyler_schmidtke , Anyone of you have any suggestion or hints for me?

Thanks

Could you post the full query? I double checked the tables/fields against the WHERE clause you posted, but it looks alright at a glance. Seeing the full query might provide some more insight.

@holly_wilsey Thank you for your response , below is the query : I have tested number of time and it is still showing most recent result , however within the console , I can see result of the policy scan that occured on 2021-10-28 but Query always output most recent scan results.

SELECT
  dp.policy_id, dp.title as policy_title, dpr.rule_id, dpr.title as policy_rule_title,
  dp.benchmark_name, da.ip_address, da.host_name, dpr.description, dp.category,
  fapr.date_tested, (regexp_replace(fapr.proof,'<[^>]*>|^\s+|\s\s+','','g')) as proof, fapr.compliance,
  dpr.severity, (regexp_replace(dpr.rationale,'<[^>]*>|^\s+|\s\s+','','g')) AS rationale, (regexp_replace(dpr.remediation,'<[^>]*>|^\s+|\s\s+','','g')) AS remediation
FROM fact_asset_policy_rule fapr
  JOIN dim_policy dp on dp.policy_id = fapr.policy_id
  JOIN dim_policy_rule dpr on dpr.policy_id = fapr.policy_id and fapr.rule_id = dpr.rule_id
  JOIN dim_asset da on da.asset_id = fapr.asset_id
WHERE fapr.date_tested< '2021-10-30'

Thank you

My initial thought is that you would need to include the full date + time in your comparison value for the timestamp, or else there may be results outside of your intended date range. It depends on how the timestamps in the fapr.date_tested field are actually formatted, though. You could try something like:

WHERE fapr.date_tested < '2021-10-30 12:00:00.000000'

Hi @holly_wilsey ,

I just tried your suggestion , It comes out with empty result , it is still not working , Any further suggestions ?
The scan however did happen :
image

Regards,
Manny

If you see no results, it sounds like it could be a date formatting issue. I would copy an example date from the date_tested column and just update the date to the desired year/month/day to ensure it retains the proper formatting. If that doesn’t work, I suspect it may be due to the table itself.

The description for the table says the following:

Hi @holly_wilsey ,

I do see result if the Date is WHERE fapr.date_tested > ‘2021-10-30 12:00:00.000000’ however there is 0 result for WHERE fapr.date_tested < ‘2021-10-30 12:00:00.000000’ , I tried copying example date from the date_tested column as you suggested however it still doesn’t work.

Do we have any other suggestion? to fix or workaround ? I also tried to use fact_asset_scan_policy table with another query both fact_asset_scan_policy and fact_asset_policy_rule are behaving exactly same.

Thank you.
Manny

Currently using the SQL query below to import the scan results into our ServiceNow custom table. This query also works with the sql export report in InsightVM, but you might need the rename “check_result” tot “compliance”. The WHERE gives me the scan results of yesterday.

SELECT dp.policy_id,
dp.title as policy_title,
dpr.rule_id,
dpr.title as policy_rule_title,
da.ip_address,
da.host_name,
fapr.asset_id,
dpr.description,
fapr.date_tested,
htmlToText(fc.proof) as proof,
fc.check_result,
dpr.severity,
(regexp_replace(dpr.rationale,’<[^>]>|^\s+|\s\s+’,’’,‘g’)) AS rationale,
(regexp_replace(dpr.remediation,’<[^>]
>|^\s+|\s\s+’,’’,‘g’)) AS remediation

FROM fact_asset_policy_rule fapr
JOIN dim_policy dp on dp.policy_id = fapr.policy_id
JOIN dim_policy_rule dpr on dpr.policy_id = fapr.policy_id and fapr.rule_id = dpr.rule_id
JOIN dim_asset da on da.asset_id = fapr.asset_id
JOIN fact_asset_policy_rule_check fc on fc.policy_id = fapr.policy_id and fc.rule_id = fapr.rule_id AND fapr.asset_id = fc.asset_id

WHERE date(fapr.date_tested) = CURRENT_DATE - 1

edit: you can change CURRENT_DATE - 1 to YYYY-MM-DD (between quotes) to get the scans for that date.

This doesn’t work , Thank you for suggestion though.