SQL query to export Policy scan results with Remediation, Rationale and Proof

Glad to help @manny_singh!

For your follow up question, can you expand a bit more on what you meant by “if want we want to pull out what labeled as Checks in console in this query what would i need to add”.

1 Like

checks2

@zyoutz
Hi Zac,
I looked at the dim_policy_rule table, I believe it is check_existence and
Check_results. Correct me if i am wrong? but cannot figure out how add to them to the query you gave us. I wanted to export the expected check value and the resulted value from policy scan.
Regards,

@zyoutz
Please help , Thank you. :worried:
Regards
Manny

@manny_singh Still not sure I completely follow; however, on the fact_asset_policy_rule table there is a compliance column that indicates whether the asset is compliant for the particular rule.

This will return either true or false and the table is already being used in the query above as fapr.compliance. I did notice in the example query I provided you it is actually filtering by fapr.compliance = false so only rules that assets aren’t compliant for are being returned at the moment.

If instead you are looking for more details about the rule (description, rationale, remediation) then the dim_policy_rule table will have more of those details.

1 Like

@manny_singh Unfortunately I’m not aware of a way to get the values of the checks through SQL query reports. I’ll see if anyone else has any suggestions for you or maybe someone else in the community might be able to chime in.

1 Like

@zyoutz Thanks for all your Help.

Regards,
Manny

@manny_singh - You will need to setup a Database Warehouse (external) to reference those values you are looking for. Instructions can be found here: Configure Data Warehouse

After that, you will need to construct the SQL query using the schema found here: Data Warehouse Schema

I think you will need to reference these tables in your SQL query to get the data you are looking for: fact_asset_policy_rule_check and fact_asset_policy_rule_test

1 Like

@Aniket-Menon
I looked at the database schema already , to me it looks like the table that needs to be referenced is dim_policy_rule_test , But i am totally confused how to construct this query.

I looked at the option of connecting to database using sql query editor tools. Can i not connect directly to the database ?

Regards,
Manny

@manny_singh - you can connect directly once you setup the Data Warehouse. It will have to be an external Postgres database. This will be a separate database from the one that comes with the Nexpose / InsightVM console.

1 Like

Manny, something that is important to remember is that for the reports within the ui you are essentially just working within a view. You can force access to the db directly through modifying authentication configurations but it’s likely not what you’re hoping for. If you are looking for full db query access the warehouse option that was mentioned by Aniket is one of your safest options and provides the _test tables you are trying to use. Otherwise using the reporting view you’ll be restricted to what is presented in the view’s datamodel which is different from what is available in the warehouse option. https://nexpose.help.rapid7.com/docs/understanding-the-reporting-data-model-dimensions for dimensions and https://nexpose.help.rapid7.com/docs/understanding-the-reporting-data-model-facts for facts.
The view can technically be modified on the console but it’s not something that I would recommend in most cases. It doesn’t look like the idea portal exists anymore but I would still suggest filing a feature request to have the information made available in the view for reports. https://kb.help.rapid7.com/docs/suggesting-a-feature-or-enhancement

2 Likes

Hi @a9315024_a9315024 ,

I updated the query again to remove the spaces that was result of removing xhtml tagged data and also fixed the proof field which was not parsing out all text with the use of htmlToText.

SELECT
  dp.policy_id, dp.title as policy_title, dpr.rule_id, dpr.title as policy_rule_title,
  dp.benchmark_name, da.ip_address, da.host_name, dpr.description, dp.category,
  fapr.date_tested, (regexp_replace(fapr.proof,'<[^>]*>|^\s+|\s\s+','','g')) as proof, fapr.compliance,
  dpr.severity, (regexp_replace(dpr.rationale,'<[^>]*>|^\s+|\s\s+','','g')) AS rationale, (regexp_replace(dpr.remediation,'<[^>]*>|^\s+|\s\s+','','g')) AS remediation
FROM fact_asset_policy_rule fapr
  JOIN dim_policy dp on dp.policy_id = fapr.policy_id
  JOIN dim_policy_rule dpr on dpr.policy_id = fapr.policy_id and fapr.rule_id = dpr.rule_id
  JOIN dim_asset da on da.asset_id = fapr.asset_id

Hope this helps !
Manny

You should be able to use the proofAsText function to remove the HTML tags from the data.

Ah I see, I did not know that. Thanks :slight_smile:

Hi @zyoutz @holly_wilsey @tyler_schmidtke ,

Team , I have been trying to get the result from policy scan dated back in 2021-10-28 using the the query and tried using WHERE clause and it is still showing results from most recent date ( 2021-11-16.

Clause used – WHERE fapr.date_tested< ‘2021-10-30’

What do i need to change in the Query to get the result from 2021-10-28 ?
Manny

@zyoutz @Aniket-Menon @holly_wilsey @tyler_schmidtke , Anyone of you have any suggestion or hints for me?

Thanks

Could you post the full query? I double checked the tables/fields against the WHERE clause you posted, but it looks alright at a glance. Seeing the full query might provide some more insight.

@holly_wilsey Thank you for your response , below is the query : I have tested number of time and it is still showing most recent result , however within the console , I can see result of the policy scan that occured on 2021-10-28 but Query always output most recent scan results.

SELECT
  dp.policy_id, dp.title as policy_title, dpr.rule_id, dpr.title as policy_rule_title,
  dp.benchmark_name, da.ip_address, da.host_name, dpr.description, dp.category,
  fapr.date_tested, (regexp_replace(fapr.proof,'<[^>]*>|^\s+|\s\s+','','g')) as proof, fapr.compliance,
  dpr.severity, (regexp_replace(dpr.rationale,'<[^>]*>|^\s+|\s\s+','','g')) AS rationale, (regexp_replace(dpr.remediation,'<[^>]*>|^\s+|\s\s+','','g')) AS remediation
FROM fact_asset_policy_rule fapr
  JOIN dim_policy dp on dp.policy_id = fapr.policy_id
  JOIN dim_policy_rule dpr on dpr.policy_id = fapr.policy_id and fapr.rule_id = dpr.rule_id
  JOIN dim_asset da on da.asset_id = fapr.asset_id
WHERE fapr.date_tested< '2021-10-30'

Thank you

My initial thought is that you would need to include the full date + time in your comparison value for the timestamp, or else there may be results outside of your intended date range. It depends on how the timestamps in the fapr.date_tested field are actually formatted, though. You could try something like:

WHERE fapr.date_tested < '2021-10-30 12:00:00.000000'

Hi @holly_wilsey ,

I just tried your suggestion , It comes out with empty result , it is still not working , Any further suggestions ?
The scan however did happen :
image

Regards,
Manny