Filtering xhtml tags


I’ve been using the sql query posted in SQL query to export Policy scan results with Remediation, Rationale and Proof - #2 by zyoutz.

SELECT da.host_name, da.ip_address, dp.policy_id, dp.title as policy_title, dpr.rule_id, dpr.title as rule_name, dpr.description, fapr.date_tested, htmlToText(fapr.proof) as proof, fapr.compliance, htmlToText(dpr.rationale) as rationale, htmlToText(dpr.remediation) as remediation
FROM fact_asset_policy_rule fapr
JOIN dim_policy dp on dp.policy_id = fapr.policy_id
JOIN dim_policy_rule dpr on dpr.policy_id = fapr.policy_id and fapr.rule_id = dpr.rule_id
JOIN dim_asset da on da.asset_id = fapr.asset_id

The method htmlToText was used to filter out the html tags from the policy compliance proof field. However this method doesnt seem to work on the policy compliance rationale or remediation fields. Probably because those fields contain xhtml.

Is there a simple method to filter the xhtml tags?

2021-04-08 10_21_14-Window