Remove HTML Tags from dim_vulnerability.description

Hello everyone
I need a hint how to remove the HTML tags from dim_vulnerability.descritption.

So far, I have imported the entire dim_vulnerability table directly into PowerBI and deleted the unnecessary columns within powerBI and removed HTML tags, e.g. in the column description.

In order to optimise the performance of my PowerBI dashboard, I have started to build saved views on PGAdmin and then connect them to PowerBI.

Now I have read the following in the data warehouse schema for “description”:
A verbose description for the vulnerability. The description is represented using HTML markup that can be “flattened” using the htmlToText() function.

How do I integrate this directly into my simple query?
SELECT *
dv.vulnerability_id,
dv.nexpose_id,
dv.title,
dv.description,
EXTRACT(YEAR FROM dv.date_published) AS published_year,
dv.severity,
dv.critical,
dv.severe,
dv.moderate,
dv.cvss_score,
dv.cvss_v3_score
FROM dim_vulnerability dv

Best regards
David

2 Likes

Have your tried replacing dv.description, with htmlToText(dv.description), in your query?

3 Likes

:joy: :rofl: Oh dear, why didn’t I figure that out on my own. And I thought I had to include some complicated functions. :see_no_evil:
Thanks for the quick help!

2 Likes