I am using one of the SQL queries that are provided on github called Assets-By-Specified-Software. When I run it. Not output rows are shown.
SELECT da.ip_address, da.host_name, ds.vendor, ds.name as software_name, ds.family, ds.version
FROM dim_asset_software das
JOIN dim_software ds using (software_id)
JOIN dim_asset da on da.asset_id = das.asset_id
where ds.name ilike '%Rapid7 Insight Agent%' AND ds.name ilike '%CrowdStrike%'
Change:
where ds.name ilike â%Rapid7 Insight Agent%â AND ds.name ilike â%CrowdStrike%â
To:
where ds.name ilike '%Rapid7 Insight Agent%' OR ds.name ilike '%CrowdStrike%'
1 Like
Thanks for the help. I didnât word the question correctly. I am looking for host that contain both of the software.
Here is a shamefully dirty way of what that logic might look like using the data warehouse schema.
WITH r7 AS ( /* Distinct List of Assets with software ilike '%Rapid7 Insight Agent%' */
SELECT
da.host_name
,'r7' AS note
FROM
dim_asset da
JOIN
dim_asset_software das USING (asset_id)
WHERE
das.name ilike '%Rapid7 Insight Agent%'
GROUP BY
da.host_name
), cs AS ( /* Distinct List of Assets with software ilike '%CrowdStrike%' */
SELECT
da.host_name
,'cs' AS note
FROM
dim_asset da
JOIN
dim_asset_software das USING (asset_id)
WHERE
das.name ilike '%CrowdStrike%'
GROUP BY
da.host_name
), asset_list AS ( /* UNION the two CTEs two get a complete list of assets */
SELECT
host_name
,note
FROM
r7
UNION
SELECT
host_name
,note
FROM
cs
), final_asset_list AS (
SELECT
host_name
FROM
asset_list
GROUP BY
host_name
HAVING COUNT(*) > 1 /* If the host_name is counted more than once, then it shows up in both CTEs */
)
SELECT
da.ip_address
,da.host_name
,das.vendor
,das.name as software_name
,das.family
,das.version
FROM
dim_asset da
JOIN
dim_asset_software das USING (asset_id)
JOIN
final_asset_list fal ON da.host_name = fal.host_name /* Ultimately all the work above is just used to filter the original query */
WHERE
(das.name ilike '%Rapid7 Insight Agent%'
OR das.name ilike '%CrowdStrike%')
AND da.sites = 'Rapid7 Insight Agents'
So technically speaking, @bobsledtedd was sorta right the first time to the extent that it would give you the assets that have either of those two installed and then within the csv you could filter to see which assets duplicate.
However, the columns youâre selecting in the way that you are, kind of make it impossible to display that data. Iâm assuming you want one line per asset, so keeping those columns for vendor, software_name, family, and version are going to break that query.
@bobsledtedd was even closer on the second query although it was for the data warehouse. I would just create two temporary tables each with a where statement of the one software youâre checking for and then simply add the host info. Then you can just join the two tables together. The assets that overlap will be displayed so you will know which assets have both software (one line per asset).
Otherwise if you want to keep the software info then instead of joining the two tables you could just do UNION and then sort by ip_address or host_name. This would give you the same result but two lines for each asset displaying the software details.
I actually played with this a little more and it wasnât as simple as I initially thought. The having clause that @bobsledtedd put in his query can run into issues if you have multiple lines for the software. For example depending on how the agent was found you may have two lines.
This is the closest I could get on a whim to what I think youâre looking to see. You SHOULD only see one line per asset with both the cs and r7 info in that line. However if you have multiple lines for any given software you may have multiple lines per software. If for example you had two versions of each software in the database you could have up to 4 lines per asset. Thatâs going to come back on the data in your specific database though.
with r7 as (
SELECT das.asset_id, ds.vendor, ds.name, ds.family, ds.version
FROM dim_asset_software das
JOIN dim_software ds using (software_id)
where ds.name ilike '%Rapid7 Insight Agent%'
),
cs as (
SELECT das.asset_id, ds.vendor, ds.name, ds.family, ds.version
FROM dim_asset_software das
JOIN dim_software ds using (software_id)
where ds.name ilike '%CrowdStrike%'
)
select da.ip_address, da.host_name, r7.vendor, r7.name as r7_software_name, r7.family, r7.version, cs.vendor, cs.name as cs_software_name, cs.family, cs.version
from dim_asset da
join r7 using(asset_id)
join cs using(asset_id)
Same thing happened to me when I started working through it, ha.
Can you provide more context around this as I donât follow? When I check your query against mine, we are getting the same results with yours having a few less lines as you are putting the r7 and CS details on the same line.
However, Iâm curious what might cause it to break.
Maybe this is it what you are referencing?
-
You can have multiple host_names in dim_asset, each with a different asset_id. The reason for this is the asset belongs to different âsitesâ. Each asset will be tagged with the âRapid7 Insight Agentsâ site and potentially anotherâŚFor example depending on how the agent was found
-
In my envrionment I am seeing the results of this from both queries, which presents itself as seeing two entries for the R7 agent on the same host_name. Again because there are multiple asset_ids for the same host.
Solution: My initial thought was to add [WHERE da.sites = âRapid7 Insight Agentsâ] to both queries, which does technically fix this issue, however I noticed in some examples that assets assigned to sites = âRapid7 Insight Agentsâ are reporting an older version of the agent than the entry where the sites = âExampleSite,Rapid7 Insight Agentsâ.
âŚmight need to dig into that more
If this was a big concern, you can get fancy with a ROW_NUMBER() and order by agent version to grab the record you are interested in (the correct one).
So what I was referring to wasnât that there were separate asset_ids but that the Rapid7 Insight Agent was found in multiple ways. Typically speaking Iâve seen this more with Windows than Linux but here are the results Iâm referring to

Interesting. Iâll look to see if there are cases of this in my envrionment. I was seeing duplicated agents with your query and mine, but it was being caused by two entries in dim_asset for the same host_name (different asset_ids).
Thanks for the information John and helping me with solving this problem.
So that would definitely be another scenario that could duplicate lines given that my query is linking on asset_id. So like I said, the results are going to be as accurate as the data in your database.