Deleting Duplicate Assets in Nexpose

Does anyone have a decent way of filtering for and deleting duplicate assets by IP? Right now, I’m exporting my full list of assets, opening in Excel, and using conditional formatting to find the duplicate IP entries. That’s an OK way of doing it, but i haven’t found a great way to take that list and mass delete only one of the two (or three) duplicates within Nexpose. I could enter those IPs into the query builder, but how would i go about making that list deletable?

Thanks for any help.

@alan_sarz A couple quick questions so I can get you going in the right direction. First, how are you identifying which assets to be deleted out of the duplicates by IP? Second, are you trying to do this with an InsightConnect workflow?

We do have the ability to delete assets by ID with the following API endpoint: https://help.rapid7.com/insightvm/en-us/api/index.html#operation/deleteAsset. Making sure we have the list of IDs programmatically (or from your Excel file if you do manual filtering) would be a good place to start.

I am searching for an IP, and then deleting all entries except for the one with the most current scan date.

I am not using insight connect. Is that something only for InsightVM? We are using onsite Nexpose, not any cloud-based solution.

I would make a dynamic asset group, and use it to limit scans to not most recent (greater than last seven days or whatever), then simply review and delete. However, if the rule is that it MUST be a duplicate as well… that’ll be harder… For that type of stuff we use the InsightVM API and have it run a script overnight. Specifically, we do this with short-lived instances that use the InsightVM so they don’t keep cluttering our asset counts, but really can use for any type of filtering you can do in code.

1 Like

I use the API to do the same thing for our systems with hostnames. I pull all of the systems that have a hostname, find all of the duplicates, and then delete the duplicate that was missing a MAC address, wrong OS, etc… There may be 1 or 2 that I have to do manually but easier than doing 20+ individually through Excel and the GUI. I use Powershell for the code but it will work with any language. Hope that helps

1 Like

I ran across an article a while ago on a good way to do this in Nexpose.
I think this might be it https://blog.rapid7.com/2017/07/11/cleaning-house-maintaining-an-accurate-and-relevant-vulnerability-management-program/
The big takeaway is making sure you have asset linking on. It also talks about removing stale assets and “Ghost” assets (Asset risk score is 0 & OS is empty & Asset name is empty) by creating asset groups to tag these. This won’t get all the duplicates, but it will get quite a few making the rest of the cleanup easier because I think you’ll find a lot of the duplicates will fall under the Ghost category.

yes thank you. I did read that before and I’ve added ghost asset criteria to a deletion asset group. Without moving everything to InsightVM (which i’m not allowed to do), I’m afraid it’ll be a largely manual process for me.

Here’s the easiest way I’ve found to do this:

*export entire asset group to csv
*Open excel and highlight, filter, and sort duplicates with the IP column
*create a user made tag and import with file. Might have to copy and paste the excel list into notepad and upload the txt file
*create an asset group and make the only criteria assets with the tag you created earlier

This way, you have a list of duplicates and you can select and delete whichever ones you want.

Can you use the API with on-premises Nexpose? I dont have licenses for InsightVM.

yes, documentation is at https://NexposeServer.domain.com:3780/api/3/html
It supports basic auth. I’ve done Tag management, ad-hoc scans, and software inventory with it and it works well.
The InsightVM Plug-in in InsightConnects works against this. They recently added the ability to limit a scan to one host in a site and as soon as my test box gets moved to its new building I’ll be doing a PR request to add the options to define the other scan parameters.

The InsightVM v3 API is compatible with InsightVM and Nexpose. Public documentation for it can be found here: https://help.rapid7.com/insightvm/en-us/api/index.html. And as Brandon mentioned it is also available from the Nexpose host itself.

Out of this process, it would be possible to automate most of the steps with the API out side of the filtering by IP. If you add any “human” decisions as to which asset by IP you keep versus get rid of then yes this would need to be reviewed; however, if you simply take the latest scan ID (or some other criteria) this might also be possible with the use of the API.

I have had the same issue with duplicate assets. Here are some steps I’ve taken to cleanup.

I use the reports and create a SQL Export report with the SQL query. The query accounts for duplicate records where the same asset is listed multiple times with or without the FQDN. It strips the ‘corp.contoso.com’ from hostnames containing it.

SELECT da.host_name, lower(da.host_name), replace(lower(da.host_name),'.corp.contoso.com','') AS HostName, da.asset_id AS "Asset ID", da.ip_address AS "IP Address", da.mac_address AS "MAC Address", dau.unique_id AS "R7 Agent ID", da.sites AS "Sites", dos.description AS "Operating System", fad.first_discovered AS "First Discovered", fa.scan_finished AS "Last Scan Date", da.last_assessed_for_vulnerabilities AS "Last Assessed", fa.critical_vulnerabilities AS "Critical Vulnerabilities", fa.severe_vulnerabilities AS "Severe Vulnerabilities", fa.moderate_vulnerabilities AS "Moderate Vulnerabilities", fa.vulnerabilities AS "Total Vulnerabilities", fa.malware_kits AS "Malware Kits", fa.exploits AS "Exploits", to_char(round(fa.riskscore::numeric,0),'999G999G999') AS "Risk Score"
FROM dim_asset da 
JOIN fact_asset fa USING (asset_id)
JOIN fact_asset_discovery fad USING (asset_id)
JOIN dim_asset_unique_id dau USING (asset_id)
JOIN dim_operating_system dos USING (operating_system_id)
Where dau.source = 'R7 Agent'
And replace(lower(da.host_name),'.corp.contoso.com','') in (select replace(lower(da.host_name),'.corp.contoso.com','') from dim_asset da group by replace(lower(da.host_name),'.corp.contoso.com','')HAVING count(*) > 1)
ORDER BY replace(lower(da.host_name),'.corp.contoso.com','') ASC, da.asset_id ASC

After running the report, I download the CSV and examine the list. The report is sorted by hostname and has the oldest AssetID listed first. I use the AssetIDs in the report to run through the RestAPI and delete.

Add a list of AssetIDs to a text file called ‘dups.txt’ and execute the python script below with your username, password, and host settings.

from __future__ import print_function
import time
import rapid7vmconsole
from rapid7vmconsole.rest import ApiException
from pprint import pprint

import base64
import logging
import sys

config = rapid7vmconsole.Configuration(name='Rapid7')
config.username = 'nxadmin'
config.password = '**********'
config.host = 'https://insightvm.contoso.com'
config.verify_ssl = False
config.assert_hostname = False
config.proxy = None
config.ssl_ca_cert = None
config.connection_pool_maxsize = None
config.cert_file = None
config.key_file = None
config.safe_chars_for_path_param = ''

# Logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler(sys.stdout)
ch.setLevel(logging.INFO)
logger.addHandler(ch)
config.debug = False


auth = "%s:%s" % (config.username, config.password)
auth = base64.b64encode(auth.encode('ascii')).decode()
client = rapid7vmconsole.ApiClient(configuration=config)
client.default_headers['Authorization'] = "Basic %s" % auth

# create an instance of the API class
api_instance = rapid7vmconsole.AssetApi(client)

with open("dups.txt","r") as a_file:
    for asset_id in a_file:
        try:
            # Asset
            api_response = api_instance.delete_asset(asset_id)
            pprint(api_response)
        except ApiException as e:
            print("Exception when calling AssetApi->delete_asset: %s\n" % e)

I hope this is helpful to those who need it.

2 Likes