InsightVM crashing regularly

sbenson · July 1, 2025, 8:17pm

I’m having challenges keeping our InsightVM virtual machine running. We have it virtualized on Ubuntu 22.04.5 on an esxi hypervisor. The virtual machine has 32 GB of RAM. I’ve scaled this up from 24 GB but it hasn’t seemed to help. I’ve opened a case with our service provider and it’s been open for over a month but we aren’t making any progress in getting stable performance out of InsightVM. I don’t see other threads indicating this type of problem or issues with memory in general. I’m interested in any suggestions that we might try to address this, or alternatively what OS and memory allocation is working for others.

Thank you

Scott

wwolczaski · July 2, 2025, 10:00am

From my experience running IVM on 100k assets environment, we did have performance issues that made us scale up the console VM to enormous size. This helped but did not solve all the problems though. We discovered and Rapid7 support confirmed it, that heavy usage of dynamic tags, dynamic asset groups makes the console chase it’s tail all the time recalculating these tags and AGs and can cause instability and poor overall performance. We came to a point where we gave the console so much RAM and CPU that it wasn’t able to consume it, there is some inherent processing limit within the IVM software

sbenson · July 7, 2025, 7:31pm

Thank you. We are running against less than 500 assets currently and still experiencing daily crashing of InsightVM. Do you mind sharing your OS that you run InsightVM on and what your current memory allocation is?

nnsmith · July 7, 2025, 7:50pm

Hey! Have you tried tailing the console logs to see why it keeps crashing? You can SSH into the machine and use this command : tail -f /opt/rapid7/nexpose/nsc/logs/nsc.log

Maybe I can help you nail down the issue if you want to share some of the error messages you get.

If not, see my suggestions below.

Do you have enhanced logging enabled in your scan templates? Check that as well. Create special templates you only use for troubleshooting and turn off enhanced logging in any of your templates you use regularly.

Also, is debug logging enabled on your console? Go to administration area and run command area. Run this command: log list. If you have any that say debug, that could fill up your disk and crash your system. You can use variations of the following command, changing ‘default-level’ to the log area that shows debug setting to change the setting to “info” so it does not fill up your system. Run command area: log set default-level info

See my stats below. You should not need nearly as much as I have with less than 500 assets.
Ubuntu 22.04.5
64GB RAM - 2k assets (46% memory used)
Disk space - 1.94TB (17% used)

Happy hunting!

sbenson · July 7, 2025, 10:18pm

Thanks for your suggestions and for sharing your OS / resources config.

I don’t have enhanced logging enabled in any templats

log list returns:
|---------------|-------|

Name	Value
access-level	INFO
---------------	-------
audit-level	INFO
---------------	-------
auth-level	INFO
---------------	-------
default-level	INFO
---------------	-------
ingress-level	INFO
---------------	-------
memory-level	INFO
---------------	-------
saml-level	OFF
---------------	-------
update-level	INFO
---------------	-------

tail -f /opt/rapid7/nexpose/nsc/logs/nsc.log gives

root@mbinsight01:/home/r7admin# tail -f /opt/rapid7/nexpose/nsc/logs/nsc.log
|---------------|-------|

ingress-level	INFO
memory-level	INFO
---------------	-------
saml-level	OFF
---------------	-------
update-level	INFO
---------------	-------

2025-07-07T22:14:23 [INFO] [Thread: Scheduled Execution Thread: extensible-ingress-request-timer] Saving the timing metrics to the database.

When InsightVM goes down for us, it takes out the entire server. The server will not respond to ping, ssh connections, web console. It’s completely hung. I wind up power cycling the VM and it comes back up for a while, sometimes up to 36 hours but sometimes less than 10 minutes. Running a scan can take it out immediately.

We have support requests in to our MSP and they have been communicating with Rapid7. We’ve provided multiple logs but the only result so far is that OOM killer seems to want to kill the Rapid7 process, so it looks like some kind of memory management issue but I don’t think we have anything special going on and there’s nothing else running on the VM other than the OS and Rapid7.

sbenson · July 16, 2025, 8:28pm

We identified a compatability issue between OpenVM Tools and our hypervisor and believe this was the source of the problem. We’ve migrated our virtual machine to AWS and it’s running without issues now. Thank you for your response, it was helpful in validating that our allocated resources should have been sufficient.