Store/retrieve JSON

dreadpir8robots · April 26, 2022, 2:25pm

How are people handling storage/retrieval of JSON within Insight Connect? I feel as though I’m missing something obvious here.

Requirement

Let’s say I want to build a workflow which runs once a day:

retrieves a large-ish array of JSON
does some processing
raises a Jira for a human to action
stores the processed JSON somewhere persistent/reliable, e.g. in case it’s required for an audit
I’d also want to be able to search back through the stored JSON using other Insight Connect workflows.

Over time, I’d want to store thousands of documents/objects.

Global Artifacts not suitable

Global Artifacts are limited to 1000 objects before they start dropping old entries so they’re immediately not suitable for this use case.

Database plugins for Insight Connect

My first thought here was MongoDB but I can’t find any sign of a MongoDB plugin for Insight Connect, which surprised me a bit.

These are the database plugins I’ve been able to find by digging through the Extensions library:

Redis. There is a Redis plugin, but it only seems to support name/value pairs, so although there’s a RedisJSON plugin for Redis which enables JSON storage, I don’t think the Insight Redis plugin is suitable for storage/retrieval of arbitrary JSON.
DynamoDB. Maybe a contender, although I’m hoping not to be locked in to AWS.
SQL. Useful, but for relational databases rather than NoSQL.
PrestoDB. Same as SQL extension.

Storage plugin/roadmap

I’ve seen the thread Storing/managing JSON object by @ilyaaz_noerkhan which is asking a similar question back in October 2020. I’d be reluctant to commit to using the Storage plugin if it might arbitrarily wipe its data, and in any case I can’t see the Storage plugin in the Extension Library - has it been removed/superceded by Global Artifacts? @joey_mcadams suggests that the roadmap contained items for 2021 Q1 - was this Global Artifacts?

brandon_mcclure · April 27, 2022, 1:26pm

I am currently using Splunk because we have it, I do a combination of Splunk Search Queries to store data in an index and also their KV store. You can store the JSON as a string, then use Splunk parsing to CIM the input for queries.

dreadpir8robots · April 27, 2022, 2:36pm

Thank you @brandon_mcclure. I’ve heard great things about Splunk - but we don’t have it.

holly_wilsey · May 9, 2022, 7:16pm

Another possible option here could be the AWS S3 plugin, where you can create and retrieve objects from buckets there. AWS has some more info on the requirements and permissions needed for those actions.

https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html
https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html

We’re also working on a new Azure Blob Storage plugin, which is extremely flexible in the type of data that it allows you to store/retrieve. It’s not complete yet, but we can definitely post about it once it’s been released.

dreadpir8robots · June 1, 2022, 3:24pm

I’m pursuing DynamoDB for NoSQL storage. It wouldn’t be my first choice for our circumstances - ideally we’d be able to host MongoDB on-prem and not be vendor-locked - but I’m hopeful that it will solve the big problem, i.e. not having a useful way to store JSON acquired as part of an InsightConnect workflow in a way which is easy to interrogate after the fact.