-
Notifications
You must be signed in to change notification settings - Fork 452
Have CAPE adopt MACO format #1037
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
amazing work. thank you @cccs-rs. @kevoreilly it looks good to me, do you want to get a look or press merge directly |
Thanks - looks like I should test these before merging |
For sure! The one thing I noticed is the output is more nested under MACO which I believe wasn't the case for CAPE, so maybe flattening is required? |
i hope to test this this weekend and merge if no problems |
Thank you guys |
I reverted this due to undesirable changes in representation of the parser output, for example QakBot: which should appear instead as: I am happy to work on this myself, I will just need to test each parser individually to ensure the new output is still acceptable in the web ui as well as in exported form. |
What do you think about having the data transform back to a flattened state for presentation in the UI? I can amend my PR to include that change specifically for CAPE in This is at least sets the presentation in the original state while adopting the new format for the parsers. |
@kevoreilly @doomedraven thoughts? |
Sorry for the delay in my reply - I have been pondering this, I think there is both an aesthetic and a technical point to be made here. I think Qakbot serves as a good example : in the malware the ip address and port are stored together. I think malware reverse engineers and other cape users will expect to see the config appear in the cape ui in the same form as in the malware, in what I would call 'raw' form. And from a purely aesthetic point of view, multiple repetitions of the label (i.e. "server_port") look bad. My feeling is that to convert the config from the 'raw' malware into Maco then back into some third form for display in cape does not seem optimal. Ultimately if the config can be represented concisely in the same way that it appears in the malware then perhaps it does not matter, but I can't help but feel it might be better to do things the other way around: parse the data from the malware into 'raw' form which can be represented as it is currently, then transform into Maco for export/API and optional additional display in the web ui? |
The problem that I noticed with creating a translation layer from CAPE-to-Maco, let's say for an export API, is that there isn't a standard for the CAPE output (least not one that I noticed). So it would be hard to map from raw to Maco, however going from Maco to a flatter version of Maco is doable and should play well with the UI as it currently stands. |
Well I guess the 'standard' output is defined by the code in the cape processing module. The design is a list of dicts at the top level representing families. Each dict has only one key which is the family name with the value another list of dicts, config item and values. Then more nested dicts/lists as needed basically. If you can map from this form to MACO that could be a nice way to handle this. For example, here is a QakBot config in this form:
|
Right, which can be doable if let's say all parsers agreed on using the same set of fields rather than a parser defining fields at will (which is the aspect that makes mapping difficult because it's unpredictable) even though they conform to the same format that the rest of system accepts. An example of what I mean is Azorult using So for me, in that particular scenario, it becomes a guessing game of: does 'address' refer to C2 URLs, IPs, domains, etc or is there another usage associated to those connections like downloading/uploading/etc.? So going from a structured output to less than is easier than the reverse but I do understand your point in which CAPE users would like to see the data in its 'raw' form. As a possible solution, what if each parser had a function to convert from raw to MACO. From your perspective, you would run the parsers as-is and render the output as you would originally. In our case, we would run your parser and then feed the raw output through the parser-specified conversion and then take the output. |
Yes I really like this idea - there are already multiple 'entry points' in some parsers, where cape calls |
Converts existing CAPE parsers to output according to MACO's output model.
This is just an initial run with limited knowledge of what the parsers are meant to extract using previous results in capesandbox.com.
Some refinement may be needed on both the model's and the parsers' end to make sure ever the result is beneficial for all!