Skip to content

feat: improved XMLArgs processing #3363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 29 commits into
base: v3/master
Choose a base branch
from

Conversation

airween
Copy link
Member

@airween airween commented Apr 20, 2025

what

This PR adds a new feature within XML processing. It's same as the #3358 but for v3.

why

See v2 patch.

references

See #3178 and #3358.

@airween airween added the 3.x Related to ModSecurity version 3.x label Apr 20, 2025
@airween airween requested review from theseion and fzipi April 26, 2025 09:40
Copy link
Contributor

@fzipi fzipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see this being the same patch as v2. Here you implemented the extra ctl:parseXMLintoArgs: it is not in v2. My bad, this is indeed implemented.

Comment on lines +30 to +31
bool ParseXmlIntoArgs::init(std::string *error) {
std::string what(m_parser_payload, 17, m_parser_payload.size() - 17);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

17? 🤦 It looks like the lenght of the SecParseXMLIntoArgs, but alone is so arbitrary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're definitely right, but I just wanted to follow the conventions (see other files) (even if it's not the prettiest solution). If we want to clean up the code, then we should do that as a sub project.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the project want this? It is worth tracking then (e.g. creating a new issue with the subproject?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the project want this?

Not necessarily. But we can live together with this.

Just a few example:

It is worth tracking then (e.g. creating a new issue with the subproject?)

Sure, it's been like this for almost ten years. Nobody reported, you are the first, so I can't say this has a priority.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds perfect. But (simple) tickets like this might get additional people into the dev side ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But (simple) tickets like this might get additional people into the dev side ;)

Amen! :)

Comment on lines +106 to +114
void MSC_startElement(void *userData,
const xmlChar *name,
const xmlChar *prefix,
const xmlChar *URI,
int nb_namespaces,
const xmlChar **namespaces,
int nb_attributes,
int nb_defaulted,
const xmlChar **attributes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here on this declaration. Is this the new standard?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, could you show a "reference"? What is the "standard"? What is the "old" standard?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature of the function is just in the same line, normally.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I had pointed out in the other PR, there are already several different ways of formatting long function signatures in the code base. This is one of them.

Comment on lines +121 to +124
void *userData,
const xmlChar *name,
const xmlChar* prefix,
const xmlChar* URI) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@airween
Copy link
Member Author

airween commented Apr 26, 2025

I don't see this being the same patch as v2. Here you implemented the extra ctl:parseXMLintoArgs: it is not in v2.

It's there.

airween and others added 16 commits April 27, 2025 20:18
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
airween and others added 2 commits April 27, 2025 20:28
Co-authored-by: Max Leske <[email protected]>
Co-authored-by: Max Leske <[email protected]>
@airween
Copy link
Member Author

airween commented Apr 27, 2025

A few tests were failed after some commit - I have to check the reason.

@airween
Copy link
Member Author

airween commented Apr 28, 2025

I added a few new commits here as at #3358:

  • bf707de - changed the directive SecXMLintoArgs to SecXmlIntoArgs
  • e8dc60e - there is a small change in the node value's parsing: if the node value contains a multi-byte character the SAX calls the function multiple times, so we need to concatenate those sub-strings, not creating a new string every time
  • 89442ed - changed the directives in tests and add a new test with a multibyte character

@airween airween requested review from theseion and fzipi April 28, 2025 20:55
@fzipi
Copy link
Contributor

fzipi commented Apr 29, 2025

Just out of curiosity, have you tried enabling this and sending a big xml file?

@airween
Copy link
Member Author

airween commented Apr 29, 2025

Just out of curiosity, have you tried enabling this and sending a big xml file?

What do you mean "a big xml file"?

Btw the size of the file is just one thing... There are couple of limits in the engine when you're sending an XML payload and want to process it as ARGS, like:

  • SecRequestBodyNoFilesLimit - the default value is 128kB
  • SecArgumentsLimit - default value is 1000

I sent an XML with size > 400kB (I increased the SecRequestBodyNoFilesLimit), then I get:

ModSecurity: XML parsing error: More than 1000 ARGS (GET + XML) [hostname "localhost"] [uri "/post"] [
ModSecurity: Access denied with code 400 (phase 2). Match of "eq 0" against "REQBODY_ERROR" required. [file "/etc/modsecurity/modsecurity.conf"] [line "78"] [id "200002"] [msg "Failed to parse request body."] [data "XML parsing error: More than 1000 ARGS (GET + XML)"] [severity "CRITICAL"] [hostname "localhost"] [uri "/post"]

Actually, the behavior of this function is exactly the same as that used for JSON. There is no difference. You could ask "have you tried enabling this and sending a big JSON file?". But the JSON works as is, and this feature is optional.

If you tell me what you're curious about, I'll try it - with an XML and a JSON too.

@fzipi
Copy link
Contributor

fzipi commented Apr 29, 2025

Just out of curiosity, have you tried enabling this and sending a big xml file?

The sentence starts with Just out of curiosity. So yes, just curious.

@@ -225,8 +225,7 @@ class RulesSetProperties {

/**
*
* The ConfigBoolean enumerator consists in mapping the different
* states of the configuration boolean values.
* The ConfigBoolean enumerator the states for configuration boolean values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The ConfigBoolean enumerator the states for configuration boolean values.
* The ConfigBoolean enumerator defines the states for configuration boolean values.

Comment on lines 184 to +185
m_secXMLExternalEntity(PropertyNotSetConfigBoolean),
m_secXMLParseXmlIntoArgs(PropertyNotSetConfigXMLParseXmlIntoArgs),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
m_secXMLExternalEntity(PropertyNotSetConfigBoolean),
m_secXMLParseXmlIntoArgs(PropertyNotSetConfigXMLParseXmlIntoArgs),
m_secXmlExternalEntity(PropertyNotSetConfigBoolean),
m_secXmlParseXmlIntoArgs(PropertyNotSetConfigXMLParseXmlIntoArgs),

Comment on lines +239 to +247
* The ConfigXMLParseXmlIntoArgs enumerator defines the states for the configuration
* XMLParseXmlIntoArgs values.
* The default value is PropertyNotSetConfigXMLParseXmlIntoArgs.
*/
enum ConfigXMLParseXmlIntoArgs {
TrueConfigXMLParseXmlIntoArgs,
FalseConfigXMLParseXmlIntoArgs,
OnlyArgsConfigXMLParseXmlIntoArgs,
PropertyNotSetConfigXMLParseXmlIntoArgs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* The ConfigXMLParseXmlIntoArgs enumerator defines the states for the configuration
* XMLParseXmlIntoArgs values.
* The default value is PropertyNotSetConfigXMLParseXmlIntoArgs.
*/
enum ConfigXMLParseXmlIntoArgs {
TrueConfigXMLParseXmlIntoArgs,
FalseConfigXMLParseXmlIntoArgs,
OnlyArgsConfigXMLParseXmlIntoArgs,
PropertyNotSetConfigXMLParseXmlIntoArgs
* The ConfigXmlParseXmlIntoArgs enumerator defines the states for the configuration
* XmlParseXmlIntoArgs values.
* The default value is PropertyNotSetConfigXmlParseXmlIntoArgs.
*/
enum ConfigXmlParseXmlIntoArgs {
TrueConfigXmlParseXmlIntoArgs,
FalseConfigXmlParseXmlIntoArgs,
OnlyArgsConfigXmlParseXmlIntoArgs,
PropertyNotSetConfigXmlParseXmlIntoArgs

/*
* XMLNodes for parsing XML into args
*/
XMLNodes::XMLNodes(Transaction *transaction)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
XMLNodes::XMLNodes(Transaction *transaction)
XmlNodes::XmlNodes(Transaction *transaction)

handler->onCharacters(userData, ch, len);
}
}

XML::XML(Transaction *transaction)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
XML::XML(Transaction *transaction)
Xml::Xml(Transaction *transaction)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.x Related to ModSecurity version 3.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants