Skip to content

Regular expression in CommandLineParser exhibits catastrophic backtracking #309

@mschwager

Description

@mschwager

Hi there! I've been working on a new Python static analysis tool called Dlint. Most recently I've been working on a rule that searches for regular expression denial-of-service: DUO138. When running this rule against your codebase I found the following violations:

$ python -m flake8 --select=DUO138 splunklib/
splunklib/searchcommands/internals.py:217:21: DUO138 catastrophic "re" usage - denial-of-service possible
splunklib/searchcommands/internals.py:235:22: DUO138 catastrophic "re" usage - denial-of-service possible
splunklib/searchcommands/internals.py:237:19: DUO138 catastrophic "re" usage - denial-of-service possible
splunklib/searchcommands/search_command.py:834:22: DUO138 catastrophic "re" usage - denial-of-service possible

After further investigation, it appears the violation in internals.py at line 235 is a true positive and the rest are false positives. If we dig in further:

class CommandLineParser(object):
    ...
    _fieldnames_re = re.compile(r"""("(?:\\.|""|[^"])+"|(?:\\.|[^\s"])+)""")

The violation occurs due to (?:\\.|""|[^"])+. This is due to mutually inclusive alternation within a quantifier. Since \\. and [^"] have character overlap. We can confirm the bug with a specially crafted string and the following code:

from splunklib.searchcommands import internals
internals.CommandLineParser._fieldnames_re.search('"' + r'\\.' * 64 + "'")
...Spins...

To fix the issue we should avoid the character overlap between \\. and [^"]. One way to accomplish this is with the following expression:

_fieldnames_re = re.compile(r"""("(?:\\.|""|[^"\\])+"|(?:\\.|[^\s"])+)""")

Note that the other false positives found contain similar expressions, and despite not being vulnerable to this issue, it may be worth updating them as well. This will avoid them becoming vulnerable in the future.

If you'd like to learn more about ReDoS and catastrophic backtracking I recommend checking out this blog post. I hope this helps, let me know if you have any questions!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions