Skip to content

Infinite loop or exception when trying to parse empty lines #593

@Kylotan

Description

@Kylotan

I have a grammar where I want to be able to parse an empty line as a "null statement", like Python's pass. It's probably not strictly necessary for my language but it will be helpful during development.

Problem is, the two approaches I have tried don't work.

This code below ends up in an infinite loop:

import unittest

class PyParsingTests(unittest.TestCase):
    def test_compound_statements(self):
        import pyparsing as pp
        # No warning is emitted
        pp.enable_all_warnings()
        # doesn't matter whether I remove newline from the set of skippable whitespace characters, or not
        # pp.ParserElement.set_default_whitespace_chars(' \t')

        empty_line = pp.rest_of_line
        null_statement = empty_line
        # Doesn't matter which of the two formulations below I use - same result in each case
        #compound_statement = pp.OneOrMore(null_statement)
        compound_statement = null_statement + null_statement[...]

        # I know this is deprecated, but using here just in case. No RecursiveGrammarException is raised
        #compound_statement.validate()

        # Expected result here - parses 3 'empty_line' elements.
        # Observed result - seems to loop forever
        compound_statement.parse_string("\n\n\n", parse_all=True)

        # Same happens even without the parse_all
        #compound_statement.parse_string("\n\n\n")

        # And with whitespace in each line
        #compound_statement.parse_string(" \n \n \n")


if __name__ == '__main__':
    unittest.main()

I guessed that this is because pp.rest_of_line does not consume the end of line character, meaning the parser would never make progress. This makes sense but I can't imagine the infinite loop is desired.

If I amend the empty_line definition to this: empty_line = pp.rest_of_line + "\n", then I get the following exceptions:

Error
Traceback (most recent call last):
  File "E:\Code\whatever\.venv\Lib\site-packages\pyparsing\core.py", line 846, in _parseNoCache
    loc, tokens = self.parseImpl(instring, pre_loc, do_actions)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\Code\whatever\.venv\Lib\site-packages\pyparsing\core.py", line 2492, in parseImpl
    if instring[loc] == self.firstMatchChar:
       ~~~~~~~~^^^^^
IndexError: string index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "E:\Code\whatever\tests\test_random.py", line 22, in test_compound_statements
    compound_statement.parse_string("\n\n\n", parse_all=True)
  File "E:\Code\whatever\.venv\Lib\site-packages\pyparsing\core.py", line 1212, in parse_string
    raise exc.with_traceback(None)
pyparsing.exceptions.ParseException: Expected '\n', found end of text  (at char 3), (line:4, col:1)

I am not sure how I would avoid the second exception (which seems to be complaining that it can't parse a 4th line, even though it only wants "one or more", and the first exception being unhandled before the second is thrown just looks like a bug.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions