Skip to content

tokenize.TokenError encountered during fuzzing #2746

Closed
@gabe-sherman

Description

@gabe-sherman

When fuzzing astroid's harness in OSS-Fuzz, the following error occurs. It seems that astroid leaves the handling of tokens to the Python library, so this doesn't constitute a bug in the astroid library, but I am reporting it just in case. If it is expected behavior, I think that adding this type of error to the try-catch block in the fuzzing harness would be helpful to avoid this in the future. Thanks in advance!

Error

tokenize.TokenError: ('EOF in multi-line string', (6, 4)) raised from _get_position_info at line 134 in rebuilder.py.

Reproducer

import astroid
import sys

d = open(sys.argv[1], "rb").read().decode("utf-8")
astroid.builder.parse(d)

POC File

https://github.com/FuturesLab/POC/blob/main/astroid/poc-01

Traceback Report

Traceback (most recent call last):
  File "harness.py", line 5, in <module>
    astroid.builder.parse(d)
  File "lib/python3.9/site-packages/astroid/builder.py", line 300, in parse
    return builder.string_build(code, modname=module_name, path=path)
  File "lib/python3.9/site-packages/astroid/builder.py", line 151, in string_build
    module, builder = self._data_build(data, modname, path)
  File "lib/python3.9/site-packages/astroid/builder.py", line 206, in _data_build
    module = builder.visit_module(node, modname, node_file, package)
  File "lib/python3.9/site-packages/astroid/rebuilder.py", line 173, in visit_module
    [self.visit(child, newnode) for child in node.body],
  File "lib/python3.9/site-packages/astroid/rebuilder.py", line 173, in <listcomp>
    [self.visit(child, newnode) for child in node.body],
  File "lib/python3.9/site-packages/astroid/rebuilder.py", line 488, in visit
    return visit_method(node, parent)
  File "lib/python3.9/site-packages/astroid/rebuilder.py", line 860, in visit_classdef
    position=self._get_position_info(node, newnode),
  File "lib/python3.9/site-packages/astroid/rebuilder.py", line 134, in _get_position_info
    for t in generate_tokens(StringIO(data).readline):
  File "/usr/lib/python3.9/tokenize.py", line 461, in _tokenize
    raise TokenError("EOF in multi-line string", strstart)
tokenize.TokenError: ('EOF in multi-line string', (6, 4))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions