Skip to content

gh-95813: Improve HTMLParser from the view of inheritance #95874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 18, 2022

Conversation

corona10
Copy link
Member

@corona10 corona10 commented Aug 11, 2022

@ezio-melotti
Copy link
Member

@corona10, thanks for the PR!
I think we should add a test that fails before your changes and passes afterwards.

@ezio-melotti ezio-melotti self-assigned this Aug 11, 2022
@corona10
Copy link
Member Author

corona10 commented Aug 11, 2022

I think we should add a test that fails before your changes and passes afterwards.

Updated!

Without patch:

0:00:00 load avg: 4.23 Run tests sequentially
0:00:00 load avg: 4.23 [1/1] test_htmlparser
test test_htmlparser failed -- Traceback (most recent call last):
  File "/Users/user/oss/cpython/Lib/unittest/mock.py", line 1359, in patched
    return func(*newargs, **newkeywargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/user/oss/cpython/Lib/test/test_htmlparser.py", line 800, in test_init
    self.assertTrue(super_init_method.called)
AssertionError: False is not true

test_htmlparser failed (1 failure)

== Tests result: FAILURE ==

1 test failed:
    test_htmlparser

@corona10
Copy link
Member Author

corona10 commented Aug 11, 2022

@ezio-melotti

diff --git a/Lib/_markupbase.py b/Lib/_markupbase.py
index 3ad7e27996..f8b2b901b9 100644
--- a/Lib/_markupbase.py
+++ b/Lib/_markupbase.py
@@ -7,6 +7,7 @@

 import re

+from abc import ABCMeta, abstractmethod
 _declname_match = re.compile(r'[a-zA-Z][-_.a-zA-Z0-9]*\s*').match
 _declstringlit_match = re.compile(r'(\'[^\']*\'|"[^"]*")\s*').match
 _commentclose = re.compile(r'--\s*>')
@@ -20,7 +21,7 @@
 del re


-class ParserBase:
+class ParserBase(metaclass=ABCMeta):
     """Parser base class which provides some common support methods used
     by the SGML/HTML and XHTML parsers."""

@@ -391,6 +392,6 @@ def _scan_name(self, i, declstartpos):
                 "expected name token at %r" % rawdata[declstartpos:declstartpos+20]
             )

-    # To be overridden -- handlers for unknown objects
+    @abstractmethod
     def unknown_decl(self, data):
         pass

One more thing, what about update ParserBase as abstract class?

@corona10
Copy link
Member Author

@ezio-melotti gentle ping

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

Copy link
Member Author

@corona10 corona10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezio-melotti
I have made the requested changes; please review again.

@ezio-melotti
Copy link
Member

Thanks for the PR!

One more thing, what about update ParserBase as abstract class?

We could, but I don't think it's necessary.

tiran pushed a commit to tiran/cpython that referenced this pull request Aug 19, 2022
…on#95874)

* pythongh-95813: Improve HTMLParser from the view of inheritance

* pythongh-95813: Add unittest

* Address code review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants