Skip to content

Conversation

mirober
Copy link
Contributor

@mirober mirober commented Jan 15, 2022

I originally tried to implement the tests using the @td.check_file_leaks decorator but couldn't get this to cause a failing test. e.g. the following test:

import pandas.util._test_decorators as td

@td.check_file_leaks
def test_file_leaks(datapath):
    test2csv = datapath("io", "parser", "data", "test2.csv")
    h = open(test2csv, "r")

Run with:

python -W default -m pytest pandas/tests/io/parser/ -k test_file_leaks

Passes with the following output on Ubuntu 20 - note that the warning indicates that there is definitely a leak, but the decorator hasn't caught it:

=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.9.9, pytest-6.2.5, py-1.11.0, pluggy-1.0.0
rootdir: /home/ubuntu/pandas, configfile: pyproject.toml
plugins: hypothesis-6.35.0
collected 4651 items / 4650 deselected / 1 selected

pandas/tests/io/parser/test_parse_dates.py .

================================================================================================ warnings summary =================================================================================================
pandas/tests/io/parser/test_parse_dates.py::test_file_leaks
  /home/ubuntu/pandas/.venv/lib/python3.9/site-packages/_pytest/python.py:183: ResourceWarning: unclosed file <_io.TextIOWrapper name='/home/ubuntu/pandas/pandas/tests/io/parser/data/test2.csv' mode='r' encoding='UTF-8'>
    result = testfunction(**testargs)

-- Docs: https://docs.pytest.org/en/stable/warnings.html
------------------------------------------------------------------------------ generated xml file: /home/ubuntu/pandas/test-data.xml ------------------------------------------------------------------------------
============================================================================================== slowest 30 durations ===============================================================================================

(3 durations < 0.005s hidden.  Use -vv to show these durations.)
================================================================================== 1 passed, 4650 deselected, 1 warning in 0.97s ==================================================================================

@pep8speaks
Copy link

pep8speaks commented Jan 15, 2022

Hello @mrob95! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-01-15 21:27:57 UTC

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks a lot more than a minimal change

pls be very targeted

@twoertwein
Copy link
Member

this looks a lot more than a minimal change

Mostly whitespace changes, can skip them on github:
image

@mirober
Copy link
Contributor Author

mirober commented Jan 16, 2022

this looks a lot more than a minimal change

pls be very targeted

git made a bit of a meal of the diff but it is almost entirely indentation changes. Having one try/except around everything following _open_handles seems less error-prone than trying to wrap every fallible call individually.

#45389 looks like a cleaner fix anyway, so I'm happy for this to be closed.

@jreback jreback added the IO CSV read_csv, to_csv label Jan 16, 2022
@jreback
Copy link
Contributor

jreback commented Jan 16, 2022

thanks @mrob95 merged #45389

@jreback jreback closed this Jan 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: read_csv can leak file handles if validation fails
4 participants