Skip to content

Conversation

Kaos599
Copy link

@Kaos599 Kaos599 commented Oct 13, 2025

  • Introduced merge_csv_options_with_dialect function to handle merging CSV options with dialect settings.
  • Updated read_to_* functions to utilize the new merging logic, ensuring compatibility with user-defined CSV options.
  • Enhanced write functions to validate headers and merge options, providing warnings for conflicts between user options and dialect settings.
  • Added tests to verify correct handling of CSV dialects and option overrides in both reading and writing scenarios.

Resolves #82

Kaos599 and others added 11 commits October 14, 2025 00:19
- Introduced `merge_csv_options_with_dialect` function to handle merging CSV options with dialect settings.
- Updated `read_to_*` functions to utilize the new merging logic, ensuring compatibility with user-defined CSV options.
- Enhanced `write` functions to validate headers and merge options, providing warnings for conflicts between user options and dialect settings.
- Added tests to verify correct handling of CSV dialects and option overrides in both reading and writing scenarios.
- Reformatted code in `merge_csv_options_with_dialect` and related functions for better readability by breaking long lines.
- Ensured consistent formatting in test files for CSV data and dialect definitions.
- Added missing newlines at the end of some files to adhere to coding standards.
@dalonsoa
Copy link
Collaborator

Many thanks for working into this! I'll review and provide feedback within the next few days.

- Corrected assertion from len(data) == 1 to len(data) == 2
- Added proper assertions for data structure validation
- Test now correctly validates CSV dialect override behavior
Copy link

codecov bot commented Oct 17, 2025

Codecov Report

❌ Patch coverage is 96.87500% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@013c6f4). Learn more about missing BASE report.

Files with missing lines Patch % Lines
csvy/readers.py 94.28% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #361   +/-   ##
=======================================
  Coverage        ?   94.20%           
=======================================
  Files           ?        7           
  Lines           ?      483           
  Branches        ?        0           
=======================================
  Hits            ?      455           
  Misses          ?       28           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for putting this together. It looks really good. I have a few comments to polish the implementation, but the functionality is there.

Kaos599 and others added 3 commits October 17, 2025 15:53
- Add comprehensive docstring to merge_csv_options_with_dialect with parameter and return descriptions
- Reduce nesting by early return in merge_csv_options_with_dialect function
- Replace hardcoded dialect_mapping with direct iteration over CSVDialectValidator.model_fields
- Move merge_csv_options_with_dialect to new csvy/utils.py module
- Update readers.py and writers.py to import from utils.py
- Remove duplicate function definitions and unused imports
- Fix Pydantic deprecation warning by accessing model_fields from class instead of instance
- All pre-commit hooks pass with proper formatting and linting
- Move merge_csv_options_with_dialect to new csvy/utils.py module

- Add comprehensive docstring with args/returns description

- Reduce nesting by using early return pattern

- Remove dialect_mapping dict, iterate directly over validator fields

- Update readers.py and writers.py to import from utils

- Add tests/test_utils.py with full test coverage
@Kaos599
Copy link
Author

Kaos599 commented Oct 19, 2025

@dalonsoa

I have tried to address all your comments from the code review!

  • I moved the merge_csv_options_with_dialect function to a new csvy/utils.py module with a proper docstring describing arguments and return values.

  • I also refactored it to reduce nesting by checking the opposite condition and returning early, and removed the unnecessary dialect mapping by getting info directly from the validator. Both readers.py and writers.py now import from utils.py.

  • I created tests/test_utils.py with comprehensive test cases for the utility function.

Copy link
Collaborator

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the changes. It looks much better. However, I've found a small issue that needs to be fixed before we can merge the PR.

- Add overrides parameter to merge_csv_options_with_dialect()

- Support library-specific option names (pandas: sep, polars: separator)

- Detect and resolve conflicts between user options and dialect settings

- Update CSVY headers when conflicts occur with user warnings

- Add comprehensive test coverage
@Kaos599 Kaos599 requested a review from dalonsoa October 20, 2025 07:21
Copy link
Collaborator

@dalonsoa dalonsoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good, and I just have a couple of minor comments.

Comment on lines +67 to +86
# Determine overrides based on data type
overrides = {}
try:
import pandas as pd

if isinstance(data, pd.DataFrame):
overrides = {"sep": "delimiter"}
except ImportError:
pass

try:
import polars as pl

if isinstance(data, pl.DataFrame | pl.LazyFrame):
overrides = {"separator": "delimiter"}
except ImportError:
pass

# For numpy arrays and lists, use standard dialect names (no overrides needed)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you separate this into a separate get_overrides function within utils.py?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there's something missing here? Probably it should be a test within test_utils.py rather that its own file.

@dalonsoa
Copy link
Collaborator

@all-contributors please add @Kaos599 for code, test

Copy link
Contributor

@dalonsoa

I've put up a pull request to add @Kaos599! 🎉

@dalonsoa
Copy link
Collaborator

@Kaos599 there're also some tests failing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use the CSV Dialect to read/write data using... csv

2 participants