Skip to content

CSV/TSV separators not guessed correctly for files with Byte Order Marks #6527

@tfmorris

Description

@tfmorris

It looks like the fix for #1241 is incomplete in that not all places which need to handle the new pseudo encoding which was introduced to handle UTF-8 with Byte Order Marks (BOM) were updated.

To Reproduce

Steps to reproduce the behavior:

  1. Create a project using this url: http://www.biodiversitylibrary.org/data/TSV/hosted/creator.txt

Current Results

An exception is logged on the console for an unsupported character encoding and the separator guessing process aborts.

By inspection, the fixed width importer is also susceptible to the same problem.

Expected Behavior

The separators are guessed correctly for CSV/TSV files

Metadata

Metadata

Assignees

Labels

Type: BugIssues related to software defects or unexpected behavior, which require resolution.importAbout importers in general - add a label for the data format if available

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions