The underlying raw converted data structure used by echopype (the EchoData
object) has undergone
revisions over time. While in most echopype releases these changes are relatively small, versions 0.8.0 and 0.6.0
incorporated significant changes, with implications for backward compatibility. Here we describe the main changes in
each of these major releases. Please refer to What’s new for more complete
details on specific changes that impacted the data structure.
Changes introduced in version 0.8.0 were carried out to incorporate missing variables that are mandated by SONAR-netCDF4 v1 and to implement adaptations to the convention in a more consistent fashion across variables and instrument types. Some of these changes modified or reverted decisions implemented in version 0.6.0 (below) that were later found to have large impacts on performance and usability.
Highlights include:
Remove beam
and ping_time
dimensions from a whole netCDF4 group or individual
variables when they were determined to be not required for specific instrument types. The beam
dimension is now dropped from all Sonar/Beam_groupX
groups except for EK80 complex samples,
where both backscatter_r
and backscatter_i
exist and the beam
dimension
represents different sectors of split-beam transducers. The ping_time
dimension is retained only
with variables that are known to potentially vary with time in the instrument types supported by echopype.
In Sonar/Beam_groupX
groups: Standardize the use of transmit_frequency_start
and
transmit_frequency_stop
, where they were previously missing or the names being used
(frequency_start
and frequency_end
) were not the ones specified by the convention.
In the Platform
group: Implement variables absed on the convention more consistently across
instrument types:
Assign default values to variables when no such variables are found in the raw data file
Revise the dimensions of each variable to be consistent across instrument types, with dimensions deemed unnecessary dropped from some variables.
In the Provenance
group: Add new attributes combination_*
to the “combined”
EchoData
object, mirroring the convention-based attributes conversion_*
.
In the Vendor_specific
group: Move filter coefficients and decimation factor from attributes to
variables in EK80, to facilitate consistent provenance tracking during combine_echodata
operations.
Improve the presence and use of variable attributes throughout EchoData
groups.
Version 0.8.0 does not incorporate the capability to read files converted by previous versions of
echopype. We recommend using open_raw
to re-convert the raw data files.
In order to enhance the compliance of echopype-generated datasets to the SONAR-netCDF4 version 1 convention, a number of changes were introduced in echopype v0.6.0 that create incompatibilities with the data structure used in previous versions.
To ease the transition, the open_converted
function is able to
open files previously converted using echopype v0.5.x (0.5.0 to 0.5.6) into the v0.6.0 data format, encapsulated in
the EchoData
object.
Key changes involved renaming and restructuring a couple of groups, and renaming some coordinates and data variables, as summarized below:
Type |
v0.5.x |
v0.6.0 |
Rationale and notes |
---|---|---|---|
Group |
|
|
Convention compliance |
Group |
|
|
Convention compliance |
Group |
|
|
Convention compliance |
Coordinate |
|
|
Accommodate channels with duplicated frequencies. The new variable |
Coordinate |
|
|
Better intuitive understanding of data |
Coordinate |
|
|
Convention compliance. This |
Coordinate |
|
|
Convention compliance. In |
Coordinate |
|
|
Convention compliance. In |
Variable |
|
|
Convention compliance. In |
Variable |
|
|
Convention compliance. In |
Other changes included:
Adding previously missing, mandatory convention variables. When no data are available to populate them, these
are filled with null (NaN
) values.
Moving variables from one group to another, particularly from the Beam groups to Platform
and
Vendor
. These variables were not typically not part of the convention.
The Beam_groupX beamwidth_receive_athwartship
and beamwidth_transmit_athwartship
variables were consolidated into beamwidth_twoway_athwartship
because the EK60 and EK80
echosounders do not store one-way transmit or receive beam widths. Likewise for
beamwidth_receive_alongship
and beamwidth_transmit_alongship
.
More details, including Pull Requests and discussions related to these changes, can be found in the Release notes.
Below we provide a sample of the v0.5.x data format via a printout of the previous EchoData
object.
Compare this with the v0.6.0 EchoData
object to see the changes listed in the table above.
<xarray.Dataset> Dimensions: () Data variables: *empty* Attributes: conventions: CF-1.7, SONAR-netCDF4-1.0, ACDD-1.3 keywords: EK60 sonar_convention_authority: ICES sonar_convention_name: SONAR-netCDF4 sonar_convention_version: 1.0 summary: EK60 raw file s3://ncei-wcsd-archive/data/ra... title: 2017 Pacific Hake Acoustic Trawl Survey date_created: 2017-07-28T18:16:19Z survey_name:
<xarray.Dataset> Dimensions: (frequency: 3, ping_time: 529) Coordinates: * frequency (frequency) float64 1.8e+04 3.8e+04 1.2e+05 * ping_time (ping_time) datetime64[ns] 2017-07-28T18:16:19.31... Data variables: absorption_indicative (frequency, ping_time) float64 0.002822 ... 0.03259 sound_speed_indicative (frequency, ping_time) float64 1.481e+03 ... 1.48...
<xarray.Dataset> Dimensions: (location_time: 2165, frequency: 3, ping_time: 529) Coordinates: * location_time (location_time) datetime64[ns] 2017-07-28T18:16:21.4759997... * frequency (frequency) float64 1.8e+04 3.8e+04 1.2e+05 * ping_time (ping_time) datetime64[ns] 2017-07-28T18:16:19.313999872 .... Data variables: latitude (location_time) float64 dask.array<chunksize=(2165,), meta=np.ndarray> longitude (location_time) float64 dask.array<chunksize=(2165,), meta=np.ndarray> sentence_type (location_time) <U3 dask.array<chunksize=(2165,), meta=np.ndarray> pitch (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray> roll (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray> heave (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray> water_level (frequency, ping_time) float64 dask.array<chunksize=(3, 529), meta=np.ndarray> Attributes: platform_type: Research vessel platform_name: Bell M. Shimada platform_code_ICES: 315
<xarray.Dataset> Dimensions: (location_time: 22037) Coordinates: * location_time (location_time) datetime64[ns] 2017-07-28T18:16:19.3140003... Data variables: NMEA_datagram (location_time) <U73 '$SDVLW,5050.149,N,5050.149,N' ... '$... Attributes: description: All NMEA sensor datagrams
<xarray.Dataset> Dimensions: () Data variables: *empty* Attributes: conversion_software_name: echopype conversion_software_version: 0.5.6 conversion_time: 2022-05-26T18:01:56Z src_filenames: s3://ncei-wcsd-archive/data/raw/Bell_M._Shi... duplicate_ping_times: 0
<xarray.Dataset> Dimensions: () Data variables: *empty* Attributes: sonar_manufacturer: Simrad sonar_model: ER60 sonar_serial_number: sonar_software_name: sonar_software_version: 2.4.3 sonar_type: echosounder
<xarray.Dataset> Dimensions: (frequency: 3, ping_time: 529, range_bin: 3957) Coordinates: * frequency (frequency) float64 1.8e+04 3.8e+04 1.2e+05 * ping_time (ping_time) datetime64[ns] 2017-07-28T18:... * range_bin (range_bin) int64 0 1 2 3 ... 3954 3955 3956 Data variables: (12/30) channel_id (frequency) <U37 'GPT 18 kHz 009072058c8... beam_type (frequency) int64 1 1 1 beamwidth_receive_alongship (frequency) float64 10.9 6.81 6.58 beamwidth_receive_athwartship (frequency) float64 10.82 6.85 6.52 beamwidth_transmit_alongship (frequency) float64 10.9 6.81 6.58 beamwidth_transmit_athwartship (frequency) float64 10.82 6.85 6.52 ... ... data_type (frequency, ping_time) float64 3.0 ... 3.0 count (frequency, ping_time) float64 3.957e+03 ... offset (frequency, ping_time) float64 0.0 ... 0.0 transmit_mode (frequency, ping_time) float64 0.0 ... 0.0 angle_athwartship (frequency, ping_time, range_bin) float64 ... angle_alongship (frequency, ping_time, range_bin) float64 ... Attributes: beam_mode: vertical conversion_equation_t: type_3
<xarray.Dataset> Dimensions: (frequency: 3, pulse_length_bin: 5) Coordinates: * frequency (frequency) float64 1.8e+04 3.8e+04 1.2e+05 * pulse_length_bin (pulse_length_bin) int64 0 1 2 3 4 Data variables: sa_correction (frequency, pulse_length_bin) float64 0.0 -0.7 ... -0.3 gain_correction (frequency, pulse_length_bin) float64 20.3 22.95 ... 26.55 pulse_length (frequency, pulse_length_bin) float64 0.000512 ... 0.00...