Saildrone Dataflows

This repository contains the dataflows for processing the Saildrone 2023 raw data. The dataflows are written in Python using echopype and Prefect.

More information of using Saildrones at NOAA Fisheries: https://www.fisheries.noaa.gov/feature-story/detecting-fish-ocean-going-robots-complement-ship-based-surveys

Running the Dask Cluster

To run the Dask cluster locally on your machine, follow these steps:

First, enable the venv:

cd saildrone-dataflow
source venv/bin/activate

Start the scheduler:

dask scheduler

This will output something like:

2025-03-07 12:16:05,158 - distributed.scheduler - INFO - -----------------------------------------------
2025-03-07 12:16:05,563 - distributed.scheduler - INFO - State start
2025-03-07 12:16:05,566 - distributed.scheduler - INFO - -----------------------------------------------
2025-03-07 12:16:05,567 - distributed.scheduler - INFO -   Scheduler at:  tcp://192.168.1.100:8786
2025-03-07 12:16:05,567 - distributed.scheduler - INFO -   dashboard at:  http://192.168.1.100:8787/status
2025-03-07 12:16:05,567 - distributed.scheduler - INFO - Registering Worker plugin shuffle

Copy the tcp address: tcp://192.168.1.100:8786

Start the workers Open another terminal window and enable the venv again (step 1), then run the following and specify the number of workers/threads desired.

This will start 4 workers with 1 threads each:

./start-dask-worker.sh tcp://0.0.0.0:8786 --nthreads 1 --nworkers 4

To specify the memory limit per worker, add the --memory-limit argument, e.g.

./start-dask-worker.sh tcp://0.0.0.0:8786 --nthreads 1 --nworkers 4 `--memory-limit 8G

Name		Name	Last commit message	Last commit date
Latest commit History 219 Commits
calibration		calibration
prefect_flows		prefect_flows
saildrone		saildrone
test		test
.gitignore		.gitignore
Readme.md		Readme.md
main.py		main.py
requirements.txt		requirements.txt
saildrone-data-processing.ipynb		saildrone-data-processing.ipynb
setup.py		setup.py
start-dask-worker.sh		start-dask-worker.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Saildrone Dataflows

Running the Dask Cluster

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

OceanStreamIO/saildrone-dataflows

Folders and files

Latest commit

History

Repository files navigation

Saildrone Dataflows

Running the Dask Cluster

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages