Reject bucket responses that have unexpected byte count #8776

philippotto · 2025-07-12T06:59:08Z

This PR make bucket requests more resilient. It auto rejects requests with an unexpected returned data length. A lot of fixing the tests was needed to get this working, but should be fine now.

URL of deployed dev instance (used for testing):

https://___.webknossos.xyz

Steps to test:

I suggest testing locally.
Uncomment the code mentioned below.
View a dataset and observe the console. It should be spammed with messages of unexpected bucket respone lengths.
Else all other things should work.
comment the code out again
do the same again. Such an error should no longer occur. If it does -> Thats a good hint for debugging. Please tell me how to reproduce this :D

TODOs:

fix that createBucketResponseFunction does not respect the dtype (e.g., 32 bit layers need 4 times the amount of bytes)

Issues:

I think no issue exists for this.

(Please delete unneeded items, merge only when none are left open)

Added changelog entry (create a $PR_NUMBER.md file in unreleased_changes or use ./tools/create-changelog-entry.py)

coderabbitai · 2025-07-12T06:59:17Z

📝 Walkthrough

Walkthrough

Refactors test helper createBucketResponseFunction to accept per-layer element-class descriptors and updates tests accordingly. Adds strict bucket-response size validation and throws on mismatch, introduces per-bucket failure tracking and partial-batch retrying in pull queue, and adds a changelog entry documenting the bucket-reject behavior.

Changes

Cohort / File(s)	Summary of Changes
Test helper update `frontend/javascripts/test/helpers/apiHelpers.ts`	Reworked `createBucketResponseFunction` to accept `dataTypePerLayer: Record<string, ElementClass>`; derives TypedArray constructors via `getConstructorForElementClass`, computes channel counts, validates layer types, allocates/fills typed arrays, and updated imports/signature.
Volumetracing saga tests `frontend/javascripts/test/sagas/volumetracing/*`	Replaced first arg to `createBucketResponseFunction` from TypedArray constructors to per-layer descriptor objects (e.g., `{ volumeTracingId: "uint16", color: "uint8" }`) in multiple mocks; test logic unchanged. Files: `.../bucket_eviction_helper.ts`, `.../bucket_eviction_with_saving.spec.ts`, `.../bucket_eviction_without_saving.spec.ts`, `.../volumetracing_remote_bucket_updates.spec.ts`, `.../volumetracing_saga_integration_1.spec.ts`, `.../volumetracing_saga_integration_2.spec.ts`.
Other tests updated `frontend/javascripts/test/sagas/proofreading.spec.ts`, `frontend/javascripts/test/api/api_volume_latest.spec.ts`	Updated mocks to pass per-layer descriptors to `createBucketResponseFunction`; API test import/formatted import block updated to include the new helper.
Binary layer tests `frontend/javascripts/test/model/binary/layers/wkstore_adapter.spec.ts`	Switched to `colorLayer`, compute bucket byte counts via `getByteCountFromLayer(layer)` and `Constants.BUCKET_SIZE` with a four-bit factor; updated test data allocation and base64 expectations.
Bucket data handling (validation & erroring) `frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts`, `frontend/javascripts/viewer/model/bucket_data_handling/bucket.ts`	`sliceBufferIntoPieces` now validates total buffer length equals non-missing-buckets * BUCKET_BYTE_LENGTH and throws on mismatch. `DataBucket.receiveData()` now constructs an Error, notifies ErrorHandling, and throws when data length mismatches.
Pull queue resilience `frontend/javascripts/viewer/model/bucket_data_handling/pullqueue.ts`	Adds per-bucket failure tracking, wraps per-bucket processing in try/catch, collects failed bucket addresses, throws after partial failures to route retry logic, and re-queues failed/dirty buckets with adjusted failure semantics.
Changelog `unreleased_changes/8776.md`	New entry: "Reject bucket request responses with unexpected size and try to reload those buckets automatically."

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Don't label outside of layer's bbox #8602 — Touches bucket handling, voxel labeling and bucket layout codepaths that overlap with the helper and bucket-size changes.
Support rendering more dtypes #8325 — Modifies ElementClass/typed-buffer handling which relates to the new per-layer ElementClass usage in test helpers.

Suggested labels

testing

Suggested reviewers

daniel-wer

Poem

I nibble bytes and check each stack,
If sizes fail, I thump and jack.
Failed buckets hop back in the queue,
Colors and segs get types that are true.
A rabbit's cheer for tests made right — hop, byte! 🥕🐇

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between b7725cc and 6530d73.

📒 Files selected for processing (1)

frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: frontend-tests
GitHub Check: backend-tests

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch reject-unexpected-bucket-lengths

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

…to get the dataset somehow)

philippotto · 2025-07-20T08:27:38Z

@MichaelBuessemeyer can you have a look if you have some air? thank you!

…ect-unexpected-bucket-lengths

MichaelBuessemeyer · 2025-08-22T15:19:40Z

frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts

+  // if (Math.random() < 0.5) {
+  //   buffer = buffer.subarray(1, buffer.length - 1);
+  // }


Uncomment this for testing

Will remove it before merging

MichaelBuessemeyer · 2025-08-22T15:45:09Z

@normanrz please re-assign in case you want someone, else to review this pr

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (5)

frontend/javascripts/test/model/binary/layers/wkstore_adapter.spec.ts (2)

215-233: Color four-bit test: token expectation mismatch

Same as above: no refresh is configured here, so token should be "token".

Apply this diff:

-    const expectedUrl = "url/data/datasets/datasetId/layers/color/data?token=token2";
+    const expectedUrl = "url/data/datasets/datasetId/layers/color/data?token=token";

Also consider reducing brittleness like in the previous comment:

-      expect(RequestMock.sendJSONReceiveArraybufferWithHeaders).toHaveBeenCalledWith(
-        expectedUrl,
-        expectedOptions,
-      );
+      expect(RequestMock.sendJSONReceiveArraybufferWithHeaders).toHaveBeenCalledWith(
+        expectedUrl,
+        expect.objectContaining(expectedOptions),
+      );

239-252: Segmentation test: fix token expectation and reduce options brittleness

Mirror the same fixes here.

Apply this diff:

-    const expectedUrl = "url/data/datasets/datasetId/layers/segmentation/data?token=token2";
+    const expectedUrl = "url/data/datasets/datasetId/layers/segmentation/data?token=token";

And make the options check future-proof:

-      expect(RequestMock.sendJSONReceiveArraybufferWithHeaders).toHaveBeenCalledWith(
-        expectedUrl,
-        expectedOptions,
-      );
+      expect(RequestMock.sendJSONReceiveArraybufferWithHeaders).toHaveBeenCalledWith(
+        expectedUrl,
+        expect.objectContaining(expectedOptions),
+      );

frontend/javascripts/viewer/model/bucket_data_handling/bucket.ts (1)

617-633: Avoid serializing the entire buffer into the error; log only minimal, structured metadata

JSON-stringifying a (typed) array can explode logs and degrade performance. Also, it currently emits full debug info in production but omits it in tests, which is counterintuitive. Recommend logging only actual/expected lengths, channelCount, and the address, and pass structured metadata to ErrorHandling instead of embedding it in the message.

-    if (data.length !== channelCount * Constants.BUCKET_SIZE) {
-      const debugInfo = // Disable this conditional if you need verbose output here.
-        process.env.IS_TESTING
-          ? " (<omitted>)"
-          : {
-              arrayBuffer,
-              actual: data.length,
-              expected: channelCount * Constants.BUCKET_SIZE,
-              channelCount,
-            };
-      console.warn(`bucket.data for ${this.zoomedAddress} has unexpected length`, debugInfo);
-      const error = new Error(
-        `bucket.data has unexpected length. Details: ${JSON.stringify(debugInfo)}`,
-      );
-      ErrorHandling.notify(error);
-      throw error;
-    }
+    if (data.length !== channelCount * Constants.BUCKET_SIZE) {
+      const actual = data.length;
+      const expected = channelCount * Constants.BUCKET_SIZE;
+      const meta = {
+        address: this.zoomedAddress.slice(0, 4),
+        actual,
+        expected,
+        channelCount,
+        elementClass: this.elementClass,
+      };
+      console.warn(`bucket.data has unexpected length for ${meta.address} (actual=${actual}, expected=${expected})`, meta);
+      const error = new Error("bucket.data has unexpected length");
+      ErrorHandling.notify(error, meta);
+      throw error;
+    }

frontend/javascripts/viewer/model/bucket_data_handling/pullqueue.ts (1)

121-131: Failed buckets are not re-queued unless dirty — this breaks “auto-reload” for read-only buckets

If a bucket fails (e.g., unexpected byte length), non-dirty buckets are marked UNREQUESTED but never re-enqueued here. Unless some other subsystem re-adds them, they won’t be retried, contradicting the PR’s goal to auto-reload. Re-queue all failed data buckets with highest priority.
-      failedBucketAddresses = failedBucketAddresses.length === 0 ? batch : failedBucketAddresses;
-      for (const bucketAddress of failedBucketAddresses) {
+      failedBucketAddresses = failedBucketAddresses.length === 0 ? batch : failedBucketAddresses;
+      for (const bucketAddress of failedBucketAddresses) {
         const bucket = this.cube.getBucket(bucketAddress);

         if (bucket.type === "data") {
           bucket.markAsFailed(false);
-
-          if (bucket.dirty) {
-            bucket.addToPullQueueWithHighestPriority();
-          }
+          // Always re-queue to honor the "auto-reload" promise.
+          bucket.addToPullQueueWithHighestPriority();
         }
       }
Follow-up: if you’re concerned about hot loops under persistent failure, the existing backoff (getRetryDelay) and isRetryScheduled guards already mitigate that. You could additionally cap retries per bucket via a transient counter on the bucket, but that’s optional.

frontend/javascripts/test/helpers/apiHelpers.ts (1)

253-263: Bug: overrides ignore channelCount and mis-index across buckets

indexInBucket computes a per-voxel index but:

Does not multiply the bucket stride by channelCount (breaks for multi-channel types like "uint24"),
Writes only the first channel for multi-channel data.

At minimum, fix the stride so overrides target the correct bucket region. Optionally, set all channels if overrides are intended to be per-voxel.

-    for (let bucketIdx = 0; bucketIdx < bucketCount; bucketIdx++) {
-      for (const { position, value } of overrides) {
-        const [x, y, z] = position;
-        const indexInBucket =
-          bucketIdx * Constants.BUCKET_WIDTH ** 3 +
-          z * Constants.BUCKET_WIDTH ** 2 +
-          y * Constants.BUCKET_WIDTH +
-          x;
-        typedArray[indexInBucket] = value;
-      }
-    }
+    const voxelsPerBucket = Constants.BUCKET_WIDTH ** 3;
+    for (let bucketIdx = 0; bucketIdx < bucketCount; bucketIdx++) {
+      const bucketBase = bucketIdx * channelCount * voxelsPerBucket;
+      for (const { position, value } of overrides) {
+        const [x, y, z] = position;
+        const voxelIndex =
+          z * Constants.BUCKET_WIDTH ** 2 + y * Constants.BUCKET_WIDTH + x;
+        const elementIndex = bucketBase + voxelIndex;
+        // Minimal fix: affect first channel
+        typedArray[elementIndex] = value;
+        // Optional: uncomment to assign all channels
+        // for (let c = 0; c < channelCount; c++) {
+        //   typedArray[bucketBase + voxelIndex + c * voxelsPerBucket] = value;
+        // }
+      }
+    }

♻️ Duplicate comments (1)

frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts (1)
274-276: Remove or gate the fuzzing snippet behind a build-time flag

These commented lines are a handy test hook but easy to forget and confusing in production code. Either remove them or gate via an env flag to allow opt-in local testing.

Option A — remove:
-  // if (Math.random() < 0.5) {
-  //   buffer = buffer.subarray(1, buffer.length - 1);
-  // }
Option B — gate behind an env flag:
+  if (process.env.WK_FUZZ_BUCKET_BYTES === "1") {
+    buffer = buffer.subarray(1, buffer.length - 1);
+  }

🧹 Nitpick comments (21)

frontend/javascripts/test/model/binary/layers/wkstore_adapter.spec.ts (8)
72-75: Tighten TestContext typing for clarity

Optional: Narrow the types to APIColorLayer and APISegmentationLayer to catch mismatches at compile time and make intent explicit.

Example:
// additionally import the concrete types:
// import type { APIColorLayer, APISegmentationLayer } from "types/api_types";

interface TestContext {
  colorLayer: APIColorLayer;
  segmentationLayer: APISegmentationLayer;
}
83-85: Avoid relying on fixture layer ordering; select by category/name instead

Index-based selection is brittle if the fixture changes order. Prefer a guarded lookup.

Apply this diff:
-    context.colorLayer = dataSource.dataLayers[0];
-    context.segmentationLayer = dataSource.dataLayers[1];
+    const color = dataSource.dataLayers.find((l) => l.category === "color");
+    const seg = dataSource.dataLayers.find((l) => l.category === "segmentation");
+    if (!color || !seg) {
+      throw new Error("Expected color and segmentation layers in fixture");
+    }
+    context.colorLayer = color;
+    context.segmentationLayer = seg;
93-103: Ensure byteCount is an integer to avoid range length off-by-one in four-bit mode

When fourBitFactor is 0.5, the product depends on BUCKET_SIZE being even. Make this explicit and future-proof by truncating to an integer.

Apply this diff:
-    const fourBitFactor = _fourBit && layer.category === "color" ? 0.5 : 1;
-    const byteCount = fourBitFactor * getByteCountFromLayer(layer) * Constants.BUCKET_SIZE;
+    const fourBitFactor = _fourBit && layer.category === "color" ? 0.5 : 1;
+    const voxelBytes = getByteCountFromLayer(layer);
+    const byteCount = Math.trunc(fourBitFactor * voxelBytes * Constants.BUCKET_SIZE);
Optional micro-optimization (not required): build bucketData1/2 as Uint8Array directly and assemble responseBuffer via set(...) to avoid large intermediate arrays. I can provide a focused patch if you’d like.

208-211: Reduce brittleness: allow extra option fields without breaking the test

If request options gain new non-essential keys, this strict equality will fail. Using objectContaining preserves intent while keeping future changes flexible.

Apply this diff:
-      expect(RequestMock.sendJSONReceiveArraybufferWithHeaders).toHaveBeenCalledWith(
-        expectedUrl,
-        expectedOptions,
-      );
+      expect(RequestMock.sendJSONReceiveArraybufferWithHeaders).toHaveBeenCalledWith(
+        expectedUrl,
+        expect.objectContaining(expectedOptions),
+      );
125-172: Optional: strengthen token handling assertions

You can assert the retry sequence more explicitly to document behavior: 403 triggers a single token refresh and exactly two send attempts.

Example additions:
expect(RequestMock.receiveJSON).toHaveBeenCalledTimes(2);
expect(RequestMock.sendJSONReceiveArraybufferWithHeaders).toHaveBeenCalledTimes(2);
254-256: Performance: full BUCKET_SIZE arrays increase test runtime

Allocating/compressing full-bucket data is realistic but can slow tests. Consider stubbing the worker or computing base64 once and reusing it.

Example (compute once):
const data = new Uint8Array(Constants.BUCKET_SIZE);
// ...
const encoded = byteArraysToLz4Base64([data])[0];
// reuse `encoded` below in both items
Or mock the worker to return a deterministic placeholder during tests. I can send a patch if you prefer that route.

282-306: DRY: avoid recomputing the same base64 twice

Compute once and reuse to shave time and make intent clearer.

Example patch (outside this exact range due to variable placement):
const encoded = byteArraysToLz4Base64([data])[0];

...
base64Data: encoded,
...
base64Data: encoded,
93-123: Add negative-path coverage for “reject unexpected bucket sizes”

Given the PR objective, consider a test where the response buffer is too short/long to ensure requestWithFallback rejects/marks buckets missing as intended. I can provide a ready-to-merge test.

Example (new test to add):
it<TestContext>("requestWithFallback: rejects responses with unexpected byte count", async ({ colorLayer }) => {
  const { batch } = ((): { batch: BucketAddress[] } => {
    const b: BucketAddress[] = [
      [0, 0, 0, 0],
      [1, 1, 1, 1],
    ];
    return { batch: b };
  })();

  // Build an undersized buffer (one bucket instead of two)
  const voxelBytes = getByteCountFromLayer(colorLayer);
  const bytesPerBucket = Math.trunc(voxelBytes * Constants.BUCKET_SIZE);
  const undersizedBuffer = new Uint8Array(bytesPerBucket); // should be 2 * bytesPerBucket

  vi.mocked(Request)
    .sendJSONReceiveArraybufferWithHeaders.mockReset()
    .mockResolvedValue({
      buffer: undersizedBuffer.buffer,
      headers: { "missing-buckets": "[]" },
    });

  await expect(requestWithFallback(colorLayer, batch)).rejects.toThrow(/unexpected/i);
});
unreleased_changes/8776.md (1)
2-2: Clarify wording and use “byte length” to match implementation

“Unexpected size” is a bit vague. Suggest making it explicit that the byte length is validated and that affected buckets are retried automatically.
-- Reject bucket request responses with unexpected size and try to reload those buckets automatically.
+- Reject bucket request responses with an unexpected byte length and automatically re-request the affected buckets.
frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts (2)
280-284: Improve error context; avoid leaking large buffers

The thrown error is good, but include layer name and counts to simplify triage and keep payload small. Do not embed or stringify the buffer.
-  if (expectedTotalByteLength !== buffer.length) {
-    throw new Error(
-      `Expected ${expectedTotalByteLength} bytes, but received ${buffer.length}. Rejecting buckets.`,
-    );
-  }
+  if (expectedTotalByteLength !== buffer.length) {
+    const msg = `Invalid bucket payload size for layer "${layerInfo.name}": expected ${expectedTotalByteLength} bytes (${availableBucketCount} buckets × ${BUCKET_BYTE_LENGTH}), received ${buffer.length}. Missing indices: [${missingBuckets.join(", ")}].`;
+    throw new Error(msg);
+  }
268-292: Type hygiene: narrow return element type and validate indices (optional)

Consider changing the return type to Array<Uint8Array | null> to match actual values; Uint8Array<ArrayBuffer> is not a valid TS generic and can confuse tooling.

Optionally assert that all values in missingBuckets are within [0, batch.length).
frontend/javascripts/viewer/model/bucket_data_handling/pullqueue.ts (1)
87-117: Per-bucket error isolation is good; also type failedBucketAddresses explicitly

Catching per-bucket failures is the right move. Minor: declare the array as BucketAddress[] for clarity and type-safety.
-    let hasErrored = false;
-    let failedBucketAddresses = [];
+    let hasErrored = false;
+    let failedBucketAddresses: BucketAddress[] = [];
frontend/javascripts/test/sagas/volumetracing/bucket_eviction_without_saving.spec.ts (1)

15-28: Consider adding a regression test for “unexpected byte length” auto-retry

To lock in the new behavior, add a spec that first returns a truncated buffer for one bucket, then a correct buffer, and asserts the bucket eventually loads without crashing. I can draft this if helpful.
frontend/javascripts/test/sagas/volumetracing/volumetracing_remote_bucket_updates.spec.ts (1)
35-35: DRY up the dtype map used in both mocks

You repeat the same per-layer map twice. Consider extracting a local const dtypeMap = { volumeTracingId: "uint16", color: "uint8" } and using it in both mockImplementation calls to reduce duplication and avoid drift in future edits.
@@
-    vi.mocked(mocks.Request).sendJSONReceiveArraybufferWithHeaders.mockImplementation(
-      createBucketResponseFunction({ volumeTracingId: "uint16", color: "uint8" }, oldCellId, 5),
-    );
+    const dtypeMap = { volumeTracingId: "uint16", color: "uint8" } as const;
+    vi.mocked(mocks.Request).sendJSONReceiveArraybufferWithHeaders.mockImplementation(
+      createBucketResponseFunction(dtypeMap, oldCellId, 5),
+    );
@@
-      vi.mocked(mocks.Request).sendJSONReceiveArraybufferWithHeaders.mockImplementation(
-        createBucketResponseFunction({ volumeTracingId: "uint16", color: "uint8" }, newCellId, 5),
-      );
+      vi.mocked(mocks.Request).sendJSONReceiveArraybufferWithHeaders.mockImplementation(
+        createBucketResponseFunction(dtypeMap, newCellId, 5),
+      );
Also applies to: 54-54
frontend/javascripts/test/sagas/volumetracing/bucket_eviction_with_saving.spec.ts (1)
21-21: Include volume tracing layer dtype to harden the initial mock

If any bucket requests for the volume tracing layer occur before testLabelingManyBuckets installs its own mock, the current map (missing volumeTracingId) would trigger “Layer Type … not provided.” Safer to include it here as well.
-      createBucketResponseFunction({ color: "uint8", segmentation: "uint16" }, 0, 0),
+      createBucketResponseFunction({ volumeTracingId: "uint16", color: "uint8", segmentation: "uint16" }, 0, 0),
frontend/javascripts/test/api/api_volume_latest.spec.ts (1)
56-56: Nice: deterministic bucket mock via per-layer dtype map; consider adding a negative-size test

The mock setup is correct and uses the new descriptor style properly. Given the PR’s goal (reject unexpected byte counts), consider adding a test that intentionally returns an undersized/oversized buffer and asserts that the client rejects it and recovers gracefully.

Here’s a minimal pattern you could adapt in this spec (or a new one) to simulate a truncated buffer:
@@
   it<WebknossosTestContext>("Data API: labelVoxels should label a list of voxels", async ({
     api,
     mocks,
   }) => {
     vi.mocked(mocks.Request.sendJSONReceiveArraybufferWithHeaders).mockImplementation(
       createBucketResponseFunction({ volumeTracingId: "uint16", color: "uint8" }, 0, 0),
     );
@@
   });
+
+  it<WebknossosTestContext>("Data API: rejects bucket responses with unexpected byte length", async ({ api, mocks }) => {
+    // Wrap the normal helper but tamper with the returned buffer length.
+    vi.mocked(mocks.Request.sendJSONReceiveArraybufferWithHeaders).mockImplementation(
+      async (url: string, payload: any) => {
+        const ok = await createBucketResponseFunction({ volumeTracingId: "uint16", color: "uint8" }, 0, 0)(url, payload);
+        // Return a buffer missing the last byte.
+        const full = new Uint8Array(ok.buffer);
+        const truncated = full.slice(0, Math.max(0, full.byteLength - 1)).buffer;
+        return { buffer: truncated, headers: ok.headers };
+      },
+    );
+    const volumeTracingId = api.data.getVolumeTracingLayerIds()[0];
+    // The actual assertion here should reflect the client behavior introduced in this PR,
+    // e.g., throwing, logging, and scheduling a retry. Adjust as needed:
+    await expect(api.data.reloadAllBuckets()).rejects.toBeTruthy();
+    // Optionally, verify the system remains responsive:
+    await expect(api.data.getDataValue(volumeTracingId, [0, 0, 0])).resolves.toBeDefined();
+  });
Also applies to: 58-60
frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_1.spec.ts (1)
88-93: Repeated descriptor usage is correct; consider DRY-ing the constant

The updated multi-line usage is clear. Since the same descriptor reappears across this file, consider factoring it into a local constant (e.g., LAYER_DTYPES) to reduce repetition and ease future edits.
+ const LAYER_DTYPES = { volumeTracingId: "uint16", color: "uint8" } as const;
 ...
- createBucketResponseFunction(
-   { volumeTracingId: "uint16", color: "uint8" },
-   newCellId + 1,
-   500,
- ),
+ createBucketResponseFunction(LAYER_DTYPES, newCellId + 1, 500),
frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_2.spec.ts (1)
139-140: Consider a shared constant for the dtype mapping

Same descriptor appears multiple times throughout this spec. Extracting a constant at the top keeps things consistent and reduces maintenance overhead.
+ const LAYER_DTYPES = { volumeTracingId: "uint16", color: "uint8" } as const;
 ...
- createBucketResponseFunction({ volumeTracingId: "uint16", color: "uint8" }, 0)
+ createBucketResponseFunction(LAYER_DTYPES, 0)
Also applies to: 226-227, 302-303, 377-378, 449-450, 509-510, 591-592
frontend/javascripts/test/helpers/apiHelpers.ts (3)
232-241: Layer-name parsing is brittle; resolve by scanning for 'layers' segment

Index-based extraction urlPathParts[4] vs urlPathParts[2] will break if the path layout changes (e.g., tracingstore vs datastore routes). Make this robust by finding the 'layers' segment and taking the next token. Fallback to last token if not found.
-    const requestedURL = new URL(_url);
-    // Removing first empty part as the pathname always starts with a /.
-    const urlPathParts = requestedURL.pathname.split("/").slice(1);
-    const requestedLayerName = urlPathParts[0] === "data" ? urlPathParts[4] : urlPathParts[2];
+    const requestedURL = new URL(_url);
+    const urlPathParts = requestedURL.pathname.split("/").filter(Boolean);
+    const layersIdx = urlPathParts.findIndex((p) => p === "layers");
+    const requestedLayerName =
+      layersIdx >= 0 && layersIdx + 1 < urlPathParts.length
+        ? urlPathParts[layersIdx + 1]
+        : urlPathParts.at(-1)!; // fallback: last segment
245-251: Typed array allocation/fill: OK; minor simplification possible

Logic correctly scales with element size and channel count, and handles BigInt arrays. Minor: returning typedArray.buffer directly below avoids the redundant new Uint8Array(...) wrapper.
-    return {
-      buffer: new Uint8Array(typedArray.buffer).buffer,
+    return {
+      buffer: typedArray.buffer,
       headers: {
         "missing-buckets": "[]",
       },
     };
232-241: Optional: add negative-test support to simulate malformed bucket byte lengths

Given the PR’s goal (reject unexpected byte counts), consider an overload/helper that intentionally skews the returned byte length (±N bytes) to make tests explicit and deterministic instead of relying on commented toggles elsewhere.

I can draft a createSkewedBucketResponseFunction(dataTypePerLayer, fill, { byteSkew }) that wraps this implementation and slices/extends the ArrayBuffer accordingly. Want me to propose it?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between de553c2 and b7725cc.

📒 Files selected for processing (14)

frontend/javascripts/test/api/api_volume_latest.spec.ts (2 hunks)
frontend/javascripts/test/helpers/apiHelpers.ts (3 hunks)
frontend/javascripts/test/model/binary/layers/wkstore_adapter.spec.ts (10 hunks)
frontend/javascripts/test/sagas/proofreading.spec.ts (1 hunks)
frontend/javascripts/test/sagas/volumetracing/bucket_eviction_helper.ts (1 hunks)
frontend/javascripts/test/sagas/volumetracing/bucket_eviction_with_saving.spec.ts (1 hunks)
frontend/javascripts/test/sagas/volumetracing/bucket_eviction_without_saving.spec.ts (1 hunks)
frontend/javascripts/test/sagas/volumetracing/volumetracing_remote_bucket_updates.spec.ts (2 hunks)
frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_1.spec.ts (6 hunks)
frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_2.spec.ts (8 hunks)
frontend/javascripts/viewer/model/bucket_data_handling/bucket.ts (1 hunks)
frontend/javascripts/viewer/model/bucket_data_handling/pullqueue.ts (2 hunks)
frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts (1 hunks)
unreleased_changes/8776.md (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2024-11-22T17:18:04.217Z

Learnt from: dieknolle3333
PR: scalableminds/webknossos#8168
File: frontend/javascripts/oxalis/model/sagas/proofread_saga.ts:1039-1039
Timestamp: 2024-11-22T17:18:04.217Z
Learning: In `frontend/javascripts/oxalis/model/sagas/proofread_saga.ts`, when calling `getMagInfo`, the use of `volumeTracingLayer.resolutions` is intentional and should not be changed to `volumeTracingLayer.mags`.

Applied to files:

frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_1.spec.ts

🧬 Code graph analysis (11)

frontend/javascripts/test/api/api_volume_latest.spec.ts (1)

frontend/javascripts/test/helpers/apiHelpers.ts (1)

createBucketResponseFunction (224-272)

frontend/javascripts/test/sagas/proofreading.spec.ts (1)

frontend/javascripts/test/helpers/apiHelpers.ts (1)

createBucketResponseFunction (224-272)

frontend/javascripts/test/sagas/volumetracing/bucket_eviction_with_saving.spec.ts (1)

frontend/javascripts/test/helpers/apiHelpers.ts (1)

createBucketResponseFunction (224-272)

frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts (1)

frontend/javascripts/libs/DRACOWorker.worker.js (1)

buffer (28-28)

frontend/javascripts/test/sagas/volumetracing/bucket_eviction_helper.ts (1)

frontend/javascripts/test/helpers/apiHelpers.ts (1)

createBucketResponseFunction (224-272)

frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_2.spec.ts (1)

frontend/javascripts/test/helpers/apiHelpers.ts (1)

createBucketResponseFunction (224-272)

frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_1.spec.ts (1)

frontend/javascripts/test/helpers/apiHelpers.ts (1)

createBucketResponseFunction (224-272)

frontend/javascripts/test/sagas/volumetracing/bucket_eviction_without_saving.spec.ts (1)

frontend/javascripts/test/helpers/apiHelpers.ts (1)

createBucketResponseFunction (224-272)

frontend/javascripts/test/model/binary/layers/wkstore_adapter.spec.ts (3)

frontend/javascripts/types/api_types.ts (1)

APIDataLayer (113-113)

frontend/javascripts/test/fixtures/hybridtracing_object.ts (1)

colorLayer (12-31)

frontend/javascripts/viewer/model/bucket_data_handling/wkstore_adapter.ts (1)

requestWithFallback (95-170)

frontend/javascripts/test/sagas/volumetracing/volumetracing_remote_bucket_updates.spec.ts (1)

frontend/javascripts/test/helpers/apiHelpers.ts (1)

createBucketResponseFunction (224-272)

frontend/javascripts/test/helpers/apiHelpers.ts (2)

frontend/javascripts/types/api_types.ts (1)

ElementClass (35-46)

frontend/javascripts/viewer/model/helpers/typed_buffer.ts (2)

TypedArrayConstructor (4-13)

getConstructorForElementClass (15-51)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: build-smoketest-push
GitHub Check: backend-tests

🔇 Additional comments (14)

frontend/javascripts/test/model/binary/layers/wkstore_adapter.spec.ts (3)

2-2: Good call: use per-layer byte width instead of fixed sizes

Importing getByteCountFromLayer ensures the test data reflects dtype (e.g., uint32 vs uint8). This aligns the test with the new validation logic.

16-16: Appropriate use of shared constants

Importing Constants (for BUCKET_SIZE) removes magic numbers and keeps the test consistent with production defaults.

87-91: Sanity check on color layer looks good

Verifying name/category/bit depth is a solid invariant check before deeper tests run.

frontend/javascripts/test/sagas/volumetracing/bucket_eviction_without_saving.spec.ts (1)

19-20: LGTM: test helper migration to per-layer descriptors

Switching to { volumeTracingId: "uint16", color: "uint8" } aligns the spec with the new helper API and mixed-layer scenarios. No issues spotted.

frontend/javascripts/test/sagas/proofreading.spec.ts (1)

65-65: Switched to per-layer dtype map — aligned with new createBucketResponseFunction signature

Providing { color: "uint8", segmentation: "uint16" } matches the helper’s expected Record<string, ElementClass> and should prevent “Layer Type … not provided” errors for these two layers. Looks good.

frontend/javascripts/test/api/api_volume_latest.spec.ts (1)

1-5: Importing createBucketResponseFunction — consistent with new mocking approach

The import changes are correct and consistent with the updated test helper API.

frontend/javascripts/test/sagas/volumetracing/bucket_eviction_helper.ts (1)

35-39: Good: per-layer dtype map covers all likely requests

Including { volumeTracingId: "uint16", color: "uint8", segmentation: "uint16" } avoids missing-type errors during the heavy labeling flow. Looks solid.

frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_1.spec.ts (2)

54-55: Switch to per-layer descriptor for createBucketResponseFunction: looks good

Passing { volumeTracingId: "uint16", color: "uint8" } aligns with the new helper signature and clarifies intent per layer. No issues spotted.

113-114: Layer name consistency verified

The volumetracing_server_objects.ts fixture defines TRACING_ID = "volumeTracingId" and exposes it on each annotationLayer.tracingId field (frontend/javascripts/test/fixtures/volumetracing_server_objects.ts lines 8–16).

All volumetracing saga tests invoke createBucketResponseFunction({ volumeTracingId: "uint16", color: "uint8" }, …) (e.g., frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_1.spec.ts lines 113–114), matching the fixture layer key.

The viewer’s data API builds bucket URLs via
${dataset.dataStore.url}/data/datasets/${…}/layers/${layerName}/data?…
where layerName is sourced from getVolumeTracingLayerIds()—i.e., the tracingId field ("volumeTracingId")—ensuring full alignment between tests and runtime endpoints.

No discrepancies found; no changes needed.

frontend/javascripts/test/sagas/volumetracing/volumetracing_saga_integration_2.spec.ts (1)

49-50: Descriptor-based mock update: OK

Using { volumeTracingId: "uint16", color: "uint8" } matches the refactored helper. Good update.

frontend/javascripts/test/helpers/apiHelpers.ts (4)

53-55: New types/imports fit the refactor

Importing ElementClass and getConstructorForElementClass is appropriate for per-layer dtype handling.

83-85: Default mock wiring is sensible

Providing a default mapping { color: "uint8", segmentation: "uint16" } for tests that don’t override the mock keeps backwards compatibility.

224-229: Function signature: clear and future-proof

Switching to dataTypePerLayer: Record<string, ElementClass> is the right abstraction for multi-layer scenarios.

232-243: Confirmed: “double”/Float64Array is intentionally unsupported
The WebKnossos elementClass only includes uint8, uint16, uint32, and float32. An ElementClass of "double" (Float64Array) isn’t documented or implemented, so getConstructorForElementClass throwing for "double" is the expected behavior. No changes to this helper are required unless future support for float64 is added.

coderabbitai · 2025-08-22T15:54:12Z

frontend/javascripts/test/model/binary/layers/wkstore_adapter.spec.ts

  it<TestContext>("requestWithFallback: Request Handling: should pass the correct request parameters", async ({
-    layer,
+    colorLayer,
  }) => {
-    const { batch } = prepare();
+    const { batch } = prepare(colorLayer);
    const expectedUrl = "url/data/datasets/datasetId/layers/color/data?token=token2";
    const expectedOptions = createExpectedOptions();

-    await requestWithFallback(layer, batch).then(() => {
+    await requestWithFallback(colorLayer, batch).then(() => {
      const RequestMock = vi.mocked(Request);


⚠️ Potential issue

Expected URL uses token2 but no refresh occurs in this test scenario

In this test, Request.receiveJSON isn’t configured to return a second token. With the current setup (no 403), requestWithFallback should use "token". The assertion for token2 will fail spuriously.

Apply this diff:

- const expectedUrl = "url/data/datasets/datasetId/layers/color/data?token=token2"; + const expectedUrl = "url/data/datasets/datasetId/layers/color/data?token=token";

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

it<TestContext>("requestWithFallback: Request Handling: should pass the correct request parameters", async ({

layer,

colorLayer,

}) => {

const { batch } = prepare();

const { batch } = prepare(colorLayer);

const expectedUrl = "url/data/datasets/datasetId/layers/color/data?token=token2";

const expectedOptions = createExpectedOptions();

await requestWithFallback(layer, batch).then(() => {

await requestWithFallback(colorLayer, batch).then(() => {

const RequestMock = vi.mocked(Request);

it<TestContext>("requestWithFallback: Request Handling: should pass the correct request parameters", async ({

colorLayer,

}) => {

const { batch } = prepare(colorLayer);

const expectedUrl = "url/data/datasets/datasetId/layers/color/data?token=token";

const expectedOptions = createExpectedOptions();

await requestWithFallback(colorLayer, batch).then(() => {

const RequestMock = vi.mocked(Request);

🤖 Prompt for AI Agents

In frontend/javascripts/test/model/binary/layers/wkstore_adapter.spec.ts around lines 197 to 205, the test asserts a URL containing "token2" but the mocked Request.receiveJSON does not simulate a token refresh (no 403 path), so the test should expect the original token; update the expectedUrl to use "token" (or alternatively mock Request.receiveJSON to return a refreshed token and simulate the 403 path) so the assertion matches the actual test setup.

normanrz · 2025-08-25T07:06:37Z

frontend/javascripts/viewer/model/bucket_data_handling/pullqueue.ts

-        } else {
-          this.handleBucket(bucket, bucketBuffer);
+          if (bucketBuffer == null && !renderMissingDataBlack) {
+            bucket.markAsFailed(true);


What is the difference between bucket.markAsFailed(true) and failedBucketAddresses.push(bucketAddress)?

failedBucketAddresses is an array of bucket addresses which stores buckets which have no / broken data and thus should be immediately re-requested. This is done a few lines below: lines 121ff.

failedBucketAddresses = failedBucketAddresses.length === 0 ? batch : failedBucketAddresses; for (const bucketAddress of failedBucketAddresses) { const bucket = this.cube.getBucket(bucketAddress); if (bucket.type === "data") { bucket.markAsFailed(false); if (bucket.dirty) { bucket.addToPullQueueWithHighestPriority(); } } }

And bucket.markAsFailed(true) sets the state of the data bucket in the FE to "missing" leading to some other side effects (if the user still anotated some stuff while the bucket was loaded, it will be added to the push queue). At least thats how I read the code.

…ect-unexpected-bucket-lengths

…eminds/webknossos into reject-unexpected-bucket-lengths

reject bucket responses that have unexpected byte count

f0e19e0

philippotto self-assigned this Jul 12, 2025

philippotto added 4 commits July 12, 2025 09:10

fix tests

251a096

add comment

bea2099

add code for testing

13e1290

wip: try to fix tests (does not work; the bucket mock function needs …

abd553e

…to get the dataset somehow)

philippotto assigned MichaelBuessemeyer and unassigned philippotto Jul 20, 2025

MichaelBuessemeyer added 2 commits August 22, 2025 15:05

Merge branch 'master' of github.com:scalableminds/webknossos into rej…

2c00625

…ect-unexpected-bucket-lengths

fix tests and also throw bucket size assertion error in test env

f03561f

MichaelBuessemeyer reviewed Aug 22, 2025

View reviewed changes

add changelog file

b7725cc

MichaelBuessemeyer added bug frontend labels Aug 22, 2025

MichaelBuessemeyer marked this pull request as ready for review August 22, 2025 15:44

MichaelBuessemeyer requested a review from normanrz August 22, 2025 15:44

MichaelBuessemeyer changed the title ~~[WIP] Reject bucket responses that have unexpected byte count~~ Reject bucket responses that have unexpected byte count Aug 22, 2025

coderabbitai bot reviewed Aug 22, 2025

View reviewed changes

normanrz approved these changes Aug 25, 2025

View reviewed changes

MichaelBuessemeyer added 4 commits August 25, 2025 10:47

remove testing code

6530d73

Merge branch 'master' of github.com:scalableminds/webknossos into rej…

f71ba8a

…ect-unexpected-bucket-lengths

Merge branch 'master' into reject-unexpected-bucket-lengths

bd96a32

Merge branch 'reject-unexpected-bucket-lengths' of github.com:scalabl…

69a87e2

…eminds/webknossos into reject-unexpected-bucket-lengths

MichaelBuessemeyer merged commit 9d6fe6e into master Aug 25, 2025
5 checks passed

MichaelBuessemeyer deleted the reject-unexpected-bucket-lengths branch August 25, 2025 11:59

Reject bucket responses that have unexpected byte count #8776

Reject bucket responses that have unexpected byte count #8776

Uh oh!

Conversation

philippotto commented Jul 12, 2025 • edited by MichaelBuessemeyer Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

URL of deployed dev instance (used for testing):

Steps to test:

TODOs:

Issues:

Uh oh!

coderabbitai bot commented Jul 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

philippotto commented Jul 20, 2025

Uh oh!

MichaelBuessemeyer Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

MichaelBuessemeyer Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

MichaelBuessemeyer commented Aug 22, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 22, 2025

Choose a reason for hiding this comment

Uh oh!

normanrz Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

MichaelBuessemeyer Aug 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

philippotto commented Jul 12, 2025 •

edited by MichaelBuessemeyer

Loading

coderabbitai bot commented Jul 12, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)