Skip to content

[GH-25] Fix persistent split/merged flags causing data corruption #41

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 13, 2025

Conversation

zhangfengcdt
Copy link
Owner

Summary

Fixes a critical bug where the split and merged boolean flags in ProllyNode were being persisted and never reset, causing severe data corruption during tree operations.

Problem

The split and merged flags were:

  • Being serialized/deserialized with the node data
  • Never reset to false after operations completed
  • Causing incorrect tree restructuring on subsequent operations
  • Leading to catastrophic data loss when merged nodes had siblings incorrectly removed

Solution

  1. Made flags transient: Added #[serde(skip)] to prevent serialization
  2. Reset on retrieval: Both storage implementations now reset flags to false when nodes are loaded
  3. Added tests: Comprehensive tests verify the fix and prevent regressions

Changes

  • src/node.rs: Added #[serde(skip)] to split/merged fields, added verification tests
  • src/storage.rs: Modified both InMemoryNodeStorage and FileNodeStorage to reset flags on retrieval

Testing

  • All existing tests pass (34/34)
  • New tests verify flags are not serialized and are properly reset
  • Manual testing confirms normal tree operations without corruption

Impact

This fix prevents data corruption and ensures tree integrity across all operations involving node splits and merges.

🤖 Generated with Claude Code

The bool-typed fields `merged` and `split` in the ProllyNode struct were being
persisted through serialization and never reset to false, causing severe data
corruption during tree operations.

This commit fixes the issue by:
1. Making split/merged flags transient using #[serde(skip)]
2. Ensuring flags are always reset to false when nodes are retrieved from storage
3. Adding comprehensive tests to verify the fix and prevent regressions

The fix prevents catastrophic data corruption where:
- Split nodes would have their edges incorrectly hoisted to parent on every operation
- Merged nodes would have their siblings incorrectly dropped from parent

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@zhangfengcdt zhangfengcdt changed the title Fix persistent split/merged flags causing data corruption [GH-25] Fix persistent split/merged flags causing data corruption Jul 13, 2025
🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@zhangfengcdt zhangfengcdt merged commit 93b5e28 into main Jul 13, 2025
4 checks passed
@zhangfengcdt zhangfengcdt deleted the fix.node.mergeandsplit branch July 13, 2025 01:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant