A Rust implementation of a time-decaying Bloom filter with multiple storage backends and a high-performance HTTP API server.
This crate provides a Bloom filter implementation that automatically expires elements after a configurable time period using a sliding window approach. It's particularly useful for rate limiting, caching, and tracking recently seen items where older data becomes less relevant over time.
- Time-based automatic element expiration - items naturally age out without manual intervention
- Multiple storage backends:
- In-memory for maximum performance
- ReDB persistence for durability across restarts
- Configurable false positive rate - tune memory usage vs. accuracy tradeoffs
- Multi-level sliding window design with timestamp-based expiration
- Complete API ecosystem:
- HTTP server with Swagger UI documentation
- CLI with interactive TUI mode
- Programmatic Rust API
The time-decaying Bloom filter uses a sliding window approach with the following characteristics:
- Sub-Filters: The main Bloom filter is divided into N sub-filters (BF_1, BF_2, …, BF_N)
- Time Windows: Each sub-filter corresponds to a fixed time window T (e.g., 1 minute)
- Rotation Mechanism: Sub-filters are rotated in a circular manner to represent sliding time intervals
- Insertion: Elements are added to the current active sub-filter with timestamps
- Query: Checks for element presence across all non-expired sub-filters
- Cleanup: Automatically removes expired elements based on configured time windows
Add this to your Cargo.toml
:
[dependencies]
expiring-bloom-rs = "0.2"
use expiring_bloom_rs::{FilterConfigBuilder, InMemorySlidingBloomFilter, SlidingBloomFilter};
use std::time::Duration;
fn main() -> Result<(), Box<dyn std::error::Error>> {
// Configure the filter
let config = FilterConfigBuilder::default()
.capacity(1000)
.false_positive_rate(0.01)
.level_duration(Duration::from_secs(60))
.max_levels(3)
.build()?;
// Create an in-memory filter
let mut filter = InMemorySlidingBloomFilter::new(config)?;
// Insert and query items
filter.insert(b"test_item")?;
assert!(filter.query(b"test_item")?);
Ok(())
}
use expiring_bloom_rs::{FilterConfigBuilder, RedbSlidingBloomFilter, SlidingBloomFilter};
use std::{path::PathBuf, time::Duration};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let config = FilterConfigBuilder::default()
.capacity(10000)
.false_positive_rate(0.01)
.level_duration(Duration::from_secs(60))
.max_levels(5)
.build()?;
// Filter state persists across program restarts
let mut filter = RedbSlidingBloomFilter::new(
Some(config),
PathBuf::from("bloom_database.redb")
)?;
filter.insert(b"persistent_item")?;
// Later or in another process:
// let filter = RedbSlidingBloomFilter::new(None, PathBuf::from("bloom_database.redb"))?;
// filter.query(b"persistent_item")?; // Returns true if not expired
Ok(())
}
use expiring_bloom_rs::{ServerConfigBuilder, FilterConfigBuilder};
use std::time::Duration;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Configure the server
let server_config = ServerConfigBuilder::default()
.server_host("127.0.0.1".to_string())
.server_port(3000)
.bloom_db_path("bloom.redb".to_string())
.bloom_capacity(10000)
.bloom_false_positive_rate(0.01)
.bloom_level_duration(Duration::from_secs(60))
.bloom_max_levels(3)
.build()?;
// Start the server
expiring_bloom_rs::run_server(server_config).await?;
Ok(())
}
The crate includes a command-line interface with both command mode and an interactive TUI:
# Create a new filter
expblf create --db-path myfilter.redb --capacity 10000 --fpr 0.01
# Insert an element
expblf load --db-path myfilter.redb insert --element "example-key"
# Check if an element exists
expblf load --db-path myfilter.redb check --element "example-key"
# Start interactive TUI
expblf tui --db-path myfilter.redb
The HTTP server provides the following REST endpoints:
GET /health
- Health check endpointPOST /items
- Insert an item into the filterGET /items/{value}
- Query if an item exists in the filterPOST /cleanup
- Manually trigger cleanup of expired items/swagger-ui
- Interactive API documentation
The filter can be configured with the following parameters:
Parameter | Description | Default |
---|---|---|
capacity |
Maximum number of elements | 1,000,000 |
false_positive_rate |
Desired false positive rate | 0.01 (1%) |
level_duration |
Duration before level rotation | 60 seconds |
max_levels |
Number of filter levels | 3 |
hash_function |
Custom hash function | Combined FNV-1a/Murmur3 |
Bro, it's 🦀🦀🦀 RUST 🦀🦀🦀 and its BLAZINGLY FAST 🚀🚀🚀
Memory usage is calculated as:
total_bits = capacity * max_levels
memory_bytes = total_bits * 8
Since i use u8
to store bool
.
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
This project is licensed under the MIT License - see the LICENSE file for details.