bloom
bloom : bloom access method - signature file based index
Overview
| ID | Extension | Package | Version | Category | License | Language |
|---|---|---|---|---|---|---|
| 2990 | bloom
|
bloom
|
1.0 |
FEAT
|
PostgreSQL
|
C
|
| Attribute | Has Binary | Has Library | Need Load | Has DDL | Relocatable | Trusted |
|---|---|---|---|---|---|---|
--s-d--
|
No
|
Yes
|
No
|
Yes
|
no
|
no
|
| Relationships | |
|---|---|
| See Also | hll
age
rum
pg_graphql
pg_jsonschema
jsquery
pg_hint_plan
hypopg
|
Packages
| PG18 | PG17 | PG16 | PG15 | PG14 |
|---|---|---|---|---|
1.0
|
1.0
|
1.0
|
1.0
|
1.0
|
This is a built-in contrib extension ship with the PostgreSQL kernel
Install
Create this extension with:
CREATE EXTENSION bloom;Usage
The bloom extension provides an index access method based on Bloom filters. A Bloom filter is a space-efficient data structure that tests whether an element is a member of a set, with possible false positives but no false negatives.
Bloom indexes are particularly useful for tables with many columns where queries test arbitrary combinations of columns. A single bloom index can replace multiple btree indexes while using significantly less space.
Create Bloom Index
CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3, i4, i5, i6);With custom parameters:
CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3)
WITH (length=80, col1=2, col2=2, col3=4);Parameters
| Parameter | Default | Max | Description |
|---|---|---|---|
length |
80 | 4096 | Length of each signature in bits (rounded to nearest 16) |
col1 - col32 |
2 | 4095 | Number of bits generated for each index column |
Operator Class Support
Bloom indexes only support the equality operator (=). Built-in operator classes exist for int4 and text. You can define custom operator classes for other types that have a hash function:
CREATE OPERATOR CLASS text_ops
DEFAULT FOR TYPE text USING bloom AS
OPERATOR 1 =(text, text),
FUNCTION 1 hashtext(text);Example
-- Create a table with many columns
CREATE TABLE tbloom AS
SELECT
(random() * 1000000)::int as i1,
(random() * 1000000)::int as i2,
(random() * 1000000)::int as i3,
(random() * 1000000)::int as i4,
(random() * 1000000)::int as i5,
(random() * 1000000)::int as i6
FROM generate_series(1, 10000000);
-- A single bloom index covers all column combinations
CREATE INDEX bloomidx ON tbloom USING bloom (i1, i2, i3, i4, i5, i6);
-- Queries on any subset of columns can use the index
SELECT * FROM tbloom WHERE i2 = 898732 AND i5 = 123451;Limitations
- Only equality (
=) queries are supported (no range queries) - Does not support
UNIQUEindexes - Does not support searching for
NULLvalues - Results require recheck against heap due to false positives