pg_bulkload
pg_bulkload
pg_bulkload : pg_bulkload is a high speed data loading utility for PostgreSQL
Overview
| ID | Extension | Package | Version | Category | License | Language |
|---|---|---|---|---|---|---|
| 9830 | pg_bulkload
|
pg_bulkload
|
3.1.23 |
ETL
|
BSD 3-Clause
|
C
|
| Attribute | Has Binary | Has Library | Need Load | Has DDL | Relocatable | Trusted |
|---|---|---|---|---|---|---|
--s-d--
|
Yes
|
Yes
|
No
|
Yes
|
no
|
no
|
| Relationships | |
|---|---|
| See Also | file_fdw
aws_s3
db_migrator
pg_fact_loader
mysql_fdw
oracle_fdw
postgres_fdw
pglogical
|
pg18 fixed by vonng
Packages
| Type | Repo | Version | PG Major Compatibility | Package Pattern | Dependencies |
|---|---|---|---|---|---|
| EXT | PIGSTY
|
3.1.23 |
18
17
16
15
14
|
pg_bulkload |
- |
| RPM | PGDG
|
3.1.23 |
18
17
16
15
14
|
pg_bulkload_$v |
- |
| DEB | PIGSTY
|
3.1.23 |
18
17
16
15
14
|
postgresql-$v-pg-bulkload |
- |
| Linux / PG | PG18 | PG17 | PG16 | PG15 | PG14 |
|---|---|---|---|---|---|
el8.x86_64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
el8.aarch64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
el9.x86_64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
el9.aarch64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
el10.x86_64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
el10.aarch64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
d12.x86_64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
d12.aarch64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
d13.x86_64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
d13.aarch64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
u22.x86_64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
u22.aarch64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
u24.x86_64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
u24.aarch64
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
PIGSTY 3.1.23
|
Source
pig build pkg pg_bulkload; # build rpm/debInstall
Make sure PGDG and PIGSTY repo available:
pig repo add pgsql -u # add both repo and update cacheInstall this extension with pig:
pig install pg_bulkload; # install via package name, for the active PG version
pig install pg_bulkload -v 18; # install for PG 18
pig install pg_bulkload -v 17; # install for PG 17
pig install pg_bulkload -v 16; # install for PG 16
pig install pg_bulkload -v 15; # install for PG 15
pig install pg_bulkload -v 14; # install for PG 14Create this extension with:
CREATE EXTENSION pg_bulkload;Usage
pg_bulkload: pg_bulkload is a high speed data loading utility for PostgreSQL
A high-speed data loading tool for PostgreSQL that bypasses shared buffers for massive data loads, with built-in ETL features for input validation and data transformation.
Basic Usage
Load data using a control file:
pg_bulkload sample_csv.ctlOutput:
NOTICE: BULK LOAD START
NOTICE: BULK LOAD END
0 Rows skipped.
8 Rows successfully loaded.
0 Rows not loaded due to parse errors.
0 Rows not loaded due to duplicate errors.
0 Rows replaced with new rows.Control File Example
# sample_csv.ctl
OUTPUT = my_table
INPUT = /path/to/data.csv
TYPE = CSV
DELIMITER = ,
QUOTE = "\""
ESCAPE = "\""
NULL = ""
SKIP = 1 # skip header row
PARSE_ERRORS = 100 # allow up to 100 parse errors
DUPLICATE_ERRORS = 0 # reject on duplicate key errors
ON_DUPLICATE_KEEP = NEW # or OLD
TRUNCATE = NOLoading Modes
- DIRECT: Bypasses shared buffers, writes directly to data files (fastest)
- PARALLEL: Uses multiple processes for loading
- CSV/BINARY/FIXED: Supports various input formats
SQL Interface
-- Load data from within SQL
SELECT pg_bulkload(
'OUTPUT = my_table, INPUT = /path/to/data.csv, TYPE = CSV'
);Key Features
- Bypasses PostgreSQL shared buffers for maximum throughput
- Input data validation with configurable error thresholds
- Duplicate key handling (keep new, keep old, or reject)
- CSV, fixed-length, and binary input formats
- Skip rows, filter functions for data transformation
- Parallel loading support
Documentation
Full documentation: http://ossc-db.github.io/pg_bulkload/index.html
Last updated on