zhparser

Extensions

zhparser

zhparser : a parser for full-text search of Chinese

Overview

ID	Extension	Package	Version	Category	License	Language
2130	zhparser	zhparser	`2.3`	FTS	PostgreSQL	C

Attribute	Has Binary	Has Library	Need Load	Has DDL	Relocatable	Trusted
--s-d-r	No	Yes	No	Yes	yes	no

Relationships
See Also	pg_trgm rum pg_search pgroonga pgroonga_database pg_bigm pg_tokenizer vchord_bm25

Packages

Type	Repo	Version	PG Major Compatibility	Package Pattern	Dependencies
EXT	PIGSTY	`2.3`	18 17 16 15 14	`zhparser`	-
RPM	PIGSTY	`2.3`	18 17 16 15 14	`zhparser_$v`	-
DEB	PIGSTY	`2.3`	18 17 16 15 14	`postgresql-$v-zhparser`	-

Linux / PG	PG18	PG17	PG16	PG15	PG14
el8.x86_64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
el8.aarch64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
el9.x86_64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
el9.aarch64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
el10.x86_64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
el10.aarch64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
d12.x86_64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
d12.aarch64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
d13.x86_64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
d13.aarch64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
u22.x86_64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
u22.aarch64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
u24.x86_64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3
u24.aarch64	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3	PIGSTY 2.3

Package	Version	OS	ORG	SIZE	File URL
`zhparser_18`	`2.3`	el8.x86_64	pigsty	4.7 MiB	zhparser_18-2.3-1PIGSTY.el8.x86_64.rpm
`zhparser_18`	`2.3`	el8.aarch64	pigsty	4.7 MiB	zhparser_18-2.3-1PIGSTY.el8.aarch64.rpm
`zhparser_18`	`2.3`	el9.x86_64	pigsty	4.3 MiB	zhparser_18-2.3-1PIGSTY.el9.x86_64.rpm
`zhparser_18`	`2.3`	el9.aarch64	pigsty	4.3 MiB	zhparser_18-2.3-1PIGSTY.el9.aarch64.rpm
`zhparser_18`	`2.3`	el10.x86_64	pigsty	4.3 MiB	zhparser_18-2.3-1PIGSTY.el10.x86_64.rpm
`zhparser_18`	`2.3`	el10.aarch64	pigsty	4.3 MiB	zhparser_18-2.3-1PIGSTY.el10.aarch64.rpm
`postgresql-18-zhparser`	`2.3`	d12.x86_64	pigsty	4.0 MiB	postgresql-18-zhparser_2.3-1PIGSTY~bookworm_amd64.deb
`postgresql-18-zhparser`	`2.3`	d12.aarch64	pigsty	4.0 MiB	postgresql-18-zhparser_2.3-1PIGSTY~bookworm_arm64.deb
`postgresql-18-zhparser`	`2.3`	d13.x86_64	pigsty	4.0 MiB	postgresql-18-zhparser_2.3-1PIGSTY~trixie_amd64.deb
`postgresql-18-zhparser`	`2.3`	d13.aarch64	pigsty	4.0 MiB	postgresql-18-zhparser_2.3-1PIGSTY~trixie_arm64.deb
`postgresql-18-zhparser`	`2.3`	u22.x86_64	pigsty	4.3 MiB	postgresql-18-zhparser_2.3-1PIGSTY~jammy_amd64.deb
`postgresql-18-zhparser`	`2.3`	u22.aarch64	pigsty	4.3 MiB	postgresql-18-zhparser_2.3-1PIGSTY~jammy_arm64.deb
`postgresql-18-zhparser`	`2.3`	u24.x86_64	pigsty	4.3 MiB	postgresql-18-zhparser_2.3-1PIGSTY~noble_amd64.deb
`postgresql-18-zhparser`	`2.3`	u24.aarch64	pigsty	4.3 MiB	postgresql-18-zhparser_2.3-1PIGSTY~noble_arm64.deb

Package	Version	OS	ORG	SIZE	File URL
`zhparser_17`	`2.3`	el8.x86_64	pigsty	4.7 MiB	zhparser_17-2.3-1PIGSTY.el8.x86_64.rpm
`zhparser_17`	`2.3`	el8.aarch64	pigsty	4.7 MiB	zhparser_17-2.3-1PIGSTY.el8.aarch64.rpm
`zhparser_17`	`2.3`	el9.x86_64	pigsty	4.3 MiB	zhparser_17-2.3-1PIGSTY.el9.x86_64.rpm
`zhparser_17`	`2.3`	el9.aarch64	pigsty	4.3 MiB	zhparser_17-2.3-1PIGSTY.el9.aarch64.rpm
`zhparser_17`	`2.3`	el10.x86_64	pigsty	4.3 MiB	zhparser_17-2.3-1PIGSTY.el10.x86_64.rpm
`zhparser_17`	`2.3`	el10.aarch64	pigsty	4.3 MiB	zhparser_17-2.3-1PIGSTY.el10.aarch64.rpm
`postgresql-17-zhparser`	`2.3`	d12.x86_64	pigsty	4.0 MiB	postgresql-17-zhparser_2.3-1PIGSTY~bookworm_amd64.deb
`postgresql-17-zhparser`	`2.3`	d12.aarch64	pigsty	4.0 MiB	postgresql-17-zhparser_2.3-1PIGSTY~bookworm_arm64.deb
`postgresql-17-zhparser`	`2.3`	d13.x86_64	pigsty	4.0 MiB	postgresql-17-zhparser_2.3-1PIGSTY~trixie_amd64.deb
`postgresql-17-zhparser`	`2.3`	d13.aarch64	pigsty	4.0 MiB	postgresql-17-zhparser_2.3-1PIGSTY~trixie_arm64.deb
`postgresql-17-zhparser`	`2.3`	u22.x86_64	pigsty	4.3 MiB	postgresql-17-zhparser_2.3-1PIGSTY~jammy_amd64.deb
`postgresql-17-zhparser`	`2.3`	u22.aarch64	pigsty	4.3 MiB	postgresql-17-zhparser_2.3-1PIGSTY~jammy_arm64.deb
`postgresql-17-zhparser`	`2.3`	u24.x86_64	pigsty	4.3 MiB	postgresql-17-zhparser_2.3-1PIGSTY~noble_amd64.deb
`postgresql-17-zhparser`	`2.3`	u24.aarch64	pigsty	4.3 MiB	postgresql-17-zhparser_2.3-1PIGSTY~noble_arm64.deb

Package	Version	OS	ORG	SIZE	File URL
`zhparser_16`	`2.3`	el8.x86_64	pigsty	4.7 MiB	zhparser_16-2.3-1PIGSTY.el8.x86_64.rpm
`zhparser_16`	`2.3`	el8.aarch64	pigsty	4.7 MiB	zhparser_16-2.3-1PIGSTY.el8.aarch64.rpm
`zhparser_16`	`2.3`	el9.x86_64	pigsty	4.3 MiB	zhparser_16-2.3-1PIGSTY.el9.x86_64.rpm
`zhparser_16`	`2.3`	el9.aarch64	pigsty	4.3 MiB	zhparser_16-2.3-1PIGSTY.el9.aarch64.rpm
`zhparser_16`	`2.3`	el10.x86_64	pigsty	4.3 MiB	zhparser_16-2.3-1PIGSTY.el10.x86_64.rpm
`zhparser_16`	`2.3`	el10.aarch64	pigsty	4.3 MiB	zhparser_16-2.3-1PIGSTY.el10.aarch64.rpm
`postgresql-16-zhparser`	`2.3`	d12.x86_64	pigsty	4.0 MiB	postgresql-16-zhparser_2.3-1PIGSTY~bookworm_amd64.deb
`postgresql-16-zhparser`	`2.3`	d12.aarch64	pigsty	4.0 MiB	postgresql-16-zhparser_2.3-1PIGSTY~bookworm_arm64.deb
`postgresql-16-zhparser`	`2.3`	d13.x86_64	pigsty	4.0 MiB	postgresql-16-zhparser_2.3-1PIGSTY~trixie_amd64.deb
`postgresql-16-zhparser`	`2.3`	d13.aarch64	pigsty	4.0 MiB	postgresql-16-zhparser_2.3-1PIGSTY~trixie_arm64.deb
`postgresql-16-zhparser`	`2.3`	u22.x86_64	pigsty	4.3 MiB	postgresql-16-zhparser_2.3-1PIGSTY~jammy_amd64.deb
`postgresql-16-zhparser`	`2.3`	u22.aarch64	pigsty	4.3 MiB	postgresql-16-zhparser_2.3-1PIGSTY~jammy_arm64.deb
`postgresql-16-zhparser`	`2.3`	u24.x86_64	pigsty	4.3 MiB	postgresql-16-zhparser_2.3-1PIGSTY~noble_amd64.deb
`postgresql-16-zhparser`	`2.3`	u24.aarch64	pigsty	4.3 MiB	postgresql-16-zhparser_2.3-1PIGSTY~noble_arm64.deb

Package	Version	OS	ORG	SIZE	File URL
`zhparser_15`	`2.3`	el8.x86_64	pigsty	4.7 MiB	zhparser_15-2.3-1PIGSTY.el8.x86_64.rpm
`zhparser_15`	`2.3`	el8.aarch64	pigsty	4.7 MiB	zhparser_15-2.3-1PIGSTY.el8.aarch64.rpm
`zhparser_15`	`2.3`	el9.x86_64	pigsty	4.3 MiB	zhparser_15-2.3-1PIGSTY.el9.x86_64.rpm
`zhparser_15`	`2.3`	el9.aarch64	pigsty	4.3 MiB	zhparser_15-2.3-1PIGSTY.el9.aarch64.rpm
`zhparser_15`	`2.3`	el10.x86_64	pigsty	4.3 MiB	zhparser_15-2.3-1PIGSTY.el10.x86_64.rpm
`zhparser_15`	`2.3`	el10.aarch64	pigsty	4.3 MiB	zhparser_15-2.3-1PIGSTY.el10.aarch64.rpm
`postgresql-15-zhparser`	`2.3`	d12.x86_64	pigsty	4.0 MiB	postgresql-15-zhparser_2.3-1PIGSTY~bookworm_amd64.deb
`postgresql-15-zhparser`	`2.3`	d12.aarch64	pigsty	4.0 MiB	postgresql-15-zhparser_2.3-1PIGSTY~bookworm_arm64.deb
`postgresql-15-zhparser`	`2.3`	d13.x86_64	pigsty	4.0 MiB	postgresql-15-zhparser_2.3-1PIGSTY~trixie_amd64.deb
`postgresql-15-zhparser`	`2.3`	d13.aarch64	pigsty	4.0 MiB	postgresql-15-zhparser_2.3-1PIGSTY~trixie_arm64.deb
`postgresql-15-zhparser`	`2.3`	u22.x86_64	pigsty	4.3 MiB	postgresql-15-zhparser_2.3-1PIGSTY~jammy_amd64.deb
`postgresql-15-zhparser`	`2.3`	u22.aarch64	pigsty	4.3 MiB	postgresql-15-zhparser_2.3-1PIGSTY~jammy_arm64.deb
`postgresql-15-zhparser`	`2.3`	u24.x86_64	pigsty	4.3 MiB	postgresql-15-zhparser_2.3-1PIGSTY~noble_amd64.deb
`postgresql-15-zhparser`	`2.3`	u24.aarch64	pigsty	4.3 MiB	postgresql-15-zhparser_2.3-1PIGSTY~noble_arm64.deb

Package	Version	OS	ORG	SIZE	File URL
`zhparser_14`	`2.3`	el8.x86_64	pigsty	4.7 MiB	zhparser_14-2.3-1PIGSTY.el8.x86_64.rpm
`zhparser_14`	`2.3`	el8.aarch64	pigsty	4.7 MiB	zhparser_14-2.3-1PIGSTY.el8.aarch64.rpm
`zhparser_14`	`2.3`	el9.x86_64	pigsty	4.3 MiB	zhparser_14-2.3-1PIGSTY.el9.x86_64.rpm
`zhparser_14`	`2.3`	el9.aarch64	pigsty	4.3 MiB	zhparser_14-2.3-1PIGSTY.el9.aarch64.rpm
`zhparser_14`	`2.3`	el10.x86_64	pigsty	4.3 MiB	zhparser_14-2.3-1PIGSTY.el10.x86_64.rpm
`zhparser_14`	`2.3`	el10.aarch64	pigsty	4.3 MiB	zhparser_14-2.3-1PIGSTY.el10.aarch64.rpm
`postgresql-14-zhparser`	`2.3`	d12.x86_64	pigsty	4.0 MiB	postgresql-14-zhparser_2.3-1PIGSTY~bookworm_amd64.deb
`postgresql-14-zhparser`	`2.3`	d12.aarch64	pigsty	4.0 MiB	postgresql-14-zhparser_2.3-1PIGSTY~bookworm_arm64.deb
`postgresql-14-zhparser`	`2.3`	d13.x86_64	pigsty	4.0 MiB	postgresql-14-zhparser_2.3-1PIGSTY~trixie_amd64.deb
`postgresql-14-zhparser`	`2.3`	d13.aarch64	pigsty	4.0 MiB	postgresql-14-zhparser_2.3-1PIGSTY~trixie_arm64.deb
`postgresql-14-zhparser`	`2.3`	u22.x86_64	pigsty	4.3 MiB	postgresql-14-zhparser_2.3-1PIGSTY~jammy_amd64.deb
`postgresql-14-zhparser`	`2.3`	u22.aarch64	pigsty	4.3 MiB	postgresql-14-zhparser_2.3-1PIGSTY~jammy_arm64.deb
`postgresql-14-zhparser`	`2.3`	u24.x86_64	pigsty	4.3 MiB	postgresql-14-zhparser_2.3-1PIGSTY~noble_amd64.deb
`postgresql-14-zhparser`	`2.3`	u24.aarch64	pigsty	4.3 MiB	postgresql-14-zhparser_2.3-1PIGSTY~noble_arm64.deb

Source

Repository

github.com/amutu/zhparser

Source Tarball

zhparser-2.3.tar.gz

pig build pkg zhparser;		# build rpm/deb

Install

Make sure PGDG and PIGSTY repo available:

pig repo add pgsql -u   # add both repo and update cache

Install this extension with pig:

pig install zhparser;		# install via package name, for the active PG version

pig install zhparser -v 18;   # install for PG 18
pig install zhparser -v 17;   # install for PG 17
pig install zhparser -v 16;   # install for PG 16
pig install zhparser -v 15;   # install for PG 15
pig install zhparser -v 14;   # install for PG 14

Create this extension with:

CREATE EXTENSION zhparser;

Usage

GitHub: amutu/zhparser

zhparser is a PostgreSQL extension for full-text search of Chinese, based on the Simple Chinese Word Segmentation (SCWS) library.

Features

Chinese text segmentation for PostgreSQL full-text search
Built on the SCWS (Simple Chinese Word Segmentation) library
Supports custom dictionaries (TXT and XDB formats)
Database-level custom word tables (since v2.1)
Multiple tunable parameters for segmentation behavior

Quick Start

-- Create the extension
CREATE EXTENSION zhparser;

-- Create a text search configuration using zhparser
CREATE TEXT SEARCH CONFIGURATION chinese (PARSER = zhparser);

-- Add token type mappings
ALTER TEXT SEARCH CONFIGURATION chinese ADD MAPPING FOR n,v,a,i,e,l WITH simple;

-- Test Chinese text segmentation
SELECT to_tsvector('chinese', '小明硕士毕业于中国科学院计算所，后在日本京都大学深造');

-- Create a table and index for Chinese full text search
CREATE TABLE articles (id serial PRIMARY KEY, title text, body text);

CREATE INDEX articles_body_idx ON articles
  USING gin (to_tsvector('chinese', body));

-- Query with Chinese full text search
SELECT * FROM articles
  WHERE to_tsvector('chinese', body) @@ to_tsquery('chinese', '中国');

Configuration Parameters

zhparser provides several GUC parameters to control segmentation behavior:

Parameter	Default	Description
`zhparser.punctuation_ignore`	`off`	Ignore all punctuation
`zhparser.seg_with_duality`	`off`	Perform duality segmentation on long words
`zhparser.dict_in_memory`	`off`	Load the whole dictionary into memory
`zhparser.multi_short`	`off`	Short word compound segmentation
`zhparser.multi_duality`	`off`	Duality compound segmentation
`zhparser.multi_zmain`	`off`	Key word in first compound segmentation
`zhparser.multi_zall`	`off`	Use all compound segmentation

Token Types

zhparser supports the following token types from SCWS:

Code	Description
`a`	Adjective
`b`	Differentiation (区别词)
`c`	Conjunction
`d`	Adverb
`e`	Exclamation
`f`	Position word (方位词)
`g`	Root word (词根)
`h`	Prefix
`i`	Idiom
`j`	Abbreviation
`k`	Suffix
`l`	Temporary idiom
`m`	Numeral
`n`	Noun
`o`	Onomatopoeia
`p`	Preposition
`q`	Classifier
`r`	Pronoun
`s`	Space word (处所词)
`t`	Time word
`u`	Auxiliary
`v`	Verb
`w`	Punctuation
`x`	Unknown
`y`	Modal particle
`z`	Status word (状态词)

Custom Dictionaries

File-based Dictionaries

Place custom dictionary files in the share directory (typically $SHAREDIR/tsearch_data/):

TXT format: one word per line
XDB format: compiled SCWS dictionary format

Custom dictionaries take precedence over built-in dictionaries.

Database-level Custom Words (v2.1+)

-- Add custom words via zhparser's built-in table
INSERT INTO zhparser.zhprs_custom_word VALUES ('中国科学院计算所');

-- Reload custom dictionary (reconnect after sync to take effect)
SELECT sync_zhprs_custom_word();

-- Verify segmentation with custom word
SELECT to_tsvector('chinese', '小明硕士毕业于中国科学院计算所');

Docker Quick Start

docker run --name pgzhparser -d \
  -e POSTGRES_PASSWORD=somepassword \
  zhparser/zhparser:bookworm-16

Last updated on 2026-03-24

pg_bigm pg_bestmatch