GitHub - nk2028/opencc-js: The JavaScript version of Open Chinese Convert (OpenCC) (original) (raw)

npm package badge GitHub Testing Badge jsDelivr Monthly Downloads Badge Socket.dev Supply Chain Security Badge

繁體版 - 简体版

The Pure JavaScript version of Open Chinese Convert (OpenCC)

opencc-js is a pure JavaScript implementation of OpenCC for both browsers and Node.js. It bundles dictionary data generated from opencc-data at build time, and no native binary is required.

The conversion pipeline aligns with the official OpenCC implementation, including phrase-level segmentation for the built-in converters, verified against upstream OpenCC test cases and golden outputs. Exact parity with the official OpenCC output is not guaranteed for all inputs.

opencc-js supports the OpenCC mmseg-style segmentation used by the built-in converters, but does not support extended segmenters such as jieba.

Note: For a comparison with the opencc and opencc-wasm packages, see below.

Data

Dictionary data is generated from opencc-data at build time and bundled in the published package. Browser usage does not fetch extra dictionary text files at runtime.

To avoid producing tofu boxes for glyphs that are often missing from browser and system fonts, opencc-js does not bundle OpenCC's TSCharactersExt tofu-risk mappings. A small number of rare Traditional-to-Simplified extension-character conversions may therefore intentionally differ from the upstream OpenCC test data.

Usage

Choose the installation method that matches your environment.

Important: Version 1.3.2-next.0 contains a critical bugfix. If you are using a CDN or self-hosted build, use this prerelease until the next stable release is published.

Install opencc-js for Node.js or a bundler

ES modules:

import OpenCC from 'opencc-js';

CommonJS:

const OpenCC = require('opencc-js');

Use opencc-js in a browser

Self-hosted ES module:

CDN ES module:

UMD build for plain script tags:

Basic usage

// Convert Traditional Chinese (Hong Kong) to Simplified Chinese (Mainland China) const converter = OpenCC.Converter({ from: 'hk', to: 'cn' }); console.log(converter('漢語')); // output: 汉语

Custom Converter

const converter = OpenCC.CustomConverter([ ['香蕉', 'banana'], ['蘋果', 'apple'], ['梨', 'pear'], ]); console.log(converter('香蕉 蘋果 梨')); // output: banana apple pear

Or using space and vertical bar as delimiter.

const converter = OpenCC.CustomConverter('香蕉 banana|蘋果 apple|梨 pear'); console.log(converter('香蕉 蘋果 梨')); // output: banana apple pear

Add words

const customDict = [ ['“', '「'], ['”', '」'], ['‘', '『'], ['’', '』'], ]; const converter = OpenCC.ConverterFactory( OpenCC.Locale.from.cn, // Simplified Chinese (Mainland China) => OpenCC standard OpenCC.Locale.to.tw.concat([customDict]) // OpenCC standard => Traditional Chinese (Taiwan) with custom words ); console.log(converter('悟空道:“师父又来了。怎么叫做‘水中捞月’?”')); // output: 悟空道:「師父又來了。怎麼叫做『水中撈月』?」

This will get the same result with an extra conversion.

const customDict = [ ['“', '「'], ['”', '」'], ['‘', '『'], ['’', '』'], ]; const converter = OpenCC.ConverterFactory( OpenCC.Locale.from.cn, // Simplified Chinese (Mainland China) => OpenCC standard OpenCC.Locale.to.tw, // OpenCC standard => Traditional Chinese (Taiwan) [customDict] // Traditional Chinese (Taiwan) => custom words ); console.log(converter('悟空道:“师父又来了。怎么叫做‘水中捞月’?”')); // output: 悟空道:「師父又來了。怎麼叫做『水中撈月』?」

DOM operations

HTML attribute lang='*' defines the targets.

漢語

// Set Chinese convert from Traditional (Hong Kong) to Simplified (Mainland China) const converter = OpenCC.Converter({ from: 'hk', to: 'cn' }); // Set the conversion starting point to the root node, i.e. convert the whole page const rootNode = document.documentElement; // Convert all elements with attributes lang='zh-HK'. Change attribute value to lang='zh-CN' const HTMLConvertHandler = OpenCC.HTMLConverter(converter, rootNode, 'zh-HK', 'zh-CN'); HTMLConvertHandler.convert(); // Convert -> 汉语 HTMLConvertHandler.restore(); // Restore -> 漢語

API

Bundle optimization

import * as OpenCC from 'opencc-js/core'; // primary code import * as Locale from 'opencc-js/preset'; // dictionary

const converter = OpenCC.ConverterFactory(Locale.from.hk, Locale.to.cn); console.log(converter('漢語'));

Differences between various opencc npm packages

There are three related npm packages for OpenCC conversion. They differ in runtime environment, implementation approach, and segmentation support.

opencc-js is a pure JavaScript implementation for browsers and Node.js. It bundles dictionary data generated from opencc-data at build time, requiring no native binaries and no runtime file fetching. Its conversion pipeline aligns with the official OpenCC implementation, including mmseg-style phrase segmentation for built-in converters, verified against upstream OpenCC test cases and golden outputs. Exact parity with the official OpenCC output is not guaranteed for all inputs. Extended segmenters such as Jieba are not supported.

opencc is the official Node.js native binding for the OpenCC C++ project. It depends on native or prebuilt binaries and follows the official OpenCC engine. Extended segmentation algorithms such as Jieba are supported when the official OpenCC configuration and runtime allow it.

opencc-wasm is another browser-capable implementation using WebAssembly. Its configuration and conversion logic stay aligned with the official opencc package, and it can support Jieba segmentation through the official OpenCC runtime.

opencc-js opencc opencc-wasm
Browser
Node.js
Implementation Pure JavaScript Native C++ binding WebAssembly
Native binary required
Dictionary source Bundled at build time Loaded at runtime Loaded at runtime
Aligned with official OpenCC Approximately
mmseg segmentation
Jieba segmentation available