GitHub - ikegami-yukino/jaconv: Pure-Python Japanese character interconverter for Hiragana, Katakana, Hankaku, and Zenkaku (original) (raw)

latest version pyversion license download GitHub code search count GitHub Repo stars NO WAR budge NO NUKE budge

jaconv (Japanese Converter) is interconverter for Hiragana, Katakana, Hankaku (half-width character) and Zenkaku (full-width character)

Japanese README is available.

INSTALLATION

$ pip install jaconv

USAGE

See also document

import jaconv

Hiragana to Katakana

jaconv.hira2kata('ともえまみ')

=> 'トモエマミ'

Hiragana to half-width Katakana

jaconv.hira2hkata('ともえまみ')

=> 'トモエマミ'

Katakana to Hiragana

jaconv.kata2hira('巴マミ')

=> '巴まみ'

half-width character to full-width character

default parameters are followings: kana=True, ascii=False, digit=False

jaconv.h2z('ティロ・フィナーレ')

=> 'ティロ・フィナーレ'

half-width character to full-width character

but only ascii characters

jaconv.h2z('abc', kana=False, ascii=True, digit=False)

=> 'abc'

half-width character to full-width character

but only digit characters

jaconv.h2z('123', kana=False, ascii=False, digit=True)

=> '123'

half-width character to full-width character

except half-width Katakana

jaconv.h2z('アabc123', kana=False, digit=True, ascii=True)

=> 'アabc123'

an alias of h2z

jaconv.hankaku2zenkaku('ティロ・フィナーレabc123')

=> 'ティロ・フィナーレabc123'

full-width character to half-width character

default parameters are followings: kana=True, ascii=False, digit=False

jaconv.z2h('ティロ・フィナーレ')

=> 'ティロ・フィナーレ'

full-width character to half-width character

but only ascii characters

jaconv.z2h('abc', kana=False, ascii=True, digit=False)

=> 'abc'

full-width character to half-width character

but only digit characters

jaconv.z2h('123', kana=False, ascii=False, digit=True)

=> '123'

full-width character to half-width character

except full-width Katakana

jaconv.z2h('アabc123', kana=False, digit=True, ascii=True)

=> 'アabc123'

an alias of z2h

jaconv.zenkaku2hankaku('ティロ・フィナーレabc123')

=> 'ティロ・フィナーレabc123'

normalize

jaconv.normalize('ティロ・フィナ〜レ', 'NFKC')

=> 'ティロ・フィナーレ'

Convert small Hiragana or Katakana to normal size

jaconv.enlarge_smallkana('わぁい')

=> 'わあい'

jaconv.enlarge_smallkana('きょういっぱい', ignore='っ')

=> 'きよういっぱい'

Hiragana to alphabet

jaconv.kana2alphabet('じゃぱん')

=> 'japan'

Alphabet to Hiragana

jaconv.alphabet2kana('japan')

=> 'じゃぱん'

Katakana to Alphabet

jaconv.kata2alphabet('ケツイ')

=> 'ketsui'

Alphabet to Katakana

jaconv.alphabet2kata('namba')

=> 'ナンバ'

Hiragana to Julius's phoneme format

jaconv.hiragana2julius('てんきすごくいいいいいい')

=> 't e N k i s u g o k u i:'

NOTE

jaconv.normalize method expand unicodedata.normalize for Japanese language processing.

'〜' => 'ー' '~' => 'ー' "’" => "'" '”'=> '"' '“' => '``' '―' => '-' '‐' => '-' '˗' => '-' '֊' => '-' '‐' => '-' '‑' => '-' '‒' => '-' '–' => '-' '⁃' => '-' '⁻' => '-' '₋' => '-' '−' => '-' '﹣' => 'ー' '-' => 'ー' '—' => 'ー' '―' => 'ー' '━' => 'ー' '─' => 'ー'