a proposed extension of
Computer-coding the IPA: a proposed extension of SAMPA
Summary version, not requiring an IPA character set. (Full version)
John Wells
Department of Phonetics and Linguistics, University College London
What follows is a proposed keyboard-compatible coding for the entire set of IPA symbols. It covers everything on the 1993 IPA Chart, including diacritics and tone marks, and is put forward as a proposed standard way to transmit IPA-transcribed material by e-mail and for similar purposes. It is an extension of the SAMPA standard, with which colleagues may be familiar. The most frequently used symbols are mapped onto single keystrokes in the ASCII range 33..126. Less frequently used symbols are mapped onto a single keystroke plus the backslash, \. Diacritics (other than those already catered for in SAMPA) are mapped onto a keystroke with a preceding underscore, _. Thus for example the voiced velar fricative (gamma) becomes G, the voiced uvular plosive G\, and the velarization diacritic _G (so that for example velarized d appears as d_G). Note that upper-case must be distinguished from lower-case, but that there is no need to separate successive symbols by spaces: X-SAMPA symbol strings are uniquely parsable.
These proposals are fully set out with a reasoned explanation, and all the correct IPA symbols, in my 7000-word draft article "Computer-coding the IPA: a proposed extension of SAMPA". If you can't read it here (using Acrobat Reader standalone or your browser's plug-in), it is also available as a Postscript file and can be downloaded by anonymous ftp from ftp.phon.ucl.ac.uk, internet address 128.40.52.11, in directory /pub/sam, file name ipasam-x.ps. Log in with username ftp, password ftp. The file should be fetched in ascii mode and sent to a postscript printer.
Using these codes, you can for example include IPA-phonetic transcriptions of all kinds in e-mail messages or other forms of electronic exchange. Wherever an IPA character set is not available, X-SAMPA will provide a workable alternative. Anyreactions from colleagues to these proposals will be very welcome. Feel free to pass this file on to anyone interested.
This summary is in the form of two columns. In the first is a phonetic label (since this is a simple ASCII file, I don't show IPA symbols); in the second is the proposed coding, which we can refer to as X-SAMPA (extended SAMPA). The listing follows the order of the Chart, and should be read in conjunction with it.
It is assumed that the reader is familiar with terms used for the classification of sound-types and with the IPA Chart and the symbols shown on it.
Note that IPA symbols belonging to the ordinary Roman lower-case alphabet(e.g. u, x) remain the same. They are not listed below.
-
X-SAMPA IPA Unicode (hex, dec)
Consonants (pulmonic)
retroflex plosive, voiceless t (
= ASCII 096) 0288, 648
retroflex plosive, voiced d 0256, 598 labiodental nasal F 0271, 625 retroflex nasal n
0273, 627
palatal nasal J 0272, 626
velar nasal N 014B, 331
uvular nasal N\ 0274, 628
bilabial trill B\ 0299, 665
uvular trill R\ 0280, 640
alveolar tap 4 027E, 638
retroflex flap r 027D, 637 bilabial fricative, voiceless p\ 0278, 632 bilabial fricative, voiced B 03B2, 946 dental fricative, voiceless T 03B8, 952 dental fricative, voiced D 00F0, 240 postalveolar fricative, voiceless S 0283, 643 postalveolar fricative, voiced Z 0292, 658 retroflex fricative, voiceless s
0282, 642
retroflex fricative, voiced z` 0290, 656
palatal fricative, voiceless C 00E7, 231
palatal fricative, voiced j\ 029D, 669
velar fricative, voiced G 0263, 611
uvular fricative, voiceless X 03C7, 967
uvular fricative, voiced R 0281, 641
pharyngeal fricative, voiceless X\ 0127, 295
pharyngeal fricative, voiced ?\ 0295, 661
glottal fricative, voiced h\ 0266, 614
alveolar lateral fricative, vl. K alveolar lateral fricative, vd. K\
labiodental approximant P (or v) alveolar approximant r\ retroflex approximant r` velar approximant M\
retroflex lateral approximant l` palatal lateral approximant L velar lateral approximant L\
Clicks
bilabial O\ (O = capital letter)
dental |
(post)alveolar !\
palatoalveolar =\
alveolar lateral ||\
Ejectives, implosives
ejective > e.g. ejective p p> implosive < e.g. implosive b b<
Vowels
close back unrounded M close central unrounded 1 close central rounded } lax i I lax y Y lax u U
close-mid front rounded 2 close-mid central unrounded @\ close-mid central rounded 8 close-mid back unrounded 7
schwa @
open-mid front unrounded E open-mid front rounded 9 open-mid central unrounded 3 open-mid central rounded 3\ open-mid back unrounded V open-mid back rounded O
ash (ae digraph) { open schwa (turned a) 6
open front rounded & open back unrounded A open back rounded Q
Other symbols
voiceless labial-velar fricative W voiced labial-palatal approx. H voiceless epiglottal fricative H\ voiced epiglottal fricative <\ epiglottal plosive >\
alveolo-palatal fricative, vl. s\ alveolo-palatal fricative, voiced z\ alveolar lateral flap l\ simultaneous S and x x\ tie bar _
Suprasegmentals
primary stress " secondary stress % long : half-long :\ extra-short _X linking mark -\
Tones and word accents
level extra high _T level high _H level mid _M level low _L level extra low _B downstep ! upstep ^ (caret, circumflex)
contour, rising _R contour, falling _F contour, high rising _H_T contour, low rising _B_L
contour, rising-falling _R_F
(NB Instead of being written as diacritics with _, all prosodic marks can alternatively be placed in a separate tier, set off by < >, as recommended for the next two symbols.)
global rise global fall
Diacritics
voiceless _0 (0 = figure), e.g. n_0 voiced _v aspirated _h more rounded _O (O = letter) less rounded _c advanced _+ retracted _- centralized _" syllabic = (or =) e.g. n= (or n=) non-syllabic _^ rhoticity `
breathy voiced _t creaky voiced _k linguolabial _N labialized _w palatalized ' (or _j) e.g. t' (or t_j) velarized _G pharyngealized _?\
dental _d
apical _a
laminal _m
nasalized ~ (or ) e.g. A (or A~)
nasal release _n
lateral release _l
no audible release _}
velarized or pharyngealized _e velarized l, alternatively 5 raised _r lowered _o advanced tongue root _A retracted tongue root _q
Go back or onwards to SAMPA home page, UCL Phonetics and Linguistics home page, University College London home page.
For queries please contact John Wells by e-mail or at
Department of Phonetics and Linguistics, University College London, Gower Street, London WC1E 6BT.
+44 171 380 7175
last revised 2000 May 03 (Unicode values added)
http://www.phon.ucl.ac.uk/home/sampa/home.htm