UTF-1 (original) (raw)
UTF-1 is a method of transforming ISO/IEC 10646/Unicode into a stream of bytes. Its design does not provide self-synchronization, which makes searching for substrings and error recovery difficult. It reuses the ASCII printing characters for multi-byte encodings, making it unsuited for some uses (for instance Unix filenames cannot contain the byte value used for forward slash). UTF-1 is also slow to encode or decode due to its use of division and multiplication by a number which is not a power of 2. Due to these issues, it did not gain acceptance and was quickly replaced by UTF-8.
Property | Value |
---|---|
dbo:abstract | UTF-1 war das erste UCS Transformation Format für Unicode und ISO 10646 und wurde 1993 im Anhang G der ursprünglichen Version von ISO 10646 veröffentlicht, ist jedoch heute nicht mehr Teil dieser Norm. UTF-1 ist kompatibel zu ISO 2022. ASCII-Zeichen, C0- und C1-Steuerzeichen werden wie in ISO 8859 unverändert (1:1) kodiert. Andere Zeichen werden – über eine relativ rechenaufwändige Modulo-190-Arithmetik – als Zeichenfolgen von 2, 3 oder 5 Byte Länge kodiert. Dabei können auch ASCII-Zeichen Teil dieser Zeichenfolgen sein. Das hat den Nachteil, dass zum Beispiel der Schrägstrich in so einer Zeichenfolge enthalten sein kann, so dass diese Kodierung nicht für Dateinamen verwendet werden kann. Aufgrund dieses Nachteils wurde später eine andere Kodierung für Unicode entwickelt, welche anfangs „UTF-FSS“ (file system safe) genannt wurde und sich heute unter dem Namen UTF-8 allgemein durchgesetzt hat. (de) UTF-1 is a method of transforming ISO/IEC 10646/Unicode into a stream of bytes. Its design does not provide self-synchronization, which makes searching for substrings and error recovery difficult. It reuses the ASCII printing characters for multi-byte encodings, making it unsuited for some uses (for instance Unix filenames cannot contain the byte value used for forward slash). UTF-1 is also slow to encode or decode due to its use of division and multiplication by a number which is not a power of 2. Due to these issues, it did not gain acceptance and was quickly replaced by UTF-8. (en) UTF-1은 국제 문자 세트/유니코드를 바이트 스트림으로 변환하는 한 방법이다. 설계 상의 이유로, 디코딩이 문자 중간에 시작하면 재동기화가 불가능하며 검색 루틴은 이와 함께 신뢰성있게 사용할 수 없다. UTF-1은 또한 제곱이 아닌 수의 나누기를 사용하기 때문에 상당히 느리다. 이러한 문제로 UTF-1은 폭넓게 채택되지 못했으며 UTF-8로 대체되었다. (ko) UTF-1 é um formato de transformação de ISO 10646/Unicode em fluxos de bytes, a fim de serialização. Devido ao seu formato não é possível resincronizar se a decodificação começa no meio dum caractere (o que dificulta o truncamento) e rotinas de busca de caractere não podem ser usadas de forma confiável. Dados tais problemas, esse padrão nunca ganhou grande adoção, sendo quase que completamente substituído pelo UTF-8. (pt) UTF-1是一种将ISO 10646 / Unicode转化成字节流的方式。由于其本身的设计问题,如果自中间的一个字符开始解码,UTF-1將無法重新同步(這造成擷取的困難),而且UTF-1也沒辦法進行可靠的字节搜索。又因为UTF-1使用的除数不是2的幂,所以转化得也相当缓慢。由于以上这些问题,UTF-1从来没有得到广泛採用,并已被UTF-8所取代。 (zh) |
dbo:wikiPageExternalLink | http://kikaku.itscj.ipsj.or.jp/ISO-IR/178.pdf https://web.archive.org/web/20150318032101/http:/kikaku.itscj.ipsj.or.jp/ISO-IR/178.pdf https://web.archive.org/web/20160607111732/http:/czyborra.com/utf/%23UTF-1 http://czyborra.com/utf/%23UTF-1 https://www.rfc-editor.org/rfc/rfc3629.html https://www.unicode.org/versions/Unicode1.1.0/appF.pdf |
dbo:wikiPageID | 2535122 (xsd:integer) |
dbo:wikiPageLength | 5498 (xsd:nonNegativeInteger) |
dbo:wikiPageRevisionID | 1092940825 (xsd:integer) |
dbo:wikiPageWikiLink | dbr:UTF-8 dbr:Unicode dbr:Unicode_Transformation_Format dbr:Modulo_operation dbr:Comparison_of_Unicode_encodings dbr:Byte dbr:Byte-order_mark dbr:C0_and_C1_control_codes dbr:UCS-4 dbr:US-ASCII dbr:ASCII dbr:Hexadecimal dbc:Unicode_Transformation_Formats dbr:Binary_Ordered_Compression_for_Unicode dbr:Code_point dbr:ISO/IEC_2022 dbr:MIME dbr:Universal_Character_Set dbr:Variable-width_encoding dbr:Extended_ASCII dbr:ISO/IEC_10646 dbr:Self-synchronizing_code dbr:Substring |
dbp:classification | dbr:Unicode_Transformation_Format dbr:Variable-width_encoding dbr:Extended_ASCII |
dbp:encodes | ISO/IEC 10646 (en) |
dbp:extends | dbr:US-ASCII |
dbp:lang | International (en) |
dbp:mime | ISO-10646-UTF-1 (en) |
dbp:name | UTF-1 (en) |
dbp:next | dbr:UTF-8 |
dbp:status | Obscure, of mainly historical interest. (en) |
dbp:wikiPageUsesTemplate | dbt:Cite_document dbt:Cite_web dbt:Notelist dbt:Short_description dbt:Unicode_navigation dbt:Character_encoding dbt:Infobox_character_encoding |
dct:subject | dbc:Unicode_Transformation_Formats |
gold:hypernym | dbr:Way |
rdf:type | yago:WikicatUnicodeTransformationFormats yago:Abstraction100002137 yago:Communication100033020 yago:Format106636806 yago:Information106634376 yago:Message106598915 |
rdfs:comment | UTF-1 is a method of transforming ISO/IEC 10646/Unicode into a stream of bytes. Its design does not provide self-synchronization, which makes searching for substrings and error recovery difficult. It reuses the ASCII printing characters for multi-byte encodings, making it unsuited for some uses (for instance Unix filenames cannot contain the byte value used for forward slash). UTF-1 is also slow to encode or decode due to its use of division and multiplication by a number which is not a power of 2. Due to these issues, it did not gain acceptance and was quickly replaced by UTF-8. (en) UTF-1은 국제 문자 세트/유니코드를 바이트 스트림으로 변환하는 한 방법이다. 설계 상의 이유로, 디코딩이 문자 중간에 시작하면 재동기화가 불가능하며 검색 루틴은 이와 함께 신뢰성있게 사용할 수 없다. UTF-1은 또한 제곱이 아닌 수의 나누기를 사용하기 때문에 상당히 느리다. 이러한 문제로 UTF-1은 폭넓게 채택되지 못했으며 UTF-8로 대체되었다. (ko) UTF-1 é um formato de transformação de ISO 10646/Unicode em fluxos de bytes, a fim de serialização. Devido ao seu formato não é possível resincronizar se a decodificação começa no meio dum caractere (o que dificulta o truncamento) e rotinas de busca de caractere não podem ser usadas de forma confiável. Dados tais problemas, esse padrão nunca ganhou grande adoção, sendo quase que completamente substituído pelo UTF-8. (pt) UTF-1是一种将ISO 10646 / Unicode转化成字节流的方式。由于其本身的设计问题,如果自中间的一个字符开始解码,UTF-1將無法重新同步(這造成擷取的困難),而且UTF-1也沒辦法進行可靠的字节搜索。又因为UTF-1使用的除数不是2的幂,所以转化得也相当缓慢。由于以上这些问题,UTF-1从来没有得到广泛採用,并已被UTF-8所取代。 (zh) UTF-1 war das erste UCS Transformation Format für Unicode und ISO 10646 und wurde 1993 im Anhang G der ursprünglichen Version von ISO 10646 veröffentlicht, ist jedoch heute nicht mehr Teil dieser Norm. UTF-1 ist kompatibel zu ISO 2022. Aufgrund dieses Nachteils wurde später eine andere Kodierung für Unicode entwickelt, welche anfangs „UTF-FSS“ (file system safe) genannt wurde und sich heute unter dem Namen UTF-8 allgemein durchgesetzt hat. (de) |
rdfs:label | UTF-1 (de) UTF-1 (ko) UTF-1 (pt) UTF-1 (en) UTF-1 (zh) |
owl:sameAs | freebase:UTF-1 yago-res:UTF-1 wikidata:UTF-1 dbpedia-de:UTF-1 dbpedia-ko:UTF-1 dbpedia-pt:UTF-1 dbpedia-zh:UTF-1 https://global.dbpedia.org/id/4wYs1 |
prov:wasDerivedFrom | wikipedia-en:UTF-1?oldid=1092940825&ns=0 |
foaf:isPrimaryTopicOf | wikipedia-en:UTF-1 |
is dbo:wikiPageDisambiguates of | dbr:UTF |
is dbo:wikiPageRedirects of | dbr:CsISO10646UTF1 dbr:ISO-10646-UTF-1 |
is dbo:wikiPageWikiLink of | dbr:UTF-8 dbr:UTF-EBCDIC dbr:Unicode dbr:Comparison_of_Unicode_encodings dbr:CsISO10646UTF1 dbr:Binary_Ordered_Compression_for_Unicode dbr:Byte_order_mark dbr:ISO/IEC_2022 dbr:UTF dbr:Variable-width_encoding dbr:Universal_Coded_Character_Set dbr:ISO-10646-UTF-1 |
is dbp:prev of | dbr:UTF-8 |
is foaf:primaryTopic of | wikipedia-en:UTF-1 |