SOAP Version 1.2 Part 2: Adjuncts (Second Edition) (original) (raw)

A. The "application/soap+xml" Media Type

The original contents of this section have been superceded by RFC3902[RFC 3902].

B. Mapping Application-Defined Names to XML Names

This appendix details an algorithm for taking an application-defined name, such as the name of a variable or field in a programming language, and mapping it to the Unicode characters that are legal in the names of XML elements and attributes as defined in Namespace in XML [Namespaces in XML]

Hex Digits
[5] hexDigit ::= [0-9A-F]

B.1 Rules for Mapping Application-Defined Names to XML Names

  1. An XML Name has two parts: Prefix andLocalPart. LetPrefix be determined per the rules and constraints specified in Namespaces in XML [Namespaces in XML].
  2. Let T be a name in an application, represented as a sequence of characters encoded in a particular character encoding.
  3. Let M be the implementation-defined function for transcoding of the characters used in the application-defined name to an equivalent string of Unicode characters.
    Note:
    Ideally, if this transcoding is from a non-Unicode encoding, it should be both reversible and Unicode Form C normalizing (that is, combining sequences will be in the prescribed canonical order). It should be noted that some transcodings cannot be perfectly reversible and that Normalization Form C (NFC) normalization may alter the original sequence in a few cases (see Character Model for the World Wide Web [CharMod]). To ensure that matching names continue to match after mapping, Unicode sequences should be normalized using Unicode Normalization Form C.
    Note:
    This transcoding is explicitly to Unicode scalar values ("code points") and not to any particular character encoding scheme of Unicode, such as UTF-8 or UTF-16.
    Note:
    Note: Properly formed surrogate pair sequences must be converted to their respective scalar values ("code points") [That is, the sequence U+D800 U+DC00 should be transcoded to the character U+10000]. If the transcoding begins with a Unicode encoding, non-conforming (non-shortest form) UTF-8 and UTF-16 sequences must be converted to their respective scalar values.
    Note:
    The number of characters in T is not necessarily the same as the number of characters in M, because transcoding may be one-to-many or many-to-one. The details of transcoding may be implementation-defined. There may be (very rarely) cases where there is no equivalent Unicode representation for T; such cases are not covered here.
  4. Let C be the sequence of Unicode scalar values (characters) represented by M(T)
  5. Let N be the number of characters inC. Let C1,C2, ..., CN be the characters of C, in order from most to least significant (logical order).
  6. For each i between 1 (one) andN, let Xi be the Unicode character string defined by the following rules:
    Case:
    1. If Ci is undefined (that is, some character or sequence of characters as defined in the application's character sequence T contains no mapping to Unicode), then Xi is implementation-defined.
    2. If i<=N-1 andCi is "_" (U+005F LOW LINE) andCi+1 is "x" (U+0078 LATIN SMALL LETTER X), then let Xi be "_x005F_".
    3. If i=1, and N>=3, andC1 is "x" (U+0078 LATIN SMALL LETTER X) or "X" (U+0058 LATIN CAPITAL LETTER X), and C2 is "m" (U+006D LATIN SMALL LETTER M) or "M" (U+004D LATIN CAPITAL LETTER M), and C3 is "l" (U+006C LATIN SMALL LETTER L) or "L" (U+004C LATIN CAPITAL LETTER L) (in other words, a string three letters or longer starting with the text "xml" or any re-capitalization thereof), then if C1 is "x" (U+0078 LATIN SMALL LETTER X) then let X1 be "_x0078_"; otherwise, ifC1 is "X" (U+0058 LATIN CAPITAL LETTER X) then letX1 be "_x0058_".
    4. If Ci is not a valid XML NCName character (see Namespaces in XML [Namespaces in XML]) or ifi=1 (one) and C1 is not a valid first character of an XML NCName then:
      Let U1, U2, ... , U6 be the six hex digits[PROD: 5] such that Ci is "U+"U1 U2 ... U6 in the Unicode scalar value.
      Case:
      1. If U1=0,U2=0, U3=0, andU4=0, then letXi="_x"U5 U6 "_".
        This case implies thatCi is a character in the Basic Multilingual Plane (Plane 0) of Unicode and can be wholly represented by a single UTF-16 code point sequence U+U5U6.
      2. Otherwise, let Xi be "_x" U1 U2 U3 U4 U5 U6 "_".
    5. Otherwise, let Xi beMi. That is, any character in X that is a valid character in an XML NCName is simply copied.
  7. Let LocalPart be the character string concatenation of X1, X2, ... , XN in order from most to least significant.
  8. Let XML Name be the QName per Namespaces in XML [Namespaces in XML]

B.2 Examples

Hello world -> Hello_x0020_world Hello_xorld -> Hello_x005F_xorld Helloworld_ -> Helloworld_

      x -> x
    xml -> _x0078_ml
   -xml -> _x002D_xml
   x-ml -> x-ml

 Ælfred -> Ælfred

άγνωστος -> άγνωστος ᜉᜅᜎᜈ -> x1709__x1705__x170E__x1708 ᏙᏚᎥ -> x13D9__x13DA__x13A5

C. Using W3C XML Schema with SOAP Encoding (Non-Normative)

As noted in 3.1.4 Computing the Type Name Property SOAP graph nodes are labeled with type names, but conforming processors are not required to perform validation of encoded SOAP messages.

These sections describe techniques that can be used when validation with W3C XML schemas is desired for use by SOAP applications. Any errors or faults resulting from such validation are beyond those covered by the normative Recommendation; from the perspective of SOAP, such faults are considered to be application-level failures.

C.1 Validating Using the Minimum Schema

Although W3C XML schemas are conventionally exchanged in the form of schema documents (see XML Schema [XML Schema Part 1]), the schema Recommendation is built on an abstract definition of schemas, to which all processors need to conform. The schema Recommendation provides that all such schemas include definitions for a core set of built in types, such as integers, dates, and so on (see XML Schema [XML Schema Part 1], Built-in Simple Type Definition). Thus, it is possible to discuss validation of a SOAP message against such a minimal schema, which is the one that would result from providing no additional definitions or declarations (i.e., no schema document) to a schema processor.

The minimal schema provides that any well formed XML document will validate, except that where an xsi:type is provided, the type named must be built in, and the corresponding element must be valid per that type. Thus, validation of a SOAP 1.2 message using a minimal schema approximates the behavior of the built-in types of SOAP 1.1.

C.2 Validating Using the SOAP Encoding Schema

Validation against the minimal schema (see C.1 Validating Using the Minimum Schema) will not succeed where encoded graph nodes have multiple inbound edges. This is because elements representing such graph nodes will carry id attribute information items which are not legal on elements of type "xs:string", "xs:integer" etc. The SOAP Encoding of such graphs MAY be validated against the SOAP Encoding schema. In order for the encoding to validate, edge labels, and hence the [local name] and [namespace name] properties of the_element information items_, need to match those defined in the SOAP Encoding schema. Validation of the encoded graph against the SOAP Encoding schema would result in the type name property of the nodes in the graph being assigned the relevant type name.

C.3 Validating Using More Specific Schemas

It may be that schemas could be constructed to describe the encoding of certain graphs. Validation of the encoded graph against such a schema would result in the type name property of the graph nodes being assigned the relevant type name. Such a schema can also supply default or fixed values for one or more of the itemType , arraySize or nodeType attribute information items; the values of such defaulted attributes affect the deserialized graph in the same manner as if the attributes had been explicitly supplied in the message. Errors or inconsistencies thus introduced (e.g. if the value of the attribute is erroneous or inappropriate) should be reported as application-level errors; faults from the "http://www.w3.org/2003/05/soap-encoding" namespace should be reported only if the normative parts of this specification are violated.

D. Acknowledgements (Non-Normative)

This document is the work of the W3C XML Protocol Working Group.

Participants in the Working Group are (at the time of writing, and by alphabetical order): Glen Daniels (Sonic Software, formerly of Macromedia), Vikas Deolaliker (Sonoa Systems, Inc.), Chris Ferris (IBM, formerly of Sun Microsystems), Marc Hadley (Sun Microsystems), David Hull (TIBCO Software, Inc.), Anish Karmarkar (Oracle), Yves Lafon (W3C), Jonathan Marsh (WSO2), Jeff Mischkinsky (Oracle), Eric Newcomer (IONA Technologies), David Orchard (BEA Systems, formerly of Jamcracker), Seumas Soltysik (IONA Technologies), Davanum Srinivas (WSO2), Pete Wenzel (Sun Microsystems, formerly of SeeBeyond).

Previous participants were: Yasser alSafadi (Philips Research), Bill Anderson (Xerox), Vidur Apparao (Netscape), Camilo Arbelaez (webMethods), Mark Baker (Idokorro Mobile, Inc., formerly of Sun Microsystems), Philippe Bedu (EDF (Electricite De France)), Olivier Boudeville (EDF (Electricite De France)), Carine Bournez (W3C), Don Box (Microsoft Corporation, formerly of DevelopMentor), Tom Breuel (Xerox), Dick Brooks (Group 8760), Winston Bumpus (Novell, Inc.), David Burdett (Commerce One), Charles Campbell (Informix Software), Alex Ceponkus (Bowstreet), Michael Champion (Software AG), David Chappell (Sonic Software), Miles Chaston (Epicentric), David Clay (Oracle), David Cleary (Progress Software), Dave Cleary (webMethods), Ugo Corda (Xerox), Paul Cotton (Microsoft Corporation), Fransisco Cubera (IBM), Jim d'Augustine (Excelon Corporation), Ron Daniel (Interwoven), Doug Davis (IBM), Ray Denenberg (Library of Congress), Paul Denning (MITRE Corporation), Frank DeRose (TIBCO Software, Inc.), Mike Dierken (DataChannel), Andrew Eisenberg (Progress Software), Brian Eisenberg (DataChannel), Colleen Evans (Sonic Software), John Evdemon (XMLSolutions), David Ezell (Hewlett Packard), James Falek (TIBCO Software, Inc.), David Fallside (IBM), Eric Fedok (Active Data Exchange), Daniela Florescu (Propel), Dan Frantz (BEA Systems), Michael Freeman (Engenia Software), Dietmar Gaertner (Software AG), Scott Golubock (Epicentric), Tony Graham (Sun Microsystems), Mike Greenberg (IONA Technologies), Rich Greenfield (Library of Congress), Martin Gudgin (Microsoft Corporation, formerly of DevelopMentor), Hugo Haas (W3C), Mark Hale (Interwoven), Randy Hall (Intel), Bjoern Heckel (Epicentric), Frederick Hirsch (Zolera Systems), Gerd Hoelzing (SAP AG), Erin Hoffmann (Tradia Inc.), Steve Hole (MessagingDirect Ltd.), Mary Holstege (Calico Commerce), Jim Hughes (Fujitsu Limited), Oisin Hurley (IONA Technologies), Yin-Leng Husband (Hewlett Packard, formerly of Compaq), John Ibbotson (IBM), Ryuji Inoue (Matsushita Electric Industrial Co., Ltd.), Scott Isaacson (Novell, Inc.), Kazunori Iwasa (Fujitsu Limited), Murali Janakiraman (Rogue Wave), Mario Jeckle (DaimlerChrysler Research and Technology), Eric Jenkins (Engenia Software), Mark Jones (AT&T), Jay Kasi (Commerce One), Jeffrey Kay (Engenia Software), Suresh Kodichath (IONA Technologies), Richard Koo (Vitria Technology Inc.), Jacek Kopecky (Systinet), Alan Kropp (Epicentric), Julian Kumar (Epicentric), Peter Lecuyer (Progress Software), Tony Lee (Vitria Technology Inc.), Michah Lerner (AT&T), Bob Lojek (Intalio Inc.), Henry Lowe (OMG), Brad Lund (Intel), Matthew MacKenzie (XMLGlobal Technologies), Michael Mahan (Nokia), Murray Maloney (Commerce One), Richard Martin (Active Data Exchange), Noah Mendelsohn (IBM, formerly of Lotus Development), Alex Milowski (Lexica), Kevin Mitchell (XMLSolutions), Nilo Mitra (Ericsson), Ed Mooney (Sun Microsystems), Jean-Jacques Moreau (Canon), Dean Moses (Epicentric), Highland Mary Mountain (Intel), Don Mullen (TIBCO Software, Inc.), Rekha Nagarajan (Calico Commerce), Raj Nair (Cisco Systems), Masahiko Narita (Fujitsu Limited), Mark Needleman (Data Research Associates), Art Nevarez (Novell, Inc.), Henrik Nielsen (Microsoft Corporation), Mark Nottingham (BEA Systems, formerly of Akamai Technologies), Conleth O'Connell (Vignette), Kevin Perkins (Compaq), Doug Purdy (Microsoft Corporation), Jags Ramnaryan (BEA Systems), Andreas Riegg (DaimlerChrysler Research and Technology), Vilhelm Rosenqvist (NCR), Herve Ruellan (Canon), Marwan Sabbouh (MITRE Corporation), Waqar Sadiq (Vitria Technology Inc.), Rich Salz (Zolera Systems), Krishna Sankar (Cisco Systems), Jeff Schlimmer (Microsoft Corporation), George Scott (Tradia Inc.), Shane Sesta (Active Data Exchange), Lew Shannon (NCR), John-Paul Sicotte (MessagingDirect Ltd.), Miroslav Simek (Systinet), Simeon Simeonov (Macromedia), Aaron Skonnard (DevelopMentor), Nick Smilonich (Unisys), Soumitro Tagore (Informix Software), James Tauber (Bowstreet), Anne Thomas Manes (Sun Microsystems), Lynne Thompson (Unisys), Patrick Thompson (Rogue Wave), Jim Trezzo (Oracle), Asir Vedamuthu (webMethods), Mike Vernal (Microsoft Corporation), Randy Waldrop (WebMethods), Fred Waskiewicz (OMG), David Webber (XMLGlobal Technologies), Ray Whitmer (Netscape), Volker Wiechers (SAP AG), Stuart Williams (Hewlett Packard), Yan Xu (DataChannel), Amr Yassin (Philips Research), Susan Yee (Active Data Exchange), Jin Yu (MartSoft Corp.).

The people who have contributed to discussions onxml-dist-app@w3.orgare also gratefully acknowledged.