xquery version "1.0"; (: : Copyright 2006-2010 The FLWOR Foundation. : : Licensed under the Apache License, Version 2.0 (the "License"); : you may not use this file except in compliance with the License. : You may obtain a copy of the License at : : http://www.apache.org/licenses/LICENSE-2.0 : : Unless required by applicable law or agreed to in writing, software : distributed under the License is distributed on an "AS IS" BASIS, : WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. : See the License for the specific language governing permissions and : limitations under the License. :) (:~ : <p> : This module provides functions for reading XML files from string inputs. : It allows reading of well-formed XML documents as well as well-formed : external parsed entities, described by : <a href="http://www.w3.org/TR/xml/#wf-entities">XML 1.0 Well-Formed : Parsed Entities</a>. The functions can also perform Schema and DTD : validation of the input documents. : </p> : : @see <a href="http://www.w3.org/TR/xml/#wf-entities">XML 1.0 Well-Formed : Parsed Entities</a> : @see <a href="http://www.w3.org/TR/xpath-functions-30/#func-parse-xml"> : fn:parse-xml() function in XPath and XQuery Functions and Operators 3.0</a> : : @author Nicolae Brinza : @project data processing/data converters : :) module namespace parse-xml = "http://www.zorba-xquery.com/modules/xml"; declare namespace zerr = "http://www.zorba-xquery.com/errors"; declare namespace err = "http://www.w3.org/xqt-errors"; declare namespace ver = "http://www.zorba-xquery.com/options/versioning"; declare option ver:module-version "2.0"; (:~ : A function to parse XML files and fragments (i.e. : <a href="http://www.w3.org/TR/xml/#wf-entities">external general parsed : entities</a>). The functions takes two arguments: the first one is the : string to be parsed and the second argument is a flags string : (eEdDsSlLwWfF]*(;[\p{L}]*)?) selecting the options described below. : <br/> : <br/> : : The convention for the flags is that a lower-case letter enables : an option and the corresponding upper-case letter disables it; specifying : both is an error; specifying neither leaves it implementation-defined : whether the option is enabled or disabled. Specifying the same option twice : is not an error, but specifying inconsistent options (for example "eE") is : a dynamic error. The options are: : : <ul> : <li> : eE - enables or disables processing of XML external entities. If the option : is enabled, the input must conform to the syntax extParsedEnt (production : [78] in XML 1.0, see <a href="http://www.w3.org/TR/xml/#wf-entities"> : Well-Formed Parsed Entities</a>). The result of the function call is a list : of nodes corresponding to the top-level components of the content of the : external entity: that is, elements, processing instructions, comments, and : text nodes. CDATA sections and character references are expanded, and : adjacent characters are merged so the result contains no adjacent text : nodes. If this option is enabled, none of the options d, s, or l may be : enabled. If the option is disabled, the input must be a well-formed XML : document conforming to the Document production : (<a href="http://www.w3.org/TR/xml/#sec-well-formed">production [1] in XML 1.0</a>). : </li> : : <li> : dD - enables or disables DTD-based validation. If this option is enabled and : the input references a DTD, then the input must be a well-formed and : DTD-valid XML document. If the option is enabled and the input does not : reference a DTD then the option is ignored. If the option is disabled, the : input is not required to reference a DTD and if it does reference a DTD then : the DTD is ignored for validation purposes (though it will still be read for : purposes such as expanding entity references and identifying ID attributes). : </li> : : <li> : sS - enables or disables strict XSD-based validation. If this option is : enabled, the result is equivalent to processing the input with the option : disabled, and then copying the result using the XQuery "validate strict" : expression. : </li> : : <li> : lL - enables or disables lax XSD-based validation. If this option is enabled, : the result is equivalent to processing the input with the option disabled, : and then copying the result using the XQuery "validate lax " expression. : </li> : : <li> : wW - enables or disables whitespace stripping. If the option is enabled, : any whitespace-only text nodes that remain after any DTD-based or XSD-based : processing are stripped from the input; if it is disabled, such : whitespace-only text nodes are retained. : </li> : : <li> : fF - enables or disables fatal error processing. If fatal error processing : is enabled, then any failure to parse the input in the manner requested : results in a dynamic error. If fatal error processing is disabled, then any : failure to parse the input (and also, in the case of fn:doc, a failure to : obtain the input by dereferencing the supplied URI) results in the function : returning an empty sequence and raising no error. : </li> : </ul> : : @param $xml-string The string that holds the XML to be parsed. If empty, : the function will return an empty sequence : @param $options The options for the parsing : @return The parsed XML as a document node or a list of nodes, or an empty : sequence. : : @error zerr:ZXQD0003 The error will be raised if the options to the function : are inconsistent. : : @error err:FODC0006 The error will be raised if the input string is not a : valid XML document or fragment (external general parsed : entity) or if DTD validation was enabled and the : document has not passed it. : : @error err:XQDY0027 The error will be raised if schema validation was enabled : and the input document has not passed it. : :) declare function parse-xml:parse-xml-fragment( $xml-string as xs:string?, $options as xs:string) as node()* external; (:~ : A function to parse XML files and fragments. The behavior is the : same as the parse-xml-fragment with two arguments. : : @param $xml-string The string that holds the XML to be parsed. If empty, : the function will return an empty sequence : @param $base-uri The baseURI that will be used as the baseURI for every : node returned by this function. : @param $options The options for the parsing (see parse-xml-fragment#2) : @return The parsed XML as a document node or a list of nodes, or an empty : sequence. : : @error zerr:ZXQD0003 The error will be raised if the options to the function : are inconsistent. : : @error err:FODC0006 The error will be raised if the input string is not a : valid XML document or fragment (external general parsed : entity) or if DTD validation was enabled and the : document has not passed it. : : @error err:XQDY0027 The error will be raised if schema validation was enabled : and the input document has not passed it. : : @error err:FODC0007 This error will be raised if $base-uri parameter passed : to the function is not a valid absolute URI. : :) declare function parse-xml:parse-xml-fragment( $xml-string as xs:string?, $base-uri as xs:string, $options as xs:string) as node()* external;