Querying Multiple Files in a Directory
DataDirect XQuery supports the use of fn:collection to query multiple XML and non-XML files in a specified directory.
XML Files
In the following example, suppose you have a number of XML files stored in the directory books. Each of the files contains information about one book, and you want to create a single XML document that contains a list of all your books.
<books>{ for $book in collection("file:///c:/books?select=*.xml") return <myBook>{$book/book/title}</myBook> }</books>The result would look something like this:
<books> <myBook> <title>Emma</title> </myBook> <myBook> <title>Pride and Prejudice</title> </myBook> . . . </books>The function’s declaration for this feature is:
where:
directory_uri
is a URI referencing a directory. The URI must use the file:// scheme.
option
is{(select="REGEX") | recurse={yes | no} | (sort=[a,t,r]+) | (xquery-regex=(yes|no))}
where:
select
contains a regular expression (REGEX
), which determines which files in the directory are selected. Ifselect
is not specified, any file is assumed.sort
determines how the retrieved files are sorted, as follows:recurse
determines whether subdirectories are searched. The default is no. To search subdirectories, set this option to yes, for example:<books>{
for $book in collection("file:///c:/books?select=*.xml;recurse=yes")
return
<myBook>{$book/book/title}</myBook>
}</books>
xquery-regex
determines what type of regular expression syntax is specified inselect
.
- If set to no (the default), the select pattern syntax takes the conventional form. For example, *.xml selects all files with an xml extension. More generally, the select pattern is converted to a regular expression by prepending "^", appending "$", replacing "." with "\.", and replacing "*" with ".*". Then, the select pattern is used to match the file names appearing in the directory using the XQuery regular expression rules. So, for example, you can specify *.(xml|xhtml) to match files with either of these two file extensions.
Note however, that special characters used in the URL may need to be escaped using the %HH convention, which can be achieved using the iri-to-uri function.- If set to
yes
, the select pattern syntax as supported by XQuery is assumed. In this case, some characters may need to be escaped such as the backslash character (\) in a file name, for example:
select=.*\.xml$
must beselect=.*%5C.xml$
Non-XML Files
The collection function supports the use of the converter URI, which allows you to use DataDirect XML Converters to query non-XML files, such as EDI, binary, and tab- and comma-separated files. For example, this XQuery uses the EDI XML Converter to return a sequence in which each item is an EDI file contained in the directory C:/myfolder:
DataDirect XQuery also supports additional arguments in fn:collection to tune navigation of the specified directory:
fn:collection("converter:name
:[property_name
=value
: |property_name
=value
: | ...]?directory_url
(?option
(;option
)*)?")where:
name
is the name of the XML Converter. There are converters for numerous non-XML file types such as EDI, CSV, dBase, and more.
property_name
=value
are used to specify the properties you want the conversion engine to use when converting a non-XML file to XML. Some properties are shared across converters; others are peculiar to a converter for a given file type.
directory_url
andoption
are the same those described in XML Files.The following examples show how fn:collection can be used to query a directory containing EDI files, using the converter URI to specify the EDI to be converted to XML and the properties to be used by the conversion engine.
In this example, X12 elements from all files in the directory C:\myfolder are retrieved.
In this example, X12 elements from all files the directory C:\myfolder are retrieved, including the ones in sub-folders.
In this example, X12 elements from all files with extension .x12 in directory C:\myfolder are retrieved, including the ones in sub-folders, and they are sorted in ascending order.
For More Information
To learn more DataDirect XML Converters, the converter URI, and conversion properties, see the DataDirect XML Converters User’s Guide and Reference manual. DataDirect XML Converters documentation is installed as part of the DataDirect Data Integration Suite, of which DataDirect XQuery is a part; you can also find DataDirect XML Converters product documentation on the DataDirect Web site.
See also Collection URI Resolvers.