The openxml procedure is used in the FROM clause of a query to generate a result set from an XML document. openxml uses a subset of the XPath query language to select nodes from an XML document.
When you use openxml, the XML document is parsed and the result is modeled as a tree. The tree is made up of nodes. XPath expressions are used to select nodes in the tree. The following list describes some commonly-used XPath expressions:
/ indicates the root node of the XML document
. (single period) indicates the current node of the XML document
// indicates all descendants of the current node, including the current node
.. indicates the parent node of the current node
./@attributename indicates the attribute of the current node having the name attributename
./childname indicates the children of the current node that are elements having the name childname
Consider the following XML document:
<inventory> <product ID="301" size="Medium">Tee Shirt <quantity>54</quantity> </product> <product ID="302" size="One Size fits all">Tee Shirt <quantity>75</quantity> </product> <product ID="400" size="One Size fits all">Baseball Cap <quantity>112</quantity> </product> </inventory>
The <inventory> element is the root node. You can refer to it using the following XPath expression:
Suppose that the current node is a <quantity> element. You can refer to this node using the following XPath expression:
To find all the <product> elements that are children of the <inventory> element, use the following XPath expression:
If the current node is a <product> element and you want to refer to the size attribute, use the following XPath expression:
For a complete list of XPath syntax supported by openxml, see openxml system procedure.
For information about the XPath query language, see http://www.w3.org/TR/xpath.
Each match for the first xpath-query argument to openxml generates one row in the result set. The WITH clause specifies the schema of the result set and how the value is found for each column in the result set. For example, consider the following query:
SELECT * FROM openxml( '<inventory> <product>Tee Shirt <quantity>54</quantity> <color>Orange</color> </product> <product>Baseball Cap <quantity>112</quantity> <color>Black</color> </product> </inventory>', '/inventory/product' ) WITH ( Name CHAR (25) './text()', Quantity CHAR(3) 'quantity', Color CHAR(20) 'color')
The first xpath-query argument is /inventory/product, and there are two <product> elements in the XML, so two rows are generated by this query.
The WITH clause specifies that there are three columns: Name, Quantity, and Color. The values for these columns are taken from the <product>, <quantity> and <color> elements. The query above generates the following result:
For more information, see openxml system procedure.
The openxml procedure can be used to generate an edge table, a table that contains a row for every element in the XML document. You may want to generate an edge table so that you can query the data in the result set using SQL.
The following SQL statement creates a variable, x, that contains an XML document. The XML generated by the query has a root element called <root>, which is generated using the XMLELEMENT function, and elements are generated for each column in the Employees, SalesOrders, and Customers tables using FOR XML AUTO with the ELEMENTS modifier specified.
For information about the XMLELEMENT function, see XMLELEMENT function [String].
For information about FOR XML AUTO, see Using FOR XML AUTO.
CREATE VARIABLE x XML; SET x=(SELECT XMLELEMENT( NAME root, (SELECT * FROM Employees KEY JOIN SalesOrders KEY JOIN Customers FOR XML AUTO, ELEMENTS))); SELECT x;
The generated XML looks as follows (the result has been formatted to make it easier to read—the result returned by the query is one continuous string):
<root> <Employees> <EmployeeID>299</EmployeeID> <ManagerID>902</ManagerID> <Surname>Overbey</Surname> <GivenName>Rollin</GivenName> <DepartmentID>200</DepartmentID> <Street>191 Companion Ct.</Street> <City>Kanata</City> <State>CA</State> <Country>USA</Country> <PostalCode>94608</PostalCode> <Phone>5105557255</Phone> <Status>A</Status> <SocialSecurityNumber>025487133</SocialSecurityNumber> <Salary>39300.000</Salary> <StartDate>1987-02-19</StartDate> <BirthDate>1964-03-15</BirthDate> <BenefitHealthInsurance>Y</BenefitHealthInsurance> <BenefitLifeInsurance>Y</BenefitLifeInsurance> <BenefitDayCare>N</BenefitDayCare> <Sex>M</Sex> <SalesOrders> <ID>2001</ID> <CustomerID>101</CustomerID> <OrderDate>2000-03-16</OrderDate> <FinancialCode>r1</FinancialCode> <Region>Eastern</Region> <SalesRepresentative>299</SalesRepresentative> <Customers> <ID>101</ID> <Surname>Devlin</Surname> <GivenName>Michael</GivenName> <Street>114 Pioneer Avenue</Street> <City>Kingston</City> <State>NJ</State> <PostalCode>07070</PostalCode> <Phone>2015558966</Phone> <CompanyName>The Power Group</CompanyName> </Customers> </SalesOrders> </Employees> ...
The following query uses the descendant-or-self (//*) XPath expression to match every element in the above XML document, and for each element the id metaproperty is used to obtain an ID for the node, and the parent (../) XPath expression is used with the ID metaproperty to get the parent node. The localname metaproperty is used to obtain the name of each element. Note that metaproperty names are case sensitive, thus ID or LOCALNAME cannot be used as metaproperty names.
SELECT * FROM openxml( x, '//*' ) WITH (ID INT '@mp:id', parent INT '../@mp:id', name CHAR(25) '@mp:localname', text LONG VARCHAR 'text()' ) ORDER BY ID;
The result set generated by this query shows the ID of each node, the ID of the parent node, and the name and content for each element in the XML document.
So far, XML that was generated with a procedure like XMLELEMENT has been used. You can also read XML from a file and parse it using the xp_read_file procedure. Suppose the file c:\inventory.xml has the following contents:
<inventory> <product>Tee Shirt <quantity>54</quantity> <color>Orange</color> </product> <product>Baseball Cap <quantity>112</quantity> <color>Black</color> </product> </inventory>
You can use the following statement to read and parse the XML in the file:
CREATE VARIABLE x XML; SELECT xp_read_file( 'c:\\inventory.xml' ) INTO x; SELECT * FROM openxml( x, '//*' ) WITH (ID INT '@mp:id', parent INT '../@mp:id', name CHAR(128) '@mp:localname', text LONG VARCHAR 'text()' ) ORDER BY ID;
If you have a table with a column that contains XML, you can use openxml to query all the XML values in the column at once. This can be done using a lateral derived table.
The following statements create a table with two columns, ManagerID and Reports. The Reports column contains XML data generated from the Employees table.
CREATE TABLE test (ManagerID INT, Reports XML); INSERT INTO test SELECT ManagerID, XMLELEMENT( NAME reports, XMLAGG( XMLELEMENT( NAME e, EmployeeID))) FROM Employees GROUP BY ManagerID;
Execute the following query to view the data in the test table:
SELECT * FROM test ORDER BY ManagerID;
This query produces the following result:
<reports> <e>102</e> <e>105</e> <e>160</e> <e>243</e> ... </reports>
<reports> <e>191</e> <e>750</e> <e>868</e> <e>921</e> ... </reports>
<reports> <e>129</e> <e>195</e> <e>299</e> <e>467</e> ... </reports>
<reports> <e>148</e> <e>390</e> <e>586</e> <e>757</e> ... </reports>
The following query uses a lateral derived table to generate a result set with two columns: one that lists the ID for each manager, and one that lists the ID for each employee that reports to that manager:
SELECT ManagerID, EmployeeID FROM test, LATERAL( openxml( test.Reports, '//e' ) WITH (EmployeeID INT '.') ) DerivedTable; ORDER BY ManagerID, EmployeeID;
This query generates the following result:
For more information about lateral derived tables, see FROM clause.