1. CodersCay »
  2. SQL Server »
  3. Extract the data from XML using OPENXML and XQuery in SQL Server

Published On: 6/30/2022

CodersCay Logo

Extract the data from XML using OPENXML and XQuery in SQL Server

This post describes to read the data content from different XML elements and attributes by using OPENXML and XQuery methods. The XML file format is one of the best way to handle the large volume of data as single file.


T-SQL exclusively provides OPENXML and XQuery methods to extract or read the data content from XML file or XML text format. Below XML samples illustrate to extract the data from XML Elements and Attributes by using OPENXML and XQuery in SQL Server.

For example, Let's say we have customers data in XML file and the customer's different information is in XML elements looks like below. We could see how to extract this customer information from XML Elements to data result set using OPENXML and XQuery methods.
<Customers>
<customer>
<ID>111589</ID>
<FirstName>name1</FirstName>
<LastName>Lname1</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>12345</ID>
<FirstName>name2</FirstName>
<LastName>Lname2</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>14567</ID>
<FirstName>name3</FirstName>
<LastName>Lname3</LastName>
<Company>DEF</Company>
</customer>
</Customers>

Scenario 1: XML Elements

Method 1: Read XML Elements data using OPENXML()

What is OPENXML?

The OPENXML method provides a row set view over an XML document. The OPENXML primarily gives the ability to insert XML data to the relational database. OPENXML function used to retrieve metadata from an XML document. To write queries using OPENXML, we must first create an internal representation of the XML document by calling sp_xml_preparedocument. The internal representation of XML document variable removed from memory by calling sp_xml_remove_document system stored procedure.

Syntax:

exec sp_xml_preparedocument @idoc out, @impxml

SELECT * FROM openxml(@idoc,'Customers/customer',2)
with(ID int,FirstName varchar(100),LastName varchar(100),Company varchar(100))

exec sp_xml_removedocument @idoc

What is sp_xml_preparedocument?

This is a system defined stored procedure. Reads the XML text provided as input, parses the text by using the MSXML parser (Msxmlsql.dll) and provides the parsed document in a state ready for consumption. This parsed document is a tree representation of the various nodes in the XML document such as elements, attributes, text and so on.

Arguments


ArgumentDescription
hdoc Handle to the newly created document by ID value. hdoc is an integer.
[xmltext]XML data as input. Parameter can be char, nchar, varchar, nvarchar, text, ntext or xml.
[xpath_namespaces]Specifies the namespace declarations that are used in row and column XPath expressions in OPENXML. Parameter can be char, nchar, varchar, nvarchar, text, ntext or xml.

What is sp_xml_remove_document?

A system defined proedure that removes the internal representation of the XML document specified by the document handle and invalidates the document handle.

Arguments


ArgumentDescription
hdoc Removing the reference of document ID value.

Source Code:
DECLARE @idoc INT
DECLARE @impxml XML ='<Customers>
<customer>
<ID>111589</ID>
<FirstName>name1</FirstName>
<LastName>Lname1</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>12345</ID>
<FirstName>name2</FirstName>
<LastName>Lname2</LastName>
<Company>ABC</Company>
</customer>
<customer>
<ID>14567</ID>
<FirstName>name3</FirstName>
<LastName>Lname3</LastName>
<Company>DEF</Company>
</customer>
</Customers>'

exec sp_xml_preparedocument @idoc out, @impxml

SELECT * FROM openxml(@idoc,'Customers/customer',2)
with(ID int,FirstName varchar(100),LastName varchar(100),Company varchar(100))

exec sp_xml_removedocument @idoc

Method 2: Read XML Elements data using XQuery

What is XQuery?

XQuery is a language used to finds and extracts the data on structured or semi-structured XML documents. SQL Server supports the xml data type in the Database Engine and the documents can be stored in a database and then queried by using XQuery.

XQuery is based on the existing XPath query language and this supports to Iterates, Sorts and the ability to construct the necessary XML. XQuery operates on the XQuery Data Model. This is an abstraction of XML documents and the XQuery results that can be typed or untyped. The type information is based on the types provided by the W3C XML Schema language. If no typing information is available then XQuery handles the data as untyped format.

XQuery Code:
SELECT
x.value('(ID)[1]','INT') AS FirstName,
x.value('(FirstName)[1]','VARCHAR(100)') AS FirstName,
x.value('(LastName)[1]','VARCHAR(100)') AS LastName,
x.value('(Company)[1]','VARCHAR(100)') AS Company
FROM @impxml.nodes('/Customers/customer') as XMLtbl(x)

Output of Scenario 1:
OPENXML and XQuery Output Scenario 1


Scenario 2: XML Attributes

XML attribute is a part of an XML element and by the use of attributes we can add the information about the element. The XML attributes enhance the properties of the elements and values must always be quoted in XML element.

Let's take below XML file which is having products information and we will go through how to extract the XML elements by using OPENXML and XQuery. In below example, GroupID and Description are in XML attributes.

declare @idoc int, @impxml XML = '<BudgetGroups>
<BudgetGroup GroupID="001" Description="Soft costs">
<LineItems>
<LineItem LineItemID="01" PercentComplete="100" Drawn="200000" />
<LineItem LineItemID="02" PercentComplete="200" Drawn="300000" />
<LineItem LineItemID="03" PercentComplete="300" Drawn="400000" />
</LineItems>
</BudgetGroup>
<BudgetGroup GroupID="02" Description="Hard Costs">
<LineItems>
<LineItem LineItemID="01" PercentComplete="20" Drawn="40000" />
<LineItem LineItemID="02" PercentComplete="20" Drawn="40000" />
<LineItem LineItemID="03" PercentComplete="20" Drawn="40000" />
<LineItem LineItemID="04" PercentComplete="40" Drawn="80000" />
</LineItems>
</BudgetGroup>
</BudgetGroups>'

Method 1: Read the XML Elements with Attributes data using OPENXML


exec sp_xml_preparedocument @idoc out, @impxml

select * from openxml(@idoc,'BudgetGroups/BudgetGroup/LineItems/LineItem',1)
with (GroupID varchar(100) '../../@GroupID', Description varchar(100) '../../@Description',
LineItemID varchar(100),PercentComplete varchar(100), Drawn varchar(100))

exec sp_xml_removedocument @idoc out

Method 1: Read the XML Elements with Attributes data using XQuery


SELECT m.n.value('@GroupID','int') AS GroupID,
m.n.value('@Description','varchar(100)') AS [Description],
p.q.value('@LineItemID','varchar(10)') AS LineItemID,
p.q.value('@PercentComplete','varchar(3)') AS PercentComplete,
p.q.value('@Drawn','varchar(10)') AS Drawn
FROM @impxml.nodes('/BudgetGroups/BudgetGroup')m(n)
CROSS APPLY n.nodes('LineItems/LineItem')p(q)

Output of Scenario 2:
OPENXML and XQuery Output Scenario 2

Conclusion

There are two ways to read the XML Elements and Attributes in SQL Server by using OPENXML and XQuery methods. Compare to XQuery method OPENXML is slightly faster in performance and also easy to use. We could choose XQuery for small size of XML file and can be used to retrieve both hierarchal and tabular data.

No comments:

Post a Comment