XMLBeans provides a simple, easy-to-use means for accessing XML. With
XMLBeans, you can compile schema to generate Java types that can be
bound to XML based on the schema. You can also access XML with schema by
using an XML cursor, which provides an access model that is an
alternative to DOM-based models.
A D V E R T I S E M E N T
Getting Started with XMLBeans
XMLBeans provides intuitive ways to handle XML that make it easier for you
to access and manipulate XML data and documents in Java.
of XMLBeans approach to XML:
It provides a familiar Java object-based view of XML data without losing
access to the original, native XML structure.
The XML's integrity as a document is not lost with XMLBeans.
XML-oriented APIs commonly take the XML apart in order to bind to its
parts. With XMLBeans, the entire XML instance document is handled as a
whole. The XML data is stored in memory as XML. This means that the
document order is preserved as well as the original element content with
With types generated from schema, access to XML instances is through
JavaBean-like accessors, with get and set methods.
It is designed with XML schema in mind from the beginning — XMLBeans
supports all XML schema definitions.
Access to XML is fast.
The starting point for XMLBeans is XML schema. A schema (contained in an XSD
file) is an XML document that defines a set of rules to which other XML
documents must conform. The XML Schema specification provides a rich data
model that allows you to express sophisticated structure and constraints on
your data. For example, an XML schema can enforce control over how data is
ordered in a document, or constraints on particular values (for example, a
birth date that must be later than 1900). Unfortunately, the ability to
enforce rules like this is typically not available in Java without writing
custom code. XMLBeans honors schema constraints.
Note: Where an XML schema defines rules for an XML document, an XML
instance is an XML document that conforms to the schema.
You compile a schema (XSD) file to generate a set of Java interfaces that
mirror the schema. With these types, you process XML instance documents that
conform to the schema. You bind an XML instance document to these types;
changes made through the Java interface change the underlying XML
Previous options for handling XML include using XML programming
interfaces (such as DOM or SAX) or an XML marshalling/binding tool (such as
JAXB). Because it lacks strong schema-oriented typing, navigation in a
DOM-oriented model is more tedious and requires an understanding of the
complete object model. JAXB provides support for the XML schema
specification, but handles only a subset of it; XMLBeans supports all of it.
Also, by storing the data in memory as XML, XMLBeans is able to reduce the
overhead of marshalling and demarshalling.
Accessing XML Using Its Schema
To get a glimpse of the kinds of things you can do with XMLBeans, take a
look at an example using XML for a purchase order. The purchase order XML
contains data exchanged by two parties, such as two companies. Both parties
need to be able to rely on a consistent message shape, and a schema
specifies the common ground.
Here's what a purchase order XML instance might look like.
This XML includes a root element, purchase-order,
that has four kinds of child elements: customer,
and shipper. An intuitive, object-based view
of this XML would provide an object representing the
purchase-order element, and it would have
methods for getting the date and for getting subordinate objects for
and shipper elements. Each of the last three
would have its own methods for getting the data inside them as well.
Looking at the Schema
The following XML is the the schema for the preceding purchase order XML.
It defines the XML's "shape" — what its elements are, what order they appear
in, which are children of which, and so on.
This schema describes the purchase order XML instance by defining the
Definitions for three complex types — customer, line-item, and
shipper. These are the types used for the children of the
purchase-order element. In schema, a complex type is one that
defines an element that may have child elements and attributes. The
sequence element nested in the complex type lists its child
These are also global types. They are global because they
are at the top level of the schema (in other words, just beneath the
schema root element). This means that
they may be referenced from anywhere else in the schema.
Use of simple types within the complex types. The name, address, and
description elements (among others) are typed as simple types. As it
happens, these are also built-in types. A built-in type (here,
one with the "xs" prefix) is part of the schema specification. (The
specification defines 46 built-in types.)
A global element called purchase-order. This element definition
includes a nested complex type definition that specifies the child
elements for a purchase-order element. Notice that the complex type
includes references to the other complex types defined in this schema.
In other words, the schema defines types for the child elements and
describes their position as subordinate to the root element,
When you use the XMLBean compiler with an XSD file such as this one, you
generate a JAR file containing the interfaces generated from the schema.
Writing Java Code That Uses the Interfaces
With the XMLBeans interfaces in your application, you can write code that
uses the new types to handle XML based on the schema. Here's an example that
extracts information about each of the ordered items in the purchase order
XML, counts the items, and calculates a total of their prices. In
particular, look at the use of types generated from the schema and imported
as part of the org.openuri.easypo package.
The printItems method receives a
File object containing the purchase order
public class POHandler
public static void printItems(File po) throws Exception
* All XMLBeans schema types provide a nested Factory class you can
* use to bind XML to the type, or to create new instances of the type.
* Note that a "Document" type such as this one is an XMLBeans
* construct for representing a global element. It provides a way
* for you to get and set the contents of the entire element.
* Also, note that the parse method will only succeed if the
* XML you're parsing appears to conform to the schema.
PurchaseOrderDocument poDoc =
* The PurchaseOrder type represents the purchase-order element's
* complex type.
PurchaseOrder po = poDoc.getPurchaseOrder();
* When an element may occur more than once as a child element,
* the schema compiler will generate methods that refer to an
* array of that element. The line-item element is defined with
* a maxOccurs attribute value of "unbounded", meaning that
* it may occur as many times in an instance document as needed.
* So there are methods such as getLineItemArray and setLineItemArray.
LineItem lineitems = po.getLineItemArray();
System.out.println("Purchase order has " + lineitems.length + " line items.");
double totalAmount = 0.0;
int numberOfItems = 0;
* Loop through the line-item elements, using generated accessors to
* get values for child elements such a description, quantity, and
for (int j = 0; j < lineitems.length; j++)
System.out.println(" Line item: " + j);
" Description: " + lineitems[j].getDescription());
System.out.println(" Quantity: " + lineitems[j].getQuantity());
System.out.println(" Price: " + lineitems[j].getPrice());
numberOfItems += lineitems[j].getQuantity();
totalAmount += lineitems[j].getPrice() * lineitems[j].getQuantity();
System.out.println("Total items: " + numberOfItems);
System.out.println("Total amount: " + totalAmount);
Notice that types generated from the schema reflect what's in the XML:
A PurchaseOrderDocument represents the
global root element.
A getPurchaseOrder method returns a
that contains child elements, including
line-item. A getLineItemArray
method returns a LineItem array
containing the line-item elements.
Other methods, such as getQuantity,
getPrice, and so on, follow naturally from
what the schema describes, returning corresponding children of the
The name of the package containing these types is derived from the
schema's target namespace.
Capitalization and punctuation for generated type names follow Java
convention. Also, while this example parses the XML from a file, other
parse methods support a Java
InputStream object, a
Reader object, and so on.
Java code prints the following to the console:
Purchase order has 3 line items.
Line item 0
Description: Burnham's Celestial Handbook, Vol 1
Line item 1
Description: Burnham's Celestial Handbook, Vol 2
Total items: 4
Total amount: 41.68
Creating New XML Instances from Schema
As you've seen XMLBeans provides a "factory" class you can use to create
new instances. The following example creates a new
purchase-order element and adds a customer
child element. It then inserts name and
address child elements, creating the
elements and setting their values with a single call to their
The following is the XML that results. Note that XMLBeans assigns the
correct namespace based on the schema, using an "ns1" (or, "namespace 1")
prefix. For practical purposes, the prefix itself doesn't really matter —
it's the namespace URI (http://openuri.org/easypo) that defines the
namespace. The prefix is merely a marker that represents it.
Note that all types (including those generated from schema) inherit from
XmlObject, and so provide a
Factory class. For an overview of the type
system in which XmlObject fits, see
XMLBeans Support for Built-In Schema Types. For reference information,
The generated types you saw used in the preceding example are actually
part of a hierarchy of XMLBeans types. This hierarchy is one of the ways in
which XMLBeans presents an intuitive view of schema. At the top of the
hierarchy is XmlObject, the base interface
for XMLBeans types. Beneath this level, there are two main type categories:
generated types that represent user-derived schema types, and included types
that represent built-in schema types.
This topic has already introduced generated types. For more information, see
Java Types Generated from User-Derived Schema Types.
Built-In Type Support
In addition to types generated from a given schema, XMLBeans provides 46
Java types that mirror the 46 built-in types defined by the XML schema
specification. Where schema defines xs:string,
xs:int, for example, XMLBeans provides
XmlInt. Each of these also inherits from
XmlObject, which corresponds to the built-in schema type
XMLBeans provides a way for you to handle XML data as these built-in
types. Where your schema includes an element whose type is, for example,
xs:int, XMLBeans will provide a generated
method designed to return an XmlInt. In
addition, as you saw in the preceding example, for most types there will
also be a method that returns a natural Java type such as
int. The following two lines of code return
the quantity element's value, but return it
as different types.
// Methods that return simple types begin with an "x".
XmlInt xmlQuantity = lineitems[j].xgetQuantity();
// Methods that return a natural Java type are unadorned.
int javaQuantity = lineitems[j].getQuantity();
In a sense both get methods navigate to the
quantity element; the getQuantity
method goes a step further and converts the elements value to the most
appropriate natural Java type before returning it. (XMLBeans also provides a
means for validating the XML as you work with it.)
If you know a bit about XML schema, XMLBeans types should seem fairly
intuitive. If you don't, you'll learn a lot by experimenting with XMLBeans
using your own schemas and XML instances based on them.
For more information on the methods of types generated from schema, see
Methods for Types Generated From Schema. For more about the how XMLBeans
represents built-in schema types, see
XMLBeans Support for Built-In Schema Types.
Using XQuery Expressions
With XMLBeans you can use XQuery to query XML for specific pieces of
data. XQuery is sometimes referred to as "SQL for XML" because it provides a
mechanism to access data directly from XML documents, much as SQL provides a
mechanism for accessing data in traditional databases.
XQuery borrows some of its syntax from XPath, a syntax for specifying
nested data in XML. The following example returns all of the
line-item elements whose
price child elements have values less than
or equal to 20.00:
PurchaseOrderDocument doc = PurchaseOrderDocument.Factory.parse(po);
* The XQuery expression is the following two strings combined. They're
* declared separately here for convenience. The first string declares
* the namespace prefix that's used in the query expression; the second
* declares the expression itself.
String nsText = "declare namespace po = 'http://openuri.org/easypo'";
String pathText = "$this/po:purchase-order/po:line-item[po:price <= 20.00]";
String queryText = nsText + pathText;
XmlCursor itemCursor = doc.newCursor().execQuery(queryText);
This code creates a new cursor at the start of the document. From there,
it uses the XmlCursor interface's
execQuery method to execute the query
expression. In this example, the method's parameter is an XQuery expression
that simply says, "From my current location, navigate through the
purchase-order element and retrieve those
line-item elements whose value is less than
or equal to 20.00." The $this variable means
"the current position."
For more information about XQuery, see
XQuery 1.0: An XML
Query Language at the W3C web site.
Using XML Cursors
In the preceding example you may have noticed the
XmlCursor interface. In addition to providing a way to execute
XQuery expression, an XML cursors offers a fine-grained model for
manipulating data. The XML cursor API, analogous to the DOM's object API, is
simply a way to point at a particular piece of data. So, just like a cursor
helps navigate through a word processing document, the XML cursor defines a
location in XML where you can perform actions on the selected XML.
Cursors are ideal for moving through an XML document when there's no
schema available. Once you've got the cursor at the location you're
interested in, you can perform a variety of operations with it. For example,
you can set and get values, insert and remove fragments of XML, copy
fragments of XML to other parts of the document, and make other fine-grained
changes to the XML document.
The following example uses an XML cursor to navigate to the
name child element.
What's happening here? As with the earlier example, the code loads the
XML from a File object. After loading the
document, the code creates a cursor at its beginning. Moving the cursor a
few times takes it to the nested name
element. Once there, the getText method retrieves the element's value.
This is just an introduction to XML cursors. For more information about
using cursors, see
Navigating XML with Cursors.
Where to Go Next
XMLBeans provides intuitive ways to handle XML, particularly if you're
starting with schema. If you're accessing XML that's based on a schema,
you'll probably find it most efficient to access the XML through
generated types specific to the schema. To do this, you begin by
compiling the schema to generate interfaces. For more information on
using XMLBeans types generated by compiling schema, see
Java Types Generated From User-Derived Schema Types and
Methods for Types Generated From Schema.
You might be interested in reading more about the type system on which
XMLBeans is based, particularly if you're using types generated from
schema. XMLBeans provides a hierarchical system of types that mirror
what you find in the XML schema specification itself. If you're working
with schema, you might find it helps to understand how these types work.
For more information, see
XMLBeans Support for Built-In Schema Types and
Introduction to Schema Type Signatures.
XMLBeans provides access to XML through XQuery, which borrows path
syntax from XPath. With XQuery, you can specify specific fragments of
XML data with or without schema. To learn more about using XQuery and
XPath in XMLBeans, see
Selecting XML with XQuery and XPath.
You can use the XmlCursor interface for
fine-grained navigation and manipulation of XML. For more information, see
Navigating XML with Cursors.
Selecting XML with XQuery and XPath
You can use XQuery and XPath to retrieve specific pieces of XML as you might
retrieve data from a database. XQuery and XPath provide a syntax for
specifying which elements and attributes you're interested in. The XMLBeans
API provides two methods for executing XQuery and XPath expressions, and two
differing ways to use them. The methods are
selectPath and execQuery, and you can
call them from
XmlObject (or an object inheriting from
XmlCursor. The results for the methods
Using the selectPath Method
The selectPath method is the most efficient
way to execute XPath expressions. The selectPath
method is optimized for XPath. When you use XPath with the
selectPath method, the value returned is an
array of values from the current document. In contrast, when you
use execQuery, the value returned is a
Calling from XmlObject
When called from XmlObject (or a type
that inherits from it), this method returns an array of objects. If the
expression is executed against types generated from schema, then the type
for the returned array is one of the Java types corresponding to the schema.
For example, imagine you have the following XML containing employee
information. You've compiled the schema describing this XML and the types
generated from schema are available to your code.
Fred Jones900 Aurora Ave.SeattleWA981152011 152nd Avenue NERedmondWA98052(425)555-5665(206)555-5555(206)555-4321
If you wanted to find the phone numbers whose area code was 206, you could
capture the XPath expression in this way:
Notice in the query expression that the variable
$this represents the current context node (the
XmlObject that you are querying from). In
this example you are querying from the document level
You could then print the results with code such as the following:
* Retrieve the matching phone elements and assign the results to the corresponding
* generated type.
PhoneType phones = (PhoneType)empDoc.selectPath(queryExpression);
* Loop through the results, printing the value of the phone element.
for (int i = 0; i < phones.length; i++)
Calling from XmlCursor
When called from an XmlCursor instance,
the selectPath method retrieves a list of
selections, or locations in the XML. The selections are remembered by the
cursor instance. You can use methods such as
toNextSelection to navigate among them.
The selectPath method takes an XPath
expression. If the expression returns any results, each of those results
is added as a selection to the cursor's list of selections. You can move
through these selections in the way you might use
java.util.Iterator methods to move
through a collection.
For example, for a path such as
$this/employees/employee, the results
would include a selection for each employee element found by the
expression. Note that the variable $this
is always bound to the current context node, which in this example is
the document. After calling the selectPath
method, you would use various "selection"-related methods to work with
the results. These methods include:
getSelectionCount() to retrieve the
number of selections resulting from the query.
toNextSelection() to move the cursor
to the next selection in the list (such as to the one pointing at
the next employee element found).
toSelection(int) to move the cursor
to the selection at the specified index (such as to the third
employee element in the selection).
hasNextSelection() to find out if
there are more selections after the cursor's current position.
clearSelections() clears the
selections from the current cursor. This doesn't modify the document
(in other words, it doesn't delete the selected XML); it merely
clears the selection list so that the cursor is no longer keeping
track of those positions.
The following example shows how you might use
selectPath, in combination with the push
and pop methods, to maneuver through
XML, retrieving specific values.
public void printZipsAndWorkPhones(XmlObject xml)
// Declare the namespace that will be used.
String xqNamespace =
"declare namespace xq='http://openuri.org/selectPath'";
// Insert a cursor and move it to the first element.
XmlCursor cursor = xml.newCursor();
* Save the cursor's current location by pushing it
* onto a stack of saved locations.
// Query for zip elements.
cursor.selectPath(xqNamespace + "$this//xq:zip");
* Loop through the list of selections, getting the value of
* each element.
// Pop the saved location off the stack.
// Query again from the top, this time for work phone numbers.
cursor.selectPath(xqNamespace + "$this//xq:phone[@location='work']");
* Move the cursor to the first selection, them print that element's
// Dispose of the cursor.
Using selections is somewhat like tracking the locations of multiple
cursors with a single cursor. This becomes especially clear when you
remove the XML associated with a selection. When you do so the selection
itself remains at the location where the removed XML was, but now the
selection's location is immediately before the XML that was after the
XML you removed. In other words, removing XML created a kind of vacuum
that was filled by the XML after it, which shifted up into the space —
up into position immediately after the selection location. This is
exactly the same as if the selection had been another cursor.
Finally, when using selections keep in mind that the list of
selections is in a sense "live". The cursor you're working with is
keeping track of the selections in the list. In other words, be sure to
call the clearSelections method when
you're finished with the selections, just as you should call the
XmlCursor.dispose() method when you're
finished using the cursor.
Using the execQuery Method
Use the execQuery method to execute
XQuery expressions that are more sophisticated than paths. These expressions
include more sophisticated loops and FLWR (For, Let, Where, and Results)
Note: Be sure to see the
simpleExpressions sample in the SamplesApp application for a sampling of
XQuery expressions in use.
Calling from XmlObject
Unlike selectPath, calling
execQuery from an
XmlObject instance will return an XmlObject
array. If the XmlObject instances resulting
from the XQuery match a recognized XMLBeans type (the namespace and top
level element name match up with an XMLBeans type) then the
XmlObject will be typed; otherwise the
XmlObject will be untyped.
Calling from XmlCursor
Calling execQuery from an
XmlCursor instance returns a new
XmlCursor instance. The cursor returned is
positioned at the beginning of a new xml document representing the query
results, and you can use it to move through the results, cursor-style (for
more information, see
Navigating XML with Cursors). If the document resulting from the query
execution represents a recognized XMLBeans type (the namespace and top level
element name match up with an XMLBeans type) then the document resulting
from the xquery will have that Java type; otherwise the resulting document
will be untyped.