Editing document with XML Schema

We shall assume that you already read the chapters on "Mapping between HTML form and XML data", Modify the Structure of XML Data and Editing document with XML Schema. If you are interested in using XML schema and not DTD, you should still read the DTD chapter first. That chapter is really about general schema languages, and there is very little we have to add here.

Again let us look at look at the simplified Form Generator below or the real Form Generator page. We shall use the same W3C PO example data and sample schema in their Schema primer document. You can get the sample data and schema by using the get sample selection lists.

How Schema affects the generated HTML form

Automatic generation of HTML Form from XML data


Style (or enter file name if you want an external style sheet):

Put the XML data here (or the URL of your XML file):

Get Sample Data:
Put the XML Schema here (or the URL of your schema):

Form output will be in popup window.
Form will output XML plain text HTML, using script on server client
Label will be same as tag separate words
Form fields will be in outline form to reflect the XML structure.
Form will be used in older browsers that do not support fieldset, legend and label.
Allow insertion and deletion, require browser supporting DOM.
Check Schema during editing.
Always conform to Schema during editing.

When we generate the form, we get the following. The forms in the rest of this page are generated using Mozilla and are mainly for display and not really "live". If you actually want to play around with it you should actually generate the form.

This form is identical in appearance to the one generated with DTD. We put it here just for easy reference. You can also omit the XML data and generate a shell document just from the Schema. Again it is identical in appearance with the one from DTD. We shall not repeat that one here.

- purchaseOrder
- shipTo

- billTo

- items
- item

- item

The appearance may be the same, but under the hood it can be quite different. With Schema we have much better description of the elements datatypes. So the DTD form passes verification because of lack of specification of datatype in DTD, the Schema form pass verification because we do have a valid XML document.

This becomes quite clear when we generate the shell document with empty fields. Verification fails all over the places, include invalid date, decimal, positiveInteger and SKU (a datatype defined in the Schema).

Schema annotation as information about the elements

Documentation in schema annotation can be used to provide information about the elements to the user. This can be accessed through the "Help" in the internal menu. In future this is probably also available from the Help key in IE. Of course most schemas do not have any annotation, in which case a "No help information is available." alert is shown.

If nothing is selected, the information is about the schema, and it is the top level annotation in the schema. So in the example PO schema, you get the following alert:

In general, for an element, two piece of information is available. There is information about the element itself, and there is information about the element's data type. For example, the shipTo element is the address to ship the product of, and the datatype is USAddress, a complex type that contains all the address content. What you should do is put the element information as an annotation right after the element is defined, and type information as an annotation right after the type is defined.

As an example, let us look at the PO schema and use color code to show where the element information annotation and type information annotation should go.

Color code: schema information element information type information

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<!-- Sample schema from XML Schema Part 0: Primer -->
<!-- Schema modified to have extra attributes and annotations for testing. -->

<xsd:annotation> <xsd:documentation xml:lang="en"> Purchase order schema for Example.com. Copyright 2000 Example.com. All rights reserved. </xsd:documentation> </xsd:annotation>
<xsd:element name="purchaseOrder" type="PurchaseOrderType"/> <xsd:element name="comment" type="xsd:string"/> <xsd:complexType name="PurchaseOrderType"> <xsd:sequence> <xsd:element name="shipTo" type="USAddress">
<xsd:annotation> <xsd:documentation xml:lang="en"> Ship the purchase to this address </xsd:documentation> </xsd:annotation>
</xsd:element> <xsd:element name="billTo" type="USAddress" nillable="true">
<xsd:annotation> <xsd:documentation xml:lang="en"> Bill the purchase to this address </xsd:documentation> </xsd:annotation>
</xsd:element> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name="items" type="Items"/> </xsd:sequence> <xsd:attribute name="orderDate" type="xsd:date"/> </xsd:complexType> <xsd:complexType name="USAddress">
<xsd:annotation> <xsd:documentation xml:lang="en"> the address type for USA Should always has country="US" Other fields are name, street, city, state and zip code </xsd:documentation> </xsd:annotation>
<xsd:sequence> <xsd:element name="name" type="xsd:string"/> <xsd:element name="street" type="xsd:string"/> <xsd:element name="city" type="xsd:string"/> <xsd:element name="state" type="xsd:string"/> <xsd:element name="zip" type="xsd:decimal"/> </xsd:sequence> <xsd:attribute fixed="US" name="country" type="xsd:NMTOKEN"/> </xsd:complexType> <xsd:complexType name="Items"> <xsd:sequence> <xsd:element maxOccurs="unbounded" minOccurs="0" name="item">
<xsd:annotation> <xsd:documentation xml:lang="en"> Item purchased </xsd:documentation> </xsd:annotation>
<xsd:annotation> <xsd:documentation xml:lang="en"> Information includes part number, product name, quantity, price, comment(optional), ship date(optional) </xsd:documentation> </xsd:annotation>
<xsd:sequence> <xsd:element name="productName" type="xsd:string"/> <xsd:element name="quantity">
<xsd:annotation> <xsd:documentation xml:lang="en"> How many was ordered </xsd:documentation> </xsd:annotation>
<xsd:annotation> <xsd:documentation xml:lang="en"> must be between 1 and 99 </xsd:documentation> </xsd:annotation>
<xsd:restriction base="xsd:positiveInteger"> <xsd:maxExclusive value="100"/> </xsd:restriction> </xsd:simpleType> </xsd:element> <xsd:element name="USPrice" type="xsd:decimal"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element minOccurs="0" name="shipDate" type="xsd:date"/> </xsd:sequence> <xsd:attribute use="required" name="partNum" type="SKU">
<xsd:annotation> <xsd:documentation xml:lang="en"> The part number of the product </xsd:documentation> </xsd:annotation>
</xsd:attribute> </xsd:complexType> </xsd:element> </xsd:sequence> </xsd:complexType> <!-- Stock Keeping Unit, a code for identifying products --> <xsd:simpleType name="SKU">
<xsd:annotation> <xsd:documentation xml:lang="en"> Stock Keeping Unit, a code for identifying products. It should be in the form ddd-xx </xsd:documentation> </xsd:annotation>
<xsd:restriction base="xsd:string"> <xsd:pattern value="\d{3}-[A-Z]{2}"/> </xsd:restriction> </xsd:simpleType> </xsd:schema>

The question is why can't we put annotation at lower level of the schema, for example to put it with each enumeration. We may be able to do this in future, but it takes a lot of work and the annotation has to written in certain way for the total message to look natural. Even now, we need to do some combining of element and type information and use the information in error message. As we can see from the examples, it is not working well together, we will be fine tuning them in coming versions.

Another question is why can't we put the element information and type information and put it together as annotation of the elements. One reason is that we may be duplicating the message. The shipTo and billTo elements both are of USAddress type and we don't want to duplicate them. Another reason is that the part information is used not only in the help message, it is also used in the error message in verification. So we should separate the element information and the type information.

Now let us look at how the annotations are used by the editor.

Here is the help information on the item element:

The type information is also used in verification error message. Notice that we use the type information instead of the internally generated message:

Here is the error message on verification of the part number:

Compare the last two message with the corresponding messages in the previous chapter.

For predefined datatype, there will be a generated message. Here is the help alert for the zip code, only the type information is shown because element information has not been defined:

Here is the error message on verification of the zip code:

In conclusion, annotation can be used to enhance the user experience in the editor, but it is not a requirement. The basic philosophy of the editor that you should be able to use existing schema unchanged, but you can always improve the editor if you are willing to enhance the schema.

Validity of the Schema

The form generator assumes that the schema supplied by the user is valid. Very little effort is used to check the semantic validity of the schema. Often the checking of the schema is delegated to the client. Adding more checking will make the generator more useful for the user to experiment with their schema. This will be worked on in the future, but as for now you have to depend on other tools.

In some case, an invalid Schema is accepted because it does not present problem with our web XML editor. A schema that violates the unique particle attribution rule would still run in the editor.

The bottom line is that if you want a schema that is definitely valid, you have to verify it elsewhere.

Elements with the nillable attribute

If the the schema, an element has the nillable attribute, then in the data the element can be nil. You can tell whether an element is nillable by using help. In the example schema, we added the nillable attribute to billTo. Here is the information about billTo:

The message specified that billTo is nillable. If you really want to nil it, you can choose the nil command from the menu. Then all the elements in billTo are removed. If you now use the help command, it will specify that billTo is nil:

If the nilled element is just a simple type element, then the value field is cleared. If the input field is a select list, it will show a value of undefined. A radio button list will have all button deselected.

In the information about nilled attribute, it suggests that to edit the element, use the clear command. This is one way to "un-nil" an element. So if we use the clear command on the nilled billTo element, we get:

This is equivalent to making a new billTo element, with the attributes kept the same.

For nilled simple type element, clearing it may show no visible change, but if you use the help command you see it is indeed no longer nil.

Another way of "un-niling" a nilled element is to just edit it. So if you put some data in a text field, or selecting the select list or radio button, then when you go to another field, the nilled element will no longer be nilled. For complex type element, you can "un-nil" by adding elements (but not attributes) to the nilled elements.

Of course, with complex type elements, it is much easier to use to use the clear command because you know the elements will conform to the schema. For simple type elements, if you want to "un-nil" it but want an empty string as the value, again the clear command is the most convenient way to do it.

Whether an element is nillable or not is specified in the schema, this means that as long as you turn off validation, any element can be nilled, even if there is only DTD and no W3C Schema. If validation is on, you cannot nil an element that is not nillable.

Supported and unsupported Schema features

There are many features in XML Schema and it takes time to implement all of them.

Right now I have implement an important subset of the features. They include most of the predefined datatypes, derivation of simple datatypes by list and union, restriction, pattern. Complex type simple content allows extension or restriction. Complex content support includes sequence, choice, all, group, attributeGroup, extension and restriction. Substitue group and nillable attribute is supported after pre-release 4.

The list of unsupported feature is also long, including namespaces, mixed content, key, unique, redefinition, referencing schema, plus a whole list of features that I do not even know about. I would work on some of these in future versions.

In summary, while this is an complete implementation, and that is why it is still a pre-release. There are enough features so it would be useful in the more common cases.

Back to the main page.