Editing document with DTD

We shall assume that you already read the chapters on "Mapping between HTML form and XML data" and Modify the Structure of XML Data. If you are interested in using XML schema and not DTD, you should still read this chapter. The client side editor code and function is the same, regardless of whether DTD, XML schema, no schema (or in the future other schema language) is used. A lot of this chapter is really about the client side editor. You need to know this before you go to the next chapter on XML schema.

Again let us look at look at the simplified Form Generator below or the real Form Generator page. We shall use the same W3C PO example data in their schema primer. We put an equivalent DTD in the schema textarea. The get sample selection lists would put the content in for you.

How DTD affects the generated HTML form

Automatic generation of HTML Form from XML data

Title:

Style (or enter file name if you want an external style sheet):

Put the XML data here (or the URL of your XML file):


Get Sample Data:
Put the XML Schema here (or the URL of your schema):

Form output will be in popup window.
Form will output XML plain text HTML, using script on server client
Label will be same as tag separate words
Form fields will be in outline form to reflect the XML structure.
Form will be used in older browsers that do not support fieldset, legend and label.
Allow insertion and deletion, require browser supporting DOM.
Check Schema during editing.
Always conform to Schema during editing.

Here we put the DTD in a schema area separate from XML data in a rather unusual manner. We do this so that we can contrast between using DTD and XML Schema for the same XML data. Of course you can embed the DTD with the XML data in the usual manner or put the DTD in an external file.

When we generate the form, we get the following. The forms in the rest of this page are generated using Mozilla and are mainly for display and not really "live". If you actually want to play around with it you should actually generate the form.

- purchaseOrder
- shipTo





- billTo






- items
- item




- item




This looks very similar to the form generated without the DTD in the previous chapter. The only noticeable difference is the country attribute in shipTo and billTo element is a select list with only 1 item. We know from the DTD that the attribute has a fixed value. This is the way we show that the value is fixed. This is like you can have any color as long as it is black. If the DTD shows that the attribute has a choice of enumerated items, then it will be shown as a selected list with the valid options.

Very often when the form is first generated, it shows a JavaScript confirm alert indicating the XML document is not valid. This is because when the forms starts up, we do a verification against the DTD/Schema. If the XML document does not conform, the error alert would come up.

Verification code are on the client side and the equivalent code is not duplicated on the server side. The client side in general assumes the DTD/Schema is correct. Therefore error in the DTD/Schema are often caught on the client side and not the server side. Sometimes DTD/Schema errors are not caught at all. Although this will improve in the future, you should try to validate the DTD/Schema with some other tools.

Generate XML document from DTD

If you provides a DTD but no XML data, then the generator would generate a blank XML instance document from the DTD. Using the same DTD in the previous example, we get the following XML instance document.

<purchaseOrder orderDate="">
  <shipTo country="US">
    <name></name>
    <street></street>
    <city></city>
    <state></state>
    <zip></zip>
  </shipTo>
  <billTo country="US">
    <name></name>
    <street></street>
    <city></city>
    <state></state>
    <zip></zip>
  </billTo>
  <comment></comment>
  <items>
    <item partNum="">
      <productName></productName>
      <quantity></quantity>
      <USPrice></USPrice>
      <comment></comment>
      <shipDate></shipDate>
    </item>
    <item partNum="">
      <productName></productName>
      <quantity></quantity>
      <USPrice></USPrice>
      <comment></comment>
      <shipDate></shipDate>
    </item>
  </items>
</purchaseOrder>
The XML document will show up in the XML editor as below. However we also get an error about the partNum is invalid. Remember we talked about a verification during startup. Fields in these generated XML documents are usually empty and the empty contents are often does not conform to the schema.

- purchaseOrder
- shipTo





- billTo






- items
- item





- item





Note that the item* generates two item elements. This is used as a hint that there can be multiple item elements.

If elements in the DTD are defined recursively, then we would only go down 1 level in the recursion to avoid an infinite size document. For example, using the following DTD

<!DOCTYPE folder [
<!ELEMENT folder (folder*, file*)>
<!ATTLIST folder folderName CDATA #REQUIRED>
<!ELEMENT file (#PCDATA)>
<!ATTLIST file fileDescription CDATA #REQUIRED>
]>

We get the following instance document. It shows you that there can be folders within folder. We also keep the document to a reasonable size rather than let it run wild. Just for fun, we shall use multi-words label.

- folder
- folder




- folder








Verify the XML document with DTD

The XML document can either be verified against the DTD during editing, or it can be verified on demand from the user.

If any change in data is verified during editing, then the editor will try to make the XML data agree with the DTD all the time. Sometimes this may not be a good idea. Suppose you have a content model (alpha, beta)*. Currently the editor only let you insert or delete one element at a time. If you add the alpha element, then the data does not agree with the DTD, and you have to add the beta element to bring it in agreement with the DTD again. However, if the data has to be in agreement with DTD all the time, then you cannot add the alpha beta pair of elements.

Therefore we have give you preference flags to control how much verification you want.

Check Schema during editing: if this flag is off, there is absolutely no verification during editing. If it is on, then items that are changed are being checked to see if the result agree with the DTD. The checking gives you a warning, but it may or may not prevent you from making the change.

Always conform to Schema during editing: if this flag is on, then the editor will try to prevent you from making the change. So if you try to insert an element but the result does not agree with the content model, then you cannot do it. This also means that you cannot create new class of elements from the menu. If this flag is off, then there is a warning. If you insists, you can still make the change. You can also create new class of elements. However, currently the existing schema is not updated when you violate the schema. So even if you are allowed to make these non-conforming changes, the document will not pass verification.

The verification checks on DTD content model, it also check on the data. Suppose an attribute is specified as NMTOKEN and verification is on. After you modified the attribute and move on the next field, it would check to see if you did enter a NMTOKEN. Currently it is possible to get around it but it will be fixed later.

If no error is found, we will finish with nothing selected. If there is an error, there will be an alert dialog and the guilty element is selected (because of browser bug, the selection often does not show up when the alert is up. Since the DTD does not provide any message, the editor generate an error message to try to describe the problem. Here is an example



It not difficult to see that the problem is the quantity is missing.

Better message may be provided with annotation in schema. This will be discussed in the schema chapter.

If you hit cancel, then the verification stops and the last error element is selected. If you hit OK, then the editor would look for more errors in the originally selected element. Following the same example, it would find the error that the part number is not a valid token, select the part number field and show the error dialog.


The two verification flags can be set from the form generator. You can also change it during editing. These two flags are on the preference dialog (in fact currently they are the only flags on the preference) and you can change it from there. Here is what the preference dialog looks like.

Check to see if data conforms to schema.
Try to disallow data that does not follow the schema.

Hitting the Set Preference button would change the preference, but the window will stay open so you can change it again quickly.

You may also verify XML elements on demand using the menu. Only the selected element (and its child elements) will be verified, but of course you can select the root element and verify the whole document. It is a good idea to do the verification before you try to generate your XML the XML document, in particular if you have turned off verification during editing.

Verification of ID/IDREF/IDREFS are different from other verifications. They are only verified if the whole document is verified. In that case, duplicate ID are reported. IDs in IDREF and IDREFS are checked for existence, but non-existence are only reported after the whole document has bee verified, and the offending elements are not selected as in other verification errors.

Unsupported DTD features

I believe that most DTD features are already supported. Some exceptions are notation which is rarely used and hence I am in no hurry to implement. Entities are also not yet supported because I have not decided how much support is necessary.

Schema features beyond DTD

The schema verification in the editor is meant to be general and not specific to any particular schema language. Of course it must be general enough to support DTD now and XSD later, and maybe other schema languages after that. The current implementation is not sufficient for XSD, and that is one reason why only DTD is supported now. However, there are a lot of features implemented that are beyond DTD. In particular, most of the predefined XSD datatype support are already in the editor.

Since it will be a while before XSD will be supported, in the mean time for testing purpose I have been using an extended DTD to test some of these feature. This extension will go away when XSD is supported. These extensions should only be used for using DTD to generate XML document because no other parser, including the parser in the cgi of the form generator, would support these extensions.

On the bounds of XML elements, besides the usual ?+* used by DTD, you can also use {n}, {n,}, {n,m}.

DTD has predefined datatypes such as ID, IDREF, NMTOKEN. You can also replace them with most of the XSD predefined datatypes such as Name, NCName, boolean, base64binary, hexbinary, float, double, anyURI, decimal, integer, nonPositiveInteger, negativeInteger, long, int, short, byte, nonNegativeInteger, ussignedLong, unsignedInt, unsignedShort, unsignedByte, positiveInteger, duration, dateTime, time, date, geYearMonth, gYear, gMonth, gDay and gMonth. You can also have a list datatype by appending the predefined datatype with *. For example, integer* stands for a list of integers.

There are also other features used to support XSD, but I don't want to make a lot of changes to the DTD parser, so they are not tested from the DTD.


Back to the main page.