Overview of the XML and binary converter
How to use the data converter.
Storage format commands for the converter.
The actual data converter page.
What is new.
This is a companion project to the XML form editor. In fact it is envisioned
before the form editor. I figured that the form editor would be useful to this
project, so I wrote that first. It turns out that the editor took up so much
time that only now I work on the XML and binary conversion.
The idea is not to have a binary representation of XML data, rather it is to
have a XML representation of binary data. The reason is that there are lots of
legacy file out there which needs special program to read the content. If we
have have a XML representation then it can be read without special software. In
order to map the binary data to XML, we need a description that describes the
data so it can be converted. This is known as data format description Language
(DFDL), and the specification is being worked on by the
DFDL-WG. However I have not
see much progress. A partial implementation of a variant that tries to stay close
to the latest DFDL draft can be found in
"Virtual XML Garden".
However, Microsoft has own specification in
A few other companies probably are probably doing their own way.
My own interest does not start from these DFDL works.
Rather I got my inspiration from the the Macintosh world, where binary resource
data are described in a somewhat similar manner. However there are lots of
problem, you have Rez/DeRez if you want to do it in batch mode and you have
learn a new language. If you want to do it interactively, you used to have the
Mac OS8 ResEdit program which has a completely different template system.
Years ago I wrote some software that can be used to map binary data into
scripting language data structure. So I have been thinking that if I can map
binary data to XML, it will be a language familiar to many people, supported by
many tools including XML specific editors. Combining XML binary conversion and
an interactive XML editor, you can a easy way edit a binary file. So before I
do a web based XML binary conversion tool, I should do the web based XML editor.
Now that the editor is running for the most part, I turn my attention to the data
However, there is one problem. With the editor, I just accept XML schema as a
standard. It is a very complicated standard and not easy to implement it all,
but at least it is a obvious standard. With data conversion, there is no
standard that I am willing to commit to at this time. This is one reason I am
reluctant to start the project. Finally I decide to use my own rules so I can
proceed. The goal is not to establish yet another standard. Rather it is to
establish what I considered to be the basic features, then I can try to see if I
can fit it inside any of the standards. It seems that most works out there are
mostly interested in legacy business data, mostly written by mainframes using
COBOL. Obviously that is where the money are. However this is not my target.
Would any company want to post their customer records to my web site to convert
to XML data. This is not very likely. And would I want to have megabytes of data
post to me? I would abort as soon as anyone try to do so. I am mainly interested
in converting and editing of small configuration file. Therefore I would
concentrate on certain types of data file. It would be unable to work well or
work at all with many other type of data files, as least for the immediate
future. For example, I don't deal with EBCDIC, the support of decimal type is
very weak. So this is not the place to deal with business data files. The
target audience are people who occasionally need to convert small amount of data.
They do not want to download any converter. They want a converter that gives
instant feedback so they can experiment with it.
One should keep in mind that this is about XML representation of binary data, not
the other way around. This means that we would try have a design that most binary
file can have a XML representation. Failure means that there is need for improvement.
However, we would not strive for a binary representation of every XML file. It is
perfectly acceptable that there are some XML data that we fail to convert to binary.
If the purpose is to have a XML representation of legacy data, then only
conversion in one direction is needed. However, since my goal is to be able to
edit binary file by XML editing, we need conversion in the other direction. So the
rule is that any XML data generated from a binary file must be able to be convert back
to binary. For any other XML data there can be no guarantee. Having said that, I would
try to have a binary representation for most XML files, even if the generated binary
is unlikely to corresponding to any legacy file.
I also believe that the generated XML data should look like XML written by some one
knowledgeable in XML. It should carry little baggage from the original binary file.
An analogy is that an English essay should look like a written essay, and not what
see when you hit "translate this page" in Google search. Here is
an example of the sort of translation we want to avoid. It is the goal of this
project to enable converting to XML data with good style possible.
When we are converting XML data, we make an assumption that the XML data are valid.
We do some checking of the data, but it is minimal.
Therefore the user should make sure the XML data is correct,
perhaps using validation with some other tools, and there are quite a few
validation tools out there. More data checking will be added
in future implementation, but that is not a goal in this design stage.
Anyway this is version 0.1. There is a lot that is unimplemented, and there are quite
a few bugs. The design is not yet stable. It is meant to be a platform to experiment with.
While the data convertor would be a great working partner with the form editor,
the converter at this stage is in a state of flux, so no attempt will be made to
tightly integrate between the two now. Still there are enough to see what this project is about.
Binary data can also be edited by using both the converter and the form editor. The
process would go like this:
converter -> form generator -> editor -> converter -> newly edited binary data.
First you have the binary data in the converter, and you choose the
binary to form generator option. The generated XML and the schema
would put into the form generator. You can then choose the options
in the form generator, although most likely you can just accept the
default. Then you can generate the editor and start editing the XML data.
When you are done with the editing, you click the generate XML. Instead of
getting XML data directly, the XML data would go to the converter. And
from there you can choose the options and get back the newly edited binary data.
See here for more details.
A tighter integration would eliminate two of the steps and look like this
converter -> editor -> newly edited binary data.
That would require more work, and then there is the issue of how to
choose the options differing from the default. This is to be solved
in the future.
As we are still in the design stage, any suggestion is appreciated.
Send feedback to feedback AT datamech DOT com
Send bug report to bugs AT datamech DOT com
Allows conversion result to go back to the converter textbox. Note that 64 bit
integer does not work because my ISP is still using Perl 5.6.1 while my test
machine using 5.8. I am looking into work around.
Allows multiple occurrence of elements in the last element of a sequence stored as csv.
Allows transfer of data between converter and form editor generator.