DTD manipulation

26 June, 2008 at 20:28 1 comment

Document Type Definition (DTD) is one of several SGML and XML schema languages, and is also the term used to describe a document or portion thereof that is authored in the DTD language. A DTD is primarily used for the expression of a schema via a set of declarations that conform to a particular markup syntax and that describe a class, or type, of document, in terms of constraints on the structure of that document. A DTD may also declare constructs that are not always required to establish document structure, but that may affect the interpretation of some documents. XML documents are described using a subset of DTD which imposes a number of restrictions on the document’s structure, as required per the XML standard (XML is in itself an application of SGML optimized for automated parsing).

DTD is native to the SGML and XML specifications, and since its introduction other specification languages such as XML Schema and RELAX NG have been released with additional functionality.

As an expression of a schema, a DTD specifies, in effect, the syntax of an “application” of SGML or XML, such as the derivative language HTML or XHTML. This syntax is usually a less general form of the syntax of SGML or XML.

In a DTD, the structure of a class of documents is described via element and attribute-list declarations. Element declarations name the allowable set of elements within the document, and specify whether and how declared elements and runs of character data may be contained within each element. Attribute-list declarations name the allowable set of attributes for each declared element, including the type of each attribute value, if not an explicit set of valid value(s).

Converts a DTD to an XML Schema with dtd2xs

Dtd2Xs allows to convert complex, modularized XML DTDs and DTDs with
namespaces to XML Schemas. As an example of Dtd2Xs conversion check
out DocBook XML Schema generated from XML DocBook DTD V4.2, and XSL-FO
Schema generated from XSL-FO DTD.

Platforms:
Win32, Linux

License:
Free for non-profit activities

dtd2xs example

Using dtd2xs to Convert DTD’s with Namespaces

To correctly handle namespaces, dtd2xs must be provided with namespace mapping and target namespace prefix information from command-line (it is not possible to deduce it from DTD itself). Default behavior of dtd2xs is to ignore elements with unmapped prefixes.

In the following examples we will use XSL Formatting Objects (XSL-FO) DTD as an example. The DTD file, fo.dtd, contains definition of a standard XSL-FO plus extensions for RenderX XSL-FO renderer. Standard elements and attributes are defined with fo prefix; RenderX extensions use different namespace and have rx prefix.

To make ‘pure’ XSL-FO Schema, use:

dtd2xs -t fo -m fo:http://www.w3.org/1999/XSL/Format fo.dtd

Note that -t option is used to specify target namespace prefix (fo), and -m option is used to map this prefix into namespace URI.

To make ‘full’ Schema, which defines both standard XSL-FO constructs and RenderX extensions, use:

dtd2xs -t fo -m fo:http://www.w3.org/1999/XSL/Format -m rx:http://www.renderx.com/XSL/Extensions -i rx:rx.xsd fo.dtd fo+rx.xsd dtd2xs -t rx -m rx:http://www.renderx.com/XSL/Extensions fo.dtd rx.xsd

We actually need two commands here. First one defines mappings for two namespace prefixes (fo and rx). fo is used as a target namespace prefix and additional option -i rx:rx.xsd means “import components with rx prefix from schemaLocation rx.xsd). In the second command, we extract RenderX extensions into the separate schema file rx.xsd, which will be used (imported) by first Schema file fo+rx.xsd. It may look cumbersome, but W3C XML Schema standard does not allow definition of elements for different namespaces in the single Schema file.

Another caveat is handling attributes for which element namespace is different from attribute namespace. Since a validator must be able to reference definitions of such attributes from an imported schema, dtd2xs generates definitions of such attributes at the top level of the schema.

Note

You must specify target prefix and prefix mappings if your DTD contains namespaces. Without this, dtd2xs will probably generate incorrect Schema.

Converts a DTD to an POJO objects with PlainXML

PlainXML represents a pure Java library which includes various lightweight XML processing tools.

Major features are:

  • Generation of POJO by DTD;
  • Using typed POJO for manipulation for XML documents in Java;
  • XML-POJO mapping using either Java5 annotations or DTD with processing instructions;
  • Ability to access and modify XML document using POJO instead of SAX or DOM;
  • Custom preprocessing of XML documents using expression language;
  • Support of “binary” XML format;
  • RMI friendly XML marshalling;
  • Export JavaBeans tree to JSON;

PlainXML Examples follow this

Others DTD Tools:

Visual DTD Editor

Design DTDs easily and intuitively with a built-in DTD Editor! The DTD Editor lets you work with text and tree views to review and modify DTD elements, attributes, and properties.

DTD Validator

Validate XML using any DTD processor (MSXML, Microsoft .NET, Xerces, and others)! DTD is easily associated — internally or externally — with any XML document, making DTD validation and syntax-checking a snap.

DTD Generator

Auto-generate DTDs from any XML document using the Stylus Studio DTD generator. DTDs can be automatically inserted using the !DOCTYPE declaration, or you can generate external DTDs for validating multiple XML documents.

Convert DTD to Schema

Convert your legacy DTD data models to the official W3C XML Schema format using built-in DTD to XSD conversion tools. Converting DTD to XML Schema provides support for both built-in and custom data types.

DTD to XML

Use a DTD to create XML documents! A simple DTD-to-XML document wizard gives you a leg-up on even the most complex XML.

DTD Parsers

DTD parsers supported in Stylus Studio include both DOM and SAX-based XML DTD Parsing components. An internal DTD parser helps you quickly validate XML documents, or select from one of several supported DTD validation parsers for XML like MSXML, .NET, and Xerces.

Using DTD in Java

Use the DTDs that you develop in Stylus Studio inside your Java applications. Using Xerces-J, you can easily parse DTDs as part of your XML validation.

Using DTD in Microsoft .NET

The DTDs that you develop in Stylus Studio can be used inside both traditional Microsoft COM-based applications, as well as newer Microsoft .NET applications. DTD parsing can be implemented in .NET, C#, and visual basic.

References:

[http://www.lumrix.net/]
[http://www.soft-amis.com]
[wikipedia.org]
[http://www.stylusstudio.com/]


Bookmark and Share

Advertisements

Entry filed under: DOM, DTD, XHTML, XML. Tags: , , , .

HTML 5: What’s news? Pros and Cons of Plugin Systems

1 Comment Add your own

  • 1. Peter Hickman  |  16 July, 2008 at 14:03

    Hey..this could be what I’m looking for! (Trying to read old SGML from the FDA) into SAS

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


IT Passion’s Store

Archives

Communities

Get the Source
OSGi supporter
JUG Milano

Upcoming Events



....

Blog Stats

  • 335,774 hits

My PageRank

What's My Google PageRank?

%d bloggers like this: