Tip: Convert from HTML to XML with HTML Tidy

by Benoit Marchal
Marchal.com

Listing 3. index-transform.xml (an excerpt)

<?xml version="1.0" encoding="MacRoman"?>
<gl:gallery xmlns:gl="http://ananas.org/2003/tips/gallery">
<gl:title>Journey to Windsor</gl:title>
<gl:photo>
<gl:title>Windsor Castle</gl:title>
<gl:date>July 2003</gl:date>
<gl:image>dscn0824.jpg</gl:image>
<gl:description>A bright, red mailbox inside the castle.
  It seems oddly familiar in an historic setting.</gl:description>
</gl:photo>
</gl:gallery>

Listing 4. cleanup.xsl

<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
                xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:gl="http://ananas.org/2003/tips/gallery"
                xmlns:html="http://www.w3.org/1999/xhtml"
                exclude-result-prefixes="html">
 
<xsl:output method="xml" indent="yes" encoding="MacRoman"/>
 
<xsl:template match="html:html">
  <xsl:variable name="date"
                select="html:body/html:table/html:tr/html:td[2]
                        /html:font/html:br[3]
                        /preceding-sibling::text()[1]"/>
  <gl:gallery>
    <gl:title>
      <xsl:value-of select="html:head/html:title"/>
    </gl:title>
    <xsl:for-each select="html:body/html:center/html:table
                          /html:tr/html:td">
      <xsl:variable name="title"
                    select="html:font/html:br[3]
                            /preceding-sibling::text()[1]"/>
      <xsl:variable name="image"
                    select="html:font/html:br[1]
                            /preceding-sibling::text()[1]"/>
      <xsl:variable name="description"
                    select="html:font/html:br[2]
                            /preceding-sibling::text()[1]"/>
      <gl:photo>
        <gl:title><xsl:value-of
          select="normalize-space($title)"/></gl:title>
        <gl:date><xsl:value-of
          select="normalize-space($date)"/></gl:date>
        <gl:image><xsl:value-of
          select="normalize-space($image)"/></gl:image>
        <gl:description><xsl:value-of
          select="normalize-space($description)"/></gl:description>
      </gl:photo>
    </xsl:for-each>
  </gl:gallery>
</xsl:template>
</xsl:stylesheet>


Options:
Printer Friendly
Email Friend

About The Author:

Benoit Marchal is a Belgian consultant. He is the author of XML by Example and other XML books. Benoit is available to help you with XML projects. You can contact him at bmarchal@pineapplesoft.com or through his personal site at marchal.com.

Developer Categories



Developer Tutorials
ASP
CGI & Perl
CSS
Flash
HTML
Java
JavaScript
MySQL
PHP
Python
XML

Web Hosting

BlueHost Hosting - Unlimited Bandwidth, Unlimited Disk Space, Unlimited Domains for only $3.95!

Developer Documentation

Developer Tools



Search our Developer Tutorials
  The DevSyndicate Network