XML notes XML Notes

XML -> XSL perspective

The following XML processing overview article is excellent and (mercifully) succinct:
          http://www.zend.com/php5/articles/php5-xmlphp.php
For XML tutorials, read:
http://www.w3schools.com/xml/default.asp
For an XML (and supporting component technologies) overview, see:
          http://www.w3.org/XML/1999/XML-in-10-points

Acronymns

DOM

DOM (Document Object Model) is a W3C standard for representing XML data as an objects hierarchy.  The PHP5 DOM implementation corresponds to the W3C DOM object-based API specification, available at http://www.w3.org/DOM.

The DOM standard is a verbose, trying to anticipate every imaginable situation.  “Nodes” are analogous to rows in database parlance.

DTD

DTD = Document type definition
A DTD defines the XML document structure with a list of legal elements. 
A DTD can be declared inline in your XML document or as an external reference.
See: http://www.w3schools.com/dtd/dtd_intro.asp

HTML 4

In addition to being able to load HTML documents the PHP DOM extension can also save them as HTML 4.  Use $dom->saveHTML() after you have built up your DOM document.

XML Namespaces

xmlns:namespacePrefix="namespaceURI"

XML-RPC

XML-RPC (Remote Procedure Call) is gaining popularity vs. SOAP, particularly in multi-server environments, because XML-RPC excels at enabling components, distributed across multiple servers, yielding performance advantages. For example:
<?php
require_once 'XML/RPC.php'; $sParameters = array(new XML_RPC_Value("Hello World", 'int')); $msg = new XML_RPC_Message("MyWebService.MyFunction", $sParameters); $client = new XML_RPC_Client("/RPC2", "localhost", 80); $response = $client->send($msg); $v = $response->value(); ?>

SAX

… Simple API for XML, an event-based API.

well-formed XML

… means that each opening tag must be accompanied by a closing tag.  XML is case-sensitive.

XPath

XPath is analogous to SQL for XML.  XPath allows querying an XML document for a specific node (row), matching some criteria.  XPath is much easier to use, faster in execution and requires less code than the standard DOM methods. Using XPath with SimpleXML saves a lot of typing (more succinct code).

XSL

Extensible Stylesheet Language (XSL) was designed to fill the need for an XML-based (as opposed to HTML-based CSS) stylesheet language.  XSL is the preferred XML style sheet language.  XSL (the eXtensible Stylesheet Language) is far more sophisticated (and complicated and verbose) than CSS.  XSL describes how an XML document should be displayed.  More Than a Style Sheet Language ...
XSL consists of:

  1. XSLT - a language for transforming XML documents
  2. XPath - a language for navigating in XML documents
  3. XSL-FO - a language for formatting XML documents

A brief XSL tutorial is available at:
          http://www.w3schools.com/xsl/default.asp

XSLT

XSLT is the XML transformation portion of the XSL specification (the remaining XSL specifications are XSL-FO and XPath).  XSLT is a 1999 W3C recommendation.
An XSLT stylesheet is used to transform an XML document into another document e.g. another XML document, PDF, HTML, RTF, TeX, delimited files, binary files or any other format that the XSLT processor is capable of producing.
Normally, XSLT does this by transforming each XML element into an (X)HTML element.   XSLT processing often begins by reading a serialized XML input document into the source tree and ends by writing the result tree to an output document i.e. XSLT transforms an XML source-tree into an XML result-tree.

XSLT PHP history

Two XSLT processors were implemented in PHP 4:
Sablotron (using the more widely known xslt extension) and libxslt (using the domxml extension). 
The two APIs were not compatible; their feature sets were different.
In PHP 5, only the libxslt processor is supported.  Libxslt was chosen because it is based on libxml2 and fits into the XML concept of PHP 5.  libxslt is one of the fastest XSLT implementations available (execution speed can be double that of Sablotron).

Sablotron

Sablotron is an XSLT, DOM and xPath processor, incorporated in PHP since PHP version 0.36 in 2000.  The Sablotron project creates a lightweight, reliable and fast XML library processor, conforming to the W3C specification, freely available for public use as a base for multi-platform XML applications.  Sablotron is written in C++ to keep it as portable as possible.

XSLT elements are defined here:
          http://www.w3schools.com/xsl/xsl_w3celementref.asp
         
XSLT functions are defined here:
          http://www.w3schools.com/xsl/xsl_functions.asp
          http://www.w3.org/TR/xquery-operators/

The default prefix for the function namespace is fn.

Putting it all together

Since an XSL style sheet is an XML document, itself, an XSL stylesheet always begins with an XML declaration:
  <?xml version="1.0"  encoding="ISO-8859-1"?>
Next, an <xsl:stylesheet> element defines this otherwise XML document as an XSLT style sheet document (along with the version number and XSLT namespace attributes).
An <xsl:template> element defines a template.  The match="/" attribute associates the template with the root of the XML source document.
The content inside the <xsl:template> element defines HTML to output.

XSL components … in brief

To access XSLT elements, attributes and features we must declare the XSLT namespace at the .xsl document top, as follows:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
Valid XSL namespace values can be:  stylesheet, template.
Reference the .xsl document near the top of the .xml document (to be styled):
  <?xml-stylesheet type="text/xsl"  href="chosenStyle.xsl"?>
An XSLT compliant browser will transform XML into XHTML.

The value of the <xsl:value-of> element select attribute is expressed as an XPath expression and can be used to extract a (the first) value in a node-set.
The <xsl:for-each> element can be used to select every XML element of a specified (by the higher-levels in the path) node-set (row set).

XSL components … in brief

To access XSLT elements, attributes and features we must declare the XSLT namespace at the .xsl document top, as follows:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
Valid XSL namespace values can be:  stylesheet, template.
Reference the .xsl document near the top of the .xml document (to be styled):
  <?xml-stylesheet type="text/xsl"  href="chosenStyle.xsl"?>
An XSLT compliant browser will transform XML into XHTML.

The value of the <xsl:value-of> element select attribute is expressed as an XPath expression and can be used to extract a (the first) value in a node-set.
The <xsl:for-each> element can be used to select every XML element of a specified (by the higher-levels in the path) node-set (row set).

XSL parameters

To pass HTML as a parameter to XSL, use <xsl: copy-of …>, rather than <xsl: value-of …>, as follows:
  <xsl:template  name="errorbox"> 
<xsl:param name="message" select="''" />

<div class="errorbox">
<h2><xsl:text>Error: </xsl:text></h2>
<div class="boxcontent">
<xsl:copy-of select="$message"/>
</div>
</div>
</xsl:template>

Templates

Attempt to generate all HTML from template files.

Separate logic from presentation – to allow output in any format. 

Used for templating, PHP makes it fast and easy to mix code and HTML; PHP gained popularity for that reason.  PHP code in templates is fine (i.e. 'ok'), as long as it's dealing with (view) presentation.

The problem with most open source PHP projects is that they are coded with no clear separation between logic and content.  At some point they grow to unmanageability.

Bulletin Board system implementation

BB code is encoded presentation text, needing to be formatted with HTML before output. That involves some logic to match tags, deal with badly-formed tags and, of course, checking for any nasties.
Custom tags, like Smarty or Wact can be replaced with PHP code blocks.  That way, presentation designers can use WYSIWYG editors and pages can be served up, efficiently.

Back Home