Found at: http://publish.ez.no/article/articleprint/76/

Parsing XML using PHP



Parsing XML is a very common task when programming for the web. This article will show you how to use the eZ xml parser to handle XML documents.

Installing eZ xml

eZ xml is a XML parser which is written in pure PHP, it doesn't need any external libraries. You can always get the latest version of eZ xml from http://developer.ez.no/developer/download/ezxml/stable

After downloading the tarball unpack it in your source directory. Then you just have to include the eZ xml class and you're ready to go.


//  include eZ xml class
include_once( "ezxml/classes/ezxml.php" );


The XML file

Below is a simple XML file we want to parse. This is a fairly simple XML file but it contains attributes and subnodes. You should be able to tweak this example to fit your specific needs.


<?xml version="1.0"?>
<document version="42">
  <person>
    <firstname value="Bård" />
    <lastname value="Farstad" />
    <description>
    Coder.
    </description>
  </person>
  <person>
    <firstname value="Christoffer A." />
    <lastname value="Elo" />
    <description>
    Coder.
    </description>
  </person>
</document>


Create the document object tree

First we need to get the XML document into a PHP string. This is done here with direct initializing. Normally this is read from the database, file, socket or similar. After we've got the string with the XML document to parse we just have to send this to the eZXML::domTree() function which will return a document object tree.


// Initialize the XML string
$xmlDocument =
"<?xml version=\"1.0\"?>
<document version=\"42\">
  <person>
    <firstname value=\"Bård\" />
    <lastname value=\"Farstad\" />
    <description>
    Coder.
    </description>
  </person>
  <person>
    <firstname value=\"Christoffer A.\" />
    <lastname value=\"Elo\" />
    <description>
    Coder.
    </description>
  </person>
</document>";

// create the document object tree
$tree =& eZXML::domTree( $xmlDocument, array( "TrimWhiteSpace" => true  ) );

Traverse the document tree

In this example we have a XML file which consists of a document with a specific version. The document contains persons with the attributes firstname, lastname and description. The code snippet finds the document version and prints it. It's pretty straight forward you just check all the top nodes of the document and look for the one named document.


foreach ( $tree->children as $document )
{
    // parse the document
    if ( $document->name == "document" )
    {
        // get the document version attribute
        foreach ( $document->attributes as $documentAttr )
        {
            if ( $documentAttr->name == "version" )
            {
                print( "Found document with version: " . $documentAttr->content . "<br>" );
            }
        }

        // find persons here
    }
}


When you've found the document node you can start looking for persons. This is done in the same manner, check the children nodes and look for person.

To make the process of getting the attribute values simpler we write a helper function to fetch the attribute value from a node.


function getAttrValue( $node, $attrName )
{
    $ret = false;
     
    foreach ( $node->attributes as $nodeAttr )
    {
        if ( $nodeAttr->name == $attrName )
        {
            $ret = $nodeAttr->content;
        }
    }
    return $ret;
}


Now we're ready to parse the information describing the persons in this example. When we've found the person node we check all the subnodes and look for the nodes we want and fetch the information from these nodes.


// parse all persons
foreach ( $document->children as $person )
{
    if ( $person->name == "person" )
    {
        print( "Found a new person <br>" );

        $firstName = "";
        $lastName = "";
        $descriptionName = "";
                
        // get the name and description
        foreach ( $person->children as $personAttribute )
        {
            switch ( $personAttribute->name )
            {
                case "firstname" :
                {
                    $firstName = getAttrValue( $personAttribute, "value" );
                }break;

                case "lastname" :
                {
                    $lastName = getAttrValue( $personAttribute, "value" );
                }break;

                case "description" :
                {
                    // get the description text
                    foreach ( $personAttribute->children as $description )
                    {
                        if ( $description->type == 3 )
                        {
                            $description = $description->content;
                        }
                    }
                }break;
            }                    
        }

        print( "The persons firstname is: $firstName <br>" );
        print( "The persons lastname is: $lastName <br>" );
        print( "The persons description is: $description <br>" );
    }            
}

Complete code listing

Below you will find the complete code listing for this example.


include_once( "ezxml/classes/ezxml.php" );

$xmlDocument =
"<?xml version=\"1.0\"?>
<document version=\"42\">
  <person>
    <firstname value=\"Bård\" />
    <lastname value=\"Farstad\" />
    <description>
    Coder.
    </description>
  </person>
  <person>
    <firstname value=\"Christoffer A.\" />
    <lastname value=\"Elo\" />
    <description>
    Coder.
    </description>
  </person>
</document>";

$tree =& eZXML::domTree( $xmlDocument, array( "TrimWhiteSpace" => true  ) );

foreach ( $tree->children as $document )
{
    // parse the document
    if ( $document->name == "document" )
    {
        // get the document version attribute
        foreach ( $document->attributes as $documentAttr )
        {
            if ( $documentAttr->name == "version" )
            {
                print( "Found document with version: " . $documentAttr->content . "<br>" );
            }
        }

        // parse all persons
        foreach ( $document->children as $person )
        {
            if ( $person->name == "person" )
            {
                print( "Found a new person <br>" );

                $firstName = "";
                $lastName = "";
                $descriptionName = "";
                
                // get the name and description
                foreach ( $person->children as $personAttribute )
                {
                    switch ( $personAttribute->name )
                    {
                        case "firstname" :
                        {
                            $firstName = getAttrValue( $personAttribute, "value" );
                        }break;

                        case "lastname" :
                        {
                            $lastName = getAttrValue( $personAttribute, "value" );
                        }break;

                        case "description" :
                        {
                            // get the description text
                            foreach ( $personAttribute->children as $description )
                            {
                                if ( $description->type == 3 )
                                {
                                    $description = $description->content;
                                }
                            }
                        }break;
                    }                    
                }

                print( "The persons firstname is: $firstName <br>" );
                print( "The persons lastname is: $lastName <br>" );
                print( "The persons description is: $description <br>" );                
            }            
        }
    }
}

/*!
  Function to fetch an attribute value.
  Will return the value of the attribute if found. False if not found.
*/
function getAttrValue( $node, $attrName )
{
    $ret = false;
     
    foreach ( $node->attributes as $nodeAttr )
    {
        if ( $nodeAttr->name == $attrName )
        {
            $ret = $nodeAttr->content;
        }
    }
    return $ret;
}


This code will produce the following output:

Found document with version: 42
Found a new person 
The persons firstname is: Bård 
The persons lastname is: Farstad 
The persons description is: Coder. 
Found a new person 
The persons firstname is: Christoffer A. 
The persons lastname is: Elo 
The persons description is: Coder2. 


Using XML is simple and straightforward with the eZ xml class. You don't need any external libraries, the class produce the same document tree as you would get from the XML functions in PHP (which all need external libraries). It is also the only PHP XML parser class which returns the same object tree as the library functions, making it easy to use your programs both on sites where XML is compiled into PHP and sites where it isn't.

You can learn more about XML and related technologies at http://www.w3.org/. The XML specification is located at this url http://www.w3.org/TR/REC-xml


| Back to normal page view |