php - Combine two elements in XML file


Keywords:php 


Question: 

First off thanks for any help!

I have an xml that looks like this:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE tv SYSTEM "xmltv.dtd">

<tv source-info-url="http://tvschedule.zap2it.com/" source-info-name="zap2it.com" generator-info-name="zap2xml" generator-info-url="zap2xml@gmail.com">
<channel id="I16689330.labs.zap2it.com">
    <display-name>502 WCBSDT</display-name>
    <display-name>502</display-name>
    <display-name>WCBSDT</display-name>
    <icon src="" />
</channel>
<programme start="20180303203000 -0500" stop="20180303230000 -0500" channel="I20453335.labs.zap2it.com">
    <title lang="en">NBA Basketball</title>
    <sub-title lang="en">Boston Celtics at Houston Rockets</sub-title>
    <desc lang="en">From the Toyota Center in Houston.</desc>
    <category lang="en">Sports</category>
    <category lang="en">Basketball</category>
    <length units="minutes">120</length>
    <icon src="" />
    <url>https://tvlistings.zap2it.com//overview.html?programSeriesId=SP00371600&amp;tmsId=SP003716000000</url>
    <episode-num system="dd_progid">SP00371600.0000</episode-num>
    <new />
    <subtitles type="teletext" />
</programme>
</tv>

I would like to generate something like this that combines title with sub-title:

<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE tv SYSTEM "xmltv.dtd">

<tv source-info-url="http://tvschedule.zap2it.com/" source-info-name="zap2it.com" generator-info-name="zap2xml" generator-info-url="zap2xml@gmail.com">
<channel id="I16689330.labs.zap2it.com">
    <display-name>502 WCBSDT</display-name>
    <display-name>502</display-name>
    <display-name>WCBSDT</display-name>
    <icon src="" />
</channel>
<programme start="20180303203000 -0500" stop="20180303230000 -0500" channel="I20453335.labs.zap2it.com">
    <title lang="en">NBA Basketball: Boston Celtics at Houston Rockets</title>
    <desc lang="en">From the Toyota Center in Houston.</desc>
    <category lang="en">Sports</category>
    <category lang="en">Basketball</category>
    <length units="minutes">120</length>
    <icon src="" />
    <url>https://tvlistings.zap2it.com//overview.html?programSeriesId=SP00371600&amp;tmsId=SP003716000000</url>
    <episode-num system="dd_progid">SP00371600.0000</episode-num>
    <new />
    <subtitles type="teletext" />
</programme>
</tv>

If it can be done with a php script that would preferable


2 Answers: 

So if we have the XML string in $string, we can parse it into an XML object with simplexml_load_string:

$xml = simplexml_load_string($string);

And then access elements as object properties:

> echo $xml->title;
NBA Basketball

To build your desired combined property, it is as intuitive as (note how the dash special character must be handled):

$xml->title .= ': '.$xml->{'sub-title'};

Because we have combined the sub-title attribute into title, we no longer need it:

unset($xml->{'sub-title'});

And then print the whole object:

> echo $xml->asXML();
<?xml version="1.0"?>
<programme start="20180303203000 -0500" stop="20180303230000 -0500" channel="I20453335.labs.zap2it.com">
    <title lang="en">NBA Basketball: Boston Celtics at Houston Rockets</title>

    <desc lang="en">From the Toyota Center in Houston.</desc>
    <category lang="en">Sports</category>
    <category lang="en">Basketball</category>
    <length units="minutes">120</length>
    <icon src=""/>
    <url>https://tvlistings.zap2it.com//overview.html?programSeriesId=SP00371600&amp;tmsId=SP003716000000</url>
    <episode-num system="dd_progid">SP00371600.0000</episode-num>
    <new/>
    <subtitles type="teletext"/>
</programme>

Sample complete execution:

<?php 
$string = file_get_contents('xmltv.xml'); 
$xml = simplexml_load_string($string); 
$xml->title .= ': '.$xml->{'sub-title'}; 
unset($xml->{'sub-title'}); 
file_put_contents('xmltv.xml', $xml->asXML());
?>
 

Alternatively, consider XSLT, the special purpose language designed to transform XML files. PHP can run XSLT 1.0 with its php-xsl class (be sure it is enabled in .ini file). Also, XSLT is portable and does not need PHP to run it. Most other languages (Java, Python, Perl, VB) can run such scripts and even standalone XSLT processors.

Specifically, below XSLT script runs the Identity Transform to copy document as is and then rewrites the programme node with concat() of title and subtitle, finally reproducing all else nodes and attributes. While this may seem overkill, if your XML is much larger and maintains many programme nodes this XSLT will combine ALL title and sub-title without any looping.

XSLT (save as .xsl file, a special .xml file)

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <!-- IDENTITY TRANSFORM -->
    <xsl:template match="node()|@*">  
        <xsl:copy> 
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>

    <!-- REWRITE PROGRAMME -->
    <xsl:template match="programme"> 
        <xsl:copy>     
            <xsl:copy-of select="@*"/>
            <title><xsl:value-of select="concat(title, ' ', sub-title)" /></title>
            <xsl:apply-templates select="*[name()!='title' and name()!='sub-title']" /> 
        </xsl:copy>       
    </xsl:template>

</xsl:stylesheet>

PHP

$xml = new DOMDocument;
$xml->load('Input.xml');

$xsl = new DOMDocument;
$xsl->load('XSLT_Script.xsl');

// Configure transformer
$proc = new XSLTProcessor;
$proc->importStyleSheet($xsl);

// Transform XML source
$newXML = new DOMDocument;
$newXML = $proc->transformToXML($xml);

// Output to console
echo $newXML;

// Output to file
file_put_contents('Output.xml', $newXML);

Output

<programme start="20180303203000 -0500" stop="20180303230000 -0500" channel="I20453335.labs.zap2it.com">
  <title>NBA Basketball Boston Celtics at Houston Rockets</title>
  <desc lang="en">From the Toyota Center in Houston.</desc>
  <category lang="en">Sports</category>
  <category lang="en">Basketball</category>
  <length units="minutes">120</length>
  <icon src=""/>
  <url>https://tvlistings.zap2it.com//overview.html?programSeriesId=SP00371600&amp;tmsId=SP003716000000</url>
  <episode-num system="dd_progid">SP00371600.0000</episode-num>
  <new/>
  <subtitles type="teletext"/>
</programme>