Data transformation |
---|
Concepts |
Transformation languages |
Techniques and transforms |
Applications |
Related |
The identity transform is a data transformation that copies the source data into the destination data without change.
The identity transformation is considered an essential process in creating a reusable transformation library. By creating a library of variations of the base identity transformation, a variety of data transformation filters can be easily maintained. These filters can be chained together in a format similar to UNIX shell pipes.
Examples of recursive transforms
The "copy with recursion" permits, changing little portions of code, produce entire new and different output, filtering or updating the input. Understanding the "identity by recursion" we can understand the filters.
Using XSLT
The most frequently cited example of the identity transform (for XSLT version 1.0) is the "copy.xsl" transform as expressed in XSLT. This transformation uses the xsl:copy command[1] to perform the identity transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This template works by matching all attributes (@*) and other nodes (node()), copying each node matched, then applying the identity transformation to all attributes and child nodes of the context node. This recursively descends the element tree and outputs all structures in the same structure they were found in the original file, within the limitations of what information is considered significant in the XPath data model. Since node() matches text, processing instructions, root, and comments, as well as elements, all XML nodes are copied.
A more explicit version of the identity transform is:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="@*|*|processing-instruction()|comment()">
<xsl:copy>
<xsl:apply-templates select="*|@*|text()|processing-instruction()|comment()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
This version is equivalent to the first, but explicitly enumerates the types of XML nodes that it will copy. Both versions copy data that is unnecessary for most XML usage (e.g., comments).
XSLT 3.0
XSLT 3.0[2] specifies an on-no-match attribute of the xsl:mode
instruction that allows the identity transform to be declared rather than implemented as an explicit template rule. Specifically:
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:mode on-no-match="shallow-copy" />
</xsl:stylesheet>
is essentially equivalent to the earlier template rules. See the XSLT 3.0 standard's description of shallow-copy[3] for details.
Finally, note that markup details, such as the use of CDATA sections or the order of attributes, are not necessarily preserved in the output, since this information is not part of the XPath data model. To show CDATA markup in the output, the XSLT stylesheet that contains the identity transform template (not the identity transform template itself) should make use of the xsl:output
attribute called cdata-section-elements
.
cdata-section-elements
specifies a list of the names of elements whose text node children should be output using CDATA sections.
[1]
For example:
<xsl:output method="xml" encoding="utf-8" cdata-section-elements="element-name-1 element-name-2"/>
Using XQuery
XQuery can define recursive functions. The following example XQuery function copies the input directly to the output without modification.
declare function local:copy($element as element()) {
element {node-name($element)}
{$element/@*,
for $child in $element/node()
return if ($child instance of element())
then local:copy($child)
else $child
}
};
The same function can also be achieved using a typeswitch-style transform.
xquery version "1.0";
(: copy the input to the output without modification :)
declare function local:copy($input as item()*) as item()* {
for $node in $input
return
typeswitch($node)
case document-node()
return
document {
local:copy($node/node())
}
case element()
return
element {name($node)} {
(: output each attribute in this element :)
for $att in $node/@*
return
attribute {name($att)} {$att}
,
(: output all the sub-elements of this element recursively :)
for $child in $node
return local:copy($child/node())
}
(: otherwise pass it through. Used for text(), comments, and PIs :)
default return $node
};
The typeswitch transform is sometime preferable since it can easily be modified by simply adding a case statement for any element that needs special processing.
Non-recursive transforms
Two simple and illustrative "copy all" transforms.
Using XSLT
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:copy-of select="."/>
</xsl:template>
</xsl:stylesheet>
Using XProc
<p:pipeline name="pipeline" xmlns:p="http://www.w3.org/ns/xproc">
<p:identity/>
</p:pipeline>
Here one important note about the XProc identity, is that it can take either one document like this example or a sequence of document as input.
More complex examples
Generally the identity transform is used as a base on which one can make local modifications.
Remove named element transform
Using XSLT
The identity transformation can be modified to copy everything from an input tree to an output tree except a given node. For example, the following will copy everything from the input to the output except the social security number:
<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>
<!-- remove all social security numbers -->
<xsl:template match="PersonSSNID"/>
Using XQuery
declare function local:copy-filter-elements($element as element(),
$element-name as xs:string*) as element() {
element {node-name($element) }
{ $element/@*,
for $child in $element/node()[not(name(.)=$element-name)]
return if ($child instance of element())
then local:copy-filter-elements($child,$element-name)
else $child
}
};
To call this one would add:
$filtered-output := local:copy-filter-elements($input, 'PersonSSNID')
Using XProc
<p:pipeline name="pipeline" xmlns:p="http://www.w3.org/ns/xproc">
<p:identity/>
<p:delete match="PersonSSNID"/>
</p:pipeline>
See also
Further reading
- XSLT Cookbook, O'Reilly Media, Inc., December 1, 2002, by Sal Mangano, ISBN 0-596-00372-2
- Priscilla Walmsley, XQuery, O'Reilly Media, Inc., Chapter 8 Functions – Recursive Functions – page 109