XPath String Functions: Mastering Text Manipulation in XML Processing

Learn how to effectively use XPath string functions for powerful text manipulation within XML documents. This tutorial provides a comprehensive guide with examples demonstrating data extraction, cleaning, and transformation using XPath functions in XSLT or XQuery.



XPath String Functions

XPath String Functions

XPath provides a rich set of functions for manipulating strings. These functions are used to perform a wide variety of operations on strings extracted from XML documents. They are essential for tasks that involve data extraction, cleaning, and transformation.

Index Function Description
1 starts-with(string1, string2) Checks if `string1` starts with `string2`. Returns `true` or `false`.
2 contains(string1, string2) Checks if `string1` contains `string2`. Returns `true` or `false`.
3 substring(string, start, length?) Extracts a substring of a given length starting at a given position. `length` is optional; if omitted, extracts to the end of the string.
4 substring-before(string1, string2) Returns the portion of `string1` before the first occurrence of `string2`.
5 substring-after(string1, string2) Returns the portion of `string1` after the first occurrence of `string2`.
6 string-length(string) Returns the length of the string (number of characters).
7 normalize-space(string) Removes leading and trailing whitespace and normalizes internal whitespace.
8 translate(string1, string2, string3) Replaces characters in `string1` that match characters in `string2` with the corresponding characters in `string3`.
9 concat(string1, string2, ...) Concatenates multiple strings.
10 format-number(number, picture, decimal-format?) Formats a number according to a specified picture string. `decimal-format` is optional; specifies the decimal format.

Example: Calculating Name Lengths

This example uses XPath string functions to calculate and display the length of employee names. This assumes you have an XML file (`Employee.xml`) with employee data and an XSLT stylesheet (`Employee.xsl`) that will process this data. The XSLT stylesheet will use XPath functions to extract and process the data. The output will be an HTML table showing employee names and name lengths.

1. Sample XML Data (`Employee.xml`)

Employee.xml

<Employees>
  <Employee id="1">
    <FirstName>Abhiram</FirstName>
    <LastName>Kushwaha</LastName>
    <NickName>Manoj</NickName>
    <Salary>15000</Salary>
  </Employee>
  <Employee id="2">
    <FirstName>Akash</FirstName>
    <LastName>Singh</LastName>
    <NickName>Bunty</NickName>
    <Salary>25000</Salary>
  </Employee>
  <!-- ... more employees ... -->
</Employees>

2. XSLT Stylesheet (Illustrative)

The XSLT would utilize `xsl:for-each`, `xsl:value-of`, and the relevant XPath string functions to generate the table.

Conclusion

XPath's string functions are essential for various text processing tasks within XML documents. Understanding these functions is key to building robust and efficient XSLT stylesheets and XQuery expressions.