Skip to main content

SharePoint Taxonomies - Labels with forbiden characters

·2 mins

Recently, when I was working on mechanism of automatic synchronization of tree structures provided by web service to SharePoint taxonomies, I came across an error like this:

The value '(<= 0320-775, 0550-5/7)' is invalid. It probably contains invalid characters or is too long.  
Parameter name: name  

with the following stacktrace fragment:

at Microsoft.SharePoint.Taxonomy.Internal.CommonValidator.ValidateName(String name, String parameterName)  
at Microsoft.SharePoint.Taxonomy.TermSetItem.CreateTerm(String name, Int32 lcid, Guid newTermId)  

Looking for some help in SharePoint API documentation for method TermSetItem.CreateTerm() I found the explanation for this error:

The labelName value will be normailized to trim consecutive spaces into one and replace the & character with the wide character version of the character (\uFF06). It must be non-empty, cannot exceed 255 characters, and cannot contain any of the following characters ;"< >|&tab

Yeah, this is SharePoint… No surprise, I would say ;)

Since I had to put the forbidden characters into the term labels somehow, my first thought was to encode them as HTML entities. But as you may noticed it is impossible - the ampersand character is replaced by similar looking unicode character with code \uFF06.

My second try - why not to try the same as MS SharePoint developers? You will find many UTF-16 characters that look almost exactly the same as the characters from ASCII set. The replacements for the forbidden characters in UTF-8 are:

Original ASCII charUTF-8 replacement codeUTF-8 char

Unfortunetly I found no replacement for semicolon character so I’m replacing it with coma. You can find more information about these characters on FileFormat.

Of course you will have to create the term grammatically to put them into term label (or copy them from some unicode table, such as FileFormat). Here is a utility method that should “normalize” the strings for CreateTerm() method:

public static string ReplaceIllegalCharacters(string termLabel)  
    return termLabel.  
        Replace("\t", " ").  
        Replace(";", ",").  
        Replace("\"", "\uFF02").  
        Replace("<", "\uFF1C").  
        Replace(">", "\uFF1E").  
        Replace("|", "\uFF5C");  

This works fine either in web interface and other clients (I’ve tested it with MS Word 2010 and Lotus Notes widget for SharePoint integration - Harmon.IE). I will put here some screenshots soon.

SharePoint web UI term picker
SharePoint web UI term picker

MS Word term picker
MS Word term picker

Harmon.IE term picker
Harmon.IE term picker

SharePoint TermStore management tool
SharePoint TermStore management tool

Happy SharePointing!

A notice for archeologists...🏺 This post was originaly published on my previous blog and moved here. Some links and resources might not be up to date.