Decoding unicode decimal codes


We have build a solution for an customer in CRMScript that read an RSS feed from a WordPress site, the title of articles from this RSS feed contains codes such as '̵', '‘', '’', etc.

I have tried the various .%decode() methods that are available on a string but none can decode these values. Is there support for this in CRMScript?


RE: Decoding unicode decimal codes

Hi David, 

could you elaborate a bit more how you get the RSS feed?

I guess you're using the HTTP class to get it? Could you provide a small code example?


If I try to print the codes, I get the values perfectly, so there might be some encoding issues.


String text = "̵, ‘, ’";


Gives me the values:

̵, ‘, ’


Have you tried sending UTF-8 encode with the HTTP class?

Av: Simen Mostuen Iversen 18. jan 2021

RE: Decoding unicode decimal codes

Hello Simen,

We retrieve the RSS feed using the following code:

    HTTP http;

    NSStream data = http.openAsStream(feedUrl);

    if (http.hasError())
        log("Error getting feed:");
      	throw "Error getting feed: " + http.getErrorMessage();
			XMLNode xml = parseXML(String(data.GetStream()));

			// parse xml
            printLine("Exception caught: " + error);
            printLine(" " + errorLocation);
Av: David Hollegien 19. jan 2021

RE: Decoding unicode decimal codes

Is there any functionality to decode these codes in CRMScript?

I have since my last post added the following:

http.addHeader("Content-Type", "application/rss+xml; charset=UTF-8");

// and then decode the utf8 characters
String text = String(data.GetStream());
XMLNode xml = parseXML(text.utf8Decode());

This fixes some of the other encoding issues but still shows values like '–'.


Note: this also happens when you request the rss feed manually using the browser, so I don't think we are retrieving it wrong

Av: David Hollegien 2. feb 2021

RE: Decoding unicode decimal codes

– is an HTML entity encoding of a unicode code point. Rendering it is up to the browser.

If you wanted to normalize it you need to replace with a unicode character 

Av: Christian Mogensen 2. feb 2021