Archive for June 7th, 2009

Jun
07
2009

Strange character sequence when parsing XML

When XML files are read from RSS feeds or from other sources, it is necessary that they both use the same standard encoding. When there is a conflict, strange characters like ‘, ’ gets shown on the browsers.

In RSS feeds, UTF-8 is the standard encoding scheme that is used and if a browser using ISO-8859-1 tries to read and display the data from the feed then these characters get passed on to the display.

To avoid having this strange characters displyed on screen (where ISO-8859-1 encoding is used), PHP’s iconv function comes in handy and you can use it like

echo iconv(”UTF-8″,”ISO-8859-1//TRANSLIT”, $temp_item['encoded']);

and this will remove those characters and replace it with the nearest possible character during its tranliteration.

  • Share/Save/Bookmark