RSS Feed

PHP: Decode WINDOWS-1252 character encoding

Recently I came across a character encoding that I didn’t recognise, WINDOWS-1252. Being a Windows character encoding it’s not at all surprising that I didn’t recognise it/do not like it.

The solution to decoding such encoded strings is actually pretty simple, despite it taking me a little while to grasp, hence for sharing the information with you all.

$str = '=?WINDOWS-1252?Q?Use_of_=93Quotes=94?=';
 
// Set the internal character encoding
mb_internal_encoding("UTF-8");
 
// Outputs ' Use of “Quotes” '
echo mb_decode_mimeheader(str_replace('_', ' ', $str));

First you need to set the internal character encoding to UTF-8 using mb_internal_encoding() and then decode using mb_decode_mimeheader(); WINDOWS-1252 also adds underscores to replace spaces, using str_replace(‘_’, ‘ ‘, $str) sorts that issue for us.

Posted in PHP on the 7th October 2010

One person has spoken their mind!

  1. MAAD says:

    This is not a good solution… What happens if you actually have underscores in your string?

SPEAK YOUR MIND...

Your email address will not be published. Required fields are marked *

*