Recently I came across a character encoding that I didn’t recognise, WINDOWS-1252. Being a Windows character encoding it’s not at all surprising that I didn’t recognise it/do not like it.
The solution to decoding such encoded strings is actually pretty simple, despite it taking me a little while to grasp, hence for sharing the information with you all.
$str = '=?WINDOWS-1252?Q?Use_of_=93Quotes=94?='; // Set the internal character encoding mb_internal_encoding("UTF-8"); // Outputs ' Use of “Quotes” ' echo mb_decode_mimeheader(str_replace('_', ' ', $str));
First you need to set the internal character encoding to UTF-8 using mb_internal_encoding() and then decode using mb_decode_mimeheader(); WINDOWS-1252 also adds underscores to replace spaces, using str_replace(‘_’, ‘ ‘, $str) sorts that issue for us.