Yesterday, I came accross this interesting table which lets me know what conversions I need to do when I paste text from Word into a textarea and further want to use this text on the web…
To be accurate, this table is useful for conversion from the default windows charset (windows-1252 aka CP1252) to the default web charset (ISO-8859-1 aka Latin-1). Nethertheless, this allowed me to check the conversion in my b2evolution software and I noticed that it was missing one conversion (in a total of 27).
Anyway, the world actually extends way beyond cp1252 and Latin-1, so how would one deal with other languages? :?:
For example, how do I convert Latvian from Windows-1257 to iso-8859-13 (close match) ? Or Russian from Koi8-r to iso-8859-5 (funky match) ? Check out this awesome character set database provided by the Institute of the Estonian Language. (Wouldn’t it make sense if unicode.org provided this? :crazy:)
By the way, how do I know what charsets are to be used for a particular language? Here’s a page by the W3C, but it’s a little sparse… Another one.
Comments from long ago:
Comment from: J.o.sue
how do I convert Hebreu
2006-03-03 18-24