(original) (raw)
On 8/22/2014 8:51 AM, Oleg Broytman
wrote:
That's not a text file. That's a binary file containing (hopefully delimited, and documented) sections of encoded text in different encodings.What encoding does have a text file (an HTML, to be precise) with text in utf-8, ads in cp1251 (ad blocks were included from different files) and comments in koi8-r? Well, I must admit the HTML was rather an exception, but having a text file with some strange characters (binary strings, or paragraphs in different encodings) is not that exceptional.
If it is named .html and served by the server as UTF-8, then the server is misconfigured, or the file is incorrectly populated.