| Home | “Changed and unchanged files may be easily separated by sorting by date of last update. |
This tutorial assumes the following 3 things:
1. That extensions for known file types are not hidden. (1. Click Start / My Documents. 2. Click Tools / Folder Options…. 3. Click the View tab. 4. Make sure that Hide extensions for known file types is unchecked.)
2. That you have Notepad2 or a shortcut to it on your desktop.
3. That you have installed Perl according to the tutorial at:
http://llbest.com/PerlProgramming.htm.
On this web page:
A. Introduction
B. Main Features
C. Demo #1: ANSI to UTF-8
D. Demo #2: UTF-8 to ANSI
E. Conclusion
The advantages of converting your HTML files from ANSI to UTF-8 are as follows:
1. UTF-8 files are more, “What you see is what you get.” (WYSIWYG). For example, instead of ♪ and ©, you will see ♪ and ©.
2. UTF-8 files are more compact. Therefore they take up less space on your local hard drive, upload faster, but more important to your website visitors: they load faster.
1. _ConvertHTMLfilesFromANSItoUTF8.bat.
2. _ConvertHTMLfilesFromUTF8toANSI.bat.
3. _Test.htm.
1. The file format is changed from ANSI to UTF-8 and the UTF-8 signature is added.
2. The following meta tag is added following the <head> tag:
This will only be done to files which were originally ANSI.
3. HTML special character codes are converted to the actual Unicode characters. For example, ♪ is converted to ♪ and © is converted to ©. This will be done to all files, even if they were already UTF-8 files.
1. The file format is changed from UTF-8 to ANSI and the UTF-8 signature is deleted.
2. The following meta tag is deleted:
3. Unicode characters are converted to their corresponding HTML ANSI character codes. For example, ♪ is converted to ♪ and © is converted to ©.
1. Multiple HTML files can be converted in one fell swoop.
2. A mixture of ANSI (ANSI_X3.4-1986), ISO-8859-1, and Windows-1252 (CP1252) format HTML files can be converted to UTF-8.
3. UTF-8 format HTML files can be converted to ISO-8859-1 / Windows-1252 compatible ANSI format.
4. A log file is created which contains detailed statistics on all of the changes.
5. Changed and unchanged files may be easily separated via sorting by date of last update.
1. Download __HTMLtoUTF8.rar (5.00 KB)
2. Use WinRAR to uncompress it to a folder called __HTMLtoUTF8. Here is how the _Test.htm file looks:
3. Drag and drop the _Test.htm file onto the Notepad2 icon and click File / Encoding. Then you will see that it is an ANSI file:
4. Double click the _ConvertHTMLfilesFromANSItoUTF8.bat file’s icon. If Perl was installed correctly, here’s what you will see:
A log file containing the following is also created:
5. Drag and drop the _Test.htm file onto the Notepad2 icon again and again click File / Encoding. Now you will see that it has been converted to a UTF-8 file with signature:
Here is what the newly created UTF-8 version of the _Test.htm file looks like:
Please note that for this short example, the file has actually gotten larger, but for a normal size web page, especially if it has lots of special characters, converting it to UTF-8 will make it smaller.
1. Assuming that you’ve already completed Demo #1 above, double click the _ConvertHTMLfilesFromUTF8toANSI.bat file’s icon. Here is what you should see:
A log file containing the following is also created:
Now the _Test.htm file is back to the way it was originally:
2. Again, drag and drop the _Test.htm file onto the Notepad2 icon. Again click File / Encoding, and you will see that it is back to being an ANSI file:
Now you are ready to convert any number of HTML files:
1. Be sure to keep backup copies of all of the HTML files until you are sure that the conversion was done correctly.
2. Copy the HTML files to be converted to a temporary folder such as C:\Temp2.
3. Copy the appropriate .bat file to the same folder.
4. Double click the .bat file's icon. If all goes according to plan, all of the HTML files will now be converted automatically!
The above will match .htm, .html, .php, .php3, etc. files. It may be changed to suit your needs. For example, if all that you want to modify are .txt files, then it could be changed as follows:
| Home | THIS WEB PAGE URL: http://llbest.com/__HTMLtoUTF8.htm |
A. Introduction
The free download contains the following 3 files:
_ConvertHTMLfilesFromANSItoUTF8.bat is a Perl program which converts ANSI format HTML files to the UTF-8 (Unicode) format. In order to accomplish this, 3 types of changes are made to each of the HTML files:
_ConvertHTMLfilesFromUTF8toANSI.bat is a 2nd Perl program which does exactly the opposite of the above 3 types of changes:
_Test.htm is an extremely short, simple HTML file used to test the 2 Perl programs.
Caution: Be sure to keep unconverted backup copies of all of the HTML files that you convert. The backups should be in a separate folder, or, better yet, on a different drive.
B. Main Features
C. Demo #1: ANSI to UTF-8


D. Demo #2: UTF-8 to ANSI

E. Conclusion
Note: Near the beginning of
the _ConvertHTMLfilesFromANSItoUTF8.bat and
the _ConvertHTMLfilesFromUTF8toANSI.bat files are the following:
Note: Another way to convert HTML from ANSI to UTF-8 and vice versa, albeit one file at a time, is to do it on-line using:
http://llbest.com/HTMLtoUTF8.htm.