Write text files without Byte Order Mark (BOM)?

I am trying to create a text file using VB.Net with UTF8 encoding, without BOM. Can anybody help me, how to do this? I can write file with UTF8 encoding but, how to remove Byte Order Mark from it?

edit1: I have tried code like this;

    Dim utf8 As New UTF8Encoding()
    Dim utf8EmitBOM As New UTF8Encoding(True)
    Dim strW As New StreamWriter("c:\temp\bom\1.html", True, utf8EmitBOM)
    strW.Write(utf8EmitBOM.GetPreamble())
    strW.WriteLine("hi there")
    strW.Close()

        Dim strw2 As New StreamWriter("c:\temp\bom\2.html", True, utf8)
        strw2.Write(utf8.GetPreamble())
        strw2.WriteLine("hi there")
        strw2.Close()

1.html get created with UTF8 encoding only and 2.html get created with ANSI encoding format.

Simplified approach - http://whatilearnttuday.blogspot.com/2011/10/write-text-files-without-byte-order.html

Answers


In order to omit the byte order mark (BOM), your stream must use an instance of UTF8Encoding other than System.Text.Encoding.UTF8 (which is configured to generate a BOM). There are two easy ways to do this:

1. Explicitly specifying a suitable encoding:

  1. Call the UTF8Encoding constructor with False for the encoderShouldEmitUTF8Identifier parameter.

  2. Pass the UTF8Encoding instance to the stream constructor.

' VB.NET:
Dim utf8WithoutBom As New System.Text.UTF8Encoding(False)
Using sink As New StreamWriter("Foobar.txt", False, utf8WithoutBom)
    sink.WriteLine("...")
End Using
// C#:
var utf8WithoutBom = new System.Text.UTF8Encoding(false);
using (var sink = new StreamWriter("Foobar.txt", false, utf8WithoutBom))
{
    sink.WriteLine("...");
}

2. Using the default encoding:

If you do not supply an Encoding to StreamWriter's constructor at all, StreamWriter will by default use an UTF8 encoding without BOM, so the following should work just as well:

' VB.NET:
Using sink As New StreamWriter("Foobar.txt")
    sink.WriteLine("...")
End Using
// C#:
using (var sink = new StreamWriter("Foobar.txt"))
{
    sink.WriteLine("...");
}

Finally, note that omitting the BOM is only permissible for UTF-8, not for UTF-16.


Need Your Help

RegEx results in infinite loop/error - Requested entity too large

javascript php ajax regex

The following code should check if either a # or @ symbol has been found in a string. The regex should find each and every @ or # and either place each instance it found into the messages table (if...

Difficulty using collection_check_boxes

mysql ruby-on-rails ruby-on-rails-4

I have a model Project which has many ProjectGenres which references Genres: