UTF-8

A popular Unicode multibyte encoding. For example

  $codecvt_wide
    UTF-8
  

specifies that codecvt_byname<wchar_t, char, mbstate_t> will implement the UTF-8 encoding scheme. If this data is in a file called "en_US", then the following program can be used to output a wchar_t string in UTF-8 to a file:

Listing: Example of Writing a wchar_t String in utf-8 to a File:
#include <locale>
#include <fstream>

int main()

{

   std::locale loc("en_US");

   std::wofstream out;

   out.imbue(loc);

   out.open("test.dat");

   out << L"This is a test \x00DF";

}

The binary contents of the file is (in hex):

  54 68 69 73 20 69 73 20 61 20 74 65 73 74 20 C3 9F
  

Without the UTF-8 encoding, the default encoding will take over (all wchar_t bytes in native byte order):

  #include <fstream>
  int main()
  {
     std::wofstream out("test.dat");
     out << L"This is a test \x00DF";
  }
  

On a big-endian machine with a 2 byte wchar_t

the resulting file in hex is:

  00 54 00 68 00 69 00 73 00 20 00 69 00 73 00 20
  
  00 61 00 20 00 74 00 65 00 73 00 74 00 20 00 DF