Case Transformation

Case transformation is usually handled by a table that maps each character to itself, except for those characters being transformed - which are mapped to their transformed counterpart. For example, a lower case map might look like:

  lower['a'] == 'a'
  lower['A'] == 'a'
  

This is represented in the ctype data as two tables: lower and upper. You can start a map by first specifying that all characters map to themselves:

  lower['\0' - '\xFF'] = '\0' - '\xFF'
  

You can then override a subrange in this table to specify that 'A' - 'Z' maps to 'a' - 'z':

  lower['A' - 'Z']     = 'a' - 'z'
  

These two statements have completely specified the lower case mapping for an 8 bit char. The upper case table is similar. For example, here is the specification for upper case mapping of a 16 bit wchar_t in the "C" locale:

  upper['\0' - '\xFFFF'] = '\0' - '\xFFFF'
  upper['a' - 'z']       = 'A' - 'Z'
  

Below is the complete "C" locale specification for both ctype_byname<char> and ctype_byname<wchar_t>. Note that a "C" data file does not actually exist. But if you provided a locale data file with this information in it, then the behavior would be the same as the "C" locale.

Listing: Example of "C" Locale
$ctype_narrow
ctype['\x00' - '\x08'] = cntrl

ctype['\x09']          = cntrl | space | blank

ctype['\x0A' - '\x0D'] = cntrl | space

ctype['\x0E' - '\x1F'] = cntrl

ctype['\x20']          = space | blank | print

ctype['\x21' - '\x2F'] = punct | graph | print

ctype['\x30' - '\x39'] = digit | xdigit | graph | print

ctype['\x3A' - '\x40'] = punct | graph | print

ctype['\x41' - '\x46'] = xdigit | upper | alpha | graph | print

ctype['\x47' - '\x5A'] = upper | alpha | graph | print

ctype['\x5B' - '\x60'] = punct | graph | print

ctype['\x61' - '\x66'] = xdigit | lower | alpha | graph | print

ctype['\x67' - '\x7A'] = lower | alpha | graph | print

ctype['\x7B' - '\x7E'] = punct | graph | print

ctype['\x7F']          = cntrl

 

lower['\0' - '\xFF'] = '\0' - '\xFF'

lower['A' - 'Z']     = 'a' - 'z'

 

upper['\0' - '\xFF'] = '\0' - '\xFF'

upper['a' - 'z']     = 'A' - 'Z'

 

$ctype_wide

ctype['\x00' - '\x08'] = cntrl

ctype['\x09']          = cntrl | space | blank

ctype['\x0A' - '\x0D'] = cntrl | space

ctype['\x0E' - '\x1F'] = cntrl

ctype['\x20']          = space | blank | print

ctype['\x21' - '\x2F'] = punct | graph | print

ctype['\x30' - '\x39'] = digit | xdigit | graph | print

ctype['\x3A' - '\x40'] = punct | graph | print

ctype['\x41' - '\x46'] = xdigit | upper | alpha | graph | print

ctype['\x47' - '\x5A'] = upper | alpha | graph | print

ctype['\x5B' - '\x60'] = punct | graph | print

ctype['\x61' - '\x66'] = xdigit | lower | alpha | graph | print

ctype['\x67' - '\x7A'] = lower | alpha | graph | print

ctype['\x7B' - '\x7E'] = punct | graph | print

ctype['\x7F']          = cntrl

 

lower['\0' - '\xFFFF'] = '\0' - '\xFFFF'

lower['A' - 'Z']       = 'a' - 'z'

 

upper['\0' - '\xFFFF'] = '\0' - '\xFFFF'

upper['a' - 'z']       = 'A' - 'Z'