u: Unicode

>> << Ndx Usr Pri Phr Dic Rel Voc !: wd Help Dictionary

Unicode

u: _ _ _

Unicode

Unicode or 2-byte characters are a new datatype. Unicode arrays are created by the verb u: . Existing verbs are extended to work on unicodes.

The monad u: applies to several kinds of arguments:

Argument	Result
1-byte characters	same as `2&u:`
2-byte characters	copy of argument
integers	same as `4&u:`

The inverse of the monad u: is 3&u:

The dyad u: takes a scalar integer left argument and applies to several kinds of arguments:

Left	Right	Result
`1`	2-byte characters	1-byte characters; high order bytes are discarded
`2`	1-byte characters	2-byte characters; high order bytes are 0
`3`	2-byte characters	integers
`4`	integers	2-byte characters; integers must be from 0 to 65535
`5`	2-byte characters	1-byte characters; high order bytes must be 0 (and are discarded)
`6`	1-byte characters	2-byte characters; pairs of 1-byte characters are converted to 2-byte characters

1&u: and 2&u: is an inverse pair, as are 3&u: and 4&u: .

2-byte characters can not be entered the keyboard. The display of an array x of 2-byte characters is that of 1 u: x , that is, discarding the high-order byte of each 2-byte character.

Examples:

   ] t=: u: 'We the people' 
We the people
   3!:0 t
131072                         the unicode datatype numeric code is 131072                 

   u: 97 98 99 +/ 0 256 512 1024
aaaa                           2-byte characters have the same
bbbb                           display as 1-byte characters
cccc 

   'a' = u: 97 + 0 256 512 1024
1 0 0 0

   ] t=: (2 4$'abcdefgh') , u: 'wxyz'
abcd                           1- and 2-byte characters can be catenated together.
efgh                           The 1-byte characters are promoted.
wxyz
   3!:0 t
131072

>> << Ndx Usr Pri Phr Dic Rel Voc !: wd Help Dictionary