Charsets and Surfaces

Charsets and Surfaces — Functions and macros for getting something sensible from EncaEncoding.

Synopsis

const char*         enca_charset_name                   (int charset,
                                                         EncaNameStyle whatname);
const char**        enca_get_charset_aliases            (int charset,
                                                         size_t *n);
char*               enca_get_surface_name               (EncaSurface surface,
                                                         EncaNameStyle whatname);
EncaEncoding        enca_parse_encoding_name            (const char *name);
EncaSurface         enca_charset_natural_surface        (int charset);
EncaCharsetFlags    enca_charset_properties             (int charset);
#define             enca_charset_is_7bit                (cs)
#define             enca_charset_is_8bit                (cs)
#define             enca_charset_is_16bit               (cs)
#define             enca_charset_is_32bit               (cs)
#define             enca_charset_is_fixed               (cs)
#define             enca_charset_is_variable            (cs)
#define             enca_charset_is_binary              (cs)
#define             enca_charset_is_regular             (cs)
#define             enca_charset_is_multibyte           (cs)
#define             enca_charset_is_known               (cs)

Description

Details

enca_charset_name ()

const char*         enca_charset_name                   (int charset,
                                                         EncaNameStyle whatname);

Translates numeric charset id charset to some kind of name.

charset :

A charset id.

whatname :

Teh type of name you request.

Returns :

The requested charset name; NULL for invalid whatname or charset, or when whatname name doesn't exist for charset charset (ENCA_CS_UNKNOWN is OK).

enca_get_charset_aliases ()

const char**        enca_get_charset_aliases            (int charset,
                                                         size_t *n);

Returns list of accepted aliases for charset charset.

The list of aliases has to be freed by caller; the strings themselves must be considered constant and must NOT be freed.

charset :

A charset id.

n :

The number of aliases will be stored here.

Returns :

The list of aliases, storing their number into *n; NULL for invalid charset (*n is zero then).

enca_get_surface_name ()

char*               enca_get_surface_name               (EncaSurface surface,
                                                         EncaNameStyle whatname);

Constructs surface name from surface flags surface.

surface :

A surface.

whatname :

The type of name you request.

Returns :

The requested surface name; NULL for invalid whatname; empty string for naming style not supporting surfaces. In all cases, the returned string must be freed by caller when no longer used.

enca_parse_encoding_name ()

EncaEncoding        enca_parse_encoding_name            (const char *name);

Transofrms encoding specification charset/surface into numeric EncaEncoding.

When the charset name is not recognized, surfaces are not parsed at all and ENCA_CS_UNKNOWN is returned as charset. However, unrecognized surfaces are considered only a minor problem causing ENCA_SURFACE_UNKNOWN flag to be set in the result, beside recognized surface flags.

name :

An encoding specification.

Returns :

The charset/surface pair.

enca_charset_natural_surface ()

EncaSurface         enca_charset_natural_surface        (int charset);

Returns natural surface of a charset.

charset :

A charset.

Returns :

The requested charset natural surface (called `implied' in recode), zero for invalid charset or for charsets with no natural surface. Natrual surface is the surface one expects for a given charset -- e.g. CRLF EOLs for IBM/Microsoft charsets, CR EOLs for Macintosh charsets and LF EOLs for ISO/Unix charsets.

enca_charset_properties ()

EncaCharsetFlags    enca_charset_properties             (int charset);

Returns charset properties.

charset :

A charset.

Returns :

The requested charset properties; zero for invalid charset.

enca_charset_is_7bit()

#define             enca_charset_is_7bit(cs)

Expands to nonzero when characters are represented with 7bit characters.

cs :

Charset id.

enca_charset_is_8bit()

#define             enca_charset_is_8bit(cs)

Expands to nonzero when characters are represented with bytes.

cs :

Charset id.

enca_charset_is_16bit()

#define             enca_charset_is_16bit(cs)

Expands to nonzero when characters are represented with 2byte words.

cs :

Charset id.

enca_charset_is_32bit()

#define             enca_charset_is_32bit(cs)

Expands to nonzero when characters are represented with 4byte words.

cs :

Charset id.

enca_charset_is_fixed()

#define             enca_charset_is_fixed(cs)

Expands to nonzero when one characters consists of one fundamental piece.

cs :

Charset id.

enca_charset_is_variable()

#define             enca_charset_is_variable(cs)

Expands to nonzero when one character consists of variable number of fundamental pieces.

cs :

Charset id.

enca_charset_is_binary()

#define             enca_charset_is_binary(cs)

Expands to nonzero when charset is binary from ASCII viewpoint.

cs :

Charset id.

enca_charset_is_regular()

#define             enca_charset_is_regular(cs)

Expands to nonzero when charset is language dependent (8bit) charset.

cs :

Charset id.

enca_charset_is_multibyte()

#define             enca_charset_is_multibyte(cs)

Expands to nonzero when charset is multibyte.

cs :

Charset id.

enca_charset_is_known()

#define             enca_charset_is_known(cs)

Expands to nonzero when the charset is known (i.e. it's not ENCA_CS_UNKNOWN).

cs :

Charset id.