Typedefs and Constants

Typedefs and Constants — Enca library typedefs, enums and constants.

Synopsis

                    EncaEncoding;
#define             ENCA_CS_UNKNOWN
enum                EncaSurface;
enum                EncaCharsetFlags;
enum                EncaNameStyle;
enum                EncaErrno;
#define             ENCA_NOT_A_CHAR

Description

Details

EncaEncoding

typedef struct {
 int charset; EncaSurface surface; 
} EncaEncoding;

Encoding, i.e. charset and surface.

This is what enca_analyse() and enca_analyse_const() return.

The charset field is an opaque numerical charset identifier, which has no meaning outside Enca library. You will probably want to use it only as enca_charset_name() argument. It is only guaranteed not to change meaning during program execution time; change of its interpretation (e.g. due to addition of new charsets) is not considered API change.

The surface field is a combination of EncaSurface flags. You may want to ignore it completely; you should use enca_set_interpreted_surfaces() to disable weird surfaces then.

int charset;

Numeric charset identifier.

EncaSurface surface;

Surface flags.

ENCA_CS_UNKNOWN

#define ENCA_CS_UNKNOWN (-1)

Unknown character set id.

Use enca_charset_is_known() to check for unknown charset instead of direct comparsion.


enum EncaSurface

typedef enum { /*< flags >*/
  ENCA_SURFACE_EOL_CR    = 1 << 0,
  ENCA_SURFACE_EOL_LF    = 1 << 1,
  ENCA_SURFACE_EOL_CRLF  = 1 << 2,
  ENCA_SURFACE_EOL_MIX   = 1 << 3,
  ENCA_SURFACE_EOL_BIN   = 1 << 4,
  ENCA_SURFACE_MASK_EOL  = (ENCA_SURFACE_EOL_CR
                            | ENCA_SURFACE_EOL_LF
                            | ENCA_SURFACE_EOL_CRLF
                            | ENCA_SURFACE_EOL_MIX
                            | ENCA_SURFACE_EOL_BIN),
  ENCA_SURFACE_PERM_21    = 1 << 5,
  ENCA_SURFACE_PERM_4321  = 1 << 6,
  ENCA_SURFACE_PERM_MIX   = 1 << 7,
  ENCA_SURFACE_MASK_PERM  = (ENCA_SURFACE_PERM_21
                             | ENCA_SURFACE_PERM_4321
                             | ENCA_SURFACE_PERM_MIX),
  ENCA_SURFACE_QP        = 1 << 8,
  ENCA_SURFACE_REMOVE    = 1 << 13,
  ENCA_SURFACE_UNKNOWN   = 1 << 14,
  ENCA_SURFACE_MASK_ALL  = (ENCA_SURFACE_MASK_EOL
                            | ENCA_SURFACE_MASK_PERM
                            | ENCA_SURFACE_QP
                            | ENCA_SURFACE_REMOVE)
} EncaSurface;

Surface flags.

ENCA_SURFACE_EOL_CR

End-of-lines are represented with CR's.

ENCA_SURFACE_EOL_LF

End-of-lines are represented with LF's.

ENCA_SURFACE_EOL_CRLF

End-of-lines are represented with CRLF's.

ENCA_SURFACE_EOL_MIX

Several end-of-line types, mixed.

ENCA_SURFACE_EOL_BIN

End-of-line concept not applicable (binary data).

ENCA_SURFACE_MASK_EOL

Mask for end-of-line surfaces.

ENCA_SURFACE_PERM_21

Odd and even bytes swapped.

ENCA_SURFACE_PERM_4321

Reversed byte sequence in 4byte words.

ENCA_SURFACE_PERM_MIX

Chunks with both endianess, concatenated.

ENCA_SURFACE_MASK_PERM

Mask for permutation surfaces.

ENCA_SURFACE_QP

Quoted printables.

ENCA_SURFACE_REMOVE

Recode `remove' surface.

ENCA_SURFACE_UNKNOWN

Unknown surface.

ENCA_SURFACE_MASK_ALL

Mask for all bits, withnout ENCA_SURFACE_UNKNOWN.

enum EncaCharsetFlags

typedef enum { /*< flags >*/
  ENCA_CHARSET_7BIT      = 1 << 0,
  ENCA_CHARSET_8BIT      = 1 << 1,
  ENCA_CHARSET_16BIT     = 1 << 2,
  ENCA_CHARSET_32BIT     = 1 << 3,
  ENCA_CHARSET_FIXED     = 1 << 4,
  ENCA_CHARSET_VARIABLE  = 1 << 5,
  ENCA_CHARSET_BINARY    = 1 << 6,
  ENCA_CHARSET_REGULAR   = 1 << 7,
  ENCA_CHARSET_MULTIBYTE = 1 << 8
} EncaCharsetFlags;

Charset properties.

Flags ENCA_CHARSET_7BIT, ENCA_CHARSET_8BIT, ENCA_CHARSET_16BIT, ENCA_CHARSET_32BIT tell how many bits a `fundamental piece' consists of. This is different from bits per character; r.g. UTF-8 consists of 8bit pieces (bytes), but character can be composed from 1 to 6 of them.

ENCA_CHARSET_7BIT

Characters are represented with 7bit characters.

ENCA_CHARSET_8BIT

Characters are represented with bytes.

ENCA_CHARSET_16BIT

Characters are represented with 2byte words.

ENCA_CHARSET_32BIT

Characters are represented with 4byte words.

ENCA_CHARSET_FIXED

One characters consists of one fundamental piece.

ENCA_CHARSET_VARIABLE

One character consists of variable number of fundamental pieces.

ENCA_CHARSET_BINARY

Charset is binary from ASCII viewpoint.

ENCA_CHARSET_REGULAR

Language dependent (8bit) charset.

ENCA_CHARSET_MULTIBYTE

Multibyte charset.

enum EncaNameStyle

typedef enum {
  ENCA_NAME_STYLE_ENCA,
  ENCA_NAME_STYLE_RFC1345,
  ENCA_NAME_STYLE_CSTOCS,
  ENCA_NAME_STYLE_ICONV,
  ENCA_NAME_STYLE_HUMAN,
  ENCA_NAME_STYLE_MIME
} EncaNameStyle;

Charset naming styles and conventions.

ENCA_NAME_STYLE_ENCA

Default, implicit charset name in Enca.

ENCA_NAME_STYLE_RFC1345

RFC 1345 or otherwise canonical charset name.

ENCA_NAME_STYLE_CSTOCS

Cstocs charset name (may not exist).

ENCA_NAME_STYLE_ICONV

Iconv charset name (may not exist).

ENCA_NAME_STYLE_HUMAN

Human comprehensible description.

ENCA_NAME_STYLE_MIME

Preferred MIME name (may not exist).

enum EncaErrno

typedef enum {
  ENCA_EOK = 0,
  ENCA_EINVALUE,
  ENCA_EEMPTY,
  ENCA_EFILTERED,
  ENCA_ENOCS8,
  ENCA_ESIGNIF,
  ENCA_EWINNER,
  ENCA_EGARBAGE
} EncaErrno;

Error codes.

ENCA_EOK

OK.

ENCA_EINVALUE

Invalid value (usually of an option).

ENCA_EEMPTY

Sample is empty.

ENCA_EFILTERED

After filtering, (almost) nothing remained.

ENCA_ENOCS8

Mulitibyte tests failed and language contains no 8bit charsets.

ENCA_ESIGNIF

Too few significant characters.

ENCA_EWINNER

No clear winner.

ENCA_EGARBAGE

Sample is garbage.

ENCA_NOT_A_CHAR

#define ENCA_NOT_A_CHAR 0xffff

Not-a-character in unicode tables.