This file is part of Logtalk https://logtalk.org/ SPDX-FileCopyrightText: 1998-2026 Paulo Moura <pmoura@logtalk.org> SPDX-License-Identifier: Apache-2.0

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

`character_sets`

This library provides a character_set_protocol protocol plus concrete objects for converting between lists of character codes and lists of bytes. It also provides metadata predicates preferred_mime_name/1, name/1, alias/1, and mibenum/1 based on the IANA character set registry:

https://www.iana.org/assignments/character-sets/character-sets.xhtml

The currently provided objects are:

[117, 115, 95, 97, 115, 99, 105, 105]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 49]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 50]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 51]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 52]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 57]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 49, 48]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 49, 51]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 49, 52]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 49, 53]

[105, 115, 111, 95, 56, 56, 53, 57, 95, 49, 54]

[119, 105, 110, 100, 111, 119, 115, 95, 49, 50, 53, 48]

[119, 105, 110, 100, 111, 119, 115, 95, 49, 50, 53, 49]

[119, 105, 110, 100, 111, 119, 115, 95, 49, 50, 53, 50]

[119, 105, 110, 100, 111, 119, 115, 95, 49, 50, 53, 51]

[119, 105, 110, 100, 111, 119, 115, 95, 49, 50, 53, 52]

[119, 105, 110, 100, 111, 119, 115, 95, 49, 50, 53, 55]

[117, 116, 102, 95, 56]

[117, 116, 102, 95, 49, 54, 108, 101]

[117, 116, 102, 95, 49, 54, 98, 101]

[117, 116, 102, 95, 51, 50, 108, 101]

[117, 116, 102, 95, 51, 50, 98, 101]

Object names are derived from the preferred IANA MIME names by lowercasing them and replacing hyphens with underscores. When a registry entry has no distinct preferred MIME alias, the registered IANA name is used instead. A compatibility alias object named utf16be is also provided for utf_16be.

The Unicode character set objects work with Unicode scalar values and do not emit or consume a byte order mark (BOM).

This library intentionally does not currently provide Shift_JIS or GB18030 objects because portable mapping tables for those multibyte encodings are not yet included.

No input validation is performed when converting between character codes and bytes. When necessary, use the types library validation and checking predicates before calling the codes_to_bytes/2 and bytes_to_codes/2 predicates.

API documentation

Open the [../../apis/library_index.html#character_sets](../../apis/library_index.html#character_sets) link in a web browser.

Loading

To load all entities in this library, load the loader.lgt file:

| ?- logtalk_load(character_sets(loader)).

Testing

To test this library predicates, load the tester.lgt file:

| ?- logtalk_load(character_sets(tester)).

Usage

The UTF, ISO 8859, and Windows character set objects are grouped in three main files:

[117, 116, 102, 95, 99, 104, 97, 114, 97, 99, 116, 101, 114, 95, 115, 101, 116, 115, 46, 108, 103, 116]
[105, 115, 111, 95, 56, 56, 53, 57, 95, 99, 104, 97, 114, 97, 99, 116, 101, 114, 95, 115, 101, 116, 115, 46, 108, 103, 116]
[119, 105, 110, 100, 111, 119, 115, 95, 99, 104, 97, 114, 97, 99, 116, 101, 114, 95, 115, 101, 116, 115, 46, 108, 103, 116]: This allows some customization of the character set objects loaded by your application. Note that the character_set_protocol.lgt and character_sets.lgt base files must always be loaded (they include the us_ascii character set, which is thus always loaded).