StringTools

A StringTools object provides several utility methods for encoding and decoding strings. For example, you can use a StringTools object to Base64-encode a chunk of data, or decode a UTF-8 encoded message header.

You can obtain a StringTools object using the DOpusFactory.StringTools method.

Method Name

Arguments

Return Type

Description

Decode

<Blob:source> or <string:source> <string:format>

string or Blob

Decodes an encoded string or data.

You can provide either a Blob object or a string as the source to decode. Depending on the value of the format argument, either a string or a Blob is returned. Valid formats are:

base64

The source will be Base64-decoded, and a Blob is returned.

quoted

The source will be Quoted-printable-decoded, and a Blob is returned.

utf-8

The source will be converted from UTF-8 to a native string.

utf-16 utf-16-le

The source will be converted from UTF-16 Little Endian to a native string.

utf-16-be

The source will be converted from UTF-16 Big Endian to a native string.

auto

Special handling is invoked to decode a MIME-encoded email subject (e.g. one beginning with =?), and a string is returned if identified. It will also detect UTF-8 or UTF-16 encoded data if it has a BOM at the beginning.

If decoding UTF-8 or UTF-16 (via "auto" or "utf-8", etc.), any byte-order-mark (BOM) will be skipped if one exists at the beginning of the input data.

If format is not specified the default is auto. Otherwise, format must be set to one of the above keywords or a valid code-page name (e.g. "gb2312", "utf-8"), or a Windows code-page ID (e.g. 936, 65001). The source will be decoded using the specified code-page and a string is returned.

base64

The source will be Base64-decoded, and a Blob is returned.

quoted

The source will be Quoted-printable-decoded, and a Blob is returned.

utf-8

The source will be converted from UTF-8 to a native string.

utf-16 utf-16-le

The source will be converted from UTF-16 Little Endian to a native string.

utf-16-be

The source will be converted from UTF-16 Big Endian to a native string.

auto

Special handling is invoked to decode a MIME-encoded email subject (e.g. one beginning with =?), and a string is returned if identified. It will also detect UTF-8 or UTF-16 encoded data if it has a BOM at the beginning.

Encode

<Blob:source> or <string:source> <string:format>

string or Blob

Encodes a string or data.

You can provide either a Blob object or a string as the source to decode. Depending on the value of the format argument, either a string or a Blob is returned. Valid formats are:

base64

The source will be Base64-encoded, and a string is returned.

quoted

The source will be Quoted-printable-encoded, and a string is returned.

utf-8

The source will be converted to UTF-8 without a byte-order-mark (BOM).

utf-8 bom

The source will be converted to UTF-8 with a BOM at the start.

utf-16 utf-16-le

The source will be converted to UTF-16 Little Endian without a BOM.

utf-16 bom utf-16-le bom

The source will be converted to UTF-16 Little Endian with a BOM.

utf-16-be

The source will be converted to UTF-16 Big Endian without a BOM.

utf-16-be bom

The source will be converted to UTF-16 Big Endian with a BOM.

Otherwise, format must be set to a valid code-page name (e.g. "gb2312", "utf-8" etc.), or a Windows code-page ID (e.g. 936, 65001). The source will be encoded using the specified code-page and a Blob is returned.

base64

The source will be Base64-encoded, and a string is returned.

quoted

The source will be Quoted-printable-encoded, and a string is returned.

utf-8

The source will be converted to UTF-8 without a byte-order-mark (BOM).

utf-8 bom

The source will be converted to UTF-8 with a BOM at the start.

utf-16 utf-16-le

The source will be converted to UTF-16 Little Endian without a BOM.

utf-16 bom utf-16-le bom

The source will be converted to UTF-16 Little Endian with a BOM.

utf-16-be

The source will be converted to UTF-16 Big Endian without a BOM.

utf-16-be bom

The source will be converted to UTF-16 Big Endian with a BOM.

IsASCII

<string:input>

bool

Tests the input string to see if it only contains characters that can be represented in ASCII.

If the result is false, the string is not safe to save into a text file unless you use a Unicode format such as UTF-8.

This check is not affected by locales or codepages. Instead, it tests whether the string consists of only 7-bit ASCII characters, such that no characters will be lost or modified if you save the string to a text file and then load it back on any other computer.

LanguageStr

<string:name> or <int:id>

string

Returns a translated string in the currently selected language. Mainly needed for internal use.

The currently defined strings are:

ID
English language string

FavoritesBar

Favorites Bar

FindResults

Find Results

CopySelection

Copy Selection

CopyAll

Copy All

MakeLegal

<string:name> [<string:flags>]

string

Strips any illegal filename characters from the supplied string.

The optional flags are:

f

forward slashes: convert separators to / instead of \

n

name instead of path: replace separators with _ (implies s)

s

subdirectory mode: replace : with ; and remove \\ from UNC paths

RemoveDiacritics

<string:input>

string

Returns a copy of the input string with any diacritics (accent symbols) removed. For example, "á" would be converted to "a".

This function uses the same rules that are used by the "ignore diacritics" options for pattern matching throughout Opus.

ID

English language string

FavoritesBar

Favorites Bar

FindResults

Find Results

CopySelection

Copy Selection

CopyAll

Copy All

MakeLegal

f

forward slashes: convert separators to / instead of \

n

name instead of path: replace separators with _ (implies s)

s

subdirectory mode: replace : with ; and remove \\ from UNC paths

RemoveDiacritics

Truncate

<string:input> or object:Path <int:length> [<int:type>]

string

Truncates the specified input string to the requested number of characters.

The optional type argument specifies the truncation type. Valid values are:

0

truncate on the right

1

truncate on the left

2

truncate in the middle

If not specified, the default is 2 if input is a Path object, otherwise the default is 0.

If the input value is a Path and middle truncation is selected, the function takes path separators into account correctly.

0

truncate on the right

1

truncate on the left

2

truncate in the middle

最后更新于