Chapter 11. Charset Encoding

Table of Contents

Charset Encoding

Charset Encoding

There are a variety of encodings for textual data, ISO-8859-1 (Latin1) and UTF-8 being the most popular. Unless specified otherwise with the SMARTY_RESOURCE_CHAR_SET constant, Smarty recognizes UTF-8 as the internal charset if Multibyte String is available, ISO-8859-1 if not.

Note

ISO-8859-1 has been PHP's default internal charset since the beginning. Unicode has been evolving since 1991. Since then it has become the one charset to conquer them all, as it is capable of encoding most of the known characters even accross different character systems (latin, cyrillic, japanese, …). UTF-8 is unicode's most used encoding, as it allows referencing the thousands of character with the smallest size overhead possible.

Since unicode and UTF-8 are very wide spread nowadays, their use is strongly encouraged.

Note

Smarty's internals and core plugins are truly UTF-8 compatible since Smarty 3.1. To achieve unicode compatibility, the Multibyte String PECL is required. Unless your PHP environment offers this package, Smarty will not be able to offer full-scale UTF-8 compatibility.

Example 11.1. Setting a different Charset Encoding


// use japanese character encoding
if (function_exists('mb_internal_charset')) {
  mb_internal_charset('EUC-JP');
}
define('SMARTY_RESOURCE_CHAR_SET', 'EUC-JP');
require_once 'libs/Smarty.class.php';
$smarty = new Smarty();