UTF-8 ‘R’ US!

In computing and programming, I use English, but I also use French and Japanese often.

UTF-8 is really important for my work and whatever else there is. Here are commands to run and configuration lines to add to set UTF-8 in various settings:

Linux terminal, in ~/.bashrc:

for e in LANG LANGUAGE LC_ALL LC_CTYPE; do export $e=en_US.UTF-8; done

GNU Screen, in ~/.screenrc:

charset utf8

Vim, in ~/.vimrc:

set enc=utf-8

Vim, to convert file encoding while editing:

:set fileenc=utf-8

PHP 5, in php.ini, under the “[mbstring]” section:

mbstring.language = Neutral
mbstring.internal_encoding = UTF-8
mbstring.http_input = auto
mbstring.http_output = UTF-8
mbstring.encoding_translation = On
mbstring.detect_order = auto
mbstring.substitute_character = none

PHP 5, function usage to consider:

htmlentities($string, ENT_COMPAT, 'UTF-8');

MySQL, in my.cnf, under respective sections:

default-character-set = utf8

init_connect='SET collation_connection = utf8_general_ci'
init_connect='SET NAMES utf8'
character-set-server = utf8
collation-server = utf8_general_ci

default-character-set = utf8

HTTP header:

Content-Type: text/html; charset=utf-8

HTML head, equivalent to HTTP header:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

HTML5 head:

<meta charset="utf-8" />

HTML form:

<form accept-charset="UTF-8">
<input type="hidden" name="utf8" value="&amp;#9675;" /><!-- For IE -->

XML prolog:

<?xml version="1.0" encoding="UTF-8"?>

Leave a Reply