the whole shebang

February 28, 2007

Language, Charset, and Internationalized Websites

Filed under: accessibility,Web Programming,Website Design/CSS — wholeshebang @ 11:18 pm

Internationalization Issues

International Sites: Minimum Requirements” is a beginning but essential introduction to the issues you must address when designing a website that attracts international viewers.

<FONT FACE> considered harmful” is the classic article about what can go wrong when you do not design websites with accessibility and usability in mind. This article is must reading for everyone, not just those who have international readers. Some basic technical explanation is included in the various scenarios. Frankly, I’m of the opinion that browsers shouldn’t be backwards-compatible with the <FONT> tag at all and should totally ignore it. As long as browsers interpret it, the ignorant and just uncaring will continue to use it!

What’s Wrong With FONT” doesn’t quite cover the totality of horrifying results from use of the <FONT> tag the way the classic “<FONT FACE> considered harmful” does…but it’s warning enough.

Even Microsoft has got on the bandwagon (everyone trying to siphon users searching for the original artilcle?), with “What’s Wrong with FONT FACE“. Just an overview of why it takes browser-based control out of the viewer’s hands.

Technical Aspects of Multilingual (or any) Websites: Doctype Declarations, Meta Tags, XML Prologs, and Media Types

Babel: Towards communicating on the Internet in any language will explain the issues of encoding, character sets, and software. Absolutely required reading if you acutally plan to have a multilingual website, as opposed to one that is just, shall we say, “compatible” with people of various nationalities.

W3C article: “Serving XHTML 1.0” is must reading unless you already have a comprehensive understanding of doctype switching, quirks mode, XML prologs, and browser support for all 3.

W3C article: “Character encodings”W3C article: “HTML Document Representation” (in HTML 4.01) — This is the actual documentation of the W3C standard for HTML 4.01

W3C article by editor: “Authoring Techniques for XHTML & HTML Internationalization: Characters and Encodings 1.0” Excellent analysis of what to do based on your sitution.

W3C article: “HTTP and meta for language information” Explains the difference between declaring the charset and the encoding (suggest reading the last 2 paragraphs first), as well, as how to decide what to use where.

W3C article: “Setting charset information in .htaccess” — Note: ‘.htaccess’ is a file for configuring individual directories and virtual websites running under the same Apache webserver. The same settings mentioned here can be used in your overall ‘httpd.conf’ file, too. (If using per-directory configuration on Windows, you must use a different filename than ‘.htaccess’ because this is not a legal Windows filename, and also set this filename in ‘httpd.conf’ so Apache knows what filename it should look for to determine individual configurations.) You also must configure Apache as to whether you will allow this on your server or not. To learn how to configure Apache, see

W3C article: “Apache MultiViews language negotiation set up” — This is a feature that allows Apache to automatically choose how to serve websites in multiple languages.

If you plan to use language negotiation (more often called content negotiation), you need to read W3C article: “When to use language negotiation” first to understand its bugs and how it should be applied.

W3C article: “Unexpected characters or blank lines” — Refers to the undesirable display of the byte-order mark in some user-agents (e.g., web browsers, text-editors, etc.) and how to get rid of it. Only applicable with files that are in UTF-8.

Doctypes and their respective layout mode” is a reference table for browsers regarding doctype-switching and when a browser may be triggered into Quirks Mode. (If you have no clue what that is, I suggest a visit to “CSS – Quirks mode and strict mode“. In fact, I would make the entire site recommended reading!)

If You Intend to Write in Another Language

W3C article: “What you need to know about the bidi algorithm and inline markup” is neccessary reading if you will type a document in mixed language where one language is read right-to-left and the other is left-to-right.

W3C article: “W3C article: CSS vs. markup for bidi support” necessary reading if you will mix directional languages!

W3C article: “An Introduction to Multilingual Web Addresses” Very important now that we can have Internationalized Domain Names, and to use native-language directory and file names. Covers browser support, too.

Software-Related Reading

I haven’t even had the chance to read through the info yet, but this mini-website, Unicode and Multilingual Support in HTML, Fonts, Web Browsers and Other Applications has much information about Unicode support and the computing environment, as well as links to other sources of information.

W3C article: “Setting encoding in web-authoring applications” shows you how to automate the inclusion of code in various editing software.


1 Comment »

  1. Great post! I’ll subscribe right now wth my feedreader software!

    Comment by buy_vigrxplus — July 14, 2009 @ 7:44 am | Reply

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Blog at

%d bloggers like this: