Today's New Content
Search CMX

Advanced Search

Latest Free Content
View All
Free Content
Accessibility

What's with the Question Marks? Or: Where are my Curly Quotes?

By: Arman Danesh

Page 3 of 3

Set for printing

Previous

What if Unicode isn't for me?

The reality is that the world of the Web is moving inexorably towards Unicode, in the form of UTF-8, will be the standard character encoding of the Web for all languages and Web sites. After all, with UTF-8 the Web retains backwards compatibility with older ASCII Web sites, every major languages and script can be represented, extended characters of all sorts are supported and, most important, all these characters can be mixed freely. With Unicode, for instance, a page could contain English, Chinese and Arabic all with a single character encoding for the document. This is very powerful.

However, we're still in a time of transition on the Web. Many older operating systems or browsers don't properly support Unicode or lack the necessary fonts to render Unicode pages properly. In other cases, in some countries, it is still common to use different national encodings instead of Unicode and it will take some time until the norm shifts to Unicode as legacy software is updated and revised to support Unicode.

To handle this we need to make the necessary adjustments to both our driver settings and our code so that ColdFusion processes pages with the correct encoding that matches what the driver expects. To work through this we will look at ISO-8859-1 as an example encoding. First, we need to set the driver's encoding to use ISO-8859-1:

Next, we need to add some code to the start of our ColdFusion script:

<cfprocessingdirective pageencoding="iso-8859-1">
<cfcontent type="text/html; charset=iso-8859-1">
<cfset SetEncoding("form","iso-8859-1")>

In order, these tags indicate:

  1. ColdFusion should treat the page as ISO-8859-1.
  2. The browser should treat the page as ISO-8859-1 (and therefore collect and submit data from the form in ISO-8859-1 encoding).
  3. ColdFusion should treat any data in the form scope as being ISO-8859-1 encoded.

The result, if we submit the same text, is data is the same as earlier when we used UTF-8 with the MySQL driver's Connection String value:

Wrap Up

This article provided a quick answer to how to solve the problem many of us have had with ColdFusion since MX was launched: the corruption of extended characters when they are inserted, and subsequently retrieved, from a database such as MySQL. The solution lies in ensuring that the ColdFusion MySQL driver uses the same character encoding as ColdFusion itself: usually this is Unicode in the form of UTF-8.

Approximate download size: 227k

Page 3 of 3 Previous 1 2 3


download
Download Support Files


Keywords
unicode, mysql, extended characters, question marks