Why are all my apostrophes missing?
Posted by Mathew Patterson on April 21, 2008
Have you ever seen an HTML page or email where everything looks fine, except instead of apostrophes there are odd question marks, or square blocks? You might also see other characters replaced similarly.
Most commonly, this occurs when importing HTML that has been created by Microsoft Word. For generating HTML, Word uses a specific character set called "Windows Latin 1" that has special characters like 'smart quotes' and trademark symbols.
When you view the email on your own machine, those characters will show up, but then when imported into Campaign Monitor they might disappear or be converted into incorrect characters.
Character encoding makes the difference
The reason is that Campaign Monitor sends in UTF-8 encoding (which covers a wide number of languages), and the special characters are not in the same location in UTF-8 as they are in Windows Latin 1.
So what to do about it? Well the first (and most thorough) option is to just not use MS Word to generate HTML. Word tends not only to cause character problems, but also adds vast amounts of unnecessary HTML to even simple pages.
If you view the source you will see rampaging hordes of span tags and CSS with oddly named classes everywhere. It can also tend to break tags that Campaign Monitor uses like <unsubscribe></unsubscribe> by inserting other tags inside them.
There are much better options for simple HTML creation out there, even at little or no cost — look at tools like NVU, Coffee Cup (free and paid) and First Page.
Of course, you can go right up to tools like Dreamweaver if you have the need.
Another alternative is to do some 'find and replace' work in notepad or similar to remove Word's smart characters and replace them with the correct unicode characters. Some common ones to look out for are:
- For “ Left double quotes: Use “
- For ” Left double quotes: Use ”
- For ’ Apostrophe: Use ’
That way you can have the typographically correct characters show up in your email. Character encoding can be a tricky area, and you have to keep an eye on it in your HTML, in your subscribe form pages and in the subscriber lists your import.
Always keep in mind that Campaign Monitor will send in UTF-8 no matter what, so you want to import everything in UTF-8 to begin with, so no conversion occurs.
For more information on HTML and character encoding, read The Definitive Guide to Web Character Encoding at SitePoint.
3 comments so far
Search all posts
Dig into a category
- Articles/Tips (102)
- Email Newsletter Design (121)
- Happy Customers/Press (90)
- In the Forums (10)
- Interviews & Case Studies (9)
- New Features & Updates (110)
- Observations & Answers (88)
Stay in the loop
Prefer updates via email? Sign up below and we'll send you all the good bits each month.
Popular articles
Why we need standards support in email
Read why standards in HTML email are so important, and what we're doing about it.
Email design guidelines
Learn how to design for images being turned off, preview panes and other useful tips.
CSS support in email in 2007
The CSS support of every popular email environment with recommendations to boot.
Image blocking in email
A roundup of how each of the popular email clients suppress images in HTML email.
Can I use flash in email?
We test flash support in all the popular email clients. The verdict - don't do it.
Email design gallery
Our email design gallery showcases more than 150 amazing email designs sent by our talented customers.
Dean
wrote on April 21, 2008 6:09 PM
Why did you leave out #8216; ?
Matt Mickiewicz
wrote on April 22, 2008 9:11 AM
We've run into this problem quite a bit with our own email newsletters. We've created a tool (for internal use) that converts all special characters into email-friendly format:
http://www.sitepoint.com/dontgetsmart/
Hopefully it's useful to at least a few people.
David Levin
wrote on April 25, 2008 9:06 AM
It's too bad there isn't a way to change the encoding method that Word uses. You would figure that in Office 2007 they would have a way to do it.
I know a lot of web based WYSIWYG editors include a "clean up from word" copy/paste feature. Also, Dreamweaver has a handy cleanup feature too.
Got anything to add?