File conversion errors

1 February 2010

A key point with typesetting is that (as mentioned earlier) all of the text and image files are imported into a ‘container’ and then manipulated there. Little may be done to an imported illustration apart from resizing or cropping (all the other enhancement work has already been done in an image editing program like Photoshop; indeed, this will usually be where any further editing is done, the typesetting file simply updated with the revised image file).

In contrast, the imported text is converted at import and in most cases any dynamic link to the original text file is lost.

Errors often occur with these file conversions. Use of special text (as discussed in my previous post) is often a cause of later grief but it isn’t the sole cause, however (indeed, we are sometimes totally mystified why this or that text corruption occurs). Here are a few examples:

  • Your fancy Arabic script, keyed in Word from right to left, may turn into left–right nonsense (though to anyone other than an Arabic specialist it still looks fine – it looks Arabic).
  • Macrons (like those above the ‘o’ here in Tōkyō) are relatively simple to key in Windows but can turn to junk on the Mac.
  • Italicized text becomes roman.
  • Superscript characters (e.g. note markers in the text) become normally aligned.
  • There may be a software bug in the typesetting program that arbitrarily changes certain character combinations to something else (a recent bug in InDesign – since corrected – changed certain characters to a full stop; this was tricky to pick up).

Obviously, conversion errors can also happen with image files but this is far less common.

The typesetter keeps an eye open for such conversion errors but ultimately it will be your responsibility at the proofing stage to pick up any such problems. I’ll return to this in a later post.

(Post #13 of the Design & Typesetting section of a lengthy series on the book production process, the first post of which is here.)

Advertisements

Font issues

31 January 2010

If your text is of the plain vanilla variety (using Times, Arial and other similar fonts), then there should be no font-related problems in the typesetting of your book. However (and here note that this is a Western publisher speaking), if you use any non-standard text like that listed below, then you will need to start talking seriously with your editor – indeed, you should have done this months ago.

  • Text with diacritics or special accents (Vietnamese, for instance, uses multiple accents over a single Latin character).
  • Other special fonts or character sets (ornaments, for example)
  • Non-Latin script (e.g. Cyrillic, Arabic and Chinese).
  • Mathematical and scientific symbols (many based on Greek letters).
  • Formulas (often a complex arrangement of super- and sub-scripted Greek letters and other symbols and markers that must be precisely placed but still run into the main text).

There are dangers in the use of such special text, three that I can think of right off-hand:

  1. The big danger is that everything turns to custard in the conversion process (an issue I shall return to in a few days time). This can be a result of incompatibility between fonts and/or between computer operating systems, something I have discussed in an earlier post.
  2. Moreover, just because you got this Chinese font free with Word, it doesn’t mean that it can be used by your publisher without paying a heft price; this issue, too, I have explored elsewhere.
  3. And finally there is the issue of readability (something I have also written about earlier and enraged a few people as a result); I would argue that every insertion of special text creates a ‘speed bump’ in the smooth reading of your text.

Please think very hard before using such special text and, if you must use it, then consult with your production editor at an early stage.

(Post #12 of the Design & Typesetting section of a lengthy series on the book production process, the first post of which is here.)


Font compatibility

11 August 2009

Readability is an issue with the use of fonts and diacritical marks (as discussed in my previous post) but it is not the only issue. Font compatibility is also important.

Playing safe

There are thousands of different fonts out there. By all means choose fonts that you like but be aware that a document with uncommon fonts when opened by someone else may be unreadable or convert to a common font with strange results. The safe move is to choose standard fonts like Times, Arial and Helvetica or those that are Unicode compliant.

When playing safe isn’t an option

Such an approach is sensible if all that you need to write is ‘plain vanilla’ text. Many authors, however, need to go beyond vanilla and insert symbols and other special characters into their text, examples being:

  • Text with diacriticals or special accents (Vietnamese, for instance, uses multiple accents over a single Latin character).
  • Non-Latin script (e.g. Cyrillic, Arabic and Chinese).
  • Mathematical and scientific symbols (many based on Greek letters).
  • Formulas (often a complex arrangement of super- and sub-scripted Greek letters and other symbols and markers that must be precisely placed but still run into the main text).

In their case, standard fonts cannot be used.

Mac vs Windows

Part of the problem here is due to incompatibilities between computer operating systems. Much of the publishing world is Mac based (because of the high quality results and stability possible in this environment) while most authors work in the Windows world. This can have consequences, as happened with a prize-winning study that we published some years ago.

One of my many nightmares with fonts and diacritics

On paper, the text was reasonably clean and the author wrote beautifully. Yes, the text was full of Vietnamese diacritical marks but – notwithstanding my earlier post dealing with readability – it read like a dream. However, when we began typesetting the book, the whole project turned into nightmare. Basically, the author had used two Windows-only fonts for the diacritics, one for capital letters, the other for lower case. When we converted the text over onto the Mac, much of it turned to junk. And unfortunately different garbage symbols (like the delta sign) had one value if the original letter was upper case and an entirely different one if lower case. It took 3-4 weeks to sort out the mess and even then a handful of errors slipped through the two rounds of proofing. Luckily, the author was a dream to work with and – as noted – the book went on to win a major prize in its field.

This is not the only such hassle with fonts and diacritical marks I have experienced. No, over the years, I have gone through quite a few – far too many – nightmares with font conversions. All have involved diacritics.

A brighter future?

Hopefully, I face fewer potential nightmares in the future as, in general, diacritics and non-Latin script are less of a problem today than previously. This is due to general acceptance of the Unicode standard for font mapping and the rise of Open Type fonts based on this standard. The common Unicode standard gives each character form its own unique identifier which allows easy swapping between fonts, so it is imperative that any font you use is Unicode compliant.

Other issues

(That said, just because it is technically possible to splatter your text with, say, Arabic characters, this does not mean that you should do so. Consider the issues of readability and ‘speed bumps’ discussed in my previous post and ask yourself what is necessary, not what is possible.)

There are also wider issues with non-Latin script such as the input method and the direction of input, as will be discussed in a later post. In addition, unfortunately, not all fonts are compliant (to the best of my knowledge, for instance, no Unicode standard has been adopted yet for Lao script).

Publisher resistance to the use of diacritics

Notwithstanding these advances, many publishers still refuse to accept works with diacritics and non-Latin script due to the added production cost and general hassle, while others refuse to have non-Latin script in the main text but allow it to appear in a separate glossary that can be typeset separately from the bulk of the book. If it is necessary to include such special characters in your book, then the ability and willingness of a publisher to handle them must influence whom you approach with your script, and you may be asked to find significant sums of money to finance the extra typesetting costs that your choice causes.

Personally, I am quite open to the use of diacritics but there are limits. Essentially, the result of my bad experiences with non-standard fonts and psychotic diacritical marks is such that today I am only interested in working with fonts that are Unicode compliant and preferably Open Type fonts.

Personal consequences

As far as I am concerned, then, it is not enough to say that “this font is standard in the [Microsoft] Office package, so what’s the problem?” If the font turns to garbage when the manuscript is converted on the typesetter’s machine, then to me and to most other publishers that’s what the font is, garbage.

Should such a problem happen with your text, then, if the text in question is a manuscript on offer to a publisher (rather than one already accepted), immediately you have an added barrier to getting accepted; your manuscript looks like it could be a hassle to produce – better, thinks the publisher, to flag this one away.

Time perhaps to rethink your use of fonts and/or diacritics?