Hi Joe,
We've done several UTF-8-based websites (Arabic and Sanskrit) and have
tackled successfully most of the problems that arise.
Here are a few bullet points that might help your investigations:
* We haven't pursued the Word or PDF options because we are working with
dynamic data: if you are taking the database approach, be
sure to use a Unicode-compatible backend.
*The primary issues within this approach is rendering the text correctly
on the browser. This tends to boil down to:
- browser platform / client OS
- font
- character encoding
Very, briefly, our results are:
*Browsers: Firefox/Mozilla has by far the best Unicode support. Any
solution, of course, should be tested on all recent browsers.
*Font: Arial displays diacritics very nicely but falls over with many
language-specific characters. We've had the best all-around
success with Lucida Sans Unicode, but this is a Windows-only font. We've
not yet found a Unicode font that renders diacritics
perfectly in Linux and would be grateful for tips from the list. Browsers
usually to render R to L text automatically, but mixed
text that includes (for example) an english notation in Arabic text, will
commonly throw off this automatic rendering.
*Character encoding: use UTF-8 throughout. One problem we find when
transferring data from Word (and similar) documents to the
system is that often users will not use true Unicode characters, but
rather visually identical characters in a specific font (such
as those which incorporate combined diacritics into a single character).
These will usually not render well in in UTF-8, so we
encourage people to convert their text to MS Unicode or Lucida first,
which will highlight non-UTF8 characters for replacement.
I hope this provides some of what you're looking for; always happy to add
more off list.
best wishes
Nick Case
Managing Director
Oxford ArchDigital Ltd
27 Park End Street
Oxford OX1 1HU
Tel:0044-1865-793043
Fax:0044-1865-794891
http://www.oxarchdigital.com
On Wed, 22 Sep 2004 11:11:15 +0100, Joe Hall <[log in to unmask]> wrote:
> Hi,
>
> I was wondering if anyone had had any useful experience of dealing with
> foreign language texts, especially those using non-Western scripts
> and/or reading right-to-left (e.g. Japanese, Arabic). We are looking to
> publish these as html for international online visitors wanting to read
> more about displays at Tate Britain.
>
> We have looked into html specifications, and although any advice on this
> would be useful, I was wondering in particular about any good/easy
> practices for receiving text from translation agencies, editing/viewing
> these texts internally (e.g. using MS Word?), and any platform/browser
> software issues.
>
> Many thanks for any help,
>
> Joe Hall
>
> Web Editor
> Digital Programmes
> Tate
>
--
|