As I metnioned at yesterday's MELCOM meting, there may be some interest in
the following message from Google - see also the comments below by a couple
of US-based librarians. I have performed a couple of tests, and Yamli is
definitely more sophisticated and accurate than Google Transliteration,
although Google offers more languages and scripts. Fore example, Google
could not transliterate the Arabic word shara'i` (plural of shari`ah) accurately,
while Yamli offered several Arabic script alternatives to the English word,
including the correct one.
Paul Auchterlonie.
Librarian for Midde East Studies,
University of Exeter.
Blog: Official Google Blog
Post: Link: http://googleblog.blogspot.com/2009/12/transliteration-goes-
global.html
Transliteration goes global
12/17/2009 10:34:00 AM
Most of us use a keyboard to enter text; it's one of the most basic activities
we perform on a computer. However even this simple activity can be
cumbersome in many parts of the world. If you've ever tried to type in a non-
Roman script using a Roman keyboard, you know that it can be difficult to do.
Many of us at Google's Bangalore office experienced this problem firsthand.
Roman keyboards are the norm in India, making it difficult to type in Indian
languages. We decided to tackle this problem by making it very easy to type
phonetically using Roman characters and we launched this service as Google
Transliteration.
Using Google Transliteration you can convert Roman characters to their
phonetic equivalent in your language. Note that this is not the same as
translation — it's the sound of the words that are converted from one
alphabet to the other. For example, typing "hamesha" transliterates into Hindi
as: , typing "salaam" transliterates into Persian as: and typing "spasibo"
transliterates into Russian as . Since our initial launch for a single Indian
language, we've been hard at work on improving quality, adding more
languages and new features.
Today we are pleased to introduce a new and improved version of Google
Transliteration, available in Google Labs or at
http://www.google.com/transliterate.
In this new version, you can select from one of seventeen supported
languages: Arabic, Bengali, Greek, Gujarati, Hindi, Kannada, Malayalam,
Marathi, Nepali, Persian, Punjabi, Russian, Sanskrit, Serbian, Tamil, Telugu and
Urdu. You can also compose richly formatted text and look up word definitions
with our dictionary integration. If the default transliteration is not the word
you wanted, you can highlight it to see a list of alternatives. For even finer-
grained control, we provide a unicode character picker to allow character-by-
character composition.
Google Transliteration is integrated into several Google properties and we have
an API and bookmarklets to extend this capability to other websites. A
solution we initially built to solve a problem we saw here in India is now being
used in many other parts of the world as well - one small example of the scale
and leverage that technology can bring in today's increasingly globalized
environment. As with all labs products, we will continue to improve the
technology and try out new features. We would love to hear from you, so do
let us know what you think.
Posted by Nilesh Tathawadekar and Mohammed Aslam, Software Engineers
Mark Muhlhausler wrote :
A similar feature, paired with an Arabic search engine, has been around
for a while:
http://www.yamli.com/
... functions remarkably well.
Andras Riedlmayer wrote:
Just tried a quick test of Google Transliterate on some of the languages.
Arabic worked remarkably well, even based on non-standard romanization.
Russian and Serbian (Cyrillis script) also seems to work.
But the Persian version had problems with some words of Arabic origin.
When I tried typing in the name of the nineteenth-century Iranian ruler
Muzaffar al-Din Shah, it did not produce correct Arabic script no matter
what form of romanization I tried (muzaffaruddin, mozaffaroddin, mozaffar
al-din, etc.). The Arabic Google transliterator had no trouble reproducing
the word or the construct correctly. I hope this is something they'll
fix soon.
On the other hand, both the Arabic and the Persian transliterators were
good at producing the correct spellings of words with the various letters
that can be pronounced as "S" or "Z" in the standard or the vernacular,
(also picks the right spelling for words with initial GH vs Q in Persian).
--
|