[windev] Converting from greek....

Roberto Tirabassi rtirabassi at 3di.it
Thu Mar 26 08:36:08 GMT 2009


Hi Serge...
    Well, we develop a native XML data Base with an Full Text retrieval
in it. I know that I can use Utf-8 to encode greek language (we
currently use utf-8 in other scenarios) but... It makes harder to make
term extension that means considering two terms equal or similar if few
chars changes or if written with or without modificators (accents and so
on).
    An example is better than 1000 words. Let's talk about my
language.... in Italian the term "university" is written as "università"
but can be written even as "universita'" using the quot char as the
accent. This is just an example but this make us face the problem that
the same word can be written uppercase, lowercase and so on...
expecially if data is coming from ancient data bases and so on...
    More... the deutsch letter 'ß' is frequently written as 'ss' (a
"modern" way to express the same char). More and more... eastern europe
languages (iso-8859-2 and so on) or northern ones have many chars that
are latin1 letter modified by a symbol. Users of different languages
often don't know how to write that char...
    That means that...
    ...when our users wants to look at the full index term list (that we
call vocabulary) we want to show the term in it's exact "face" but when
I make my searches, search extensions and so on... I have to "normalize"
all terms. We believe that... latin-1/ascii-7-bit should be the most
significant normalization level.
    That's the target.
    That's why we tried to follow the greeklish way but we also find out
that greeklish isn't standard and there are at least 3 different ways to
transliterate greek to greeklish.
    We eared about ELOT standard for greek transliteration but I still
haven't found anything about it...

    Here's my trouble...

                           Roberto Tirabassi.
-- 
"Nei periodi di grandi cambiamenti, gli apprendisti ereditano la Terra
mentre gli specialisti si ritrovano preparatissimi ad affrontare
un mondo che non c'è più". [Eric Hoffer]


Skype: roberto.tirabassi

3D Informatica, Via Speranza 35,
40068, S.Lazzaro di Savena - Bologna, Italy
Voice: +39051450844, Fax: +39051451942
WWW: http://www.3di.it
Documentation: http://www.3di.it/manuali/ - http://wiki.3di.it
FTP: ftp://ftp@ftp.3di.it, Download:/3di, Upload:/incoming
-- 






More information about the Windev mailing list