Next Previous Contents

10. Pivakes xaraktnrwv, character sets

O H/U gia va parastnsei ta glwssika sumbola xrnsimopoiei 1byte=8bit, dnladn exoume 2^8=256 diaforetika grammata. O kwdikas ASCII (American Standard Code for Information Interchange) orizei austnra movo ta prwta 128 sumbola (7bit). Ta alla misa sumbola xrnsimopoiouvtai gia apeikovisn eidikwv sumbolwv allwv glwsswv alla kai grafikwv sumbolwv. Avti8eta, me alles eurwpaikes glwsses ta ellnvika eivai e3' oloklnrou sta 8bit. O profavns logos eivai ta polla diaforetika sumbola pou exei n ellnvikn glwssa se sxesn me tis upoloipes.

Pros8etes plnrofories gia ta ellnvika sto Diadiktuo, mporeite va breite sto RFC 1947, "Greek Character Encoding for Electronic Mail Messages". Deite sxetika http://andrew2.andrew.cmu.edu/rfc/rfc1947.html

10.1 Tupopoinseis twv ellnvikwv

Ta ellnvika uparxouv se polles diaforetikes tupopoinseis. Oi pio suvn8ismeves apo autes, eivai ta 737 kai ta 928. Amfotera eivai gia movotovika ellnvika. Ta mev 737 xrnsimopoiouvtai apo to DOS, ta de 928 apo ola ta UNIX kai Windows (me mikres parallages). To Linux exei sav kuria kwdikoselida ta 928. To oti exoume gia Ellnvika duo kai pleov protupa, fusika, eivai megalo problnma, pou 3epervietai me eidikous metatropeis, gia allagn apo to eva set sto allo.

Apo tnv tekmnriwsn tns Oracle gia to Linux kai ta egxeiridia tou server, mporei kaveis va brei ta diadedomeva ellnvika protupa pou xrnsimopoiouvtai se baseis dedomevwv (ara kai sta pio snmavtika sustnmata H/U) kai tous tupopoinmevous (ma pali;) kwdikous tous:

Episns, to OS/2 xrnsimopoiei tnv kwdikoselida 869 kai 851 gia ta ellnvika.

10.2 737

Ta 737 eivai episns gvwsta kai ws 437G (=437Greek), giati proekuyav apo tropopoinsn twv amerikavikwv 437. Ta 737 prwtoemfavistnkav stis ellnvikes EPROM twv MDA kai Hercules kartwv grafikwv twv prwtwv PC, opou briskovtav dnladn sto HARDWARE. Xrnsimopoin8nkav kata korov sto DOS, kai gia auto ola ta arxeia pou proerxovtai apo ekei avamevetai va eivai 737. Epeidn ta 737 8ewrouvtai pleov kataloipo tou DOS, eivai kalutera va metatreyete ta arxeia pou eivai 737 se 928, bl. convertgreek . Sto Linux, n kwdikoselida 737 upostnrizetai plnrws movo stnv kovsola (text-mode), alla uparxouv kai merikes grammatoseires gia X-Windows.

Tropopoinsn purnva gia upostnri3n 737

Exouv avafer8ei periptwseis, opou to "d" (DELTA mikro) dev plnktrologeitai se kapoious purnves kai auto sumbaivei giati sumpiptei me to 128+ESC (128+27=155=asc("d")). Pngaivete sto /usr/src/linux/drivers/char/console.c, kapou leei:

              && (c != 127 || disp_ctrl)
              && (c != 128+27);
alla3te se 
              && (c != 127 || disp_ctrl)
              /*      && (c != 128+27)*/;
kai kavete compile eva veo purnva.

737 se X-windows

Ta 737 upostnrizovtai se merikes apo tis fixed grammatoseires pou eivai sto paketo Grafis: graphis .

[[email protected]]'s report for names (from xlsfonts):
-misc-grfixed-medium-r-normal--0-0-75-75-c-0-grpc-737
-misc-grfixed-medium-r-normal--0-0-85-85-m-0-grpc-737
-misc-grfixed-medium-r-normal--14-110-75-75-c-75-grpc-737
-misc-grfixed-medium-r-normal--16-120-75-75-c-75-grpc-737
-misc-grfixed-medium-r-normal--23-179-85-85-m-120-grpc-737
-misc-grfixed-medium-r-semicondensed--0-0-75-75-c-0-grpc-737
-misc-grfixed-medium-r-semicondensed--10-100-75-75-c-60-grpc-737
-misc-grfixed-medium-r-semicondensed--13-120-75-75-c-60-grpc-737
-misc-grvga-medium-r-normal--0-0-75-75-c-0-grpc-737
-misc-grvga-medium-r-normal--13-120-75-75-c-60-grpc-737
  (nomizw kapoia exoyn bugs kai exw skopo na ta diorthwsw se next release).

10.3 928

Ta ellnvika 928 eivai n pio sugxrovn kai diadedomevn tupopoinsn kai ka8ierw8nke arxika apo tov ELOT. Argotera egivav apodekta kai apo tov ISO ws ISO-Latin-8859-7, n apla Latin7, akoma kai n UNICODE upostnri3n ellnvikwv basizetai se auta. Ta 928 xrnsimopoiouvtai se oles tis efarmoges twv UNIX, sto Internet kai apotelouv to snmerivo protupo kai gia to Linux. To protupo 928 upostnrizetai, kai stnv kovsola (text-mode), kai se grafiko periballov (X-Windows).

Windows-1253

H kuria apoklisn twv Windows ellnvikwv (Windows-1253) apo tnv tupopoinsn ELOT 928, eivai o xaraktnras "A", (A tovoumevo) tou 928 o opoios sta Windows avtistoixei sto Paragraph mark. Apo ta Windows-1253 leipouv episns n avw teleia, kai ta ellnvika omoiwmatika << kai >>. Epeidn moiraia 8a prepei va apodextoume tov periorismo auto pou mas 8etouv ta MS-Windows, kai epeidn arketoi xrnstes xrnsimopoiouv wintel platforma ergasias, kalo 8a eivai va apofeugetai to < A tovoumevo > kata tnv apostoln e-mails, postings, klp. Evallaktika mporeite va xrnsimopoieite to 'A ( ' = SHIFT+" ) Paromoia problnmata uparxouv kai me ta 'E kai 'O. Gia eukolia sas, auta eivai ola ta tovoumeva kata 928: AEHIOYO.

10.4 Unicode

Ta UNICODE (ISO 10646) eivai 16bit (dnl. 65536 suvdiasmoi) kai perilambavouv polles glwsses, mazi me ta vea ellnvika, pou exouv offset #370 kai ta arxaia ellnvika me offset #1F00. Upostnrizovtai apo ta vea mexri ta arxaia (polutovika) ellnvika kai Grammikn B! To Linux upostnrizei eswterika ta UNICODE, alla akoma n xrnsn tous dev eivai diadedomevn, giati e3artatai kai apo tnv uio8etnsn tous apo tis efarmoges. Gia perissotera deite: http://www.linuxdoc.org/HOWTO/Unicode-HOWTO.html

====================================================================
Vasilis Vasaitis <[email protected]>:
 Av kai dev exw asxoln8ei ektevws me to avtikeimevo, mporw va suveisferw
kapoia gvwsn pou exw epi tou 8ematos. Loipov, exoume kai leme:

  Kapoia stigmn, se avupopto xrovo, eixa katebasei eva Unicode fixed font
gia ta X windows. Epeidn duskola sbnvw auta pou katebazw, to brnka va
ka8etai akoma sto disko mou. H grammatoseira autn dev periexei to plnres
Unicode, afou auto apoteleitai apo perissoterous apo 38000 xaraktnres, apo
tous opoious oi perissoteroi eivai Kivezika/Iapwvika/Koreatika, pou etsi ki
alliws sto 6x13 tou fixed dev mpaivouv. Omws me peripou 2800 xaraktnres (n
ekdosn pou exw egw toulaxistov) kaluptei plnrws tnv lativikn, ellnvikn,
kurillikn, armevikn, gewrgiavn kai ebraikn grafn, suv kapoia texvika kai
ma8nmatika sumbola. H grammatoseira autn mporei va xrnsimopoin8ei ws protupo
apo opoiovdnpote evdiaferetai va sxediasei grammatoseires me pollous
xaraktnres sxetika me pio praktikes efarmoges, deite parakatw. H selida tou
tupou pou tnv eftia3e, av eivai akoma n idia, eivai:

  http://www.cl.cam.ac.uk/~mgk25/

        Upostnri3n stnv kovsola:

  H kovsola upostnrizei Unicode edw kai kati aiwves, mesw bebaia tou UTF8
(gia osous dev 3erouv, to UTF8 eivai mia avaparastasn tou UniCode me
metablnto mege8os, to opoio gia gia tous 128 prwtous xaraktnres exei tnv
idia morfn me to ASCII). To 8ema eivai oti etsi ki alliws n upostnri3n tns
VGA gia xaraktnres pou emfavizovtai sugxrovws eivai polu periorismevn (256,
512 xwris to avabosbnma).

        Upostnri3n sta X:

  H grammatoseira pou avaferw parapavw douleuei mia xara, kai n teleutaia
fora pou tn dokimasa ntav priv polu kairo. Episns, tuxaivei va exw evav X
server me evswmatwmevn upostnri3n TrueType fonts (dev fortwvw font server),
kai blepw oti kai ta TrueType douleuouv mia xara. Gia osous dev 3erouv, ta
XFree86 4.0 8a erxovtai me evswmatwmevn upostnri3n TrueType. H Microsoft
(dev exw apo alln etaireia) xrnsimopoiei stis grammatoseires tns to Windows
Glyph List 4 (WGL4), to opoio eivai uposuvolo tou ISO 10646-1 (ligo polu
auto pou exei n grammatoseira pou periegraya arxika).

        Efarmoges:

  Edw katarreouv ola. Autn tn stigmn uparxouv kava duo programmata pou
kavouv metatropn apo/pros UTF8, to yudit kai to Netscape pou mazeuouv apo
edw ki apo ekei gia va brouv arketa sumbola tou Unicode, kai apo ekei kai
pera to xaos. Pavtws kalov 8a eivai va arxisei prospa8eia gia ta fonts, kai
favtazomai oti oi efarmoges 8a koita3ouv va akolou8nsouv.

---------------

Report apo Panagioti Vrioni:

Gnwrizw oti o Giannis Gyftomitros <[email protected]> exei hdh arxisei
na asxoleitai me thn dunatothta dhmiourgias Unicode grammatoseirwn
pou na periexoun kai ta ellhhnika (Project Grafis, bl. GRArial k.l.),
isws na exei proxwrhsei kai parapera...

Apo thn ekdosh 6.0, o XFS pou periexetai sto Red Hat exei patch wste
na mporei na emfanisei Trye Type Fonts. Bl. sxetiko "White Paper" stho
"support" ths http://www.redhat.com/ . An balete Unicode TTFonts
(px. ths M$) auta paizoun, me thn ennoia oti fainontai dia8esima
ta fonts me xilia-duo diaforetika encodings. Den kserw omws an paizoun
kai san unicode grammatoseira, px. gia na dei kapoios ena keimeno
me ellhnika, agglika kai kinezika tautoxrona sto Netscape.

=====================================================================

Unicode Links

Uparxei mia fixed grammatoseira gia Xwindows, deite sxetika: http://www.cl.cam.ac.uk/~mgk25/ucs-fonts.html

Uparxei kai evas text editor gia Unicode, me to ovoma Yudit, ftp://metalab.unc.edu/pub/Linux/apps/editors/X/yudit-1.1.tar.gz

To protupo UTF-8 eivai pleov standard sto Internet, deite to sxetiko RFC: http://andrew2.andrew.cmu.edu/rfc/rfc2279.html

Perissotera gia ta vea ellnvika sta Unicode edw: http://charts.unicode.org/Unicode.charts/normal/U0370.html

10.5 Metatropeis ellnvikwv

gr2gr

O Aggelos Xaritsns < > exei grayei tov metatropea autov: ftp://ftp.hri.org/pub/greek/programs/gr2gr.prl Trexei me perl (5 n 4). Suvepws douleuei se opoio leitourgiko sustnma exei egkatasta8ei perl (unix, dos, win32, os2, mac, vms ...).

Upostnrizei polla diaforetika ellnvika, opws:

grfilter

Sto Ivstitouto Texvologias Upologistwv uparxei to grfilter: ftp://ftp.cti.gr/pub/src/grfilter.tar

greek2lat

Sto directory ftp://corfu.forthnet.gr/pub/greek2lat uparxei evas metatropeas apo 928 se greeklish, katallnlos kai gia WEB sites.

trans120.tar.gz

O Kwstas Kwstns, < > exei grayei episns autov tov metatropea, pou upostnrizei kai polla ellnvika, alla kai alles glwsses: http://www.kostis.net/freeware/trans120.tar.gz

gkconv

Uparxei kai eva programma tou Giwrgou Spnliwtn, metatrepei 437, Win95, X win. H dieu8uvsn tou agvoeitai.

recode

Auto eivai eva programmataki gevikns xrnsns apo to GNU project, to opoio upostnrizei metatropeis gia polles diaforetikes glwsses (kai ellnvika). Isws 8a eprepe ola ta upoloipa programmata kapoia stigmn va evswmatw8ouv se auto. Deite stnv dieu8uvsn http://www.delorie.com/gnu/docs/recode/recode_toc.html

10.6 Tupoi arxeiwv kai metatropn tous

.txt, .doc

Avaloga me tnv periptwsn, blepe convertgreek

.dbf

Suvn8ws eivai 737, 8elouv prosoxn stnv metatropn, afnste to gia kava guru.

.diz,

Suvn8ws eivai 737, blepe convertgreek

.html,

Prepei va eivai 928, kai faivovtai kavovika.

.mov, .avi

Av exei upotitlous sta ellnvika, 8a eivai OK :-)

.exe, .com

peta3te ta

10.7 Ti uparxei akoma sto Internet sxetika me ellnvika;

Xrnsimoi suvdesmoi:

Yaxte va breite oti xreiazeste me auto to search engine: http://www.google.org


Next Previous Contents