regex - Javascript - normalize accented greek characters -


I'm trying to apply some kind of generalization in Greek text (use less case, remove accent And replace σ with σ). For example, I would like to become "ἀντίθεσις" (Greek polytechnic) and "αντίθεσις" (modern Greek) "αντιθεσισ" I attended and wrote what character substitution I should do.

  Greek and Coptic (range: 0370-03 FF) ΆΑά - & gt; Α ΈΕέ - & gt; Ε ΉΗή - & gt; Η ΊΪΙίΐ - & gt; Ι ΌΟό - & gt; Ο ΎΫΥΰϋύ - & gt; Υ ΏΩώ - & gt; Ω Greek Extended (Range: 1F00- 1FFF) ἀἁἂἃἄἅἆἇὰάᾀᾁᾂᾃᾄᾅᾆᾇᾰᾱᾲᾳᾴᾶᾷἈἉἊἋἌἍἎἏᾈᾉᾊᾋᾌᾍᾎᾏᾸᾹᾺΆᾼ - & gt; Α ἐἑἒἓἔἕὲέἘἙἚἛἜἝῈΈ - & gt; Ε ἠἡἢἣἤἥἦἧὴήᾐᾑᾒᾓᾔᾕᾖᾗῂῃῄῆῇἨἩἪἫἬἭἮἯᾘᾙᾚᾛᾜᾝᾞᾟῊΉῌ - & gt; Η ἰἱἲἳἴἵἶἷὶίῐῑῒΐῖῗἸἹἺἻἼἽἾἿῘῙῚΊ - & gt; Ι ὀὁὂὃὄὅὸόὈὉὊὋὌὍῸΌ - & gt; Ο ὐὑὒὓὔὕὖὗὺύῠῡῢΰῦῧὙὛὝὟῨῩῪΎ - & gt; Υ ὠὡὢὣὤὥὦὧὼώᾠᾡᾢᾣᾤᾥᾦᾧῲῳῴῶῷὨὩὪὫὬὭὮὯᾨᾩᾪᾫᾬᾭᾮᾯῺΏῼ - & gt; Ω ῤῥῬ - & gt; Ρ  

I am thinking that these substitutions are a great way of doing and avoid checking the string characters from letters.

1 try (thanksgiving tables)

  normal = 'somebody Also on the type of interval in writing, what about I Lorem Ipsum? Οι ρίζες του βρίσκονται in a κείμενο Λατινικής λογοτεχνίας 45 π.Χ., φτάνοντας την ηλικία του πανό from 2000 έτη. '; Paul = 'Μήγαρις ἔχω ἄλλο στὸ νοῦ μου πάρεξ ἐλευθερία καὶ γλώσσα;'; Console.log (normalizeGreek (normal)); Console.log (normalizePolytonicGreek (pole)); The grec (text) for the normal function {text = text.replace (/ Ά | AAT | ँ / g, 'α') .replace (/ Έ | Ε | | έ / έ, 'ε') .replace (/ Ή | Η | y / g, 'η') .replace (/ Ί | Ϊ | Ι | ί | ΐ | ϊ / g, 'ι') .replace (/ Ό | Ο / ό / g, 'ο'). Replace (/ Ύ | Ϋ | Υ | ύ | ΰ | ϋ / g, 'υ') .replace (/ Ώ | Ω | ώ / g, 'ω') .replace (/ Σ | α / g, 'σ' ); Return text; } Function normalized platonic graphic (text) {text = text.replace (/ Ά | aa | ँ | ἀ | ἃ | ἄ | ἅ | ἇ | ὰ | ὰ | ά | ᾀ | ᾂ | ᾂ | ᾃ | ᾄ | ᾅ | ᾎ | ᾇ | ᾰ | ᾱ | ᾲ | ᾳ | ᾴ | ᾶ | Ἀ | Ἀ | Ἁ | Ἂ | Ἄ | Ἅ | Ἆ | ᾀ | ᾁ | ᾂ | ᾄ | ᾄ | ᾅ | ᾆ | ᾏ | Ᾰ | Ᾱ | Ὰ | Ά | ᾼ / g, 'α') .replace (/ Έ | έ | ἐ | ἑ | ἔ | ἕ | ἕ | έ | έ | Ἐ | Ἑ | Ἒ | Ἓ | Ἔ | Ἕ | Ὲ | Έ / g, 'ε') .replace (/ Ή | Η | ἠ | ἢ | ἣ | ἤ | ἦ | ἧ | ὴ | ᾐ | ᾑ | ᾒ | ᾓ | ᾔ | ᾝ | ᾖ | ᾗ | ῂ | ῃ | ῄ | ῆ | ῇ | Ἡ | Ἡ | Ἢ | Ἣ | Ἥ | Ἦ | Ἧ | ᾑ | ᾒ | ᾓ | ᾕ | ᾕ | ᾞ | ᾟ | Ὴ | Ή | ῌ / g, 'η') .replace (/ Ί | ί | ΐ | ἲ | ἲ | ἳ | ἴ | ἶ | ἶ | ἷ | ὶ | ί | Ῐ | ῑ | ῒ | ΐ | ῖ | ῗ | Ἰ | Ἱ | Ἲ | Ἳ | Ἴ | Ἵ | Ἶ | Ἷ | Ῐ | Ῑ | Ὶ | Ί / जी, 'ι') .replace (/ Ό | Ο | ὀ | ὁ | ὂ | ὃ | ὅ | ὅ | ὸ | ό | Ὀ | Ὁ | Ὂ | Ὃ | Ὄ | Ὅ | Ὸ | Ό / जी, 'ο') .rele (/ Ύ | Υ | ΰ | ϋ | ύ | ὐ | ὑ | ὓ | ὓ | ὔ | ὕ | ὖ | Ὗ | ὺ | ύ | ῡ | ῡ | ῢ | ΰ | ῧ | ῧ | Ὑ | Ὓ | Ὗ | Ῠ | Ῡ | Ὺ | Ύ / g, 'υ') .replace (/ | Ω | ώ | Ὠ | ὡ | ὢ | ὣ | ὤ | ὥ | ὦ | ὧ | ὼ | ώ | ᾨ | ᾩ | ᾪ | ᾫ | ᾬ | ᾭ | ᾮ | ᾯ | ῲ | ῼ | ῴ | ῶ | ῷ | Ὠ | Ὡ | Ὢ | Ὣ | Ὤ | Ὥ | Ὦ | Ὧ | ᾨ | ᾩ | ᾪ | ᾫ | ᾬ | ᾭ | ᾮ | ᾯ | Ὼ | Ώ | ῼ / g, 'ω'). Place (/ ῤ | Ῥ | Ῥ / g, 'ρ') .replace (/ Σ | ς / g, 'σ'); Return text; }  

I do not think That you can do it in any other way than checking each letter, but it does not hurt anybody

  result = string.replace (/ Ά | one | Α / g, 'α') .replace (/ Έ | Ε | έ / g, 'ε') .replace (/ Ή | Η | or / g, 'η') ;; // & amp; amp; Or else ... if you loop instead, which you probably do if you have more than the characters of the check and which is even better to maintain the code, then store characters / characters in an array of arrays, examples of matches for. With one object:  
  var cvtValues ​​= [/ * = convert to chars; From: 'α'} {From: '[' Έ ',' Ε ',' ',' ',' ',' Έ ']'} {to: ['Ή', 'Η', 'or'], from: 'η'}]; From / to all containers for / * loop * / (var i = 0; i & lt; cvtValues.length; i ++) from 'array' & amp; Loop on all characters; Replace them for 'value' (var x = 0; x & lt; cvtValues ​​[i] .from.length; x ++) {string = string.replace (New RegExp (cvtValues ​​[ I] .from [x] 'g'), cvtValues ​​[i] .to); / * You can specify it in any other variable, e.g. Results if you wated for *}}  

Comments

Popular posts from this blog

import - Python ImportError: No module named wmi -

Editing Python Class in Shell and SQLAlchemy -

c# - MySQL Parameterized Select Query joining tables issue -