2019-10 / Alexander Tkačenko

Venedica, transliteration of Cyrillic

Unified transliteration system of the Cyrillic script for all Slavic languages

Features

This transliteration system:

Less like English

Following patterns of the English language results in inconsistencies. The notable example is using the Latin letter y as a transliteration for й, ы, -ий, or -ый within a single transliteration system (×Altay, ×Andrey, ×Vasily, ×Maly), removing the difference between these letters and sounds, apparent to a speaker of the Slavic languages.

Also, the English orthography doesn't possess an established inventory for soft consonants, resulting in the softness in transliterations of the Slavic languages being conveyed inconsistently.

Furthermore, mimicking the English orthography while using apostrophes and consonant clusters like zh, shch, kh still ends up in a writing deeply foreign to an English speaker, defeating the purpose of making a transliteration system look familiar to speakers of English.

More like Czech and Polish

Compared to the Cyrillic-based Slavic languages, the Latin-scripted Slavic languages (such as Czech and Polish) have a similar phonetic system and they already have well-established patterns for representing most phonetic features typical of the Slavic languages. These languages are therefore the fittest source for a transliteration system of Slavic Cyrillic.

Also, writing in all Slavic languages in a similar and nearly mutually intelligible way should contribute to better understanding between speakers of these languages.

Transliteration rules

As shown below, most Cyrillic letters have a simple one-to-one transliteration, with a few exceptions still following clear long-established patterns.

Czech sibilants

Cyr Lat IPA[a]
ц c[b] [t͡s]
с s [s]
з z [z]
ч č [t͡ʃ, t͡ɕ]
ш š [ʃ]
ж ž [ʒ]

č, š, ž are the common representations of the sounds corresponding to ч, ш, ж. These letters have been used in Czech, Slovak, and Slovene for centuries.

The diacritic mark, caron, in these characters indicates their affinity with the unmarked consonants, which can be seen for instance in Russian word forms like улица — уличный (c/č), писать — пишу (s/š), возить — вожу (z/ž).

This also means that, in many contexts, typing with a keyboard lacking the accented characters shouldn't render a text completely incomprehensible.

Polish soft acute

Cyr Lat IPA[a]
ь ◌́ goes as an acute over the preceding consonant (◌́), except č, š, ŝ, ž and except Old Slavic
omitted after č, š, ŝ, ž, except Old Slavic
ě [*ĕ, *ĭ] in Old Slavic (see also jer and jeŕ)
ѓ [1] [dʲ]
di before vowels, except i
d before i
ќ ć [c, tɕ]
ci before vowels, except i
c before i
љ ĺ [ʎ, lʲ]
li before vowels, except i
l before i
њ ń [ɲ]
ni before vowels, except i
n before i
ћ ć [tɕ]
ci before vowels, except i
c before i
с́ ś [ɕ]
si before vowels, except i
s before i
з́ ź [ʑ]
zi before vowels, except i
z before i

[1] Although graphically based on г which is represented by Latin g, the Cyrillic letter ѓ is transliterated as based on its phonetic value, which makes it less cryptic and more recognizable to speakers of other Slavic languages. For a similar reason, Cyrillic ќ is transliterated as ć.

Consonants are palatalized with the acute accent, unless followed by a vowel. Followed by a vowel, consonants are palatalized by introducing i after the consonant. If the vowel following the consonant is i, the palatalizing i is redundant and dropped. This is similar to the pattern in Polish (ć turns to ci). This pattern is applicable to unpalatalized consonants as well in order to render their palatalized counterparts: n → ń, m → ḿ, r → ŕ, etc. (like in Polish Poznań).

Iotated vowels

Cyr Lat IPA[a]
е je [je] word-initially and after vowels
e [ʲe] after consonants
є je [je] word-initially and after vowels
ie [ʲe] after consonants
ѣ [je, ji, ije] word-initially and after vowels
[ʲe, ʲi, ʲije] after consonants
ё jo [jo] word-initially and after vowels
io [ʲo] after consonants, except č, š, ŝ, ž
o [o] after č, š, ŝ, ž
ю ju [ju] word-initially and after vowels
iu [ʲu] after consonants, except č, š, ŝ, ž
u [u] after č, š, ŝ, ž
я ja [ja] word-initially and after vowels
ia [ʲa] after consonants, except č, š, ŝ, ž
a [a] after č, š, ŝ, ž
ї ji [ji]

The sounds corresponding to the characters я, е, ё, ю are represented as j + vowel in the Slavic languages: ja, je, jo, ju (as in the name Jan) when the [j] sound is actually present: word-initially, after a vowel, after the soft sign ь and the hard sign ъ, after the Cyrillic apostrophe ' (in Belarusian and Ukrainian) (in all these cases, j is in the beginning of a syllable).

After the palatalized (softened) consonants these characters no longer contain the [j] sound and, for this reason, they are represented with ia, iu, io (like in Polish), where the leading i marks the preceding consonant as softened. (Examples in Russian: Jaroslavĺ, Riazań.)

However, Cyrillic е after consonants can stand for both [ʲe] (as in Rus. тесто [tʲestə]) and [e] (as in Rus. тест [test]) which is the reason why ie and e are merged into a single representation e after consonants. In the beginning of a syllable, Cyrillic е is still represented as je.

Cyrillic ѣ (jat́) (non-cursive form: ѣ) is transliterated as due to the probable historical pronunciation being close to [ie], and its cursive form resembling a ligature of іь, with the Old Slavic vowel ь represented as ě (see jer and jeŕ) (ѣ → іь → ).

i, j, y

Cyr Lat IPA[a]
і i [i]
í [i(ː), ij] long i, for [i] before or after a vowel and for ій or іј
и i [i] except Ukrainian
í [i(ː), ij] except Ukrainian: long i, for [i] before or after a vowel and for ий or иј
y [ɨ] in Ukrainian
ý [ɨ(ː), ɨj] in Ukrainian: long y, for ий
ы y [ɨ]
ý [ɨ(ː), ɨj] long y, for ый
й j [j]
i short i, for [j] after a vowel and not followed by a vowel
ј j [j]
i short i, for [j] after a vowel and not followed by a vowel
ї ji [ji]

The characters í and ý stand for the long vowels in the Czech and Slovak orthographies: dobrý den, Letní stadion, Průmyslový palác.

In the transliteration, i and y are short. y represents [ɨ] (like in Polish and Old Czech), i can be:

While í and ý are long. They represent:

The use of the accented í character next to other vowels is also akin to the use of the diaeresis in some other languages to distinguish two standalone sounds from a diphthong represented by the same two characters, like in French naïf, Noël, Citroën, Moët.

Graphically, í and ý can be regarded as merged digraphs of ij and yj, where the trailing j was transformed to a handier superscript stroke.

As shown in the table above, some instances of the [j] sound ([й]) are represented by the letter j (jot) (not y), like in all Latin-scripted Slavic languages.

Jer and jeŕ

Cyr Lat IPA[a]
ъ omitted, except Bulgarian and Old Slavic
ǒ [ɤ̞, ə] in Bulgarian
ǒ [*ə, *ŭ] in Old Slavic
ь ◌́ goes as an acute over the preceding consonant (◌́), except č, š, ŝ, ž and except Old Slavic
omitted after č, š, ŝ, ž, except Old Slavic
ě [*ĕ, *ĭ] in Old Slavic

The Old Slavic ъ and ь (jer and jeŕ) are transliterated as ǒ and ě respectively (not ŭ and ĭ), which correspond to the modern voiced vowels, о and е, descending from those older ones:

(The fact that in the written form of the Ancient Novgorod dialect ъ/о and ь/е were often used interchangeably also reinforces this approach to transliteration.)

Similarly, the Bulgarian ъ is transliterated as ǒ (not ǎ, as in ISO 9). Apart from offering a valid representation in Bulgarian, the transliteration of ъ as ǒ results in a more recognizable spelling when compared to the other cognate languages:

(See also the note on the letter ѣ.)

Balkan affricates

Cyr Lat IPA[a]
ђ d́ž [d͡ʑ]
џ [d͡ʒ, d͡ʐ]

Other letters

Cyr Lat IPA[a]
б b [b]
в v [v]
г g [g, ɣ, ɦ][1]
ґ ġ [g] used when opposed to г realized as [ɣ]/[ɦ]
д d [d]
ѕ dz [d͡z]
ј j [j]
к k [k]
л l [l]
м m [m]
н n [n]
п p [p]
р r [r]
т t [t]
ф f [f]
х h[b] [x]
щ ŝ[2] [ɕ(ː), ʃt͡ʃ, ʃt]
ѳ [f]
а a [a]
і i [i]
о o [o]
у u [u]
ў w [w]
э ê[3] [e]
e word-initially, the circumflex can be dropped

[1] Similarly to Dutch, [ɣ] is represented by g. Both [ɣ] and [ɦ] are regarded as allophones of [g] and therefore they are represented by the same letter (which should also contribute to the mutual intelligibility across cognate languages). ¶

[2] ŝ resembles š (see Czech sibilants), just as щ is close to ш, both phonetically and graphically. ¶

[3] The circumflex in ê helps distinguish words like Rus. mêtr (мэтр) and metr (метр), or Rus. sêr (сэр) and ser (сер). In the beginning of a word, Cyrillic e is transliterated to je (see also iotated vowels), whereas a transliterated э won't have the leading j, rendering the circumflex unnecessary. ¶

Notes

[a] The IPA column shows an approximate phonetic value of the most common realization of the given character (close to standard, and stressed in case of vowels). ¶

[b] The letters c and h represent the [t͡s] (ц) and [x] (х) sounds respectively (not ts and kh). c stands for the [t͡s] sound in all Latin-scripted Slavic languages (as in the Polish words ulica, centralny).

The use of the ts and kh digraphs would only be reasonable in a writing system where the standalone letters of c and h are already reserved to represent other sounds (as in English and French). Some transliteration systems employing these digraphs leave the standalone c and h unused (which is odd on its own), losing the c/č affinity and being conspicuously unlike the Latin-scripted Slavic languages. ¶

Stress mark

The use of the acute accent ◌́ as a stress mark becomes ambiguous in writing systems employing the acute for other purposes, like in this transliteration system. The ambiguity of the acute seems to be easily resolvable by using the underline as a stress mark proving, in fact, to be more suitable for this role.

The underline as a stress mark:

Web apps

See Russian translit app and Transliteration of Russian proper names.