ТІЛ БІЛІМІ
I feel highly indebted to Prof. Gulmira Madieva, Chair of General Linguistics, for her invitation and
hospitality. It was a great pleasure to discuss various topics with the students and teachers.
From Russian into English.
* * *
Peter Enders
HUYGENS’ PRINCIPLE FOR LINGUISTICS
(Dedicated to the 75th birthday of Werner Ebeling)
Physics has got a huge experience in applying
mathematics and philosophy to science. This can,
should and has been exploited by other branches,
in particular, linguistics, as shown by means of few
examples. The defi ciencies of simple Markov chains
for characterizing texts are proposed to be overcome
using correlated Markov chains. Texts of natural
languages are considered to result from phonetic
processes rather than to be just static sequences of
characters. This allows to exploit the relationship
between Markov chains and Huygens’ principle (both
being much more widely applicable than originally
conceived).
Introduction
In 2011, the new president of the Academy of
German Language, Heinrich Detering (*1959) has
announced, that the academy shall be more concerned
with social issues. Moreover, not only poets, scholars
of literature and critics, but also other experts, such as
jurists, scientists, health professionals and historians
should become members and care for the German
language. In such a spirit of the unity of science
– including the Humboldtian ideal of universal
education – a series of lectures on ‘Science – Language
– Society’ were held at the Faculty of Philology of the
Kazakh National Al Farabi University, Almaty.
Why to invite a physicist to give such lectures?
1
R. W. V. Elliott, Isaac Newton’s ‘Of an Universal
Language’, Mod. Lang. Rev. LII (1957) 1-18; jstor.org/
stable/3719861
2
R. W. V. Elliott, Isaac Newton as Phonetician, Mod.
Lang. Rev. XLIX (1954) 5-12; jstor.org/stable/3718012
Due to its experience in methodology in a broader
sense, physics is particularly able to support that
approach. To illustrate this, let me mention the
activities of few scholars, which are closely related
to physics.
Galileo Galilei (1564-1642), the founder
of modern physics, was among the fi rst to publish
not in Latin, but in her mother tongue. This made
science accessible to the common reader. He stressed
the role of the notions. “Names and attributes must
be accommodated to the essence of things, and not
essences to names; for things come fi rst, and names
afterward.” (1613) “If the opinions of philosophers,
and their words, have the power to call into existence
the things they consider and name, why then I beg
them the favor of their considering and naming „gold“
a lot of old hardware I have about the house.” (1623)
Isaac Newton (1642-1727), the founder of
modern classical mechanics, aimed at a universal
language
1
and did phonetic studies
2
.
Михаил Васильевич Ломоносов (1711-
1765), being perhaps most famous for his many
Абай институтының хабаршысы. №6 (12), 2011
71
milestone contributions to science and technics,
was also a practitioner and theoretician of Russian
literature.
Thomas Young (1773-1829), “the last man
who knew everything”
3
, is famous for his pioneering
contributions to optics, elasticity, vision and color
theory, liquids, medicine, music, and languages
(proposal of a universal phonetic alphabet, coining
the term ‘Indo-European’ languages, basic results on
the Egyptian hieroglyphs).
Андрей Андреевич Марков (1856-1922)
has applied his new formalism
4
– now known as
Markov chains-to the statistics of letters in «Евгений
Онегин» Пушкина (1799-1837)
5
and other texts
6
.
This was highly welcomed by Роман Осипович
Якобсон (1896-1982), Андрей Белый (Борис Нико-
лаевич Бугаев, 1880-1934) and others, who sought
for an objective characterization and evaluation of
literature. Generally speaking, however, one should
not forget, that mathematics deals with mathematical
structures, while the content of them comes from
linguistics (physics, biology...).
Claude Elwood Shannon (1916-2001), the
father of information theory, has analyzed the English
language by means of simple schemes and models,
including Markov chains
7
.
Gell-Mann (*1929) & Ruhlen (*1944), both at
the Santa Fe Institute (santafe.edu), have investigated
the ordering of subject (S), verb (V), and object (O)
in 2011 (!) languages
8
. They also note that three
lines of evidence – from genetics, archaeology, and
linguistics – all indicate that humans suddenly started
using sophisticated tools and making objects of art
around 50,000 years ago. They second the conjecture
that this was connected with the appearance of fully
modern human language.
Application of Markov chains in linguistics
Despite of the simplicity of Markov chains (see
below), they have infl uenced various developments,
eg
9
;
in the 1940-ies: Jan Mukaшovskэ (1891-1975)
and the Chech structuralists;
in the 1950-ies: Pierre Guiraud (1912-1983)
and the French mathematical linguistics;
in the 1960/1970-ies: Andrei Kolmogorov
(1903-1987), Wilhelm Fucks (1902-1990), Roman
Jakobson and the information-theoretically resp.
semiotically founded theory of the science of poetry,
the concrete (visual, auditive) poetry in Germany,
ИSSR and Brazil as well as in L’Ouvroir de Littйrature
Potentielle (Oulipo; founder members, among others:
the poet and philosopher of science Jean Lescure
(1912-2005) and the mathematically experienced
chess expert Franзois Le Lionnais (1901-1984)).
More sophisticated statistical methods have been
exploited in a still ongoing project, which is sketched
next.
Deciphering and Representing Mesopotamian
Texts Using Statistical Methods
The texts on Mesopotamian clay tablets (ca. 3000
BC) contain calculations with apparent contradictions,
the reasons of which were not known. Were there
errors in the calculations?
In 1983, the mathematician Peter Damerow
(1939-2011), the archaeologist Hans Jцrg Nissen
(*1935) and the orientalist Robert K. Englund
(*1952) have chosen a novel approach. They have not
analyzed the texts on single tablets, but on 5000 (!)
tablets – using computers, of course. They obtained
two crucial results:
3
A.Robinson, The Last Man Who Knew Everything:
Thomas Young, the Anonymous Genius who Proved Newton
Wrong and Deciphered the Rosetta Stone, among Other
Surprising Feats, Penguin 2007
4
Распространение закона больших чисел на вели-
чины, зависящие друг от друга, Изв. Физ.-мат. общ. Каз.
унив. [2] 15 (1906) 135–156; Engl.: Extension of the limit
theorems of probability theory to a sum of variables connected
in a chain, in: R. Howard, Dynamic Probabilistic Systems,
Vol.1: Markov Chains, Wiley 1971, Appendix B
5
An Example of Statistical Investigation of the Text
Eugene Onegin Concerning the Connection of Samples in
Chains, Science in Context 19.4 (2006) 591–600; journals.
cambridge.org/production/action/cjoGetFulltext?fulltext
id=637500
6
See also Berechenbare Kьnste. Mathematik, Poesie,
Moderne, Berlin/Zьrich: Diaphanes 2007; F. P. Ingold,
Markows vergessener Beitrag zur quantitativen Textlinguistik,
Recherche, 2009; recherche-online.net/andrej-markow.html;
the web site Markov transition matrix for a verse (antonalexeev.
hop.ru/markov/index.html) shows why a simple statistics of
letters is not suffi cient for describing a natural language.
7
A mathematical theory of communication, Bell System
Techn. J. 27 (1948) 379-423, 623-656; notice the publication
of that and other fundamental papers in an industrial journal!
8
M. Gell-Mann & M. Ruhlen, The origin and evolution
of word order; after Charles Day, blogs.physicstoday.org/
thedayside/2011/10/a-physicist-tackles-the-evolution-of-
word-order.html
9
F. P. Ingold, Mathematik und Poesie. Andrej A.
Markows vergessener Beitrag zur quantitativen Textlinguistik,
Recherche, 2009; recherche-online.net/andrej-markow.html
Абай институтының хабаршысы. №6 (12), 2011
72
1. The texts are not just the written form of a
spoken language, but document administrative and
accounting activities;
2. Different number systems were used for
different tasks or objects (goods sold on the market
place) to be counted.
Today, the internet plattform ‘Cuneiform Digital
Library Initiative’ (CDLI) dipicts and describes tens
of thousends of clay tablets with cuneiform writing.
New 3D photograph and processing techniques
are employed. A complete archive is aimed at –
not meeting the support of those, who wish fi rst to
decipher their tablets on their own...
10
Simple Markov Chains and Linguistics
A simple Markov chain-as considered by
Markov – describes a (simple) Markov process, ie, a
random process, where the future is determined only
by the present and not by the past. It is given by a
vector, z(0), describing the initial state and so-called
transition matrices, P(k+1,k), connecting the states k
and k+1; z(1)=P(1,0)·z(0); z(2)=P(2,1)·z(1) etc. k is
the process, or discrete time variable. The values of
P( k+1, k) are determined by the underlying process.
In view of this mathematical simplicity, the wide
applicability of Markov chains is rather astonishing.
In linguistics, z is not a state vector, but the set of
characters included, eg,
z={NUL,...,!,...,0,...,A,...,a,...,~,DEL}
for 7-bit ASCII
11
. P is considered not to be
determined by an underlying process; the step
number, k, has no meaning. For a given text, the
matrix element Pij counts how often the character
zi follows the character zj. For instance, within 7-bit
ASCII, P97,101 equals the number of the sequence
‘ea’ as in ‘real’ and ‘meat’
11
.
This example suggests, that the immediate
sequence of characters might be too simple a mean
for characterizing a text.
Correlated Markov Chains and Linguistics
A correlated Markov chain contains some
infl uence of the past. In other words, two subsequent
steps–say, from k to k+1 and from k+1 to k+2–are not
independent, but correlated. Thus,
10
For a more detailed account, see K. Vaillant, E. Fesseler,
Ideen, tдglich: Wissenschaft in Berlin, Berlin: Nicolai, 2010,
S. 138-151; damerow.mpiwg.de/pdf/ideen_138-151.pdf
11
See, eg, asciitable.com/; 7-bit means 27=128 characters
z(2) = P(2,1)· z(1) + Q(2,0)· z(0)
In linguistics, again, Q is considered not to be
determined by a process, but by the text. The matrix
element Qij counts how often the character zi is the
second character after the character zj.
Staying with 7-bit ASCII, Q97,101 equals the
number of the sequence ‘e?a’, where ’?’ means any
other letter (wildcard), as in ‘legal’ and ‘meat’.
Actually, the frequencies of sequences of three and
more letters are investigated. This transcends Markov
processes, because the corresponding relation,
z(2) = R(2,1,0)·z(1):z(0)
is non-linear.
In what follows I will assume that there are texts
that can be characterized by Markov chains.
Defi nition: A text is called a n-step Markov text,
if it can be characterized by a n-step Markov chain
12
.
For the natural languages I pose the following
Hypothesis: There is some ‘mechanism’ behind
the natural languages, because they are conditioned
by phonetic processes.
Conclusion:
The natural languages can be
characterized by (correlated) Markov chains.
Of course, this has far reaching consequences for
the statistical characterization of texts. In particular,
there are no single quantities that account for the
words, eg, R101,97,116 for ‘eat’
13
. For not the static
result of a text is taken as basis, but the building of
words during speech.
Some of these chains are distinguished as will be
outlined next.
Simple Markov Chains and Huygens’ Principle
There is a physical principle that—like Markov
Chains — looks rather special, but is much more
widely applicable than fi rst intended, viz, Huygens’
principle
14
. It has been formulated by Christiaan
Huygens (1629-1695) fi rst for mechanics and later
for optics (where it is best known). Shortly, every
point to which light reaches becomes the source
12
Analogously to the famous question “Can one hear the
shape of a drum?” by Mark Kac (1914-1984), one may ask,
can one reconstruct a text from its (Markovian) transition
matrices?
13
More exactly, R32,101,97,116,y, where ‘y’ stays for
‘space’ or a punctuation mark.
14
Cf K. Simonyi, A Cultural History of Physics, Peters/
CRC Press 2012; for a recent review, see P. Enders, Huygens’
principle as universal model of propagation, Latin Am. J.
Phys. Educ. 3 (2009) 19-32
Абай институтының хабаршысы. №6 (12), 2011
73
of a secondary wave, and the superposition of all
secondary wave(let)s yields the same light wave as
the original source.
The validity of Huygens’ principle for Markov
chains has long been known. Richard Feynman
(1918-1988) has used this relationship in his famous
path integral formulation of quantum mechanics
15
.
Thus, for simple Markov chains we have
z( k+ m) = P( k+ m, k+ m-1)· P( k+ m-1, k+ m-
2)...·P(k+1,k)·z(k)
Each matrix corresponds to the propagation from
one to the next wave front; each matrix multiplication
corresponds to the summation over the secondary
wavelets. For this, the matrices, P, are Huygens
propagators.
The k-step transition function, G(k+m,k):
z( k+ m)= G( k+ m, k)· z( k), is a Green’s function
16
. Since
it is a product of single-step transition matrices,
P( k+1, k), it obeys the equation
G( k+ m, k) = G( k+ m, k+ l)· G( k+ l, k); 0 ≤ l ≤ k
This is one form of the Chapman-Kolmogorov
equation
17
. It represents the most general mathematical
expression of Huygens’ principle
18
. For this, a Green’s
function obeying the Chapman-Kolmogorov equation
is also called a Huygens propagator; it propagates the
system under consideration from state z(k) to state
z(k+m).
Conjecture: The broad success of Markov chains
is due to the fact, that Huygens’ principle holds true
for them.
Thus, in what follows, generalizations of Markov
chains are proposed, which use Huygens’ principle as
Ariadnian thread.
Proper Huygens Propagators. Markov-Huygens
Chains
As mentioned above, correlated Markov chains
contain to a two-step ‘equation of motion’,
z( k+1) = P( k+1, k)· z( k) + Q( k+1, k-1)· z( k-1)
(or even higher-order ‘evolution laws’). In contrast
to the original, single-step Markov chain, the state z(k+1)
15
R. P. Feynman, Space-Time Approach to Non-Relati-
vis
tic Quantum Mechanics, Rev. Mod. Phys. 20 (1948)
367-387; reprint in: J. Schwinger (Ed.), Selected Papers on
Quantum Electrodynamics, New York: Dover 1958, No.27
16
After George Green (1793-1841)
17
After Андрей Николаевич Колмогоров (1903-1987)
and Sydney Chapman (1888-1970)
18
P. Enders, Huygens' Principle and the Modelling of
Propagation, Eur. J. Phys. 17 (1996) 226-235
is immediately connected not only with state z(k), but
also with state z(k-1) (correlation). This doubles not only
the set of independent dynamical variables from {z(0)}
to {z(0), z(1)}, but allows for a more complex dynamics
due to the new transition matrices, Q.
Now, in general, the Green’s function of a two-step
difference equation is not a Huygens propagator
19
. To
obtain a Huygens propagator, one hase to decompose
that difference equation of second order into two
coupled difference equation of fi rst order.
For example, analogously to d’Alembert’s (1717-
1783) solution to the one-dimensional wave equation,
one may try to set
z( k) = r( k) + l( k)
where, in some situations, r and l represent right-
and left-going quantities (pulses, waves), respectively.
They are connected through one-step equations of
motion like
r(k+1) = Rrr(k+1,k)·r(k) + Rrl(k+1,k)·l(k)
l(k+1) = Rlr(k+1,k)·r(k) + Rll(k+1,k)·l(k)
The Green’s function of this set of equations (in
terms of r and l, it is a 2Ч2 matrix), Ĝ, is a Huygens
propagator
20
.
Moreover, this set of one-step equations is equi-
valent to the two-step equation of motion for z( k), if
P( k+1, k) = Rrr( k+1, k) + Rrl( k+1, k)· Rll( k, k-
1)·Rrl(k,k-1)-1
= Rll(k+1,k) + Rlr(k+1,k)·Rrr(k,k-
1)·Rlr(k,k-1)-1
and
Q(k+1,k-1) = Rrl(k+1,k)·Rlr(k,k-1) –
Rrl(k+1,k)·Rll(k,k-1)·Rrl(k,k-1)-1·Rrr(k,k-1)
= Rlr(k+1,k)·Rrl(k,k-1) – Rlr(k+1,k)·Rrr(k,k-
1)·Rlr(k,k-1)-1·Rll(k,k-1)
In this case, that matrix Green’s function, Ĝ,
obeys the two-step equation of motion for z(k), too.
Ĝ2 = P·Ĝ + Q
By virtue of the Cayley-Hamilton theorem
21
, this
is the eigenvalue equation for Ĝ. Its solution yields a
set of eigenvectors and eigenvalues.
Now, the set of eigenvalues, the spectrum, is a
crucial characteristic of Ĝ. According to Leonhard
19
Cf the well-known fact, that the Green’s function of the
wave equation is not a Huygens propagator as it does not obey
the Chapman-Kolmogorov equation.
20
In this context, it is always understood, that the same
boundary conditions are fulfi lled.
21
After Arthur Cayley (1821-1895) and William Rowan
Hamilton (1805-1865)
Абай институтының хабаршысы. №6 (12), 2011
74
Euler (1707-1783)
22
, such a particular meaning
deserves a particular notion.
Defi nition: A Huygens propagator is called
proper, or irreducible, if it obeys the single higher-
order equation of motion, too.
Defi nition: A Markov chain, the Green’s function
of which is a Huygens propagator, is called a Markov-
Huygens chain.
Obviously, all simple Markov chains are Markov-
Huygens chains.
I thus arrive at the following
Conjecture:
For propagation(-like) processes,
the relevant Markov chains are the Markov-Huygens
chains.
Proper Huygens Propagators and Linguistics
The modeling of the phonetic aspects of a text
by means of Markov-Huygens chains represents an
extremely high degree of abstraction
23
. Nevertheless,
let us assume that a text can be described by Markov-
Huygens chains with various numbers of steps. Since
a detailed investigation of concrete texts is beyond
the scope of this series of lectures, let me propose the
following
Conjecture: The complexity of a text is measured
by the minimum number of steps of a Markov-
Huygens chain that is necessary for uniquely charac-
terizing it.
22
L. Euler, Anleitung zur Naturlehre worin die Grьnde zur
Erklдrung aller in der Natur sich ereignenden Begebenheiten
und Verдnderungen festgesetzet werden, ca. 1750; in: Opera
Omnia, III, 1, pp.17-178; Opera posthuma 2, 1862, pp.449-
560 (Enestrцm 842; http://www.math.dartmouth.edu/~euler/
tour/tour_17.html)
23
It is comparable, possibly, with that in 1963 Edward
Norton Lorenz’s (1917-2008) set of just three ordinary
differential equations of fi rst order for modeling the weathe–
unintentionally, this approach has initiated the modern chaos
research.
This suggests the following
Defi nition: A language is not redundant, if
all meaningful sentences of it exhibit a unique
complexity.
Of course, redundancy is a necessary characte-
ristics of natural languages.
Conclusions
Physics is the most experienced science what
concerns, (i), the development of methodology,
in particular, the application of mathematics, and,
(ii), the philosophical and social analysis. This can
and should be exploited by other sciences. Physics,
however, cannot judge about the content of other
sciences, such as the style of literary texts. This limits
its support of linguistics in such areas as literary
translations. In the fi eld of statistical analysis of texts,
the development of new ideas and methods within
physics may well benefi t mathematical linguistics,
too. Here, our results on Huygens’ principle are
proposed to be checked for applicability.
Markov chains have also been applied for
compose and modeling music and identifying
composers
24
. This suggests linguistics to inspect
musicology w.r.t. the application of mathematical and
physical methods.
Acknowledgement
24
Y.-W. Liu & E. Selfridge-Field, Modeling music as
Markov chains—composer identifi cation, Music 254 Final
Report, 10 June 2002, Center for Computer Research in
Music and Acoustics, Stanford University; https://ccrma.
stanford.edu/~jacobliu/254report/; Ch. Dodge & Th. A. Jerse,
Computer Music - Synthesis, Composition, and Performance,
Schirmer 2nd 1997
Абай институтының хабаршысы. №6 (12), 2011
75
Достарыңызбен бөлісу: |