(Universal and Generative grammar – a trend-setting idea or a mental straitjacket?)

It is Noam Chomsky’s merit to have significantly influenced (if not created) a prominent area of modern linguistics by asking the right questions. They were 1) How does language come about? Is it innate or subject to evolutionary genesis? 2) How come that children create sentences – theoretically in unlimited number – that they have never heard before? 3) What makes members of a language community like English distinguish grammatically correct from grammatically incorrect sentences? 4) Once we answered this question, why not apply the knowledge thus gained to all languages thus establishing a Universal Grammar? 5) Do we get a complete picture of language as long as we restrict our analysis to its surface, or do we have to visualize a deeper structure in order to understand language correctly? 6) May a linguist talk authoritatively about language as such (i.e. about all possible natural languages) even if he hardly knows more than the idiom he happens to speak?

To have asked these and other questions in all strictness is the great achievement of Chomsky. At the same time, however, it is his personal tragedy that he did not answer a single one of them satisfactorily or even correctly, as I will show in this article. Of course, this shortcoming need not be a matter of real concern as errors have often proved to be extremely fruitful, provided that they prompted others to think for themselves. Unfortunately, this turned out to be a vain hope, because, with his extraordinary prestige and worldwide fame, Chomsky managed to nip all independent thought in the bud – it was, so to speak, rejected as heresy. The answers Chomsky had given to the above-mentioned questions were received as inviolable dogmas, forcing thought into a corset from which linguistics is only now cautiously trying to disentangle itself.

Supposed that this devastating appraisal of Chomsky’s influence on the science of linguistics is correct, we are, of course, immediately faced with the intriguing question why Chomsky is hailed – and rightly so, in my opinion – as one of the greatest contemporary linguists?

Let us first listen to some cynical sociologist well trained in the history of science (you may think of some pupil of Thomas S. Kuhn). He might claim that in the humanities, where empirical facts are often very complex and difficult to determine, the „how“ of arguments counts at least as much as the „what“, that is sound logical and empirical corroboration. As far as the “how” is concerned, Chomsky undoubtedly assumed the role of a master: he plays the claviature of scientific jargon with uncontested virtuosity. This is a definite advantage. You may be irremediably wrong as to the subject matter itself but still be accepted as serious and scientific because you succeed in presenting your subject in so brilliant a manner (while, conversely, others may be right as to what they say but never be accepted as equals by their peers if they do not conform to the prevalent jargon).

I think there is at least a grain of truth in this rather cynical interjection. After all, it may not be mere coincidence that Chomsky eventually turned his back on linguistics devoting himself instead to political essays. Mind you, these essays he no longer wrote in an esoteric jargon – “abstruse formalism”, as Pinker called it -, but meticulously adhered to empirical facts in a language of great and convincing clarity. He could not have distanced himself more clearly from his earlier style.

Such an interjection by a cynical sociologist, does, however, not explain the undeniable fact that Chomsky’s achievements make him one of the greatest living linguists. We would have to acknowledge this fact as one of the most inexplicable mysteries ever recorded in the history of science, if it were true – as I just asserted – that all his answers to the six above-mentioned basic questions are not only wrong, but even present themselves as serious stumbling blocks on the way to the further progress of linguistic science.

The apparent riddle is not really as difficult to solve as it might appear at first glance. It offers a solution often to be found in history, be it that of science or that of politics. Someone sets out to discover Cathay and India, but actually discovers America. Chomsky’s declared intention was to establish a Universal and Generative Grammar, but it turned out that he has provided the theoretical tools for one of the greatest achievements of modern linguistics, namely automated translation. Chomsky’s training in distributionism by Zellig Harris was, of course, the best prerequisite for this undertaking. Just as Harris before him, Chomsky too believed he could do without meaning in the description of language – and this is indeed perfectly correct as far as machines are concerned. In translation, say, from English into German, the digital program may receive the command to replace the English word ‚hut‘ with German ‚Hütte‘, without any notion whatsoever of meaning, which is, however, the hidden “tertium comparationis” that enables the programmer to carry out this replacement in the first place. The same is true, on a higher level, if instead of a single word a whole English sentence is to be translated. In this case, too, a corresponding algorithm ensures that the corresponding „formal realization“ of English is replaced by the „formal realization“ of German – both based on the same underlying structure of meaning known to the programmer but totally unknown to the translating machine.

Automated translation is a great historical achievement, but it is so especially at the present, where practical usefulness attained overriding importance. One of the most visible consequences is the invasion of linguistics by computer specialists. This may be a great advantage for further advances in all kinds of automation but it, certainly, does not entail an advance in our understanding of language. Even laymen are well aware that computers need not understand anything. A science of linguistics that discards meaning is reduced to a mere skeleton as it eliminates the very basis of its subject matter. Nevertheless, it is a well-known fact that understanding – the proper domain of humanities – is progressively pushed into insignificance even at universities, the general trend being orientated towards the immediately useful (including the financial benefits to be derived from it). “By all accounts, the humanities are in trouble. University programs are downsizing; the next generation of scholars is un- or underemployed; morale is sinking; students are staying away in droves” (Pinker).

Let me now deal with the six basic questions of Universal and Generative Grammar to all of which Chomsky has given answers that are logically or empirically incorrect.

1) Linguistic competence – ready made module or socially stimulated and evolving?

Chomsky believed to provide a satisfactory answer to this question when he postulated a fictitious “language module”, with which all humans, regardless of culture, are allegedly equipped, so that there can be no question of an evolution of language as otherwise characteristic of all human organs and abilities. “For Chomsky, then, the basic justifications for saying that the capacity for language must be an innate module or organ, a computational mechanism, was the argument from the poverty of the input together with lack of correction, and ease of acquisition in childhood” (Pinker)

Unfortunately for the truth of his statement, Chomsky did not take the trouble to verify it by means of the abundantly available empirically material. In an article titled “So all languages aren’t equally complex after all” (2018), Christopher Hallpike quotes some of this evidence to prove that much of what can easily be formulated in developed languages is not expressed in the language of many tribal people such as for instance the Piraha: /They have/ “no relative pronouns; only single modifiers; only one possessor; no co-ordinates such as ‘John and Bill came today’; no disjunctions e.g. ‘either Bob or Bill will come’; only one verb and one adjective in a sentence; no comparatives or superlatives; no counting; no distinction between singular and plural; no quantifiers – some, all, every, none; nouns have no prefixes or suffixes; no color terms; no passive constructions; word order is not strict; no phatic communication (no greetings or farewells, ‘please’ or ‘thank you’ etc.).”

The basic question is, of course, whether the Piraha and other formerly called “primitive” people cannot express such contents or whether they simply have no need for a more complex way of expression. As the brain of Sapiens is genetically the same in different ethnic groups for at least fifty thousand years, the second alternative may be assumed to be empirically proven if infants of the Piraha, raised in an English family, would on average show the same language competence as any Englishman. If this hypothesis – empirically verifiable and presumably often tested in the past with regard to other „natives“ – presents the right solution to our problem, then the lower linguistic competence of Pirahas is rather explained by the fact that in a group of a few people, who spend their entire lives in an identical environment, prelingual communication is so predominant that linguistic expression is largely dispensable. This fact is more easily understood when we compare it to similar situations in our own societies. Just think of those old married couples who often find themselves in exactly the same unchanging environment, knowing each other perfectly well and being completely enclosed in their small world. Their emotional bond may be very intense, still their linguistic communication is often quite limited, though both partners may have reacted in linguistically very complex ways as long as they were challenged with constantly changing problems and environments in their respective professions. Obviously, we are confronted with a lack of need rather than with a lack of ability.

In the case of the Piraha, this point seems to be further corroborated by the observation that these people are quite capable of expressing the corresponding contents of meaning through paraphrases. Though they do not have a ready-made formula like English „Paita, bring back the nails that Dan bought“, they may well express the same message in the following manner: „Hey Paita, bring back some nails. Dan bought those very nails. They are the same“. The fact that they can perfectly well understand and express the semantic content in question, albeit in a very cumbersome way, merely proves a lack of need and not of ability. It is only when such need offers itself on a regular base – as in our cultures – people look for a „formal realization“ that is not cumbersome but as simple as possible. This also applies to another example mentioned by Hallpike, the disjunction „Either Bob or Bill will come“. I assume that the same semantic message may be realized as „maybe Bob will come, maybe Bill will come“ – but, of course, this is much more cumbersome kind of expression.

As to the presence or not of relative pronouns, it is not indicative of linguistic expressiveness, because relative pronouns are completely absent in some highly developed languages such as Chinese or Japanese. English represents a case of what I call “formal abundance” because it is able to express identical meaning by more than one formal device. „Men, who eat rice on Friday, tend to eat rice on Wednesdays too“ makes use of a relative pronoun while „Men eating rice on Friday tend to eat rice Wednesday too“ does not. Only the second alternative is available in Chinese and Japanese. We may say that English acquires additional suppleness because it may resort to two different formal alternatives. These examples, treated with particular emphasis in the “Principles”, are of primary importance for the correct understanding of both the semantic and formal aspects of language. But conforming to his usual procedure, Chomsky clings to the mere surface when stating that something “recurs” or becomes “embedded”. He does not ask himself what this something is, in other words, what it means and how it enlarges the range of information passed between speakers and listeners.

On the basis of empirical evidence, Hallpike’s conclusion seems incontestable: “language does indeed become more complex in relation to social and cultural complexity… and cannot therefore be an instinct, organ or module as Universal Grammar maintains.”

I would add that in its earliest stages greater cultural complexity originates in a rather elementary way through the numerical growth of small groups into societies whose members no longer all have a personal knowledge of each other. It is at this moment that the abstract human being emerges, who is no longer identical with definite individuals, and at the same time we probably witness the genesis of abstract numbers. Hallpike confirms this conjecture when quoting the following observations of a competent anthropologist: “The /Cree/ hunter /of Eastern Canada/ knew every river in his territory individually and therefore had no need to know how many there were. Indeed, he would know each stretch of each river as an individual thing and therefore had no need to know in numerical terms how long the rivers were.” And the anthropologist proceeds: “the story is that we count things when we are ignorant of their individual identity” and “distinctive individual identity is key to the lack of number and counting.” In other words, the absence of abstract thinking may prove the overwhelming presence of concrete knowledge!

In numerically large societies abstract semantic contents like numbers that did not play any role in the mutual intercourse of groups of a few members, will henceforth be expressed in the simplest possible way. And it is only at this complex level of society and its material self-preservation that writing is needed as a supplement to oral communication. On the one hand, writing relieves memory (and thereby creates a new memory, the meme, outside of all individual brains) while, on the other hand, it is able to reach all those members of a language community who are no longer personally known to each other. Again, Hallpike is right in his statement that complexity of expression arises „especially as a result of writing and literacy“.

We may conclude that empirical facts such as these clearly contradict the assertion by Chomsky and Pinker that human language is a ready-made module that reaches and reached the same full development in every society (in fact, we do not even know whether our society reached such an endpoint of development or whether it exists at all). Nevertheless, it may be assumed that all members of our species are equipped with broadly the same genetic prerequisites to unfold this faculty when raised under favorite conditions. Or stated in an alternative manner: all people have the same ability to speak, even if they manifest it in very different ways due to divergent social needs in different cultures and at different times. Even at an individual level, expertise comes and wanes with constant use or the lack of it – as is indeed true for all organs and abilities.

2) The generative capacity

Chomsky’s question about the generative capacity of the child, which enables it to form new, never-heard sentences in theoretically unlimited numbers, is particularly significant insofar as it is based on the refutation of false assumptions made by Behaviorism. As Chomsky rightly pointed out, the latter is incapable of explaining this really amazing faculty. Unfortunately, this important insight does not prevent Chomsky from following, for his part, a path as wrong as that of the theory he attacks. The trees he constructs, borrowed from distributionalism, cannot be used to justify a child’s generative capacity.

In “Principles of Language” I tried to show that distributionalism, the method used by Chomsky, never arrives at any deduction at all: that is, it cannot get out of its deductive trees any more information than it had previously (or surreptitiously) put into them: it is indeed strictly tautologous. Let me illustrate my point using a Chomskyan tree of the most elementary type. In the upper first line we may find an abstract expression like “noun phrase”. In the second line beneath we may find a more concrete expression like “noun verb”; the third line then provides some concrete example like “(the) girl smiles”. Now, no English-speaking person would accept “(the) diddle doddles” in the third line, while a Bantu without any knowledge of English might well accept it. In other words, without a perfect previous knowledge of English we wouldn’t be able to exclude this example. In quite the same way, an English speaker wouldn’t accept “(the) girl smile” as it violates an elementary rule of English grammar. Finally, an example like “(the) stone coughs” would even be rejected by a Bantu or an Apache because all human beings make use of an innate structure of meaning which tells them that stones are incapable of coughing (other than in a metaphorical sense). In all three cases a perfect previous knowledge of either the semantic deep structure or of valid formal realizations of English is a necessary prerequisite if we want to arrive at admissible examples. This is to say, that the Chomskyan tree only creates examples which we must know beforehand – or, put in an alternative way, it generates nothing butmerely distinguishes admissible sentences from those which are not – and that is exactly what any traditional grammar does. However, by giving to the procedure of traditional grammar the shape of a tree, Chomsky imparts the misleading impression as if he were able to deduce the concrete from the general (like “the girl smiles” from “noun phrase”).

Nevertheless, it is perfectly true that every child – and every adult for that matter – may create sentences it never heard before, like, for instance, “Tuggering smiles” (if the child just saw some fictional being called Tuggering on TV). Or it may say, “Tuggering giggles” though this combination of words too has never been uttered by its parents. Or a particularly gifted child conceives the idea of a half-smile and creates its own word for realizing it in form (like the Japanese who have ten different words for rain or the Inuit ten words for snow). No machine will list these expressions among admissible English sentences unless the newly created word with the meaning half-smile were previously inserted into the class of English verbs and the name Tuggering in the class of nouns. In other words, a machine is utterly incapable of any generative act whatsoever. The reason should be no secret to anybody: We know that machines lack any knowledge of meaning. On the other hand, it is precisely on the level of meaning (or the “language of thought”) that unknown events become known to the mind and contribute to its constant enlargement in children as well as in adults. Stating this evident truth in the most general way, we may say that meaning is the raw material which is brought into the specific form of some particular language according to the latter’s rules of “formal realization”. Generativeness is, therefore, primarily located on this basis. But this essential truth was missed by Chomsky. Influenced as he was by distributionalism, he was unable to accept meaning as the true foundation and deep structure of language.

A residue of generativeness remains, however, even if meaning is excluded from language. It manifests itself not as a positive force of enlargement but merely as a source of mistakes like, for instance, the violation of prescribed rules of formal realization. Children are surely generative when saying ‘go-ed’ instead of ‘went’ or ‘fli-ed’ instead of ‘flew’ (they never heard their parents utter these mistakes). Likewise, mistakes of prevailing formal realization changed the French of invading Normans to develop into modern English, and in course of time mistakes produced what we now call “sound shifts” in German. Similarly, the German language is now transformed by the mistakes of immigrants. Any change of language on the level of form is an act of creative (or more often non-creative) destruction of established patterns. It may make a language just different (which is true of “sound shifts”) or it may reduce expendable complexity as, for instance, when English abolished the gender distinction characteristic of Indo-Germanic languages. Here too, Chomsky missed the essential point. He couldn’t see true generativeness welling up from the deep stratum of meaning, and he couldn’t see that mistakes are generative acts in their own particular way.

3) Grammatical correctness in Generative Grammar

Chomsky uses basically the same method as traditional grammar in order to distinguish grammatically correct sentences from grammatically incorrect ones. He does so, however, in quite a particular manner by depicting the rules specifying correct formal realization in the shape of a tree that begins at the top on the most abstract level and then widens downwards to concrete examples. As explained above, he thus creates the misleading impression as if he were able to deduce the concrete from the general. Apart from this erroneous notion, Chomsky’s method is, however, just as legitimate as that of any traditional grammar. That is why Chomsky’s approach has proved so immensely fruitful in its application to automated translations.

But Chomsky wants to achieve a lot more. His method is meant to reveal the hidden rules that allow a child to form infinite sentences it has never heard before, and at the same time prevents it from uttering the equally infinite number of sentences a given linguistic community would reject as unacceptable.

The most primitive case of grammatical correctness is, of course, compliance with the rules that apply at the formal level. A child saying „Mary is cute. I like him“, would commit as grave an error in English as when it says „Mary are cute.“ The formal rules of grammatical correctness are rather simple and they can be counted on a few hands in all languages. It is for this reason, that they are understood and correctly applied by any child after a rather short time of learning.

But there are countless wrong sentences, which may be formally correct, but are still not considered admissible in any language community. These include statements such as „the stone coughs unbearably in D major“ or „the water pipe has just laid a bent and illiterate egg“. The theoretically infinite number of such sentences is rejected for a completely different reason: They do not occur in the social or natural reality of man (that man may consciously distort “real” reality in order to confront it – again quite consciously – with a fictitious one, is, of course, quite another question).

Again, Chomsky’s answer is inadequate because of its incompleteness. It only becomes correct once we simultaneously consider two completely different grammars. Firstly, the grammar that determines the rules of „formal realization“, and, secondly, the grammar of the „general structure of meaning“, which consists of a small number of „syntheses“ equally found in all natural languages, since they represent a reversal of the previous mental “analysis” of reality. As Chomsky was opposed to give meaning a prominent place in his theory, he was barred from recognizing the second type of grammar, the one at the very base of all languages. This point was well understood by Steven Pinkerwho insists that “People do not think in English or Chinese or Apache; they think in a language of thought“. Chomsky missed this elementary truth. For this reason, he can only account for a tiny portion of grammatical correctness.

In my two German books of 1981 (Grammatica Nova) and 1991 (Prolegomena zur Generellen Grammatik) and, likewise, in the “Principles of Language” published in English in 1993 and, finally, in “Principles revised”, I described a „general structure of meaning“. Jerry Fodor used the term „Language of Thought” in 1975. Steven Pinker took up this term in “The Language Instinct” 1994.

4) Universal Grammar

The leap to Universal Grammar, which Chomsky wants to achieve when answering the fourth basic question, i.e. the transition from some specific language to all possible languages, was recently rejected as a failure by Prof. John Goldsmith, who justly criticizes Chomsky for having borrowed his basic categories of description from traditional Western grammar. Traditional terms like noun and verb are „unscientific“ because their semantic content varies from one language to another. Let us take an elementary utterance like „John comes today“, which can be expressed in any developed and probably in most less developed languages as well. It consists of three formal slots: a noun, a verb and an adverb. If the formal slot of nouns would in all languages only be filled with the same semantic category of (living or non-living) substances, and likewise all verbs only with actions, all adverbs only with temporal determinants as in the above English example, then these formal slots could be described as universal as their semantic content would be identical in all natural languages. But anybody with a slight knowledge of other languages than English is well aware that this is not the case – not even in English itself, where sentences like “Withdrawal happened at once”, “Greatness came later”, or “John arrived unexpectedly” shows that the noun may comprise the semantic content, first, of an action (to withdraw), second, of a quality (great) and, last, of a psychic synthesis (I, you or we did not expect that John would arrive). In other languages, the same semantic content must be formally expressed in totally different ways.

It is, nevertheless, a truism that the semantic content of formal categories like noun, verb etc. must to a certain degree overlap in different languages – how else would it be possible to apply the same term of noun to totally different languages like English and Japanese, in the first place? Thus, in English as well as in Japanese and Russian, nouns always contain substances, verbs always actions, adjectives always qualities, but depending on the language under review they may take up other semantic categories as well (the English noun may contain both actions (withdrawal) and qualities (greatness) and even spatial, temporal or psychic determinants like “unexpectedly”. Only because of this partial identity have grammarians been motivated to apply identical terms to otherwise widely different formal classes of different languages.

There is no harm in using the “sloppy and not scientific” concepts like noun and verb so justly criticized by Prof. Goldsmith as long we use them without any claim to generalization. The procedure is quite legitimate in traditional grammar as otherwise we would have to invent a new set of descriptive concepts for each language under consideration. If only we keep in mind that the semantic contents of formal slots (like Russian, Chinese or English verbs or nouns) merely overlap without ever being identical, our procedure is totally correct and unobjectionable. The same holds true for the use of such categories in translation machines, because these too only deal with particular languages. But the use of noun, verb etc. in Universal Grammar is strictly out of question. Mr. Goldsmith is right. In this case their use is more than just “sloppy”, they blur basic distinctions which we want to explain – instead of explaining them away as Chomsky does.

5) Deep versus surface structure

As to Chomsky’s attempt to distinguish a linguistic surface structure from a depth structure, this was from the outset condemned to failure. To this very day, nobody – not even Chomsky himself – seems to know exactly how to define their difference. Nor should this be a surprise since the failure is due to the fact that Chomsky never accepted the independent reality of a general structure of meaning (Language of Thought). Only after clearly distinguishing meaning and form are we capable of drawing an exact line between them. All languages are built on the foundation of a general structure of meaning (a Language of Thought), which in each of them is transformed into a sequence of acoustic signals – a formal structure – by means of specific rules of formal realization. This truth seems so evident that Chomsky’s resistance can only be explained by the fact that the use of his theory for the practical purpose of automated translation would not be possible had he based it on the general structure of meaning, because machines know nothing about the latter. In other words: Chomsky had to sacrifice the understanding of language if he wanted to secure its practical application.

6) A general theory of Language without knowing languages in the plural?

Let us now turn to Chomsky’s last question and how he answered it. He confidently asserts that we may well design a General Theory of Language without knowing any language other than the one we happen to speak. On this matter Chomsky expresses himself in no uncertain terms. “I have not hesitated to propose a general principle of linguistic structure on the basis of observation of a single language. The inference is legitimate, on the assumption that humans are not specifically adapted to learn one rather than another human language, say English rather than Japanese. Assuming that the genetically determined language faculty is a common human possession, we may conclude that a principle of language is universal if we are led to postulate it as a ‘precondition’ for the acquisition of a single language.”

I beg your pardon, if at this point, I am quite unable to prevent the cynical sociologist quoted above to raise his voice once again. Look, he says, how scientific jargon – Chomsky’s intellectual brilliancy – produces so forceful a hypnotic effect that almost nobody noticed the utter nonsense this statement really conveys. What Chomsky actually says is tantamount to the assertion that botany may do without the study of plants or astronomy without the study of stars. To linguists he says that they all carry the language module within them, so they really needn’t worry at all about languages, that is, about empirically given reality. According to Chomsky, language, unlike all other organs and abilities, did not evolve gradually, but resembles the Goddess Athene falling from heaven perfectly equipped with all her divine attributes. Chomsky gives a license to theorists to assert whatever they like about language regardless of whether or not this contradicts the evidence arising from the study of facts. No wonder that his pronouncement was fervently acclaimed by all those who in fact are quite innocent as to their knowledge of languages and of historical facts. Chomsky managed to give absolution and a pretty good conscience to learned ignorance (docta ignorantia) that as a rule is more dangerous than its naïve counterpart.

Conclusion:

Let us beware of cynics and their mostly one-sided and misleading judgments. It is simply not true that modern linguistics has abandoned empiricism. On the contrary, in a specific sector, albeit a restricted one, it has achieved a resounding success. Automated translation, whose incredible performance we rightly admire, requires an extreme degree of technical competence and precise knowledge of the relevant algorithms – in other words, it represents nothing less than a triumph of empirically orientated research. There can absolutely be no question of a detached and outlandish theorizing.

A fair judgment must, therefore, strictly distinguish between the great achievement that Chomsky actually attained and the one that he himself wanted to accomplish. Chomsky reached a shore that he did not even want to head for: against his intentions, he has become the pioneer and father of automated translation. On the other hand, he never got to the coast, which he declared to be his real goal, namely Universal and Generative Grammar. Although here too we have to admit that Chomsky must be credited with having grasped the eminent meaning of such a goal more clearly than others. And he, certainly, asked the right questions even though he never gave the right answers as he remained fatally attached to „sloppy and unscientific” descriptive categories borrowed from traditional grammar. The overall effect of his work was to strengthen the prevailing trend towards the immediately useful (machine translation), and simultaneously weaken the scientific understanding of complex reality. The latter presupposes an extended knowledge of history and linguistic diversity – a knowledge the significance of which he tried to discredit. As to the matter, which was most dear to his heart, the study of Universal and Generative Grammar, this he turned from science into dogma and the linguists following him from scientists into “True Believers”. Universal and Generative Grammar, a trend-setting idea, was changed into a mental straitjacket.