Some thoughts about the principle of compositionality

This is another of those ideas that ive had independently, and that it turned out that others had thought of before me, by thousands of years in this case. The idea is that longer expressions of language as made out of smaller parts of language, and that the meaning of the whole is determined by the parts and their structure. This is rather close to the formulation used on SEP. Heres the introduction on SEP:


Anything that deserves to be called a language must contain meaningful expressions built up from other meaningful expressions. How are their complexity and meaning related? The traditional view is that the relationship is fairly tight: the meaning of a complex expression is fully determined by its structure and the meanings of its constituents—once we fix what the parts mean and how they are put together we have no more leeway regarding the meaning of the whole. This is the principle of compositionality, a fundamental presupposition of most contemporary work in semantics.

Proponents of compositionality typically emphasize the productivity and systematicity of our linguistic understanding. We can understand a large—perhaps infinitely large—collection of complex expressions the first time we encounter them, and if we understand some complex expressions we tend to understand others that can be obtained by recombining their constituents. Compositionality is supposed to feature in the best explanation of these phenomena. Opponents of compositionality typically point to cases when meanings of larger expressions seem to depend on the intentions of the speaker, on the linguistic environment, or on the setting in which the utterance takes place without their parts displaying a similar dependence. They try to respond to the arguments from productivity and systematicity by insisting that the phenomena are limited, and by suggesting alternative explanations.


SEP goes on to discuss some more formal versions of the general idea:


(C) The meaning of a complex expression is determined by its structure and the meanings of its constituents.



(C′) For every complex expression e in L, the meaning of e in L is determined by the structure of e in L and the meanings of the constituents of e in L.


SEP goes on to disguish between a lot of different versions of this. See the article for details.

The thing i wanted to discuss was the counterexamples offered. I found none of them to be rather compelling. Based mostly on intuition pumps as far as i can tell, and im rather wary of such (cf. Every Thing Must Go, amazon).


Heres SEP’s first example, using chess notation (many other game notations wud also work, e.g. Taifho):


Consider the Algebraic notation for chess.[15] Here are the basics. The rows of the chessboard are represented by the numerals 1, 2, … , 8; the columns are represented by the lower case letters a, b, … , h. The squares are identified by column and row; for example b5 is at the intersection of the second column and the fifth row. Upper case letters represent the pieces: K stands for king, Q for queen, R for rook, B for bishop, and N for knight. Moves are typically represented by a triplet consisting of an upper case letter standing for the piece that makes the move and a sign standing for the square where the piece moves. There are five exceptions to this: (i) moves made by pawns lack the upper case letter from the beginning, (ii) when more than one piece of the same type could reach the same square, the sign for the square of departure is placed immediately in front of the sign for the square of arrival, (iii) when a move results in a capture an x is placed immediately in front of the sign for the square of arrival, (iv) the symbol 0-0 represents castling on the king’s side, (v) the symbol 0-0-0 represents castling on the queen’s side. + stands for check, and ++ for mate. The rest of the notation serves to make commentaries about the moves and is inessential for understanding it.

Someone who understands the Algebraic notation must be able to follow descriptions of particular chess games in it and someone who can do that must be able to tell which move is represented by particular lines within such a description. Nonetheless, it is clear that when someone sees the line Bb5 in the middle of such a description, knowing what B, b, and 5 mean will not be enough to figure out what this move is supposed to be. It must be a move to b5 made by a bishop, but we don’t know which bishop (not even whether it is white or black) and we don’t know which square it is coming from. All this can be determined by following the description of the game from the beginning, assuming that one knows what the initial configurations of figures are on the chessboard, that white moves first, and that afterwards black and white move one after the other. But staring at Bb5 itself will not help.


It is exacly the bold lines i dont accept. Why must one be able to know that from the meaning alone? Knowing the meaning of expressions does not always make it easy to know what a given noun (or NP) refers to. In this case “B” is a noun refering to a bishop, which one? Well, who knows. There are lots of examples of words refering to differnet things (people usually) when used in diffferent contexts. For instance, the word “me” refers to the source of the expression, but when an expression is used by different speakers, then “me” refers to different people, cf. indexicals (SEP and Wiki).


Ofc, my thoughts about are not particularly unique, and SEP mentions the defense that i also thought of:


The second moral is that—given certain assumptions about meaning in chess notation—we can have productive and systematic understanding of representations even if the system itself is not compositional. The assumptions in question are that (i) the description I gave in the first paragraph of this section fully determines what the simple expressions of chess notation mean and also how they can be combined to form complex expressions, and that (ii) the meaning of a line within a chess notation determines a move. One can reject (i) and argue, for example, that the meaning of B in Bb5 contains an indexical component and within the context of a description, it picks out a particular bishop moving from a particular square. One can also reject (ii) and argue, for example, that the meaning of Bb5 is nothing more than the meaning of ‘some bishop moves from somewhere to square b5’—utterances of Bb5 might carry extra information but that is of no concern for the semantics of the notation. Both moves would save compositionality at a price. The first complicates considerably what we have to say about lexical meanings; the second widens the gap between meanings of expressions and meanings of their utterances. Whether saving compositionality is worth either of these costs (or whether there is some other story to be told about our understanding of the Algebraic notation) is by no means clear. For all we know, Algebraic notation might be non-compositional.


I also dont agree that it widens the gap between meanings of expressions and meanings of utterances. It has to do with refering to stuff, not meaning in itself.

4.2.1 Conditionals

Consider the following minimal pair:

(1) Everyone will succeed if he works hard.
(2) No one will succeed if he goofs off.

A good translation of (1) into a first-order language is (1′). But the analogous translation of (2) would yield (2′), which is inadequate. A good translation for (2) would be (2″) but it is unclear why. We might convert ‘¬∃’ to the equivalent ‘∀¬’ but then we must also inexplicably push the negation into the consequent of the embedded conditional.

(1′) ∀x(x works hard → x will succeed)
(2′) ¬∃
x (x goofs off → x will succeed)
(2″) ∀
x (x goofs off → ¬(x will succeed))

This gives rise to a problem for the compositionality of English, since is seems rather plausible that the syntactic structure of (1) and (2) is the same and that ‘if’ contributes some sort of conditional connective—not necessarily a material conditional!—to the meaning of (1). But it seems that it cannot contribute just that to the meaning of (2). More precisely, the interpretation of an embedded conditional clause appears to be sensitive to the nature of the quantifier in the embedding sentence—a violation of compositionality.[16]

One response might be to claim that ‘if’ does not contribute a conditional connective to the meaning of either (1) or (2)—rather, it marks a restriction on the domain of the quantifier, as the paraphrases under (1″) and (2″) suggest:[17]

(1″) Everyone who works hard will succeed.
(2″) No one who goofs off will succeed.

But this simple proposal (however it may be implemented) runs into trouble when it comes to quantifiers like ‘most’. Unlike (3′), (3) says that those students (in the contextually given domain) who succeed if they work hard are most of the students (in the contextually relevant domain):

(3) Most students will succeed if they work hard.
(3′) Most students who work hard will succeed.

The debate whether a good semantic analysis of if-clauses under quantifiers can obey compositionality is lively and open.[18]


Doesnt seem particularly difficult to me. When i look at an “if-then” clause, the first thing i do before formalizing is turning it around so that “if” is first, and i also insert any missing “then”. With their example:


(1) Everyone will succeed if he works hard.
(2) No one will succeed if he goofs off.


this results in:


(1)* If he works hard, then everyone will succeed.
(2)* If he goofs off, then no one will succeed.


Both “everyone” and “no one” express a universal quantifer, ∀. The second one has a negation as well. We can translate this to something like “all”, and “no” to “not”. Then we might get:


(1)** If he works hard, then all will succeed.
(2)** If he goofs off, then all will not succeed.


Then, we move the quantifier to the beginning and insert a pronoun, “he”, to match. Then we get something like:


(1)*** For any person, if he works hard, then he will succeed.
(2)*** For any person, if he goofs off, then he will not succeed.


These are equivalent with SEP’s


(1″) Everyone who works hard will succeed.
(2″) No one who goofs off will succeed.


The difference between (3) and (3′) is interesting, not becus of relevance to my method about (i think), but since it deals with something beyond first-order logic. Quantification logic, i suppose? I did a brief Google and Wiki search, but didnt find something like that i was looking for. I also tried Graham Priest’s Introduction to non-classical logic, also without luck.


So here goes some system i just invented to formalize the sentences:


(3) Most students will succeed if they work hard.
(3′) Most students who work hard will succeed.


Capital greek letters are set variables. # is a function that returns the cardinality a set.


(3)* (∃Γ)(∃Δ)(∀x)(∀y)(Sx↔x∈Γ∧Δ⊆Γ∧#Δ>(#Γ/2)∧(y∈Δ)→(Wy→Uy))


In english: There is a set, gamma, and there is another set, delta, and for any x, and for any y, x is a student iff x is in gamma, and delta is a subset of gamma, and the cardinality of delta is larger than half the cardinality of gamma, and if y is in delta, then (if y works hard, then y will succeed).


Quite complicated in writing, but the idea is not that complicated. It shud be possible to find some simplified writing convention for easier expression of this way of formalizing it.


(3′)* (∃Γ)(∃Δ)(∀x)(∀y)(((Sx∧Wx)↔x∈Γ)∧Δ⊆Γ∧#Δ>(#Γ/2)∧(y∈Δ→Uy))


In english: there is a set, gamma, and there is another set, delta, and for any x, and for any y, (x is a student and x works hard) iff x is in gamma, and delta is a subset of gamma, and the cardinality of delta is larger than half the cardinality of gamma, and if y is in delta, then u will succeed.


To my logician intuition, these are not equivalent, but proving this is left as an exercise to the reader if he can figure out a way to do so in this set theory+predicate logic system (i might try later).


4.2.2 Cross-sentential anaphora

Consider the following minimal pair from Barbara Partee:


(4) I dropped ten marbles and found all but one of them. It is probably under the sofa.

(5) I dropped ten marbles and found nine of them. It is probably under the sofa.


There is a clear difference between (4) and (5)—the first one is unproblematic, the second markedly odd. This difference is plausibly a matter of meaning, and so (4) and (5) cannot be synonyms. Nonetheless, the first sentences are at least truth-conditionally equivalent. If we adopt a conception of meaning where truth-conditional equivalence is sufficient for synonymy, we have an apparent counterexample to compositionality.


I dont accept that premise either. I havent done so since i read Swartz and Bradley years ago. Sentences like


“Canada is north of Mexico”

“Mexico is south of Canada”


are logically equivalent, but are not synonymous. The concept of being north of, and the concept of being south of are not the same, even tho they stand in a kind reverse relation. That is to say, xR1y↔yR2x. Not sure what to call such relations. It’s symmetry+substitition of relations.


Sentences like


“Everything that is round, has a shape.”

“Nothing is not identical to itself.”


are logically equivalent but dont mean the same. And so on, cf. Swartz and Bradley 1979, and SEP on theories of meaning.


Interesting though these cases might be, it is not at all clear that we are faced with a genuine challenge to compositionality, even if we want to stick with the idea that meanings are just truth-conditions. For it is not clear that (5) lacks the normal reading of (4)—on reflection it seems better to say that the reading is available even though it is considerably harder to get. (Contrast this with an example due to—I think—Irene Heim: ‘They got married. She is beautiful.’ This is like (5) because the first sentence lacks an explicit antecedent for the pronoun in the second. Nonetheless, it is clear that the bride is said to be beautiful.) If the difference between (4) and (5) is only this, it is no longer clear that we must accept the idea that they must differ in meaning.


I agree that (4) and (5) mean the same, even if (5) is a rather bad way to express the thing one normally wud express with something like (4).


In their bride example, one can also consider homosexual weddings, where “he” and “she” similarly fails to refer to a specific person out of the two newlywed.

4.2.3 Adjectives

Suppose a Japanese maple leaf, turned brown, has been painted green. Consider someone pointing at this leaf uttering (6):


(6) This leaf is green.


The utterance could be true on one occasion (say, when the speaker is sorting leaves for decoration) and false on another (say, when the speaker is trying to identify the species of tree the leaf belongs to). The meanings of the words are the same on both occasions and so is their syntactic composition. But the meaning of (6) on these two occasions—what (6) says when uttered in these occasions—is different. As Charles Travis, the inventor of this example puts it: “…words may have all the stipulated features while saying something true, but also while saying something false.”[[20]


At least three responses offer themselves. One is to deny the relevant intuition. Perhaps the leaf really is green if it is painted green and (6) is uttered truly in both situations. Nonetheless, we might be sometimes reluctant to make such a true utterance for fear of being misleading. We might be taken to falsely suggest that the leaf is green under the paint or that it is not painted at all.[21] The second option is to point out that the fact that a sentence can say one thing on one occasion and something else on another is not in conflict with its meaning remaining the same. Do we have then a challenge to compositionality of reference, or perhaps to compositionality of content? Not clear, for the reference or content of ‘green’ may also change between the two situations. This could happen, for example, if the lexical representation of this word contains an indexical element.[22] If this seems ad hoc, we can say instead that although (6) can be used to make both true and false assertions, the truth-value of the sentence itself is determined compositionally.[23]


Im going to bite the bullet again, and just say that the sentence means the same on both occasions. What is different is that in different contexts, one might interpret the same sentence to express different propositions. This is not something new as it was already featured before as well, altho this time it is without indexicals. The reason is that altho the sentence means the same, one is guessing at which proposition the utterer meant to express with his sentence. Context helps with that.

4.2.4 Propositional attitudes

Perhaps the most widely known objection to compositionality comes from the observation that even if e and e′ are synonyms, the truth-values of sentences where they occur embedded within the clausal complement of a mental attitude verb may well differ. So, despite the fact that ‘eye-doctor’ and ‘ophthalmologist’ are synonyms (7) may be true and (8) false if Carla is ignorant of this fact:


(7) Carla believes that eye doctors are rich.
(8) Carla believes that ophthalmologists are rich.


So, we have a case of apparent violation of compositionality; cf. Pelletier (1994).

There is a sizable literature on the semantics of propositional attitude reports. Some think that considerations like this show that there are no genuine synonyms in natural languages. If so, compositionality (at least the language-bound version) is of course vacuously true. Some deny the intuition that (7) and (8) may differ in truth-conditions and seek explanations for the contrary appearance in terms of implicature.[24] Some give up the letter of compositionality but still provide recursive semantic clauses.[25] And some preserve compositionality by postulating a hidden indexical associated with ‘believe’.[26]


Im not entirely sure what to do about these propositional attitude reports, but im inclined to bite the bullet. Perhaps i will change my mind after i have read the two SEP articles about the matter.


Idiomatic language

The SEP article really didnt have a proper discussion of idiomatic language use. Say, frases like “dont mention it” which can either mean what it literally (i.e., by composition) means, or its idiomatic meaning: This is used as a response to being thanked, suggesting that the help given was no trouble (same source).

Depending on what one takes “complex expression” to mean. Recall the principle:


(C′) For every complex expression e in L, the meaning of e in L is determined by the structure of e in L and the meanings of the constituents of e in L.


What is a complex expression? Is any given complex expression made up of either complex expressions themselves or simple expressions? Idiomatic expressions really just are expressions whose meaning is not determined by their parts. One might thus actually take them to be simple expressions themselves. If one does, then the composition principle is pretty close to trivially true.


If one does not take idiomatic expressions to be complex expressions or simple expressions, then the principle of composition is trivially false. I dont consider that a huge problem, it generally holds, and explains the things it is required to explain just fine when it isnt universally true.


One can also note that idiomatic expressions can be used as parts of larger expressions. Depending on which way to think about idiomatic expressions, and of constituents, then larger expressions which have idiomatic expressions as parts of them might be trivially non-compositional. This is the case if one takes constituents to mean smallest parts. If one does, then since the idiomatic expressions’ meanings cannot be determined from syntax+smallest parts, then neither can the larger expression. If one on the other hand takes constituents to mean smallest decompositional parts, then idiomatic expressions do not trivially make the larger expressions they are part of non-compositional. Consider the sentence:


“He is pulling your leg”


the sentence is compositional since its meaning is determinable from “he”, “is”, “pulling your leg”, the syntax, and the meaning function.


There is a reason i bring up this detail, and that is that there is another kind of idiomatic use of language that apparently hasnt been mentioned so much in the literature, judging from SEP not mentioning it. It is the use of prepositions. Surely, many prepositions are used in perfectly compositional ways with other words, like in


“the cat is on the mat”


where “on” has the usual meaning of being on top of (something), or being above and resting upon or somesuch (difficult to avoid circular definitions of prepositions).


However, consider the use of “on” in


“he spent all his time on the internet”


clearly “on” does not mean the same as above here, it doesnt seem to mean much, it is a kind of indefinite relationship. Apparently aware of this fact (and becus languages differ in which prepositions are used in such cases), the designer of esperanto added a preposition for any indefinite relation to the language (“je”). Some languages have lots of such idiomatic preposition+noun frases, and they have to be learned by heart exactly the same way as the idiomatic expressions mentioned earlier, exactly becus they are idiomatic expressions.


As an illustration, in danish if one is at an island, one is “på Fyn”, but if one is at the mainland, then one is “i Jylland”. I think such usage of prepositions shud be considered idiomatic.

