1.2.4 Distribution
Let us turn now to the observations made in (2) and (3). There we observed that there are certain positions in a sentence that some words can occupy and other words cannot. Clearly, this is determined by category. This is perhaps the most basic point of word categories as far as syntax is concerned. The grammar of a language determines how we construct the expressions of the language. The grammar, however, does not refer to the individual words of the lexicon, telling us, for example, that the word cat goes in position X in expression Y. Such a system would not be able to produce an indefinite number of sentences as there would have to be such a rule for every expression of the language. Instead, the grammar defines the set of possible positions for word categories, hence allowing the construction of numerous expressions from a small number of grammatical principles. The question of how these positions are defined is mostly what this book is about, but for now, for illustrative purposes only, let us pretend that English has a rule that says that a sentence can be formed by putting a noun in front of a verb. This rule then tells us that the expressions in (15) are grammatical and those in (16) are not:
(15) | a | John smiled |
b | cats sleep | |
c | dogs fly | |
d | etc. | |
(16) | a | *ran Arnold |
b | *emerged solutions | |
c | *crash dogs | |
d | *etc. |
This is not meant to be a demonstration of how English grammar works, but how a rule which makes reference to word categories can produce a whole class of grammatical expressions.
We call the set of positions that the grammar determines to be possible for a given category the distribution of that category. If the grammar determines the distribution of categories, it follows that we can determine what categories the grammar works with by observing distributional patterns: words that distribute in the same way will belong to the same categories and words that distribute differently will belong to different categories.
The notion of distribution, however, needs refining before it can be made use of. To start with, as we will see, sentences are not organised as their standard written representations might suggest: one word placed after another in a line. We can see this by the following example:
(17) | dogs chase cats |
If distribution were simply a matter of linear order, we could define the first position as a position for nouns, the second position for verbs and the third position for nouns again based on (17). Sure enough, this would give us quite a few grammatical sentences:
(18) | a | dogs chase birds |
b | birds hate cats | |
c | hippopotami eat apples | |
d | etc. |
However, this would also predict the following sentences to be ungrammatical as in these we have nouns in the second position and verbs in the third:
(19) | a | obviously dogs chase cats |
b | rarely dogs chase birds | |
c | today birds hate cats | |
d | daintily hippopotami eat apples |
It is fairly obvious that the sentences in (19) are not only grammatical, but they are grammatical for exactly the same reason that the sentences in (17) and (18) are: the nouns and verbs are sitting in exactly the same positions regardless of whether the sentence starts with a word like obviously or not. It follows, then, that distributional positions are not defined in terms of linear order. Just how distributional positions are defined is something to which we will return when we have introduced the relevant concepts.
A further complication is indicated by the following observation:
(20) | a | Knut hates sea |
b | *Knut smiles sea |
The morphological forms hates and smiles are both present tense, indicating that the words are of the same category, i.e. verbs. However, as demonstrated by (20), these words appear to have different distributions and thus they belong to different categories. How can this apparent contradiction be reconciled? We will see that part of the solution to this problem follows from the way in which distributions are defined, which we have yet to discuss. However, another aspect of distribution can be discussed at this point. Note that a sentence in which the verb smiles would be grammatical, would be ungrammatical with the word hates:
(21) | a | Knut smiles |
b | *Knut hates |
Obviously there are words which cannot go in either of these positions:
(22) | a | *Knut cats sea |
b | *Knut cats |
What (22) indicates is that the positions we are considering here are both verb positions, and hence a noun cannot occupy them. Yet some verbs can occupy one of these positions and other verbs can occupy the other. This suggests that there are different types of verb, what we might call subcategories of the category verb. If this is right, we would expect that the set of possible verbal positions would be divided up between the different verbal subcategories so that the positions in which one can appear in are those in which the others cannot. In other words, different subcategories will have complementary distributions. This indeed seems to be true, as (20) and (21) indicate.