Eliminating Positions:
Syntax and semantics of sentence modification
Published by
LOT Phone: +31 30 253 6006
Trans 10 fax: +31 30 253 6000
3512 JK Utrecht e-mail:
[email protected]
the Netherlands http://wwwlot.let.uu.nl/
Cover illustration: Image generated by Owen Ransen’s Gliftic graphics software
ISBN 90-76864-34-9
NUR 632
Coyright
c 2003 Øystein Nilsen. All rights reserved.
Eliminating Positions:
Syntax and semantics of sentence modification
Posities Verwijderen:
Syntaxis en semantiek van zinsmodificatie
(met een samenvatting in het Nederlands)
Proefschrift
ter verkrijging van de graad van doctor
aan de Universiteit Utrecht
op gezag van de Rector Magnificus,
Prof. Dr. W. H. Gispen
Ingevolge het besluit van het College voor Promoties
in het openbaar te verdedigen
op vrijdag 24 januari 2003
des middags te 12.45 uur
door
Øystein Nilsen
geboren op 29 maart 1971 te Ringerike
Promotor: Prof. dr. E. J. Reuland
Contents
Acknowledgements vii
1 Preliminaries 1
1.1 Introduction and overview . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Positions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Syntactic vs. ontological hierarchies . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Cinque’s Universal Hierarchy . . . . . . . . . . . . . . . . . . . 7
1.3.2 Ernst’s Fact-Event object calculus . . . . . . . . . . . . . . . . 13
1.4 Antisymmetry and processing . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.2 Antisymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.3 A processing account . . . . . . . . . . . . . . . . . . . . . . . 23
1.4.4 GU20 and Penultimate position . . . . . . . . . . . . . . . . . . 26
1.4.5 A&N’s arguments for right-adjunction . . . . . . . . . . . . . . 31
1.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2 Domains for Adverbs 37
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2 Adverbs and Polarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.2.1 NPIs and speaker oriented adverbs . . . . . . . . . . . . . . . . 39
2.3 Approaches to polarity items . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.1 Monotonicity and veridicality . . . . . . . . . . . . . . . . . . . 41
2.3.2 Why DE? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.3.3 Widen up and strengthen . . . . . . . . . . . . . . . . . . . . . 51
2.4 possibly: Shrink, but don’t weaken! . . . . . . . . . . . . . . . . . . . . 54
2.4.1 Modal bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4.2 Entrenched beliefs . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.4.3 The status of possibly as a PPI . . . . . . . . . . . . . . . . . . 59
2.5 Prospects and consequences . . . . . . . . . . . . . . . . . . . . . . . . 66
2.5.1 Short Verb Movement . . . . . . . . . . . . . . . . . . . . . . . 67
2.5.2 Semantic selection . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3 V2 and Holmberg’s Generalization 77
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.2 Some data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.2.1 V2-violations with focus particles . . . . . . . . . . . . . . . . . 79
3.2.2 Pronoun Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.3 First approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.3.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.4 More data: how initial is the initial position? . . . . . . . . . . . . . . 85
3.5 Second approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.5.1 Root clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.5.2 More problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.6 German . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.6.1 Hallman’s analysis . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.6.2 Müller’s V2 as vP first . . . . . . . . . . . . . . . . . . . . . . . 102
3.7 Third approximation: V2 without positions . . . . . . . . . . . . . . . 106
3.7.1 the analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.7.2 ΣP fronting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
3.8 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4 Verb movement, Scope and Scrambling 123
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.2 A Bobaljik Paradox for SVM . . . . . . . . . . . . . . . . . . . . . . . 126
4.3 English and French . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.4 Italian verb scrambling and VP scrambling . . . . . . . . . . . . . . . 138
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
A Degrees and SOA 141
A.1 The veldig ∼ vis Competition . . . . . . . . . . . . . . . . . . . . . . . 141
A.1.1 A morphological analysis? . . . . . . . . . . . . . . . . . . . . . 142
A.1.2 DegPs and sentence adverbs . . . . . . . . . . . . . . . . . . . . 144
Bibliography 147
Samenvatting in het Nederlands 155
Acknowledgements
Utrecht is a beautiful place to live and a great place to do linguistics. One of the
most impressive things about Utrecht is its high density of impressive people.
The Utrecht institute of Linguistics OTS is no exception in this respect. Being
allowed to do my Ph.D. surrounded by such people has been a privilege.
I would like to express my warm thanks to my supervisor, Eric Reuland,
for being a seemingly inexhaustible source of stimulating discussions, support
and energy.
Tanya Reinhart is a great inspiration for many people at the institute,
including myself. Her courses and her highly constructive comments have been
instrumental for my work at several junctures.
My stay in Venice during my Master studies was a turning point in my
linguistic training. I am deeply grateful to Guglielmo Cinque for his time and
patience in discussions with me about linguistics and adverbial syntax during
that stay and on several occasions later.
Several people have been vital for my development as a Ph.D. student by
giving courses, having meetings with me or discussing linguistics with me in
corridors, pubs, trains, on email, etc. I thank Peter Ackema, Sjef Barbiers,
Raffaella Bernardi, Hagit Borer, Olga Borik, Patrick Brandt, Anne Breitbarth,
Balder ten Cate, Lisa Cheng, Crit Cremers, Paul Dekker, Alexis Dimitriadis,
Arnold Evers, Christophe Costa Florencio, Anastasia Giannaidou, Taka Hara,
Heleen Hoekstra, Anders Holmberg, Janne Bondi Johannessen, Marit Julien,
Richard Kayne, Hilda Koopman, Fred Landman, Marika Lekakou, Michael
Moortgat, Richard Moot, Iris Mulders, Marie Nilsenova, Rick Nouwen, Tim
Stowell, Peter Svenonius, Henriëtte de Swart, Johan Rooryck, Robert van Rooy,
Susan Rothstein, Maaike Schoorlemmer, Kriszta Szendrőy, Tarald Taraldsen,
Henk Verkuyl, Nadya Vinokurova, Willemijn Vermaat, Yoad Winter, Ton van
der Wouden and Henk Zeevat for contributing to my work in various important
ways. I am certain to have left out some people that deserve mention: I
apologize for my forgetfulness and thank them, too.
Special thanks are due to Esther Kraak for helping me take care of practical
things in the last phase of writing up the dissertation, and to Willemijn Vermaat
and Elise de Bree for translating the Dutch summary.
For excellent company, and, in some cases, even home-sharing I thank my
friends Patrick, Richard, Raffa, Manuel, Francesco, Frederic, Sue, Sylvain,
Kathleen, Reza, Lorna, Karin, Derek, Alireza, Luciano, Federica, Kriszta, Bal-
azs, Rick, Alexis, Amy, Olga and Femke.
Finally, I thank Marika with love for being patient with me, for helping me
in all sorts of ways, and for many, many other things.
CHAPTER 1
Preliminaries
1.1 Introduction and overview
This dissertation presents a view of “clause structure” which essentially amounts
to denying its existence. It is argued that phenomena like adverb placement,
short verb movement and verb second, which form core cases in support of con-
temporary notions of CP-fields and IP-fields in the clause, are better handled in
other ways which do not resort to the notion of arbitrary selectional sequences
of (functional) syntactic heads. The alternatives which are pursued here take
Chomsky’s notion of a “bare output condition” quite literally (Chomsky, 1995,
1999, 2001). Schematically, the approach can be described as follows: given an
expression α with a limited distribution δ, i.e. α can’t occur in a non-δ envi-
ronment, call these δ̄, one searches for a set of properties of α, δ and δ̄ which, in
conjunction with independently motivated assumptions about the conceptual-
intentional and sensorimotor interfaces derives α’s limitation to δ as a theorem.
Suppose, for example, that two expressions a and b can only occur in one order,
namely ab. Suppose furthermore that the relative ordering of expressions of
the relevant kind invariably affects their semantic scope, such that precedence
maps directly onto semantic scope. In currently available theories, we essen-
tially have two major ways of dealing with this. On the one hand, we could set
up a clausal hierarchy, such as in figure 1.1 and explain the fact that a must
precede and outscope b by stating that a must occupy spec-XP, while b must
occupy spec-YP (and stating that semantic scope is determined by c-command
in the usual way). On the other hand, we could try to identify some semantic
2 Preliminaries
properties of a and b which would explain why b cannot outscope a, state that
scope = c-command, and thus derive the absence of the order ba from that. If
tenable, it seems that the latter approach is better, because it involves fewer
stipulations than the former. In fact the account in terms assigning positions
leaves one with the impression that we are describing facts, not deriving them.
XP
HH
a XP
HH
X YP
H
H
b YP
H
Y ...
Figure 1.1: Hierarchy for ab
The dissertation is organized as follows. The rest of this chapter is devoted
to two different topics. First, I will discuss some of the reasons why I think
the standard notion of “clausal architecture” must be discarded. Next, I will
argue that this does not bear on whether or not one should adopt Kayne’s
antisymmetry theory. In fact, I will adopt antisymmetry, because it is the only
theory I am aware of which derives the universal absence of certain word orders.
I will argue that proposed alternatives, like Ackema and Neeleman (2002), at
least as they stand, cannot be taken as a serious competitor. This, in turn, will
be important for how I derive the correspondence between certain word orders
and certain semantic scopal orders.
Chapter 2 is devoted to arguing that adverb ordering can be given a se-
mantic account. The proposal is essentially that “high” adverbs are positive
polarity items, and that this derives the distribution of these adverbs to a sur-
prisingly large extent. Furthermore, adverb orderings which have been reported
in the literature to be ungrammatical (including Nilsen (2000)) are often quite
acceptable. I believe part of the problem has been that it is quite difficult
to conjure up pragmatically felicitous examples containing several adverbs.1
Hence it might not be surprising that, when one succeeds in doing this, the
examples one has created are such that the adverbs are only felicitous in one
particular order. In order to sidestep this problem, I have adopted the method
of searching for strings of adverbs on the internet. I argue that one can explain
why the adverb possibly behaves like a positive polarity item by adopting the
theory developed for NPIs like any by Kadmon and Landman (1993); Krifka
(1995); Chierchia (2001).
In Chapter 3, I develop a novel approach to verb second (V2) after argu-
ing that the standard approach in terms of verb movement to C with sub-
1 Readers who are incredulous about this point are invited to come up with pragmatically
sensible sentences containing the three adverbs usually, soon and already in all of the six
possible orders.
1.2 Positions 3
sequent topicalization runs into problems with certain v2-violations involving
focus-sensitive operators, like bare ‘only’. Then it is argued that the stan-
dard approach groups facts together in an incorrect way. For example, it links
obligatory subject-verb inversion to verb movement to a high position, i.e. C.
However, we will see that there are cases of obligatory inversion where the verb
is arguably much lower than this. The approach is developed in three con-
secutive “approximations”, which can handle the problems discussed above,
in addition to leading to a very simple account of Holmberg’s Generalization
(Holmberg, 1986, 1999) concerning the interplay between argument shift and
verb movement in the Scandinavian languages. The idea is that shifted argu-
ments cannot cross the verb (or other phonetically realized material from the
VP) because it is the VP itself (or something bigger) which moves. In other
words, it is argued that weak pronoun shift, as well as head movement of the
finite verb to C, do not exist. The fact that these elements tend to end up
in the “left periphery” is that a larger constituent, dubbed ΣP, must move to
the first position. Material occurring lower down in the clause must then have
been extracted from ΣP prior to ΣP fronting.
Chapter 4 addresses arguments for positions from verb placement in Ro-
mance and English, concluding that, to a large extent, (short) verb movement
must be treated as a scope/information structure related phenomenon, to be
treated on a par with scrambling. It is suggested that obligatory verb move-
ment around adverbs, as seen in French (Pollock, 1989) does not exist. In other
words, when a given verb form must precede adverbs, this is to be treated as
(remnant) movement of a larger constituent containing the verb to the left
periphery, much as fronting of ΣP discussed in chapter 3.
The overarching theme is that phenomena that have been treated in terms
of selectional sequences of functional heads need not, and sometimes cannot be
analyzed in this way.
1.2 Positions
The notion of a ‘syntactic position’ as used in contemporary generative syntax
is a residue of structuralist analysis of clause structure in terms of “fields”, e.g.
the analysis of Germanic verb-second languages into a Vorfeld, a Mittelfeld
and a Nachfeld. The latter trichotomy can be seen to correspond more or less
directly to the more contemporary notions of CP, IP and vP, respectively. Thus,
one could say that CP/IP/vP represents an attempt to flesh out the internal
structure of the structuralist fields. In the last fifteen years or so, syntactic
research has shown, conclusively, I think, that if we want to describe clause
structure in these terms, we need to split CPs, IPs and vPs into several distinct
functional projections (Pollock, 1989; Rizzi, 1997; Cinque, 1999), generating a
plentitude of positions, thus leading again to notions of “CP-fields”, “IP-fields”
and “VP-fields”. This line of thought has been pursued intensively over the last
few years under the heading of “clausal cartography”. The task, under such a
4 Preliminaries
view, for syntactic theory is to create a detailed “map” of all the positions in the
clause, and to assign expressions to these positions. However, one might wonder
whether it would not be possible to go further. Given a clausal cartography,
one might want to ask why the clause should contain exactly the positions it
does, and why they should relate to one another in the way that they do. The
present thesis attempts to address the last of these two questions. In fact, the
answer I will propose to this question suggests an answer to the other one: The
clause does not contain any positions.
In order to see what is meant by this, let us put the notion of a “position”
under some scrutiny. In the prevalent theoretical frameworks of the sixties,
clausal positions naturally arose as a byproduct of the use of phrase structure
(PS) rules. Thus, the PS-rules in (1) give rise to the tree in figure 1.2, with a
COMP position, an NP (subject) position, an AUX position, and a VP position.
(1) a. S ⇒ COMP, S
b. S⇒ NP, AUX, VP
S
HH
H
COMP S
HH
H
NP AUX VP
Figure 1.2: Positions generated by (1)
With the advent of X-theory, the burden of generating designated positions
was shifted to the lexicon. Thus, one made use of subcategorization frames, θ-
grids and (s/c-) selectional properties, all of which are qualities of lexical items.
According to this view, trees are projected from lexical items in accordance
with the ‘Projection Principle’, governing how lexical information is reflected
in syntactic structure, and category-neutral PS-rules, like the following, where
YP is an “adjunct”, ZP is a “specifier” and UP is a “complement”.
(2) a. XP⇒ YP; XP
b. XP⇒ ZP;X
c. X ⇒ X0 ; UP
Now the COMP NP, AUX, etc. positions of figure 1.2 must be projected
from lexical items. This lead to the postulation of the functional heads C0 and
I0 , etc. The idea is that these project positions in accordance with (2) and that
C0 syntactically selects for IP.
X-theory is essentially a set of stipulated PS-rules. In the nineties theories
were proposed that derive the stipulations from more fundamental properties
of syntactic composition (Kayne, 1994; Chomsky, 1994; Brody, 2000), but the
core of the approach remains the same, i.e. these approaches all generate posi-
tions by means of selectional sequences in interaction with general axioms for
1.2 Positions 5
composition.2 More often than not, a particular functional head X0 serves the
sole purpose of generating a position by altering the label of the tree, i.e. the
label of the complement of the head is different from the label projected from
the head. As pointed out by Starke (2001), the empty heads can be omitted
if we allow phrases (i.e. “specifiers”) to alter the labels in this fashion them-
selves. The idea is that since “specifiers” of X invariably share a feature with
X, the feature of the specifier could project the label itself. Thus instead of the
leftmost tree in figure 1.3, we would have the rightmost one.3
XP XP
HH H
H
YP XP YP ZP
H
H
X 0 ZP
Figure 1.3: X-structure and its corresponding Starke structure
In such a system, the positions cannot be ordered by selection in the stan-
dard sense. But the selectional properties of an abstract head X0 is essentially
a stipulated property, crucial only inasmuch we use the head to generate a
particular position. So Starke suggests that we might as well stipulate the se-
quence in which labels are allowed to change separately, without any loss of
insight or elegance, his “fseq”. An fseq hfn , . . . , fn i is essentially an ordering
of features. If features are thought of as corresponding to properties (i.e. sets)
of expressions, we see that what what fseq does is essentially to stipulate the
distribution of expressions. It would seem overly pessimistic, then, to suppose
that fseq cannot be derived from more fundamental properties of the classes
of expressions that it orders.4 The work with this thesis started out as an effort
to derive certain parts of fseq from independent (semantic) factors. However,
after a while it became clear that there are serious problems with the very
notion of an fseq, problems that seem to indicate that the notion not only is
conceptually problematic (i.e. stipulative), but, in fact, empirically wrong.
Fseq is standardly assumed to be linear. As we shall see, there are distribu-
tional phenomena which cannot be accommodated into a linear fseq, in fact,
not even into a partial order. Hence some other theory of distribution must
supplement fseq. The alternative to stipulating a linear fseq is, of course to
derive the distribution of different classes of expressions from independently
motivated properties of these expressions.5 In other words, we could seek to
2 Of course, the question of selectional sequences is orthogonal to the question of how to
derive X-theory. For example, one can perfectly well adopt Kayne’s antisymmetry framework
without assuming selectional sequences to be playing any role.
3 See Starke (2001) for technical details of how this is implemented.
4 Please note that I am not attributing to Starke (or anybody else) the view that fseq
cannot be derived. I am merely pointing out that, if it were to turn out that must be taken
as primitive, this would be a very disappointing state of affairs for linguistic theory. In fact,
I take this to be obvious.
5 Some alternative frameworks generate non-linear “fseqs”. One example is Type Logical
6 Preliminaries
derive fseq from a proper characterization of the features themselves. Then,
there would be no need to stipulate an fseq or empty heads which are only
there to project positions. Given such an approach, there would be no residue
of structuralist “positions”, except, perhaps, in a purely descriptive sense.
Apart from being essentially descriptive, positional analyses generate some
deep puzzles. As has been noted by Jonathan Bobaljik, the following two
observations combine to yield a paradoxical situation if both adverb order and
argument order are to be analyzed in terms of fseqs. i) Scrambling/Argument
Shift: In Germanic V2 languages (except Danish6 ), the subject of the clause
can occupy any position in a string of adverbs, as long as the relative ordering
of the adverbs remains unaltered. The same can be seen to hold of direct
and indirect objects. ii) In the same languages, given the string hs, io, doi,
an adverb can occur anywhere in the string, as long as the ordering of the
arguments remains unaffected. It should be obvious that no linear fseq can
accommodate at the same time arguments and adverbs in such a way that both
observations hold. Bobaljik, following Åfarli, concludes that adverbs must be
Z-axis elements in 3-dimensional graphs. I think one could keep the number
of dimensions of syntactic representations low if we abandon the view that we
should analyze such distributional phenomena in terms of linear fseqs in the
first place.7
Of course eliminating positions in the manner outlined, can only be done
by reanalyzing phenomena which have been dealt with in this way. This is
obviously more work than one can undertake in one dissertation. The more
modest goal of the current dissertation is to to propose alternatives to positional
analyses in three domains which I take to be crucial cases, namely the analysis
of verb second, the analysis of adverb placement and of (short) verb movement.
Grammar (Moortgat, 1996), where the unary connectives give rise to a cube of “positions”
which has been put to uses which are similar to the generative notion of fseq. See in
particular the derivation of polarity phenomena in Bernardi (2002). In this sense, TLG has
an “fcube” rather than an fseq. An important difference is that, while the generative fseq
is stipulated, the fcube of TLG is a theorem of the algebraic laws governing the unary
(and binary) grammatical connectives. Given that the base logic of TLG is Curry-Howard
isomorphic to the typed lambda calculus, it seems that one could actually have a semantic
motivation for which corner of the cube an expression is assigned to. It is possible that that
such a theory can be developed in ways compatible with the ideas of e.g. adverb ordering to
be proposed in the present work. See Nilsen (2001) for some steps in that direction.
6 In this language, subjects and weak pronouns precede all adverbs in the mittelfeld,
whereas other kinds of arguments follow all (pre-VP) adverbs. In other words, Danish does
not have ‘object shift/scrambling of full DPs. This is widely and falsely believed to be
a property of all Mainland Scandinavian languages. See Nilsen (1997) and chapter 3 for
discussion.
7 The number of dimensions of a graph is obtained by counting the number of primitive
relations the graph represents. In the case of syntactic trees, they are 2D inasmuch they
encode both precedence and dominance and neither relation is reducible to other theoretical
notions. If one is derivative of the other, it is 1D. Adding a third primitive relation, call it
“beyond”, would require heavy empirical ammunition. See Nilsen (1997) for discussion of
the proposal in Åfarli (1996)
1.3 Syntactic vs. ontological hierarchies 7
1.3 Syntactic vs. ontological hierarchies
As an alternative to assuming label change or selectional sequences to be re-
sponsible for the distribution of expressions, it has been proposed Ernst (2001);
Svenonius (2001) that one could vary the semantic (ontological) type of the de-
notation of the clausal projection. I find the Ernst/Svenonius approach very
interesting, because it accounts for some important properties of verb/adverb
placements, including “Bobaljik paradoxes” of the sort discussed above, and
transitivity failures to be discussed below. Furthermore, the approach they
pursue makes it entirely natural that the ordering of verbs with respect to ad-
verbs is optional in many cases. My main objection to their theory is that they
employ two orthogonal selection orderings, one semantic and one syntactic.
This leads to some problems which seem to suggest that, if adverb distribution
is to be handled in terms of selection at all, such selection cannot be orthogonal
to fseq.
1.3.1 Cinque’s Universal Hierarchy
Cinque (1999) develops an approach to adverb ordering which is not so much
about explaining it, but rather about capturing crosslinguistic and crosscatego-
rial generalizations about the distribution of functional material in the clause,
only part of which pertains to adverbs. Cinque draws the important conclusion
that adverbs, verbal affixes, free functional morphemes and so-called ‘restruc-
turing’ verbs (often called ‘verb raisers’ in the literature on Germanic) share
important distributional properties, and hence, at some level must be given a
unified theoretical treatment. Given the wealth of empirical evidence Cinque
brings to bear on this question, it is hard to see how one could reasonably
doubt that conclusion. His implementation of it in terms of a very large uni-
versal hierarchy of functional projections (or, equivalently, a very long fseq)
is controversial, however.8 Cinque’s hierarchy is given below.9
(3) [moodspeech−act frankly [moodevaluative fortunately [moodevidential al-
legedly [modepisthemic probably [Tpast once [Tf uture then [modirrealis
perhaps [modnecessity necessarily [modpossibility possibly [asphabitual usually
[asprepetetive again [aspf req(I) often [modvolitional intentionally [aspcelerative(I)
quickly [Tanterior already [aspterminaitive no longer [aspcontinuative still
[aspperf ect(?) always [aspretrospective just [aspproximative soon [aspdurative
briefly [aspgeneric/progressive characteristically(?) [aspprospective almost
[aspsg.completive(I) completely [asppl.completive tutto [voice well [aspcelerative(II)
fast/early [asprepetetive(II) again [aspf req(II) often [aspsg.completive(II) com-
pletely ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
8 In Cinque (to app.), he addresses many of the objections that have been raised against
his proposal.
9 This is the hierarchy he proposes in his (1999) book. Later, he has proposed exten-
sions and refinements of it based on the distributional properties of ‘restructuring’ verbs and
adverbial PPs.
8 Preliminaries
Cinque shows that adverbs obey substantially the same ordering restrictions
in languages as diverse as Italian, Serbo-Croatian, Mandarin Chinese, Hebrew
and Norwegian (and lots of other languages). He then shows that, in languages
which express “adverbial notions” as verbal affixes, the ordering of affixes is
entirely consistent with that found for adverbs, given standard assumptions
about morphology. Next, he shows that languages where the notions in question
manifest themselves as free functional morphemes order these in the same way.
All of this goes to show that there is something fundamentally universal about
the relative ordering in which functional material occur in the clause. He
then shows that, given a rigidly ordered10 sequence of adverbs, a1 , . . . , an , one
finds detailed variation within Romance languages with respect to where one
can place verbs. In other words, whereas one language L, might allow past
participles to occupy any position in the sequence, another language L0 would
allow anything, except for the very last position. Yet another language L00
might exclude the two last positions in the adverbial sequence, while allowing
all other positions and so on. This√gives rise to the picture in the following
table, where ai are adverbs and “ ”,”∗” represent possible and impossible
positions for the verb, respectively.
√ √ √ √
L1 a1 a2 a3
√ √ √
L2 a1 a2 a3 ∗
√ √
L3 a1 a2 ∗ a3 ∗
√
L4 a1 ∗ a2 ∗ a3 ∗
Table 1.1: Verb–adverb distribution
The standard treatment of differential verb placement is head movement to
distinct functional heads. Thus, Cinque argues that the pattern in table 1.1
indicates that there must be head positions between each of the adverbs; other-
wise there would be no means to describe the difference between the languages
in question, while capturing the relevant generalizations. In particular, Cinque
argues that there is no language where the verb can precede, say, a1 but none
of the other adverbs. Thus, we arrive at a structure like the one in figure
1.4, where Li now represents the lowest possible position for the verb in the
corresponding language.
Suppose that we denote the relevant (descriptive) ordering relation of func-
tional material as “≺” where x ≺ y iff x can precede y. Cinque’s approach leads
one to expect ≺ to be a linear ordering, i.e. a transitive (∀x, y, z(x ≺ y ∧ y ≺
z) → x ≺ z)), asymmetric (∀x, y((x ≺ y ∧ y ≺ x) → x = y)) and connected
(∀x, y(x ≺ y ∨ y ≺ x)) ordering of classes of functional material. We need to
talk about ordering of classes, rather than individual expressions, because it
could happen that two functional expressions represent opposite values for the
10 By “rigidly ordered” I mean that the adverbs in question can occur in this order, and,
furthermore, that this is the only order in which they can cooccur, and that this ordering is
constant across languages.
1.3 Syntactic vs. ontological hierarchies 9
xp
H
HH
H
x yp
H
H H
H
L4 x HH
a1 yp
H
HH
H
y zp
H
H H
H
L3 y HH
a2 zp
HH
H
z up
H H
L2 z HH
a3 up
HH
u wp
H
H P
P
L1 u ......
Figure 1.4: syntactic hierarchy for the pattern in table 1.1
same functional notion, e.g. the pair always and never. These would then be
predicted not to be able to cooccur. Linearity comes about because each func-
tional projection is a syntactic complement of another one (except the highest
one, of course), and, because of binary branching, each functional head can
have at most one complement.
In the face of apparent counterexamples to asymmetry, Cinque would have
to argue (and indeed, he does for some cases) that one of the expressions
involved can occupy two distinct positions, and therefore must be treated as
belonging to two distinct (though possibly semantically related) classes. If
two expressions cannot cooccur (counterexample to connectedness), they must
either belong to the same class, or one has to show that there are independent,
possibly semantic reasons for this. Cinque shows that transitivity holds for
some cases. Of course, demonstrating this for every possible combination would
be a nearly impossible task. If counterexamples to transitivity exist, they
would be particularly hard to handle for the approach under discussion. Such
a counterexample would always be of the form that, for some triplet of adverbs
a1 , a2 , a3 , it holds that (a1 ≺ a2 ∧ a2 ≺ a3 ∧ a1 6≺ a3 ). In this case, one would
have to develop a complementary theory T of adverb ordering which would
explain why a1 6≺ a3 . More in particular, no amount of assigning positions to
expressions in a linear sequence could accommodate such a triplet. Given that
we do not, in general want to explain the same phenomenon twice, this would
inevitably lead to a tension: one could be tempted to try to develop T into
a full blown alternative. Depending on its nature, such an alternative could
then allow us to return to a syntactically null theory of adverb placement and
10 Preliminaries
the distribution of other functional material. This is, in a nutshell, what the
present thesis is all about. The real question here is about the use of syntactic
selection and checking. According to Cinque, adverbs are ordered because there
exists a universal inventory of functional heads, and that these are universally
ordered by syntactic selection, and because the adverbs enter into checking
relations with distinct functional heads. Suppose that T is a semantic account
of adverb ordering. Then one could abandon the selection + checking analysis
of adverb ordering, if one could also account for patterns like the one in table
1.1, with a possibly different theory of verb placement.
A counterexample to transitivity
The Norwegian triplet of adverbs muligens (‘possibly’), ikke (‘not’) and alltid
(‘always’) is not linearly ordered and, hence, cannot be accommodated into
a sequence of functional heads. If we consider pairs of adverbs, we find that
muligens must precede ikke; ikke must precede alltid; but muligens does not
have to precede alltid. Thus, we have a counterexample to transitivity. This
behavior is illustrated with the Norwegian examples below.11
(4) a. Ståle har muligens ikke spist hvetekakene sine.
S has possibly not eaten the-wheaties his
“Stanley possibly hasn’t eaten his wheaties.”
b. * Ståle har ikke muligens spist hvetekakene sine.
S has not possibly eaten the-wheaties his
(5) a. Ståle hadde ikke alltid spist hvetekakene sine.
S had not always eaten the-wheaties his
“Stanley hadn’t always eaten his wheaties.”
b. * Ståle hadde alltid ikke spist hvetekakene sine.
S had always not eaten the-wheaties his
Natural examples where alltid ‘always’ precedes muligens ‘possibly’ are not
frequent, but existent. Their low frequency probably has pragmatic reasons:
One does not usually talk about the frequency with which something is epis-
temically possible. The English example (6a) was found on the internet, part
of an advertizement for an internet game. It contrasts minimally with (6b).
11 In English the sequence didn’t possibly may be slightly better than the sharply ungram-
matical Norwegian example above, although most English speakers find it degraded. Cer-
tainly, English allows examples like (i), with two modals, whereas the corresponding Norwe-
gian example (ii) is still sharply out.
(i) Stanley clouldn’t possibly eat his wheaties.
(ii) *Ståle kunne ikke muligens spise hvetekakene. (’S could not possibly eat the-wheaties’)
Similarly, in English the sequence always not might be marginally acceptable while the Nor-
wegian counterpart (alltid ikke) is sharply ungrammatical.
1.3 Syntactic vs. ontological hierarchies 11
The Norwegian translation of (6a), i.e. (7a) is also grammatical, and in this
case, too, there is a contrast between alltid ‘always’ and aldri ‘never’ (7b).12
(6) a. This is a fun, free game where you’re always possibly a click
away from winning $1000!
b. ?? This is a fun, free game where you’re never possibly further
than a click away from winning $1000!
(7) a. Dette er et morsomt, gratis spill hvor spillerne alltid
this is a fun free game where the-players always
muligens er et klikk fra å vinne $1000!
possibly are one click from to win $1000
b. ?? Dette er et morsomt, gratis spill hvor spillerne aldri
this is a fun free game where the-players never
muligens er lenger enn et klikk fra å vinne $1000!
possibly are further than one click from to win $1000
This fits our description of counterexamples for transitivity: Alltid ≺ muligens
(7a); muligens ≺ ikke (4a), but alltid 6≺ ikke (5b).
Anticipating somewhat, my explanation why alltid ‘always’ can’t outscope
the negation is that universal quantifiers never can (Beghelli and Stowell, 1997).
Consider for example (8). This example is very odd, unless read with heavy
stress on everybody, in which case the sentence is an emphatic denial of the
sentence everybody showed up, i.e. the negation outscopes the universal, not
the other way around.
(8) Everybody didn’t show up.
I will propose that the reason why muligens ‘possibly’ can’t occur under nega-
tion is that it is a positive polarity item. This proposal is developed in detail
in chapter 2.
Bobaljik’s paradox
In a squib in Glot Bobaljik (1999) (see also Svenonius (2001)) argues that
there are empirical considerations which, when taken together with the idea
of a single, linear fseq accommodating the distribution of different kinds of
expressions, lead to a paradox. His observation is that one finds phenomena
where expressions of type X are rigidly ordered, and expressions of type Y
are rigidly ordered, but where any x ∈ X can precede or follow any y ∈ Y as
long as the ordering requirements internal to X and Y are both satisfied. It
seems to follow that one cannot accommodate both X-type expressions and Y -
type expressions in a unique linear fseq. This behavior is illustrated with the
12 One could argue that one or both of the adverbs are constituent modifiers, rather than
sentence modifiers and hence irrelevant for our purposes. This point is taken up in chapter
2 (page 45) for the same examples.
12 Preliminaries
relative ordering of arguments and the relative ordering of adverbs in Norwegian
in the examples below.
(9) a. Derfor ga Jens Kari kyllingen tydeligvis ikke
therefore gave J K the-chicken evidently not
lenger kald.
any.longer cold
b. Derfor ga Jens Kari tydeligvis kylingen ikke lenger kald.
c. Derfor ga Jens tydeligvis Kari kyllingen ikke lenger kald.
d. Derfor ga Jens tydeligvis Kari ikke kyllingen lenger kald.
e. Derfor ga Jens tydeligvis Kari ikke lenger kyllingen kald.
f. Derfor ga Jens tydeligvis ikke lenger Kari kyllingen kald.
g. Derfor ga tydeligvis Jens ikke lenger Kari kyllingen kald.
h. Derfor ga tydeligvis ikke Jens lenger Kari kyllingen kald.
i. Derfor ga tydeligvis ikke lenger Jens Kari kyllingen kald.
j. * Derfor ga Jens ikke tydeligvis Kari lenger kyllingen kald.
k. * Derfor ga Jens tydeligvis ikke kyllingen lenger Kari kald.
The problem that patterns like (9) raise is the following. We see that arguments
must cooccur in one specific order (modulo topicalization and wh- movement),
but that adverbs can intervene between the arguments. Moreover, we see that
the adverbs must occur in a specific order, but that arguments can intervene
between them. Thus, it seems hopeless to try to account for (9) in terms of
specific positions for both adverbs and arguments. In other words, we cannot
accommodate patterns like (9) in a linear fseq. Bobaljik (1999) shows that
similar problems arise with respect to the relative ordering of auxiliaries on
the one hand, and the relative ordering of adverbs on the other. I discuss this
‘verb-placement paradox’ in detail in chapter 4. One could rescue the unique
fseq approach by assuming for (9) that the adverbs occupy unique specifiers,
and that the arguments can scramble freely among the adverbs, subject to
the requirement that scrambling must preserve the order of the arguments. A
similar approach could be proposed for the verb-placement paradox, but that
would seem to create problems for Cinque’s argument for functional heads on
the basis of differential verb placement in Romance. It is tantamount to giving
up the idea of a unique linear fseq. Bobaljik (1999) concludes that we should
expand the theory to allow for multi-dimensional phrase markers (Åfarli, 1996).
I think we can keep the number of dimensions down if we rethink the idea that
the distributional phenomena under discussion should be handled in terms of
fseqs. I return to a treatment of (9) in chapter 3, and to the verb placement
paradox in chapter 4.
1.3 Syntactic vs. ontological hierarchies 13
1.3.2 Ernst’s Fact-Event object calculus
Ernst (2001) gives a semantic account of adverb ordering by analyzing different
classes of adverbs as modifiers of different kinds of semantic objects. By way
of example, according to Ernst, an adverb like completely modifies events
(henceforth i), whereas e.g. not modifies propositions (p), and paradoxically
modifies facts (f ). He then imposes the following system of type conversion
on his ontological primitives, his FEO-calculus:
i =⇒ p =⇒ f
Ernst does not use type-theoretical notation. He states, for example that an
adverb like completely is an event modifier, and uses the notation in (10a) for
that, indicating that completely wants a constituent denoting an event as a
syntactic complement, and projects a syntactic node denoting an event. I find
this mix of syntactic and semantic notions slightly confusing, so I will assume
that what Ernst intends to say could equally well be given the type-theoretical
notation in (10b), i.e. completely is a function from event descriptions to event
descriptions. The types for not and paradoxically are given in (10c-10d).
(10) a. [EVENT complete [EVENT ]]
b. completely: hi, ii
c. not: hp, pi
d. paradoxically: hf, f i
Given the FEO-calculus, this derives the fact that these three adverbs must
occur in the following order when they cooccur: paradoxically > not >
completely. An adverb of type hx, xi can apply to an expression of type y
iff y =⇒ x by the FEO-calculus. Thus the tree in figure 1.5 represents a gram-
matical structure, because all the type transitions respects the FEO-calculus.
The FEO-calculus looks suspiciously similar to Starke’s fseq. Given that
the latter is just an equivalent reformulation of the selectional sequences that
the FEO-calculus is intended to replace, this is a state of affairs that requires
some attention. Especially so, since Ernst does assume that there are functional
projections like CP, TP and PredP etc. in the clausal structure, and that these
are ordered by fseq. His idea is that adverbs can attach freely to any functional
projection as long as the semantic requirements of the adverbs (FEO-calculus)
are respected. Hence, Ernst has it that, while adverbs are ordered by his FEO-
calculus, the categories that the adverbs attach to are ordered by an orthogonal
relation, fseq. Thus, if we suppose that in some language the verb moves to T,
as shown in figure (1.6), we expect adverbs to be able to freely precede or follow
the verb. A congenial analysis is proposed by Svenonius (2001). Suppose that
the language in question allows all of the three adverbs in figure 1.5 to precede
or follow the verb. This would then be analyzed as attaching the adverbs to VP
or TP in accordance with the calculus. So it is predicted that when the adverbs
14 Preliminaries
f
HH
H
H
hf, f i f
paradoxically p
H
HH
H
hp, pi p
not i
H
HH
hi, ii i
PP
completely ... ... ...
Figure 1.5: Adverbs and FEO-transitions
TP
H
HH
H
adv* TP
H
HH
T VP
H H
H
V T adv* VP
Figure 1.6: verb/adverb placement
cooccur, they must respect the ordering paradoxically>not>completely, but the
verb can appear anywhere in the sequence. This allows Ernst and Svenonius
to eliminate some functional heads. Interestingly, it also seems to solve the
paradoxes discussed by Bobaljik (1999) (see above),13 and, as we shall see the
machinery is technically capable of handling transitivity failures.
However, there are also some problems. To my mind, the most difficult one
is that this setup seems to force us to abandon the view that T has semantic
import. For suppose that it does, i.e. that T denotes tense, as standardly
assumed. Then T should also have selectional properties, not only with respect
to fseq, but with respect to the FEO-calculus as well. Given that, in the setup
we are considering, paradoxically (type hf, f i) can follow T, and thus must be
attached to VP, and that, in that case, VP must denote a fact, it follows that T
must be able to apply to a fact. Similarly, given that completely (type hi, ii) can
attach above T, it appears to follow that the result of applying T to VP must
be a FEO type that completely can apply to, i.e. i. Hence, T must be of type
hf, ii; it takes a fact and returns an event. But this move destroys the account
of adverb ordering, because if T has this type, we could attach paradoxically to
VP, apply T to the result of that, yielding an event, and then apply completely
13 See Svenonius (2001) for discussion.
1.3 Syntactic vs. ontological hierarchies 15
to TP. This results in the ungrammatical sequence completely V paradoxically
VP. In fact, given this type for T, all of the six logically possible orderings of
the three adverbs should be fine, as long as one adverb applies below, and one
above T. So the FEO-calculus would cease to have an effect. If we constrain
the type for T somewhat, we can rule out some adverb orderings, but only
at the cost of losing the account of verb placement. In other words, if we do
that, we must have more positions for the verb to move to in order to account
for the verb placement facts in the hypothetical language we are considering,
thus potentially destroying the resolution of Bobaljik’s paradoxes. The only
way out seems to be to assume that T mindlessly passes on the FEO-type of
its complement. But that creates some problems for the view that T denotes
tense, since tense presumably does care about the semantic type of its semantic
argument. Hence it is not clear fseq and the FEO-calculus can be treated as
orthogonal after all. In that case, the account for the different verb positions
in terms of figure 1.6 would seem to be much less straightforward than Ernst
and Svenonius assume.
Although Ernst’s calculus is linear, it does not rule out non-linearity of
adverb ordering. For instance, we have seen that we could assign the type
hf, ii to some adverbs, and these would then be able to precede or follow any
other adverb. Furthermore, since Ernst’s calculus is stipulated, he could in
principle impose any other relation on his ontological primitives, and avoid
linearity of the calculus in this way. Needless to say, the latter point may also
be construed as a weakness of this kind of approach: The FEO-calculus does
for Ernst what syntactic selection does for Cinque. Hence Ernst’s approach
does not explain adverb ordering unless the calculus can be derived from more
fundamental properties of the particular ontological primitives he uses.
Let us see how our counterexample to transitivity can be accommodated
in an Ernst-type calculus. We refer to the first element in the type hx, yi as
its argument and write A (adv) for the argument of an adverb. Similarly we
write R (adv) for the second element in the type, i.e. the result of the function.
As before, we write Adv1 ≺ Adv2 for “Adv1 can precede Adv2 .14 Finally,
we operate with a calculus a1 ≤ ... ≤ an where ai are types and ≤ is a linear
ordering of the types. In other words, we abstract away from Ernst’s particular
interpretation of his types as “events”, “propositions”, etc. This is in order to
be able to focus on the structural properties of his calculus. In general, the
following biconditional can be seen to hold:
Adv1 ≺ Adv2 ⇔ R (Adv2 ) ≤ A (Adv1 )
From this it follows that (contraposition of [⇐] and [⇒])
Adv1 6≺ Adv2 ⇔ R (Adv2 ) 6≤ A (Adv1 )
14 Strictly speaking, ≺ should be read can apply after, but since, in the relevant cases,
precedence=scope, we ignore this.
16 Preliminaries
and, by linearity of ≤ (i.e. connectedness), that
Adv1 6≺ Adv2 ⇔ A (Adv1 ) < R (Adv2 ).
We can now use the ordering facts above to reason with the types of our triplet
of adverbs. The ordering facts are given on the left and the conclusions to be
drawn concerning their types are given on the right.
ikke ≺ alltid R (alltid) ≤ A (ikke)
alltid 6≺ ikke A (alltid) < R (ikke)
muligens ≺ ikke R (ikke) ≤ A (muligens)
ikke 6≺ muligens A (ikke) < R (muligens)
muligens ≺ alltid R (alltid) ≤ A (muligens)
alltid ≺ muligens R (muligens) ≤ A (alltid)
By compiling this into one sequence, we get the following:
R (alltid) ≤ A (ikke) < R (muligens) ≤ A (alltid) < R (ikke) ≤ A (muligens)
If, in order to minimize the amount of types, we always read ≤ as =, we need
three basic types a, b, c such that a < b < c and the following types for the
adverbs which we simply read off the ordering above:
ikke ha, ci
muligens hc, bi
alltid hb, ai
As can be seen from our discussion above, it also follows that, if we have just
three types this is the only type assignment which would work, given a linear
type conversion calculus. If we have a category S of type a, we can show that
the following facts obtain.
derivable not derivable
ikke(alltid(S)) *alltid(ikke(S))
muligens(ikke(S)) *ikke(muligens(S))
muligens(alltid(S))
alltid(muligens(S))
ikke(alltid(muligens(S)))
I give the derivation of the three examples (ikke(alltid(S))), (alltid(muligens(S)))
and (muligens(ikke(S))) in figure 1.7 below. Premises are written above the
lines, and conclusions below. If the conclusion does not follow trivially, the
relevant application of the calculus is written to the right of the line. The
notation α : x indicates that the expression α is of type x.
The calculus a < b < c is obviously isomorphic to Ernst’s FEO calculus,
i.e. a = i, b = p and c = f . Thus, what we have derived is that Ernst could
handle our problematic triplet by assigning the type hi, f i to ikke, hf, pi to
muligens and hp, ii to alltid. In practice, Ernst usually assigns identity types to
1.3 Syntactic vs. ontological hierarchies 17
S : a
a<b
alltid : hb, ai S : b
ikke : ha, ci (alltid(S)) : a
(ikke(alltid(S))) : c
S : a a<c
muligens : hc, bi S : c
alltid : hb, ai (muligens(S)) : b
(alltid(muligens(S))) : a
ikke : ha, ci S : a
muligens : hc, bi (ikke(S)) : c
(muligens(ikke(S))) : b
Figure 1.7: Derivations for the non-transitive triplet
adverbs,15 i.e. they always return the same type as the type of their argument.
Nothing prevents him from assigning types like hi, f i, and given our discussion,
it seems that he has to assign just this type to Norwegian ikke. This move
seems to weaken the plausibility of his account, however. One might find it
“plausible” or “intuitive” that e.g. completely should be a function from events
to events. But how “intuitive” is it that the Norwegian negation ikke is a
function from events to facts, or that muligens ‘possibly’ is a function from
facts to propositions rather than from facts to events, for example? Are all
possible type-transitions attested?
Another problem with Ernst’s system is that he does not explain what sort
of things facts are. At one point he states that they are propositions that the
speaker is committed to, and at another, he seems to say that they are factive
propositions in the sense of being presupposed. For example, he states that the
reason why paradoxically cannot occur under negation is that the negation is a
propositional modifier, while paradoxically takes facts. Then he says that this
expresses the intuition that one cannot negate a presupposed proposition, i.e.
one cannot presuppose something to be true and and assert its negation at the
15 He argues that evidentials like obviously map facts to events. This is to account for
examples like (i), where obviously occurs under negation (Ernst, 2001, p107).
(i) Sally was not obviously affected by her winning the award.
It seems to me that this example might be an instance of obviously used as a manner adverb.
Note for example that other cases of evidentials under negation appear to be less good.
(ii) John hasn’t evidently gone home.
(ii) appears to be good with a so-called meta-linguistic or contrastive negation, but these
are known to be special in any event (see chapter 2). If evidentials create events from facts,
Ernst must make extra assumptions to rule out cases where other event modifiers end up
outscoping an epistemic adverb when there is an intervening evidential.
18 Preliminaries
same time. But if we define a presupposed proposition as “a proposition which
cannot occur under negation” Ernst’s explanation is circular.16 If we define
it as “a proposition which is required to be part of the common ground”, it
seems that paradoxically does not trigger this kind of presupposition. It does
not appear to give rise to infelicity when the modified proposition is known
to be false, for example, which is the hallmark of such presupposition triggers.
Thus, (11a) is false, while (11b) is odd, because the definite article presupposes
the existence of a president of the UK. Similarly, (11c) is odd because of the
strong factivity of it is sad that.17 Is (11d) false or odd? The informants I have
consulted find it false, but not odd in the same way as (11b). Interestingly,
the adjectival version (11e) does seem to give rise to infelicity, so it seems
that paradoxical, in fact, is a presupposition trigger, more so, apparently, than
paradoxically. But the adjective occurs happily under negation (11f), so, even
if we would grant that the presupposition of the adjective is somehow inherited
by the adverb, this would not, in and of itself, explain why the latter cannot
occur under negation.
(11) a. There is a current president of the UK playing ping pong in
the corridor.
b. The current president of the UK is playing ping pong in the
corridor.
c. It is sad that there is a current president of the UK playing
ping pong in the corridor.
d. Paradoxically, there is a current president of the UK playing
ping pong in the corridor.
e. It is paradoxical that there is a current president of the UK
playing ping pong in the corridor.
f. It is not paradoxical that someone is playing ping pong in the
corridor.
Thus, although I am very sympathetic to the idea to the idea that the
facts discussed by Cinque (1999) should ultimately be derived from more fun-
damental considerations, it does not seem to me that the primary tool to do
so can be enrichment and manipulation of ontological categories. Hence, I try
to develop an alternative semantic analysis of adverb ordering in chapter 2,
which crucially does not rely on ontology. The analysis pursued there is essen-
tially that several sentence adverbs are (positive) polarity items, and that this
suffices to derive surprisingly many adverb ordering effects. This allows us to
16 The standard definition of a presupposition is that it is an implication which is preserved
under negation. Although this is clearly compatible with Ernst’s explanation, it does not
offer support: the definition does not apply in this case for the very reason that paradoxically
cannot occur under negation.
17 It is “strongly” factive in the sense that it resists accommodation, thus contrasting with
the verb know which is “weakly” factive in this sense.
1.4 Antisymmetry and processing 19
maintain the positive results achieved by Ernst (2000) and Svenonius (2001),
while sidestepping the difficulties their approach encounters.
1.4 Antisymmetry and processing
1.4.1 Introduction
From Kayne’s Linear Correspondence Axiom (LCA) (Kayne, 1994), it follows
that phrase markers universally conform to specifier-head-complement order,
and there can be no “right-adjunction”. The issue remains controversial, in
particular with respect to sentence-final adjuncts. In Ackema and Neeleman
(2002), they argue that right-adjunction should be maintained as a theoretical
option, and that there are alternative ways of deriving the empirical results
that antisymmetric analyses can boast of. In particular, Ackema and Neele-
man claim that universal word-order asymmetries can be derived as effects
of limitations on on-line parsing. I find this proposal a very interesting one
that deserves careful consideration. To anticipate, the conclusion will be that,
although there may well be other ways of formulating it which would work
better, Ackema and Neeleman’s account fails to supply a viable alternative to
antisymmetry.
1.4.2 Antisymmetry
The LCA states that for a phrase marker P with the set of terminals T and
the set N of nonterminals,
(12) Linear Correspondence Axiom (LCA)
d(A) is a linear ordering of T ,
where
A = {hn, mi ∈ N × N | n asymmetrically c-commands m};
for any n ∈ N ,
d(n) = {t ∈ T | n dominates t};
and for any n, m ∈ N,
dhn, mi = {ht1 , t2 i ∈ T × T | t1 ∈ d(n) and t2 ∈ d(m)};
[
d(A) = {dhn, mi | hn, mi ∈ A}.
Kayne defines (symmetric) c-command as follows:
(13) c-command
n c-commands m iff n, m are categories, n excludes m, and every cate-
gory that dominates n, dominates m.
20 Preliminaries
A nonterminal n x-projected from terminal t is a category iff it reflexively
dominates all other x-projections of t, its segments. “x-projection” can vary
between head-projection and phrasal projection. A node n is dominated by a
category c iff all segments of c dominates n. c excludes n iff no segments of c
dominates n, and c includes n iff some segment of c dominates n. Kayne shows
that a number of stipulative elements of phrase structure theory (X-theory)
fall out as theorems of this set up, e.g. binary branching, one head per phrasal
projection, etc. The notions he uses, i.e. c-command, segments and categories
etc., are, of course, independently motivated. However, the system’s reliance
on the distinction between terminals and nonterminals has been criticized in
Chomsky (1994). Kayne himself has pointed out that the system would be
more elegant if it had not relied on this very specific definition of c-command
(Kayne, 2000). In Kayne (1994), he also notes that nothing apparently prevents
c-command to map onto postcedence rather than precedence. His solution to
this problem involves some very specific assumptions about time (i.e. that it
is linear) which are not uncontroversial in physical and philosophical circles.
Thus one might want to look for alternative formulations which rid the system
of these features.
I illustrate how the system rules out right adjunction to XP, since this is of
crucial importance here. Consider the tree in figure 1.8. The set of pairs hn, mi
XP
HH
XP ZP
H
H
X YP Z
x Y z
y
Figure 1.8: Right adjunction
such that n asymmetrically c-commands m in this tree, i.e. A, is given in (14).
(14) {hZP, Xi, hZP, Y P i, hZP, Y i, hX, Y i}
The lower segment of XP does not c-command Z, because it is not a category.
YP does c-command X, but not asymmetrically. Finally, ZP asymmetrically
c-commands X and YP, because it (vacuously) holds that every category dom-
inating ZP dominates X, YP. Hence, d(A) comes out as (15), which has it that
z precedes x, y, rather than follow them.
(15) {hz, xi, hz, yi, hx, yi}
The most compelling evidence for the LCA comes from the fact that it shows
promise to derive word-order universals of the following kind: Although there
1.4 Antisymmetry and processing 21
are many languages that exhibit second position phenomena (e.g. Germanic
verb-second, Slavic clitic-second, etc.), there are no known penultimate position
phenomena. For instance, there are no languages where the finite verb ends up
in the second to last position of the clause. Another asymmetry of this kind is
the one noted by Greenberg (1966) (his universal 20), which states that
when any or all of the items (demonstrative, numeral, and descrip-
tive adjective) precede the noun, they are always found in that
order. If they follow, the order is either the same or its exact op-
posite.
In other words, there is a (universal) gap in the ordering possibilities: we find
the orders (16a)-(16c), but not the one in (16d).
(16) a. hdem, num, adj, N i
b. hN, dem, num, adji
c. hN, adj, num, demi
d. * hadj, num, dem, N i
The phenomenon at hand appears to be considerably more general than what
Greenberg’s original formulation suggests. More specifically, it generalizes to
crosslinguistic ordering patterns involving verbal clusters (Koopman and Sz-
abolcsi, 2000; Cinque, 2000b), adverbial PPs (Cinque, 2002), as well as sen-
tence adverbs (Nilsen and Vinokurova, 2000), and verbal affixes (Cinque, 1999).
Hence, it seems that one could restate it as a generalization about stacking of
modifiers, adequately defined.
Given the restriction that a moved category must c-command its trace,
it follows from the absence of right-adjunction (i.e. the LCA) that all move-
ment must be to the right. Hence there could be no derivation of the order
hadj, num, dem, N i, and Greenberg’s U20 would follow as a theorem of the sys-
tem. Ackema and Neeleman (2002) give one illustration of a system which can
derive the GU20, taken from Cinque’s work. We give another illustration of
a toy system that derives a generalized U20. Suppose that we distinguish a
category of “x-raisers”, (xr) the class of elements that participate in U20-like
word-order patterns. When an xr has a complex complement, i.e. a comple-
ment which itself has a specifier, it can attract the specifier of its complement,
the head of its complement or the entire complement to its own specifier. By
iterating attraction of specifier, we end up moving the most deeply embed-
ded specifier to the highest position (climbing). If the complement of an xr is
simplex, e.g. a simple NP, this NP must raise. This is illustrated in (17).
(17) [xr1P xr1 [NP N]]
I move NP
[xr1P [NP N] xr1 tNP ]
I merge xr2 and extract spec
22 Preliminaries
[xr2P [NP N] xr2 [xr1P tNP xr1 tNP ]]
I merge xr3 and extract spec
[xr3P [NP N] xr3 [xr2P tNP xr2 [xr1P tNP xr1 tNP ]]]
Suppose, instead, that we iterate the extract complement option. This leads
to “roll-up” structures.
(18) [xr1P xr1 [NP N]]
I move NP
[xr1P [NP N] xr1 tNP ]
I merge xr2 and extract complement:
[xr2P [xr1P [NP N] xr1 tNP ] xr2 txr1P ]
I merge xr3 and extract complement
[xr3P [xr2P [xr1P [NP N] xr1 tNP ] xr2 txr1P ] xr3 txr2P ]
Finally, if we consistently extract the head of the complement, we restore the
order of merger, except for the two highest elements.
(19) [xr1P xr1 [NP N]]
I move N
[xr1P N xr1 [NP tN ]]
I merge xr2 and extract head
[xr2P xr1 xr2 [xr1P N txr1P [NP tN ]]]
I merge xr3 and extract head
[xr3P xr2 xr3 [xr2P xr1 txr2P [xr1P N txr1P [NP tN ]]]]
A system like this one can derive the three observed orders in (16) and many
intermediate ones, but not the ungrammatical (16d), that is, if we rename adj,
num, dem as xr1, xr2 and xr3, respectively. Assume that it can. Then there
must be a derivation conforming to the LCA which yields the order xr1 xr2 xr3
N, i.e. where all material is left-adjoined. Since xr1 and N are separated, xr1
must have moved as a head, i.e. leaving N behind, thus yielding the intermediate
structure (20).
(20) [xr3P xr3 [xr2P xr1 xr2 [xr1P N txr1 ]]]
But from in this structure, there is no constituent containing xr1, xr2, but
excluding N, which can be moved to spec-xr3, hence the order xr1 xr2 xr3 N
cannot be derived. In other words, the unwanted order could only be derived
if there is an element α intervening between xr3 and its complement which
has the property that it (non-locally) attracts N. Then xr3 must (non-locally)
attract xr2P, as shown in (21).
(21) [αP α [xr2P xr1 xr2 [xr1P N txr1 ]]]
I move N
1.4 Antisymmetry and processing 23
[αP N α [xr2P xr1 xr2 [xr1P tNP txr1 ]]]
I merge xr3 and move xr2P
[xr3P [xr2P xr1 xr2 [xr1P tNP txr1 ]] xr3 [αP N α txr2P ]]
In view of ordinary (long distance) displacement facts, we have to acknowledge
the existence of non local attractors like α. Furthermore, if α is allowed to be
phonetically empty, (21) will give the impression of being a well formed order
of the relevant type. If we reserve the term xr for expressions that are not
capable of long distance attraction, we have derived that, under the current
assumptions, these should conform to the Generalized U20. This system could
not derive the U20-behavior of adverbial PPs (Cinque, 2002), given that xrs are
treated as heads. I refer the reader to Cinque’s work for a derivation of PPs.
The point here is that the LCA gives room for several alternative systems which
derives the GU20, i.e. this is a robust feature of LCA-compatible systems.
One point is worth emphasizing. It is often thought that the LCA ren-
ders left-branching structures impossible. This, however, is not correct. Left-
branching structures can arise when constituents are successively embedded in
specifiers, as shown in figure (1.9). This can arise, either by base generation or
by (“roll-up”) movement.
XP
H
H
YP XP
HH X
ZP YP
H
H
UP ZP Y
Z
Figure 1.9: Left branching structure
Hence it does not follow from the LCA that if a constituent X follows another
constituent Y, X must be “lower in the tree” than Y. What is excluded is
right-adjunction and, hence, rightward movement.
1.4.3 A processing account
Ackema and Neeleman (2002) propose that one could exclude head movement
to the right (in certain cases) but allow right adjunction, and that this would
suffice to derive GU20 and the absence of penultimate position phenomena.18
They suggest that the absence of rightward head movement could be thought
of as a processing phenomenon, because it would force the parser to keep more
things in store than leftward movement. Their account rests on the following
assumptions about the parser (Ackema and Neeleman, 2002, their (21)):
18 Note that, given the discussion of figure (1.9) in the previous section, this approach may
be much closer to the antisymmetric approach than what these authors seem to assume.
24 Preliminaries
(22) a. It scans the input string from the left to the right.
b. It constructs a tree, that is, a set of dominance and precedence
relations.
c. It has no look-ahead.
d. It can only postulate a trace after having encountered an an-
tecedent.
e. It cannot alter information (dominance and precedence relations)
stored in short-term memory for a given parse.
To these assumptions, I would like to add the following one, which they tacitly
assume (or something like it).
(23) Immediate attachment: Incoming material must be integrated immedi-
ately into a structure during sentence processing.
Their assumptions (22a)-(22c) could hardly be taken as controversial ones.
Unfortunately, this does not hold for their assumptions (22d), (22e) and (23).
Let us defer discussion of (22d) for a moment. (22e) seems to say that whenever
the parser must reanalyze some already analyzed material there is a processing
cost, e.g. a garden path effect. But this is blatantly false, at least if it is to
square with assumption (23). Mulders (2002) gives numerous examples where
any existing parsing theory would seem to be forced to assume reanalysis, but
they do not give rise to detectable garden path effects. Consider the following
contrast from Japanese. (24a) gives rise to a garden path, whereas (24b) does
not (Mulders, 2002, pp.131-133).
(24) a. ¿ Hurugashi-ga Yumiko-o ∅ ∅ yobidasita kissaten-ni
Hurugashi-nom Yumiko-acc pro pro summoned tea-room-loc
nagai koto mata-seta.
long time wait-made
“Hurugashi made Yumiko wait for a long time at the tea room
to which he summoned her.”
b. Yumiko-o Hurugashi-ga ∅ ∅ yobidasita kissaten-ni
Yumiko-acc Hurugashi-nom pro pro summoned tea-room-loc
nagai koto mata-seta.
long time wait-made
“Hurugashi made Yumiko wait for a long time at the tea room
to which he summoned her.”
Consider first the parse of (24a). At the stage where the parser has encoun-
tered Hurugashi-ga Yumiko-o yobidasita (‘Hurugashi yumiko summoned’), it
analyzes this as a main clause. But the continuation is not compatible with
this. The continuation forces both Hurugashi and Yumiko to be reanalyzed
as arguments of a superordinate clause, and the predicate yobidasita to be
1.4 Antisymmetry and processing 25
reanalyzed as a relative clause on the following locative NP kissaten-ni ‘tea-
room-loc’. This is compatible with A&N’s view that “information cannot be
altered”, since the reanalysis does, in fact, lead to a garden path effect. How-
ever, (24b), which minimally differs from (24a) in that the object Yumiko is
scrambled around the subject Hurugashi, mysteriously does not lead to a gar-
den path effect. But the amount of reanalysis involved is the same. Hence it
seems that reanalysis is possible sometimes, after all.
Mulders (2002) gives numerous other cases where reanalysis seems to be
costless, not only in Japanese, of course, and she comes up with a more so-
phisticated restriction on reanalysis (her (T)ROLLC), ultimately relating it to
standard syntactic locality constraints on movement.19 The ease with which
(24b) is parsed, indicates that “short term memory” is not as severely limited
as Ackema and Neeleman (with many others) suppose. I refer the reader to
Mulders’ work to see how contrasts such as that in (24) can be handled. What
is crucial here is that Mulder’s approach explicitly rejects assumption (22e).
As for Ackema and Neeleman’s assumption (23), Mulders (2002) has many
arguments against that, too. In fact, she devotes an entire chapter (chapter
3) to arguing that all the arguments that have been put forth in the literature
in favor of this assumption, and thus against a “θ-driven” parser (“Maximize
satisfaction of the θ-criterion at every stage of the parse” (Pritchett, 1992)), are
either inconclusive or wrong. She furthermore shows rather convincingly that a
θ-driven parser can handle problems that lead to quite puzzling problems under
an “immediate attachment” approach, including the contrast in (24) above.
Mulders discusses parsing of Japanese sentences with relative clauses in great
detail, showing that whereas “immediate attachment” (of course wrongly) leads
us to expect that Japanese speakers should enter into a garden path virtually
all the time,20 the θ-driven approach can actually to a large extent correctly
predict when Japanese speakers actually do feel a garden path effect, and when
they don’t. I refer the reader to Mulders’ work for discussion.
Finally, the assumption (22d), that traces can only be inserted after the
antecedent have been encountered, seems to depend in crucial ways on the
two assumptions we have just seen to be questionable. In particular, θ-driven
parsing algorithms, like the one Mulders is promoting, assume that empty
material is freely generated (of course subject to the restrictions of the grammar
of the language in question) and do not lead to processing problems. In any
case, it is clear that this assumption is rather essential for any algorithm that
wants to derive absence of rightward movement from processing difficulties:
With rightward movement, the trace precedes its antecedent by definition, and
by assumption (22e), it cannot be inserted by the parser before the antecedent
is encountered, leading to potential for processing problems because of the
assumption that everything must be inserted into a tree at once. On a θ-driven
19 Mulders’ theory is based on work by Pritchett (1992).
20 Of course, we do not want to say that Japanese sentence processing follows different
principles than, say, English sentence processing.
26 Preliminaries
approach, the crucial assumption is that there is no problem with keeping things
in storage: In fact, this is the typical case in strongly “head final” languages
like Japanese, where the θ assigner is typically encountered as the very last
element.
The point here is that the assumptions that Ackema and Neeleman (2002)
make about the parser are not uncontroversial, and, in some cases, likely to be
false. Be that as it may, let us look at how their system derives the (G)U20
and the absence of penultimate position phenomena.
1.4.4 GU20 and Penultimate position
In this subsection, I will try to explain how Ackema and Neeleman attempt
to derive universal word-order asymmetries from their assumptions about pro-
cessing. It does not seem to me that their assumptions actually have the results
that they claim them to have. In fact, I will argue that they fail to derive both
the GU20 and the absence of penultimate position phenomena.
Suppose that the parser has identified an XP. Then, according to their
assumptions, it will postulate the existence of some head H that follows XP (it
has not yet encountered such a head) and that XP is immediately dominated
by some projection Hi of H. They notate this as follows:
(25) P(XP, H); ID(Hi , XP); Proj(H, Hi ).
Here “P” is in short for “precedes”, “ID” for “immediately dominates”, and
“Proj” for “is a projection of”. “H” is an “abstract head” which will be given
content when an actual head is encountered. If the parser now encounters
another phrase YP, we get the following (“D” means “dominates”).
(26) a. XP, YP
b. P(XP, H); ID(Hi , XP); Proj(H, Hi )
P(YP, H); ID(Hj , YP); D(Hi , Hj ); Proj(Hj , H)
So far, our parse corresponds to the tree in figure (1.10). When the parser
Hi
HH j
XP H
H
H
YP H
Figure 1.10: Tree for (26)
encounters a head, (ignoring the possibility that it has moved for the moment),
we get the following addition to our parse, which, is a monotone increase in
the represented information from (26), just as (26) was from (25).
(27) a. XP, YP, V
1.4 Antisymmetry and processing 27
b. P(XP, H); ID(Hi , XP); Proj(H, Hi )
P(YP, H); ID(Hj , YP); D(Hi , Hj ); Proj(Hj , H)
H=V
If we now encounter more phrases, these could either be complements of V, or
right-adjoined material. Consider the latter option.
(28) a. XP, YP, V, ZP, WP
b. P(XP, H); ID(Hi , XP); Proj(H, Hi )
P(YP, H); ID(Hj , YP); D(Hi , Hj ); Proj(Hj , H)
H=V
P(H, ZP); ID(Hk , ZP); Proj(H, Hk )
P(H, WP); ID(Hl , WP); D(Hl , Hk ); Proj(H, Hl )
This is still in accordance with our assumptions, and the parse is unambiguously
has it that XP c-commands YP, and that WP c-commands ZP. It does not
determine the scope between the preverbal and the postverbal phrases, and
Ackema and Neeleman simply assume that this is determined either by the
grammar or by the discourse context. Hence we see that the system can handle
both left- and rightbranching structures. A&N argue that it cannot handle
certain kinds of rightwards movement, however. Consider the tree in figure
(1.11) and how it would be parsed incrementally. At the point where the two
V’
H H
H
tP V
HH
XP t’
H
t YP
Figure 1.11: Rightward head movement
phrases have been encountered, we would have the following parse:
(29) a. XP, YP
b. P(XP, H); ID(Hi , XP); Proj(H, Hi )
P(YP, H); ID(Hj , YP); D(Hi , Hj ); Proj(Hj , H)
Suppose that the parser, upon encountering the verb tries to insert a trace
between the two phrases.21 Given that (29) says that Hi dominates XP and Hj ,
and that Hj immediately dominates YP, the trace must be an H, immediately
dominated by Hj . But this contradicts the information in (29) that YP precedes
H. Thus insertion of such a trace contradicts their assumption (22e), that the
21 It cannot hypothesize a trace before, because of assumption (22d).
28 Preliminaries
parser cannot alter information it has already postulated. A&N show that
rightward head-movement is, in fact, possible in this approach just in case it
does not cross any “dependents” of the head. This is stated in their “Rightward
Head Movement Theorem”:
(30) RHMT (Ackema and Neeleman, 2002, their (38))
Rightward head movement is possible as long as no dependent of the
moving head is crossed.
This formulation arguably requires some attention to the notion “dependent”.
In footnote 12, they explain that “dependent” refers to such elements as “spec-
ifiers, complements or adjuncts of the heads under discussion.” Thus as long as
an element is not thought to be such a dependent, rightward head movement
should be able to cross it. They then go through some cases which they admit
can be given an antisymmetric account, but can also be analyzed as rightward
head movement in accordance with RHMT. This now derives GU20, if it can
be shown i) that, e.g. in noun phrases, the noun can only move around dem,
num and adj as a head and ii) that these are “dependents” of the noun, i.e.
these two auxiliary assumptions would rule out a structure like that in figure
(1.12), where N has head-moved to the right. However, A&N do not argue
for these two extra assumptions. They explicitly argue that phrasal movement
?
HH
H
? N
HH
H
? dem
HH
? num
H
? adj
t
Figure 1.12: Impossible NP
is not subject to the rightward movement restriction, and propose to handle
extraposition phenomena in this way, so if N could move as a phrase, the tree
in figure (1.12) should be OK, depending on how one analyzes dem, num, adj.
The crucial difference between head movement and phrasal movement on their
account is that, given “immediate attachment”, the parser is forced to postu-
late “abstract heads” before the “actual head” is encountered. Postulation of
“abstract heads”, in turn, leads to the potential of conflicting information (re-
analysis) if a trace is to be inserted inside an already parsed string. Phrases are
not postulated before they are encountered, so there’s no potential for conflict.
1.4 Antisymmetry and processing 29
Thus, it appears that they do derive that head movement, in a technical sense,
cannot cross “dependents”, but that phrasal movement can.22 The question is
whether, granting, for the sake of the argument, A&N’s assumptions (22-23),
this suffices to derive GU20. Apparently, it doesn’t. We will need the two
auxiliary assumptions.
To see that this objection should be taken seriously, consider the follow-
ing phenomenon from Norwegian. Certain “low” adverbs occur in the same
order/scope when they follow the VP as when they precede it (Cinque, 1999;
Nilsen, 2000). This happens when the last adverb is stressed.23 Sentential
negation (ikke) and “high” sentential adverbs do not participate in such or-
ders (31e). If the order/scope of the adverbs is reversed (31f), the result is
ungrammatical.
(31) a. Jens hadde ikke lenger alltid helt forstått
J had not any.longer always completely understood
problemet,
the-problem
b. Jens hadde ikke lenger alltid forstått problemet
J had not any.longer always understood the-problem
helt.
completely
c. Jens hadde ikke lenger forstått problemet alltid
J had not any.longer understood the-problem always
helt.
completely
d. Jens hadde ikke forstått problemet lenger alltid
J had not understood the-problem any.longer always
helt.
completely
e. * Jens hadde forstått problemet ikke lenger alltid
J had understood the-problem not any.longer always
helt.
completely
f. * Jens hadde ikke forstått problemet helt alltid
J had not understood the-problem completely always
lenger.
any.longer
It seems hard to avoid the conclusion that, in Norwegian, the VP can move
around (left-adjoined) low adverbs. Suppose, for concreteness, that the adverbs
22 It is currently an open question whether head movement in the standard sense is possible
in any direction, See a.o. Chomsky (2001); Koopman and Szabolcsi (2000); Müller (2002);
Starke (2001); Nilsen (to app.b).
23 If the adverbs are destressed, in a “comma reading”, both orders are possible.
30 Preliminaries
are left-adjoined to vP, and that VP is allowed to move leftward and adjoin
to vP. The relevant part of the structure for the examples in (31) would be
a. b.
vP vP
H HH
HH H
adv* vP vP adv*
H H
HH HH
VPi vP vP VPi
HH HH
adv* vP vP adv*
H
H H
v ti ti v
Figure 1.13: VP-scrambling
figure (1.13a). But then what in A&N’s system would prevent a symmetric
language, where the adverbs are adjoined to the right, and the VP can move to
the right as in (1.13b)? This would be a language where a sequence of adverbs
with “inverse” (right to left) scope could precede the VP. Of course, no such
language is attested.
In fact there is ample reason to think that the status of the expressions
as heads is irrelevant to the proper formulation of GU20. We have just seen
that adverb-VP ordering appears to conform to it. Cinque (2002) argues that
PP-VP ordering conforms to it, too. Verbal cluster formation also conforms to
it, and this has been argued not to involve head-movement, but rather phrasal
movement (Koopman and Szabolcsi, 2000).
A&N show that their parser allows string vacuous head movement to the
right. In fact, they argue that the finite verb in Japanese moves to C, just as
in Dutch, but that in this language, C is final. But then what prevents the
existence of a cousin of Japanese where the specifier of C is also final? This
would be a verb-second-to-last language. A&N are aware of this problem, of
course, and they try to explain this in terms of a Right-Roof effect. The right-
ward movement of the XP in spec-C would cross XP-barriers, and hence be
ruled out by the Right-Roof constraint. They argue that this constraint can be
made to follow from an assumption that phrases become “atomic” when the
parser has finished parsing them, and one cannot postulate traces inside such
atoms. For example, rightward movement of an object to spec-C would neces-
sarily cross VP, so the parser would have to postulate a trace in VP after it has
finished analyzing it, i.e. after it has become an atom. It is somewhat strange
that the parser, which is otherwise fully aware of the grammar of the relevant
language, would “close off” a VP which violates the θ-criterion. Furthermore,
it seems rather mysterious that, on A&N’s account, rightwards head movement
is easier than rightward phrasal movement when it comes to movement out of
1.4 Antisymmetry and processing 31
IP, but harder than phrasal movement when it comes to “extraposition.” Sup-
pose that this can be made to work, however. There could still be a cousin
of Japanese where spec-CP is initial when, say, an object is moved there, but
final otherwise, e.g. when an adverbial is base-generated there. This would
result in a language which is verb-last when something is moved to spec-C,
and verb-second-to-last when something is base-generated there. Yet another
cousin of Japanese could reserve its final spec-C exclusively for base-generated
material, but require it to be filled, sometimes by expletives. This would also
be a strict verb-second-to-last language. Hence it is difficult to see that A&N
derive the universal absence of penultimate position phenomena. By the LCA
it follows, given that the LCA bans rightward movement in principle.
1.4.5 A&N’s arguments for right-adjunction
Ackema and Neeleman point out that there are arguments that circumstan-
tial PPs are right adjoined when they follow the VP. Their arguments are
essentially that a cluster of sentence final PPs occur in the mirror ordering of
the sentence internal order (Koster, 1974; Barbiers, 1995). This would follow
straightforwardly if the PPs can be adjoined to the right or to the left of the
VP. The second argument is that the relative scope of the PPs conforms to
the right-/left-adjunction structures. However, these facts are also compatible
with an antisymmetric approach. What they seem to show is that the structure
of a VP with three final PPs, e.g. (32a) must conform to the general layout
of (32b) (at some point of the derivation). More in particular, the facts do
not pose problems for an antisymmetric account unless it can be demonstrated
beyond doubt that the labels rendered here as ‘?’ must be VP. Barbiers (1995)
suggests that (32b) is obtained derivationally by successively moving VPs into
spec-PP as illustrated in figure (1.14).24
(32) a. shot him with a gun in the park on Friday
b. [? [? [? [VP shot him] [PP with a gun]] [PP in the park]] [PP on
Friday]]
Part of the evidence Barbiers (1995) has for this kind of derivation comes
from the distribution of focus particles like pas ‘just’. He shows that, Dutch
focus particles must immediately precede the constituent they associate with,
with the sole exception of “extraposed” PPs, which cannot be immediately
preceded by pas. Consider the contrasts in (33):
(33) a. Jan heeft pas in EEN stad gewerkt.
J has just in one city worked
24 Barbiers shows that this is not a case of “sideways movement”, i.e. the moved VPs do end
up c-commanding their traces with Kayne’s (or his own) definition of c-command. Kayne’s
definition is the following: X c-commands Y iff X, Y are categories, X does not dominate Y,
and Y does not dominate X, and every category that dominates X, dominates Y. Note that,
with this definition, there would have to be empty heads between each specifier of VP.
32 Preliminaries
VP1 VP1
H HH
HH H
PP1 VP2 PP1 tVP2
H =⇒
HH HH
PP2 VP3 H
VP2 PP1
HH
PP3 VP4 HH
H
P
P PP2 tVP3
V NP H
HH
VP3 PP2
H
HH
PP3 tVP4
HH
VP4 PP3
P
P
V NP
Figure 1.14: VP-intraposition
b. * Jan heeft gewerkt pas in EEN stad.
J has worked in just one city
c. Jan heeft pas gewerkt in EEN stad.
J has just worked in one city
d. Jan heeft in EEN stad gewerkt pas
J has in one city worked just
e. Pas in EEN stad heeft Jan gewerkt.
just in one city has J worked
A free right/left-adjunction analysis would have to stipulate that (33b) is un-
grammatical. Barbiers shows that it follows from his account of PPs and focus
particles. Barbiers motivates the movements semantically: The VP moves to
spec PP in order to establish appropriate semantic relations with it. I refer the
reader to Barbiers’ work to see how this is done in detail. In Nilsen (2000), it
is argued that the left branching structure can be base-generated directly, still
conforming to the LCA, if one treats the PPs as reduced relative clauses on
the VP (event variable). According to this view, the structure comes out as
something like figure (1.15), without any movements.
The idea according to this view, is that a temporal PP is a reduced relative
clause on TP. It projects a TP, and takes a TP in its specifier. Similarly, a
locative projects an AspP and takes an AspP in its specifier. The reason for
1.4 Antisymmetry and processing 33
TP1
H
HH
HH
H
HH
H
H
TP2 TP1
H HH
HH T1 PP
H
(already) TP2 PP
on Friday
HHH
H
T2 AspP1
H
HH
H
AspP2 AspP1
HH H
HH
Asp2 VP Asp1 PP
P
P P
PP
V NP in the park
Figure 1.15: Relative clause structure for PPs
differentiating the labels is that temporals and locatives are ordered, i.e. tem-
porals invariably occur higher in the structure than locatives. Some arguments
favoring this kind of structure over the right-adjunction structure is that it
interacts in interesting ways with the possibility of having sentence final sen-
tential adverbs. Consider the following minimal pairs from Norwegian (capitals
indicate prosodic stress):
(34) a. . . . at han har møtt Jens i parken ALLEREDE
. . . that he has met J in the-park already
b. * . . . at han har møtt Jens på fredag ALLEREDE
. . . that he has met J on Friday already
c. . . . at han allerede har møtt Jens på fredag
. . . that he already has met J on Friday
The adverb allerede ‘already’ can follow the VP if it is modified by a locative
PP, but not if it is modified by a temporal one. If we analyze allerede as a
specifier of TP2 in figure (1.15), this follows without further stipulation: There
is no constituent XP containing the VP and the temporal PP which can shift
around the adverb allerede. There is, however, a constituent (i.e. AspP1 which
can shift around allerede. In Nilsen (2000), several more adverb/PP pairs are
shown to (mis-) behave in the same way. If the adverbs and PPs were freely
right/left-adjoined to VP, it is hard to imagine a non-stipulative account for
contrasts like that in (34).
34 Preliminaries
In order to derive the mirror ordering effects (Koster, 1974) within this
framework, one would assume that the predicate of the relative clause can be
fronted instead of the relative head. This is in accordance with the derivation of
head final relatives in e.g. Japanese proposed by Kayne (1994); Bianchi (1995).
Perhaps the worst problem for the free right/left-adjunction account for
circumstantials is what has come to be known as “bracketing paradoxes”: the
evidence sometimes favors a left-branching structure, and sometimes a right-
branching one, with conflicting results, sometimes for one and the same sen-
tence. These were first discussed by Pesetsky (1995), and recently given an
antisymmetric derivation in Cinque (2002). As arguments for a left-branching
structure, Cinque mentions i) absence of principle C effects when an argument
precedes a coreferential R-expression in an adjunct (35a); ii) constituency di-
agnostics, e.g. the possibility of stranding the locative in (35b) and the impos-
sibility of fronting the string John in the park (35c); iii) the relative scope of
VP-final PPs, i.e. VP-final PPs take scope “towards the left”, and not “towards
the right” as we would expect on a right-branching structure. For example, in
(35d), the because-clause takes scope over the entire VP, including the locative
PP.
(35) a. They killed himi on the very same day Johni was being released
from prison.
b. [Kill John]i they did [ ti in the park].
c. * [John in the park]i they [killed ti ].
d. John smoked in the car because of the rain.
As arguments favoring a right-branching structure for the PPs, Cinque (2002)
mentions i) anaphor binding into an adjunct (36a-36b); ii) variable binding into
an adjunct (36c-36d); and iii) negative polarity item licensing into an adjunct
(36e-36f).
(36) a. John spoke to Mary [about [these people]i ] [in [[each others]i
houses]].
b. * John spoke to Mary [about [each other]i ] [in [[these peoples]i
houses]].
c. Gideon Kremer performed [in everyi Baltic republic] [on itsi in-
dependence day].
d. * He spent many hours [in itsi memorial] [on everyi independence
day].
e. John spoke to Mary [about no linguist] [in any conference room].
f. * John spoke to Mary [about any linguist] [in no conference room].
This lead e.g. Pesetsky (1995) to suggest that multiple trees can be associ-
ated with the same string (simultaneously). Cinque (2002) argues that the
evidence is compatible with an antisymmetric account which also incorporates
1.5 Summary 35
the word-order facts discussed by Barbiers (1995); Nilsen (2000) in addition
to deriving the “koster-effects” (Koster, 1974). This approach takes Pesetsky’s
different trees (or something like them) to be different derivational stages of
a sequential derivation. All this goes to show that the arguments adduced
by Ackema and Neeleman for left-branching structures are not arguments for
right-adjunction, and that, once the full complexity of the facts is taken into
account, the antisymmetric accounts seem to fare better with the data. The
right-adjunction approach would need to be significantly enriched in order to
handle the facts, and it is not clear that it would end up being any “simpler” or
postulate fewer theoretical entities and operations in need of motivation than
the antisymmetric accounts.
A&N argue that the extra heads needed in an antisymmetric account must
be motivated independently. This is true as long as one takes them to be func-
tional heads in the standard sense. Aside from answering this challenge directly,
by actually trying to independently motivate lots of heads, which is Cinque’s
approach, one could think of at least two ways of circumventing this problem:
1) abandoning the extra heads. By minimally adjusting Kayne’s definition of
c-command as follows: X c-commands Y iff X,Y are categories and every seg-
ment dominating X, dominates Y. This leads to an asymmetric system with
all the properties of Kayne’s, except that it allows multiple specifiers/adjuncts.
2) One could treat the heads as “dummies” one can insert for free, much as in
Larson’s VP-shell analysis. So this is not a deep problem with antisymmetry.
Secondly, they argue that all the extra movements needed in an antisym-
metric account must be triggered by (independently motivated) features. But
the trigger-based view of movement is controversial, and not the only way of
“triggering” movement, and other ways of doing this that have been or will be
proposed could be extended to antisymmetric accounts as well. In short, this
is not a problem with antisymmetry per se, but rather one with certain aspects
of current syntactic theory in general. In fact, I think most syntacticians would
agree that there is currently no general account of “triggers” for displacement
phenomena. Furthermore, “directionality parameters” and other restrictions
on “symmetry” needs motivation as well. What are the general properties
(+N, +V, +θ, phonological phrasing etc) according to which directionality is
parameterized? Why are there no well documented cases of final specifiers? It
does not seem to follow from processing, for example.
1.5 Summary
I have argued that the main challenges for a Cinque-style approach to the
distribution of functional material in the clause are transitivity failures and
Bobaljik paradoxes. I have argued that an approach along the lines of Ernst
(2001); Svenonius (2001) can handle these problems, but only at the cost of
abandoning any hope of treating functional heads such as T as semantically
contentful expressions. The root of the problem was shown to lie in the use
36 Preliminaries
they make of the two orthogonal orderings of theoretical entities, viz. the FEO-
calculus (s-selection) and fseq (c-selection). Hence, the problem we are faced
with is this: If we want to make use of a single (linear) selectional sequence to
account for the distribution of functional material, we face transitivity failures
and Bobaljik paradoxes. If we make use of orthogonal sequences, we cannot
account for interactions between the kinds of expressions ordered by the or-
thogonal sequences. I conclude that we should look for alternative accounts
of the distribution of functional material in the clause, accounts that do not
depend on the notion of a selectional sequence.
We have seen that, although there are problems with antisymmetry, it re-
mains a viable and robust account for word-order asymmetries. At this stage,
it seems to me that the account in terms of processing comes with too many
problems for it to be taken as a real alternative. As noted in the beginning,
there are some problems with the exact formulation of the LCA (complicated
definition of c-command, reliance on a linear notion of “time”, etc.), and one
would like to look for even more elegant formulations of it. It may well be that
such a reformulation will ultimately rely on processing algorithms. So far it
seems to me that we have to settle for the original formulation.
CHAPTER 2
Domains for Adverbs
2.1 Introduction
Why are sentential adverbs1,2 ordered? This question has become relatively
central to linguistic theory since the work of Cinque (1999) and Alexiadou
(1997). The present chapter represents an attempt to give it a partial answer.
I will mainly concern myself with high (speaker oriented) adverbs, but some
discussion of lower (temporal) adverbs and manner adverbs is included.
It is well known that some adverbs are polarity items. For instance, adverbs
like yet and any longer are negative polarity items (NPI). Perhaps less widely
acknowledged is the fact that some adverbs are positive polarity items (PPI). In
van der Wouden (1997) it is argued that the Dutch adverbs al ‘already’ and niet
‘not’ are PPIs. In this chapter I will argue that surprisingly many sentential
adverbs are PPIs. In particular, I will argue that ‘speaker oriented’ adverbs,
like allegedly, fortunately, possibly, evidently should be treated as PPIs. I will
use this fact to derive the relative ordering in which adverbs can cooccur. The
observation is that some adverbs, in addition to being PPIs, themselves induce
1 This chapter will appear in virtually identical form as Nilsen (to app.a).
2 Throughout, I use the notions ‘adverb’ and ‘adverbial’ interchangeably and in a purely
descriptive sense. Alexiadou (2001) argues that the word class ‘adverb’ is questionable,
Dowty (2000) questions the argument–adjunct distinction, and Julien (2000) argues that the
notion of grammatical word is irrelevant for grammatical theory. Thus it is not obvious that
any class of adverbs or adverbials can be adequately defined and distinguished from e.g.
auxiliaries and verbal affixes. On this, see also Nilsen and Vinokurova (2000).
38 Domains for Adverbs
the environments that PPIs are excluded from. This results in a classification
of adverbs according to which environments they can/cannot/must appear in,
and which expressions can/cannot occur within their scope. It relates adverb
ordering to Bellert’s observation (Bellert, 1977) that certain adverbs are de-
graded in e.g. questions, imperatives and antecedents of conditionals.
Classifying adverbs in this fashion already gives us predictions for possi-
ble adverb sequences which differ from existing theories (Cinque, 1999; Ernst,
2001). However, I have not answered the initial question, i.e. why sentential
adverbs are ordered, unless I can explain why a given adverb should be sensitive
to a given property of its environment. In order to approach this question, I
will adopt the framework developed in Chierchia (2001) for (negative) polarity
items3 and attempt to extend it to positive polarity phenomena. I will focus
primarily on the adverb possibly here, but I think the approach can be extended
to other speaker oriented adverbs as well.
Quantification is quite generally subject to contextual restriction (West-
erståhl, 1988). For example, if I say that Stanley invited everybody, I do not
generally mean that he invited everybody in the whole world. The idea is
essentially that negative polarity any is lexicalized with a (universally closed)
variable ranging over domain expanding functions. The application of such do-
main expansion is governed by a strengthening condition: the result of widening
the quantificational domain for a quantifier in a proposition must be stronger
than (entail) the corresponding proposition without widening.
This derives the major distributional patterns for NPIs in a rather plausible
way. I suggest that this approach can be straightforwardly extended to positive
polarity items if we assume that PPIs are associated with domain shrinkage
rather than domain expansion. This approach is then argued to explain the
distribution of some (though not all) sentential adverbs.
Before proceeding, I would like to point out one complication, also noted in
Cinque (1999, to app.). It is often stated that Cinque’s hierarchy (Cinque, 1999)
should be given a semantic explanation.4 The complication is this: there are
adverb–adjective pairs which are apparently synonymous, but, nevertheless,
differ in distribution. For instance, the adverb probably cannot occur under
never, whereas the (proposition embedding) adjective probable can. Compare
(37a) to (37b):
(37) a. * Stanley never probably ate his wheaties.
b. It was never probable that Stanley ate his wheaties.
In order to make plausible a semantic explanation for the oddness of (37a),
it seems one has to be able to establish that probably is not synonymous with
3 Chierchia bases his theory on that of Kadmon and Landman (1993), and it is closely
related to proposals by Krifka (1995) and Lahiri (1997). These proposals are discussed in
section 3.2.
4 Such semantic derivation would not automatically remove the motivation for the hierar-
chy. Its motivation is that certain syntactic phenomena, such as verb placement in Romance,
interact with adverb ordering in intriguing ways. I return briefly to this point in section 5.
2.2 Adverbs and Polarity 39
probable in certain respects, otherwise (37b) should be odd as well. What
is more, one has to show that whatever semantic difference there is between
the two, that difference must be the culprit of the contrast in (37).5 I try to
establish such a difference between the pair possibly, possible below.
The chapter is organized as follows. In section 2, I present the main data,
that speaker oriented adverbs are excluded from the same type of environment
that license NPIs. In section 3, I present the theories of polarity items that I will
employ to derive this behavior and discuss some of their prediction with respect
to adverb ordering. Section 4 develops a semantics for the adverb possibly,
deriving its semantic differences with the adjective possible and, at the same
time, deriving its distribution. In section 5, the final section, I discuss how the
present account bears on current theories of adverb ordering phenomena.
2.2 Adverbs and Polarity
2.2.1 NPIs and speaker oriented adverbs
In Bellert (1977) it was observed that certain sentential adverbs have a narrower
distribution than one might expect. For example, she observes that speaker ori-
ented adverbs (henceforth SOA), such as evaluatives (fortunately), evidentials
(evidently) and some modals (possibly) are degraded in questions (39a). As it
turns out, these types of adverbs are also degraded in antecedents of condi-
tionals (39b), imperatives (39c) under negation (39d), under clause-embedding
predicates like hope (39e), as well as within the scope of monotone decreasing
subject quantifiers like no N (39f). I use Norwegian examples to demonstrate
this. As far as I have been able to determine, English works the same way, i.e
the English translations are also degraded. In (39), ‘ADV’ represents any of
the adverbs in (38):6
5 I take this to be a problem for Ernst’s (2002) semantic approach to adverb distribu-
tion. This author takes adverb–adjective pairs like the ones under discussion to be, in fact,
systematically synonymous.
6 Kanskje ‘maybe’ is better in questions than the other ones. However, when maybe occurs
in questions, as in (i), the question is not whether or not it is possible that S ate his wheaties,
rather, it seems to be
equivalent to the same question with maybe removed.
(i) Did Stanley (maybe) eat his wheaties (, maybe)?
I have no suggestions as to why this should hold here (but see van Rooy (2002) on polarity
items and questions). One can also find occurrences of probably in antecedents of conditionals
which are not that bad.
(ii) If Le Pen will probably win, Jospin must be disappointed.
I take the slipperiness of some these intuitions to be comparable to that found with relative
adverb ordering. Consequently, I will try to stick to phenomena for which the intuitions are
sharper.
40 Domains for Adverbs
(38) heldigvis, tydeligvis, paradoksalt nok, rlig talt, muligens,
fortunately evidently paradoxically honestly possibly
kanskje, sannsynligvis, angivelig, neppe
maybe probably allegedly hardly
(39) [Norwegian]
a. Spiste Ståle (*ADV) hvetekakene?
ate S (*ADV) the-wheaties
“Did Stanley (*ADV) eat the wheaties?
b. Hvis Ståle (*ADV) spiste hvetekakene,. . .
if S (*ADV) ate the-wheaties
c. (*ADV) Spis (*ADV) hvetekakene!
(*ADV) eat (*ADV) the-wheaties
d. Ståle spiste (ADV) ikke (*ADV) hvetekakene.
S ate (ADV) not (*ADV) the-wheaties
“Stanley (ADV) didn’t (*ADV) eat the wheaties.”
e. Jeg håper Ståle (*ADV) spiste hvetekakene.
I hope S (*ADV) ate the-wheaties
f. Ingen studenter (*ADV) spiste hvetekakene.
no students (*ADV) ate the-wheaties
The same adverbs can appear in degree clauses (40a) and under clause
embedders like think (40b) (compare (39e) with the verb hope), as well as, of
course, ordinary declaratives (40c).
(40) a. Ståle var så sulten at han (ADV) spiste hvetekaker.
S was so hungry that he (ADV) ate wheaties
b. Jeg tror Ståle (ADV) spiste hvetekakene.
I think S (ADV) ate the-wheaties
c. Ståle spiste (ADV) hvetekakene.
S ate (ADV) wheaties
It seems that SOA are excluded from the environments that license negative
polarity items (NPI) like English any, Greek tipota or Dutch ook maar iets.
Thus, SOA appear to be positive polarity items (PPI). I give Dutch examples
in (41) and Greek ones in (42).7
(41) [Dutch]
a. Heeft Jan ook maar iets gegeten?
has J anything eaten
7 Some Dutch speakers find (5-c) and (5-e) degraded. What is crucial here is that some
speakers accept them, so these environments are relevant for NPI licensing.
2.3 Approaches to polarity items 41
b. Als je ook maar iets hebt gegeten,. . .
if you anything has eaten
c. Eet ook maar iets!
eat anything
“Eat (something)!”
d. Er heeft geen enkele student ook maar iets gegeten.
there has no single student anything eaten
e. Ik hoop dat Jan ook maar iets heeft gegeten.
I hope that J anything has eaten
f. * Ik denk dat Jan ook maar iets heeft gegeten.
I think that J anything has eaten
(42) [Greek]
a. Efage o Yiannis tipota?
ate J anything
b. An o Yiannis efage tipota,. . .
if J ate anything
c. Fae tipota!
eat anything
d. Elpizo na efage o Yiannis tipota.
hope-1sg that ate J anything
e. * Nomizo pos o Yiannis efage tipota.
think-1sg that J ate anything
The emerging generalization is the following: whenever an NPI like ook
maar iets or tipota is licensed, speaker oriented adverbs are degraded. In other
words, SOA are PPIs.
2.3 Approaches to polarity items
2.3.1 Monotonicity and veridicality
NPIs like ook maar iets and tipota are usually classified as ‘weak’ NPIs. These
are licensed in all downwards entailing (DE) contexts, as well as certain other
contexts, including epistemic might. NPIs that are only licensed in a subset
of DE environments (viz. antiadditive environments) are called ‘strong’ NPIs.
An function f is antiadditive iff
f (a ∨ b) = f (a) ∧ f (b).
It is DE iff it is order reversing, i.e. whenever a v b, it holds that f (b) v f (a),
where “v” denotes semantic strength. It is Antimorhic (AMo) iff it is both
42 Domains for Adverbs
AA, and antimultiplicative. A function f is antimultiplicative iff
f (a ∧ b) = f (a) ∨ f (b).
Examples of AA operators include nobody, not, never. Classical negation is
antimorphic, i.e. it is both AA and antimultiplicative.8 DE operators that are
not AA include less than half of the N and rarely.
Giannakidou (1997); Zwarts (1995) argue that there is a well defined class of
semantic environments, non-veridical (NV), in which weak NPIs are licensed.
Informally, a (propositional) operator Op is veridical if Op(ϕ) entails ϕ for
any ϕ; non-veridical if Op(ϕ) does not entail ϕ, and anti-veridical if Op(ϕ)
entails ¬ϕ. I give a more formal definition below. In general, we have that
all antiadditive environments are DE, and that all DE environments are NV,
although the converse is not generally true (cf. van der Wouden (1997); Zwarts
(1998); Bernardi (2002) for discussion and formal proofs):
AM o ⊆ AA ⊆ DE ⊆ N V
This is important, because it it shows us that, whenever some expression is
is (anti-)licensed in NV environments, it is (anti-)licensed in DE, AA and AMo
environments as well. Thus, an analysis in terms of NV-ness can incorporate
results from the analyses in terms of DE-ness.
Formally, veridicality can be defined as follows (Bernardi, 2002):9
Definition 2.1 Let f be a boolean function with a boolean argument,
1. f ∈ DtDt . f is said to be
(a) veridical iff [[f (x)]] = 1 |= [[x]] = 1
(b) non-veridical iff [[f (x)]] = 1 6|= [[x]] = 1
(c) anti-veridical iff [[f (x)]] = 1 |= [[x]] = 0
Dha,ti
2. f ∈ Dt (i.e. f is a generalized quantifier). f is said to be
(a) veridical iff [[f (x)]] = 1 |= ∃y ∈ Da ([[x(y)]] = 1)
(b) non-veridical iff [[f (x)]] = 1 6|= ∃y ∈ Da ([[x(y)]] = 1)
(c) anti-veridical iff [[f (x)]] = 1 |=6 ∃y ∈ Da ([[x(y)]] = 1)
Some examples are in order. Possibly is non-veridical because possibly(ϕ) does
not entail (“6|=”) ϕ. Obviously is veridical, because obviously(ϕ) |= ϕ. I hope
that is non-veridical, because I hope that ϕ 6|= ϕ. The same appears to hold
8 Antiadditivity and antimultiplicativity combine to yield the de Morgan Laws for classical
negation. See van der Wouden (1997) for discussion of these properties and their relevance
for different classes of polarity items.
9 Bernardi (2002) also generalizes the definition to n-ary boolean functions. I omit this
here, as it will not be relevant to our discussion.
2.3 Approaches to polarity items 43
for I think that , but Giannakidou argues that whenever it is true of John
that he thinks that ϕ, then ϕ must be true in his epistemic model. This does
not hold for hope. In this sense, then, think can be said to be weakly veridical.
Weak veridicality is thus veridicality relativized to epistemic states.
Giannakidou’s motivation for using the notion is, of course, that some NPIs
are apparently licensed outside of DE contexts. One example is adversative
predicates (43). These are argued by Linebarger (1987) not do be DE.
(43) a. I’m surprised he has any potatoes.
b. I’m glad he has any potatoes.
However, Kadmon and Landman (1993) argue rather convincingly that these
environments are, in fact, DE, once a certain contextual perspective is fixed
and we limit ourselves to cases where their factive presupposition is satisfied.
These authors argue along similar lines for antecedents of conditionals, which
have also been argued not to be DE (Linebarger (1987)).
Another case for NVness are questions which cannot straightforwardly be
said to be DE. In van Rooy (2002) it is shown that questions are, in fact, DE
in their subject position under a Groenendijk and Stokhof (1984) style seman-
tics. He gives an analysis of polarity items in questions extending and refining
the results of Kadmon and Landman (1993); Krifka (1995), without appealing
directly to NVness or DEness, however. It appears that the remaining cases
for NVness are existential modals (e.g. might, cf. (44a)) imperatives (44b)
and generics (44c). Generics was actually given a treatment in Kadmon and
Landman (1993), so one might suspect that the other two can be dealt with as
well.
(44) a. John might have bought anything.
b. Bring anything!
c. John buys anything he finds.
There are also some problems for the NV-based approach. One obvious one is
that languages differ with respect to which NV operators they allow to license
weak NPIs. For instance, in Greek, isos ‘possibly’ licenses tipota, but the
corresponding adverbs do not license Dutch ook maar iets or English any.
(45) a. O Yiannis isos efage tipota.
J possibly ate anything
[Greek]
b. * John possibly ate anything.
c. * Perhaps John ate anything.
Similarly, we have seen that the Dutch and Greek verbs for hope license ook
maar iets and tipota. According to our informants, English hope does not
license any, as seen in (46).
(46) * I hope John has any potatoes.
44 Domains for Adverbs
Another problem is that English any and Dutch ook maar iets, although they
are licensed in other NV environments, are actually reported to be degraded
under a purely DE operator, i.e. one which is not AA. This result is slightly
alarming, because the NPIs in question are licensed in other (upwards entailing)
NV environments, as well as in AA environments, and, as we have seen, AA ⊆
DE ⊆ N V .
(47) a. ? Less than three students have eaten anything.
b. ? Minder dan drie studenten hebben ook maar iets gegeten.
less than three students have anything eaten
This does not show that the class of licensers for these items cannot be char-
acterized as (a subset of) NV, but it does seem to show that something more
needs to be said.
Speaker orientation, NV and DE
Before moving on I would like to point out that the NV-based approach gives
us some predictions for adverb ordering as it stands. The idea is this. Suppose
that we rephrase the generalization made in the previous section in the following
way:
(48) SOA are excluded from X environments.
where X ranges over AMo, AA, DE and NV.
Let us now look at Cinque’s universal hierarchy with adverbs in their respective
specifiers Cinque (1999)
(49) [moodspeech−act frankly [moodevaluative fortunately [moodevidential al-
legedly [modepisthemic probably [Tpast once [Tf uture then [modirrealis
perhaps [modnecessity necessarily [modpossibility possibly [asphabitual usually
[asprepetetive again [aspf req(I) often [modvolitional intentionally [aspcelerative(I)
quickly [Tanterior already [aspterminaitive no longer [aspcontinuative still
[aspperf ect(?) always [aspretrospective just [aspproximative soon [aspdurative
briefly [aspgeneric/progressive characteristically(?) [aspprospective almost
[aspsg.completive(I) completely [asppl.completive tutto [voice well [aspcelerative(II)
fast/early [asprepetetive(II) again [aspf req(II) often [aspsg.completive(II) com-
pletely ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
Out of the adverbs in (49), the ones in (50a) are NV. Hence, if this is the
relevant property that SOA are “allergic to”, we expect them not to be able to
precede and outscope SOA. To the NV class, I would like to add the ones in
(50b).
(50) a. allegedly, probably, perhaps, possibly, usually, no longer
b. hardly, never, rarely, not
2.3 Approaches to polarity items 45
Let us begin with some relatively clear contrasts. We expect, for example,
that there should be a contrast between often and rarely with respect to their
ability to outscope SOA. (51a) is an example found on the internet. It seems to
contrast minimally with (51b) which should be a near semantic equivalent.10
We expect to find a similar contrast between always and never. (52a), also
from the internet, is taken from an advertizement for an internet game. Again
there is a fairly sharp contrast with (52b), which is what we expect.11
(51) a. His retaliations killed or endangered innocents and often pos-
sibly had little effect in locating terrorists.
b. ?? His retaliations killed or endangered innocents and rarely
possibly had an effect in locating terrorists.
(52) a. This is a fun, free game where you’re always possibly a click
away from winning $1000!
b. ?? This is a fun, free game where you’re never possibly further
than a click away from winning $1000!
In (52a), one could in principle argue either that possibly directly modifies the
noun phrase one click away..., or that always directly modifies possibly or both.
This is rendered implausible by the following facts. The translation of (52a)
into Norwegian is equally grammatical (53a), and in (53a), muligens ‘possibly’
is separated from the noun phrase by the copula. Furthermore, the subject of
the clause can intervene between the two adverbs (53b). For (53b), in turn,
one could object that alltid ‘always’ could directly modify the subject, but this
is refuted by the ungrammaticality of (53c), where this supposed constituent
is moved to the V2-initial position. One could be tempted to say that the
position of the subject in (53b) is due to some PF-reordering mechanism. But
this cannot be the case (at least on available ideas of what PF-reordering could
do), since the subject takes surface scope with respect to the adverbs: the
sentence means that it is always the case that, for at least one player, it is
possible that that player is one click away from the prize. Finally, there is
a contrast, also in Norwegian between (53a) and (53d), where aldri ‘never’
is substituted for alltid ‘always’, along with some other changes to make the
example more pragmatically plausible.
(53) a. Dette er et morsomt, gratis spill hvor spillerne alltid
this is a fun free game where the-players always
muligens er et klikk fra å vinne $1000!
possibly are one click from to win $1000
10 I did find some examples of the string rarely possibly but these seem to involve mis-
spellings. Here is one example:
(i) The receiver requires a line of sight to the satellites that is relatively unobstructed by
foliage and buildings, and this is rarely possibly on such a compact campus.
11 I have deliberately tried to make the (b)-examples roughly synonymous with the (a)
examples. This is to prevent a contrastive interpretation of the (b) examples: Contrastivity
is known to suppress (positive) polarity effects (on this, see a.o. Szabolcsi (2002)).
46 Domains for Adverbs
b. Dette er et morsomt, gratis spill hvor alltid [en av
this is a fun free game where always one of
spillerne] muligens er et klikk fra å vinne $1000!
the-players possibly is one click from to win $1000
c. * Alltid en av spillerne er muligens et klikk fra å
always one of the-players is possibly one click from to
vinne $1000!
win $1000
d. ?? Dette er et morsomt, gratis spill hvor spillerne aldri
this is a fun free game where the-players never
muligens er lenger enn et klikk fra å vinne $1000!
possibly are further than one click from to win $1000
Another contrast of the same kind holds between the pair already and not yet12
Consider (54a) which contrasts with (55a).
(54) a. In Wall Street, Enron was already allegedly going bankrupt.
b. In Wall Street, Enron was allegedly already going bankrupt.
(55) a. ?? (On Wall Street,) Enron was not yet allegedly going bankrupt.
b. On Wall Street, Enron was allegedly not yet going bankrupt.
(56a), from the internet, does not contrast sharply with (56b); my infor-
mants find the latter example quite acceptable. This requires an explanation.
(56) a. Although Beckham’s absence will be felt England are still
probably the best national team in Europe.
b. (Beckham’s absence will be felt and) England are no longer
probably the best national team in Europe.
Consider what is the semantic contribution of an adverb like no longer. In
general, no-longer(p) is truth-conditionally equivalent to ¬p, with the added
presupposition that p was true at some previous time.13 This presupposition
makes it virtually impossible to exclude a contrastive reading of the modifiee.
Contrastivity, as we have seen, quite generally interferes with polarity effects
(Szabolcsi, 2002). The English word somewhat is a PPI, so (57a) is odd unless it
is read with heavy stress on doesn’t, thus contrasting it with the corresponding
positive assertion. Finally, also somewhat can appear under no longer, again
12 See Löbner (1999) for arguments that not yet is the negation of already, and that still
is the dual of already, whereas no longer is the negation of still. Still and already are both
PPIs, but they are allowed in a superset of the environments allowing for SOA. I return to
this point shortly.
13 no longer, still, already, not yet also come with aspectual requirements pertaining to the
modified predicate (Löbner, 1999).
2.3 Approaches to polarity items 47
probably due to the contrastivity inherent to the presuppositional content of
the adverb. In this way, the fact that (56b) is acceptable is actually expected on
the view that probably is a PPI. (55a) also improves considerably if interpreted
as an emphatic denial of (54a).
(57) a. ?? Stanley doesn’t like it somewhat.
b. Stanley no longer likes it somewhat.
If examples like (51a, 52a, 54a, 56a) are not very frequent, there are prob-
ably pragmatic reasons for this. In any case, we are not trying to account for
the statistical distribution of expressions, and I think it is fair to say that these
examples are grammatical.
Two SOA in the same sentence rarely give good results. Consider, for
instance (58).
(58) a. ?? Maybe Stanley probably ate his wheaties.
b. ?? Probably, Stanley possibly ate his wheaties.
We might take this to indicate that these adverbs are excluded from NV envi-
ronments, rather than merely DE environments. However, I do not think the
oddness of these sentences is due to the status of the adverbs as PPIs. Rather,
I think it is due to the fact that they are all epistemic adverbs of the same sort,
and one cannot epistemically modify the same sentence twice, if the modifiers
are of the same epistemic kind. One reason to think that this is true is the
following. Necessarily is also epistemic, but it certainly isn’t a PPI as it occurs
felicitously under negation (59a). Nevertheless, it is bad under the scope of
possibly (59b).
(59) a. Stanley didn’t necessarily eat his wheaties.
b. ?? Possibly, Stanley necessarily ate his wheaties.
Another reason not to say that SOA are excluded from NV environments is
that allegedly which is NV, seems to be able to outscope probably and possibly
(60).
(60) Allegedly, Enron was probably/possibly going bankrupt.
Thus it does not seem to be the case that probably/possibly is excluded from NV
environments. Speech-act adverbs like (briefly) might seem to be excluded from
all NV environments, since these are degraded under NV epistemic adverbs.
For example, (61b) does not have a reading according to which “it is possible
that I am brief in saying that S ate his wheaties.”
(61) a. Briefly, Stanley possibly ate the wheaties.
b. ?? Possibly, Stanley briefly ate the wheaties.
48 Domains for Adverbs
This might also be treated as a syntactic binding effect. Thus, briefly is essen-
tially a manner adverb containing a variable which can be syntactically bound.
If it occurs “too high” in the clause for the subject to bind it, and thus to get a
“subject” oriented or standard “manner” reading, it becomes speaker oriented
by default.14 This is supported by the fact that even frankly becomes subject-
oriented in certain cases. (62a) can only mean that Stanley is frank, not the
speaker. (62b) is apparently ambiguous between the two readings, while (62c)
seems to prefer the speaker oriented reading.
(62) a. Maybe Stanley frankly doesn’t like fish cakes.
b. Stanley frankly doesn’t like fish cakes.
c. Frankly, Stanley doesn’t like fish cakes.
Cinque (to app.) points out that (63a) is degraded, while if one of the
adverbs is realized as an adjective (63b), the example becomes good. Cinque
does not pose this as a problem for the present account, but, since surely is
not NV (or DE), it might be thought to be one. However, the two adverbs
surely and probably do not seem to be able to cooccur in any order, as seen by
comparing (63a) to (63c), so the relative order of the adverbs cannot be the
source of the oddness of these examples.
(63) a. ?? Stanley surely probably ate his wheaties.
b. It is surely probable that Stanley ate his wheaties.
c. ?? Stanley probably surely ate his wheaties.
This behavior might be related to the observation made in Cinque (1999) that
cooccurrence of two adverbs ending in -ly is quite regularly degraded.15 I do
not have an account for this phenomenon, but I would like to point out that it
is equally problematic for all existing accounts of adverb ordering I am aware
of. Examples like (63a) seem to improve if the two adverbs are not adjacent. In
this case the relative ordering does seem to matter, since (64b) is significantly
worse than (64a). But in this case, the adjectival version is also degraded, i.e.
(64c-64d) are also odd, thus contrasting with (63b).
(64) a. ? Surely, Stanley probably ate his wheaties.
b. ?? Probably, Stanley surely ate his wheaties.
c. ?? It is probably sure that Stanley ate his wheaties.
d. ?? I am probably sure that Stanley ate his wheaties.
e. John is probably sure that Stanley ate his wheaties.
What seems to go wrong with the examples where probably outscopes surely/sure
is that it makes little or no sense for the speaker to assert that s/he finds it likely
14 For a very similar point of view concerning the manner/subject-oriented distinction, see
Ernst (2000, 2001).
15 Similar remarks hold of German adverbs ending in -weise, Mainland Scandinavian ones
ending in -vis, or Italian ones ending in -mente.
2.3 Approaches to polarity items 49
that s/he is sure that Stanley ate his wheaties. In other words, (64e) is good
because the adverb probably refers to the speakers epistemic state, whereas sure
refers to John’s epistemic state.
In sum, it seems that SOA are excluded from DE environments, but gener-
ally allowed in NV environments.
A note on ‘phase quantifiers’ and frequency adverbs
We have noted that yet (in its temporal use) and any longer are NPIs. This
obviously limits their distribution. It has also been argued that already and
still are PPIs. But they do not appear to have the same distribution of adverbs
like possibly, so I will outline my answer to why this is so here. The answer is
that the (positive) phase quantifiers16 are not excluded from DE environments,
but merely from antiadditive (AA) environments. The latter, as we have seen,
forms a subset of the former. Few students and no students are DE, but only
the latter quantifier is AA. The definition of AA is repeated here.
f (a ∨ b) = f (a) ∧ f (b).
Applying this to the quantifiers in question, we see that (65a) is equivalent to
(65b), while (66a) is not equivalent to (66b).17
(65) a. No students jumped or danced. ⇐⇒
b. No students jumped and no students danced.
This equivalence goes through, because, if the set of students who danced is
empty, and the set of students who jumped is empty, then the set of students
who did one or the other must also be empty and vice versa.
(66) a. Few students jumped or danced 6⇐⇒
b. Few students jumped and few students danced.
This equivalence does not go through, because the fact that the set of students
who jumped has low cardinality, and the set of students who danced has low
cardinality does not entail that the union (disjunction) of these two sets has low
cardinality. To see this, suppose that “few” is contextually resolved to mean
“less than n”. Let S denote the set of students, D the set of dancers, and J
the set of jumpers. For (66b) to entail (66a), we would have to have that
|S ∩ J| < n ∧ |S ∩ D| < n
entails that
|S ∩ (J ∪ D)| < n
16 This term was introduced (as far as I know) by Löbner.
17 In the a-examples, only the readings where the quantifier outscopes the disjunction are
relevant. For some languages this is not possible, because in these languages, the disjunction
is itself a PPI (Szabolcsi, 2001). As Szabolcsi points out, this does not appear to hold for
English or, however.
50 Domains for Adverbs
which it clearly doesn’t. Now consider how already/still behave with respect
to these quantifiers.
(67) a. Few students are still/already here.
b. ?? No students are still/already here.
c. ?? The students aren’t still/already here.
The adverbs are apparently degraded under AA (hence under AMo) operators,
but good under merely DE ones. This parallels the fact that still/already are
substantially better within the scope of rarely (which is DE) than they are
within the scope of never (which is AA).
(68) a. At 9 AM/PM, Stanley is rarely still/already tired.
b. At 9 AM/PM, Stanley is never still/already tired.
Thus, phase quantifiers like already/still appear to be excluded from AA
environments. This allows them more freedom than SOA, because, as we have
seen AA⊆DE, and SOA are excluded from all DE environments. In particular,
(53-54) demonstrates that already/still enjoy considerable freedom.
We have seen that upwards entailing (UE) frequency adverbs, like always18
and often have a very free distribution. Always is degraded when it outscopes
negation. As discussed in Beghelli and Stowell (1997), this is a quite general
property of universal quantifiers. Thus, (69a) is an odd sentence unless it is read
with heavy stress on everybody, in which case the universal takes scope below
the negation, not the other way around. No such oddness arises with quantifiers
like many people (69b). An entirely parallel contrast obtains between (69c-69d).
(69c) and
(69) a. ?? Everybody didn’t snore.
b. Many people didn’t snore.
c. ?? John always didn’t snore
d. John often didn’t snore.
Apart from this, it seems that frequency adverbs have a very free distribution.
If we have 40 adverbs, there are 40 × 40 = 1600 ordered pairs that we would
have to consider in order to exhaust their ordering possibilities, that is, if we
limit ourselves to pairs of adverbs. If we do not limit ourselves to pairs, but
exclude repetitions of the same adverb, the number is 40! (= 1 × 2 × · · · × 40 ≈
1048 ), a truly astronomical number. Hence, I am not going to test all possible
orderings here. I hope to have made plausible the idea that limitations on
adverb distribution can be treated, to a large extent, as a polarity phenomenon.
18 Always, being a universal quantifier, is DE in its restriction and UE in its scope. See
Beaver and Clark (2002) for arguments that the restriction of always is given by the context,
and not by the background, or unfocussed part of the clause. In other words, always is not
focus-sensitive. Thus a clause (constituent) modified by always will be in a UE environment.
2.3 Approaches to polarity items 51
2.3.2 Why DE?
Summing up the findings in this section, we can refine our generalization about
SOA as in (70). Why should this generalization hold? This question becomes
rather acute when we combine it with the observation made in the introduction,
that the adjectival counterparts of these adverbs are not excluded from DE
environments. Compare (71a) to (71b):
(70) SOA are excluded from DE environments.
(71) a. * Jospin didn’t possibly win.
b. It is not possible that Jospin won.
If we want to explain adverb distribution from the lexical semantics of the dif-
ferent adverbs, we now have to look for some independently motivated semantic
difference between possible and possibly which we can take to be the source of
the contrast in (71). We will now see that there is a difference between these
two expressions. In the rest of the chapter, I will try to establish that this
difference is indeed the culprit. Consider (72).
(72) a. It’s possible that Le Pen will win. . .
b. # Le Pen will possibly win. . .
c. # Perhaps Le Pen will win. . .
d. . . . even though he certainly won’t.
(72a) followed by (72d) (uttered by the same speaker) appears to make up a
consistent statement. (72b)-(72c), on the other hand, cannot consistently be
continued with (72d). Restricting our attention to possible and possibly, we see
that there must be a truth-conditional difference between the two. Impression-
istically, the difference is this: (72a) simply states that there is some possibility,
however remote and implausible, that Le Pen will win. (72b) does not permit
such remoteness: It has it that there is a realistic chance for Le Pen to win. In
other words, (72b) constitutes a stronger statement than (72a).
2.3.3 Widen up and strengthen
According to Kadmon and Landman (1993); Krifka (1995); Lahiri (1997);
Chierchia (2001), the ungrammaticality of (73a) is a pragmasemantic phe-
nomenon. I follow the implementation in Chierchia (2001) here. The idea
is essentially that any is synonymous with the indefinite article a in the sense
that they are both existential quantifiers. Thus the meaning of both (73a,73b)
could be represented as (73c), where D subscripted to ∃ is a contextual re-
striction, a quantificational domain (Westerståhl, 1988) and ∃ represents an
existential generalized quantifier, i.e. λXλY.X ∩ Y 6= ∅.
(73) a. * John has any potato.
52 Domains for Adverbs
b. John has a potato.
c. ∃D (potato)(λx.has(j, x))
The difference between the two is argued to be that (73a) involves expansion of
D to a different quantificational domain D0 ⊇ D. In other words, any invites
us to consider more potatoes; it signals reduced acceptance of exceptions. Thus
a more accurate representation of the ungrammatical sentence (73a) would be
(74) where g is a function from sets to sets, such that ∀X(X ⊆ g(X)).
(74) ∃g(D) (potato)(λx.has(j, x))
So far, this does not explain why (73a) is ungrammatical. The reason for this
is that domain expansion is subject to a strengthening condition: The result of
domain expansion must entail the same proposition without domain expansion.
The fact that there is a potato in some large set such that John has that potato
does not entail that he has a potato in a subset of that large set. In order to
implement strengthening compositionally, Chierchia defines an operator Op
that universally closes the function variable g as follows:
Definition 2.2 Let ∆ be a contextually determined set of domain expansions,
let ϕ be sentential constituent containing a free occurrence of g, and ϕ0 be ϕ
with all free occurrences of g removed. Then
Op(ϕ) = ∀g ∈ ∆[ϕ], if ∀g ∈ ∆[ϕ] entails ϕ0 ,
else = undefined.
Op does the following: It takes a formula ϕ containing a free occurrence
of a domain expansion variable g and universally quantifies g just in case the
result of this whole operation entails ϕ without domain expansion. Otherwise
it is undefined. Consider the following examples.
(75) John doesn’t have any potatoes.
Derivation 2.1
¬∃g(D) (potato)(λx.has(j, x))
apply Op I
Op(¬∃g(D) (potato)(λx.has(j, x)))
check strengthening I
∀g ∈ ∆[¬∃g(D) (potato)(λx.has(j, x))] |=
¬∃D (potato)(λx.has(j, x))
I
∀g ∈ ∆[¬∃g(D) (potato)(λx.has(j, x))]
2.3 Approaches to polarity items 53
Derivation 2.2
∃g(D) (potato)(λx.has(j, x))
apply Op I
Op(∃g(D) (potato)(λx.has(j, x)))
check strengthening: I
∀g ∈ ∆[∃g(D) (potato)(λx.has(j, x))] 6|=
∃D (potato)(λx.has(j, x))
I
undefined.
(76) * John has any potatoes.
In words, the bottom line of Derivation (2.1) says that, for any domain expan-
sion that we are willing to entertain, there is no potato in the expanded domain
that John has. This clearly entails that there is no potato in the (unexpanded)
quantificational domain that John has, so strengthening is satisfied. In Deriva-
tion (2.2), strengthening is not satisfied, hence application of Op is undefined.
In Derivation (2.2), we actually have that that the “unexpanded” alternative
entails the result of application of Op, i.e. we have that
∃D (potato)(λx.has(j, x)) |= ∀g ∈ ∆[∃g(D) (potato)(λx.has(j, x))]
This will always happen if domain expansion applies in an upwards entailing
(UE) environment. This is is because of the fact that if a (small) set A has
a non-empty intersection with another set B, then every superset (expansion)
A0 of A will also have a non-empty intersection with B. Under DE operators,
this entailment relation by definition reverses, i.e. the unifying property of DE
operators is precisely reversal of entailment relations. Hence, domain expansion
is allowed exactly when it occurs in a DE environment. In van Rooy (2002) it is
argued that one can derive the stipulative part of this proposal (i.e. that domain
expansion requires strengthening) from very general and plausible assumptions
about the pragmatics of statements, ultimately related to the Gricean notion
of relevance. The latter notion, he derives from a decision theoretic notion of
utility. For reasons of space, I cannot go into the details of this here.
This account explains why any only occurs in monotone decreasing (entail-
ment reversing) contexts from its lexical semantics. See e.g. Krifka (1995) for
a congenial analysis of several other polarity items. In the next section, I will
show how the same kind of analysis can be extended to account for the behavior
of maybe and possibly. The idea is essentially that these adverbs are associated
with domain shrinkage. This explains their contrast with the adjective possible
noted in the previous section. It also derives their status as weak PPIs.
54 Domains for Adverbs
2.4 possibly: Shrink, but don’t weaken!
2.4.1 Modal bases
We have seen that there is an intuitive sense in which statements of the form
possibly(ϕ) are stronger than statements of the form it is possible that(ϕ). By
this we know that, if Γ is a DE operator, Γ(possibly(ϕ)) should be a weaker
statement than Γ(it is possible that(ϕ)). If the use of possibly involves some
operation that is subject to a strengthening condition, we can explain the
distribution of this adverb in a manner entirely parallel to the explanation of
the distribution of any. In order to achieve this, we need to look at what
possibly means.
As a start, I take an information state (belief state) to be a set of possible
worlds K, the set of worlds compatible with what we take to be true. If K = W ,
the set of all possible worlds, we are in a state of total ignorance: everything is
possible. Upon learning a new proposition p, the new information state K 0 is
given by K ∩ p. Suppose p is a contradiction. Then K ∩ p = ∅., i.e. the absurd
information state. If |K| = 1, i.e. it only contains one world, we are in a state
of total information. All this is entirely standard. What is the result, given K,
of learning that something is possible i.e. possibly(ϕ)? According to a standard
view (Groenendijk et al., 1996), the result should be K if K ∩ ϕ 6= ∅, otherwise
it should be ∅. Groenendijk et al. (1996) themselves point out that this leads
to a situation where one can never really learn anything new from possibility
statements. Like Groenendijk et al. (1996), I choose to live with this problem
for the purposes of this chapter. I add that epistemic possibilities work on
modal bases Kratzer (1977, 1991). In the default case, I take the modal base
to be W . Thus, the meaning of possible becomes as in shown in (77):
(77) [[possible]]K = λp[p ∩ W ∩ K 6= ∅]
Given that K is always a subset of W , this is equivalent to λp[p ∩ K 6= ∅].
In order to derive the strengthening effect observed with possibly, I take this
expression to come with domain shrinkage of the modal base. In (78), g is a
variable over domain shrinks, i.e. for all X, g(X) ⊆ X.
(78) [[possibly]]K = λpλg[p ∩ g(W ) ∩ K 6= ∅]
Just as with the domain expansion associated with any, I assume that our func-
tion variable must be universally closed. To this effect, we can use Chierchia’s
Op, but now with respect to domain shrinking functions, and with respect
to domains consisting of possible worlds. Thus, possibly applied to Le Pen
will win returns Derivation (2.3). ∆ now contains various ways in which we
can constrain our modal base in various contexts. For instance, If we are dis-
cussing the French election, we can shrink W by intersecting it with plausible
assumptions about the behavior of the French electorate. The bottom line of
Derivation (2.3) says that the intersection of “win” with such a modal base
has a non-empty intersection with K, our information state. Clearly, this can
2.4 possibly: Shrink, but don’t weaken! 55
Derivation 2.3
λg[win ∩ g(W ) ∩ K 6= ∅]
apply Op I
Op(λg[win ∩ g(W ) ∩ K 6= ∅])
check strengthening I
∀g ∈ ∆[win ∩ g(W ) ∩ K 6= ∅] |= [win ∩ W ∩ K 6= ∅]
I
∀g ∈ ∆[win ∩ g(W ) ∩ K 6= ∅]
only happen if “win” has a non-empty intersection with K to begin with, so
strengthening is satisfied.
This already derives the fact that possibly cannot occur in DE contexts.
Before I move on to demonstrate this, however, I need to make sure that this
setup actually derives the observed difference between possible and possibly, i.e.
possible(win) is consistent with certainly(¬win) while possibly(win) is not. The
standard semantics for certainly would be as in (79).
(79) [[certainly]]K = λp[K ⊆ p]
By this, it is possible that Le Pen will win, even though he certainly won’t comes
out as in (80), which is unfortunately inconsistent.
(80) [win ∩ K 6= ∅] ∧ [K ⊆ ¬win]
A solution that suggests itself is to make use of modal bases with this adverb as
well. Thus, we could try to assume that certainly also comes equipped with a
W -shrinking variable. There are two options, according to whether the modal
base intersects with K or p. I consider them in turn.
(81) [[certainly]]K = λpλg[g(W ) ∩ K ⊆ p]
Applying this to ¬win, we get the following.
The problem is that if we have that A ⊆ B, it does not follow from A ⊆ C
that B ⊆ C. In other words, our function variable g is already in a DE
environment in Derivation (2.4). In fact, given our meaning for certainly, it
should behave as a negative polarity item. But this is plainly wrong, given that
(82) is quite impeccable.
(82) Chirac will certainly win.
The other option would be to intersect our modal base with p rather than with
K. This yields (83).
(83) [[certainly]]K = λpλg[K ⊆ p ∩ g(W )]
56 Domains for Adverbs
Derivation 2.4
λg[g(W ) ∩ K ⊆ ¬win]
apply Op I
Op(λg[g(W )∩ ⊆ ¬win])
check strengthening I
∀g ∈ ∆[g(W ) ∩ K ⊆ ¬win] 6|= [K ⊆ ¬win]
I
undefined.
This would make (84a) a stronger statement than (84b). This does not cor-
respond to our intuitions, I think. If anything, asserting (84b) is a stronger
statement than the assertion of (84a). (84a) seems to indicate that there is
some (however small) room for doubt.
(84) a. Chirac has certainly won.
b. Chirac has won.
Be that as it may, the meaning for certainly in (83) does not help us. (85),
which is the meaning our example with it is possible that... would get now, is
still inconsistent.
(85) [win ∩ K 6= ∅] ∧ ∀g ∈ ∆[K ⊆ ¬win ∩ g(W )]
2.4.2 Entrenched beliefs
The discussion in the previous subsection seems to indicate that we need some-
thing different. I follow van Rooy (2001) in assuming that an information
state should be thought of as an ordering relation on the set of propositions
ϕ ⊆ W . Intuitively, the ordering relation tells us how plausible a proposition
is, given what we know. To do this, we need a way to compare the plausibility
of worlds, given what we know. The definitions below are taken from van Rooy
(2001), and ultimately from Harper (1976). They implement what is known as
“Harper’s Principle” for determining a similarity relation among worlds:
(86) Harper’s Principle (HP)
Only propositions decided by K should count in determining com-
parative similarity relative to K.
Definition 2.3 Let S = ℘W, let x, y ∈ W and K be a belief state. Then
Syx K = {p ∈ S | (K ⊆ p or K ⊆ ¬p) and
((x ∈ p and y 6∈ p) or (x 6∈ p and y ∈ p))}
2.4 possibly: Shrink, but don’t weaken! 57
Syx K gives us the set of propositions which are decided by K and on whose
truth value the two worlds x and y disagree. We can now say that a world x is
closer to what we take to be true, or more plausible, than another world y if x
disagrees less with what we know than y.
Definition 2.4 Let x, y be worlds. x K y iff for all w ∈ K:
|Sxw K| ≤ |Syw K|
The relation K gives us a system of spheres of worlds of the kind proposed
by Lewis and Stalnaker in the early seventies to account for conditionals and
counterfactuals. We now need to know what set of worlds to associate with
propositions. This is done by the following selection function. It tells us to
only consider the most plausible worlds in p.
Definition 2.5 CK (p) = {v ∈ p | ∀u ∈ p : v K u}
Given that we can count the number of propositions in Syx K, we can devise
a quantitative measure of plausibility.
Definition 2.6 Let k(w/p) be the implausibility of a world w, after updating
K with p. Let k(p/q) be the implausibility of a proposition p after updating K
with q.
worlds: k(w/p) = |Suw CK (p)|, for any world in CK (p).
propositions: k(q/p) = min{k(w/p) | w ∈ q}
A proposition p is accepted in K iff k(¬p/>) > 0. p is more implausible
than q w.r.t. K if k(p/>) > k(q/>). Let f (p) = k(¬p/>). Then f measures
the level of plausibility, or epistemic entrenchment of p, given K. We follow
van Rooy (2001) in defining an information state (belief state) K as just such
an entrenchment relation on a set of possible worlds.
I will now use the entrenchment relation f to give a semantics for epis-
temically modal statements. In van Rooy (2001), he proposes that what he
calls ‘evidential’ attitude reports, like be certain that, be sure that, be convinced
that can be treated within the following schema, where fwa is the entrenchment
function associated with a in w:
(87) [[a α that p]]w = 1 only if fwa (p) = high
“high” is a contextually determined number. I take it to be determined in
the following way: One picks some strongly believed proposition p. high=
fwa (p) and low = fwa (¬p). Given that fwa (p) = kw
a
(¬p/>) by definition, this
setup ensures that high and low are inversely proportional. In order to imple-
ment our treatment of positive polarity, I make use of the following epistemic
accessibility relation (E):
(88) E(a, w) = {w0 ∈ W : kw a
(w0 ) ≤ n}
where n is either high or low.
58 Domains for Adverbs
I write E ↑ (a, w) when n is to be understood as high and E ↓ (a, w) when it is
to be understood as low. Furthermore, I normally skip specification of a and
w. For speaker oriented adverbs, a is always the speaker19 and w is always the
world of evaluation. (88) either gives us the set of worlds whose implausibility
is at most low (i.e. E ↓ ) or the set of worlds whose implausibility is at most
high (i.e. E ↑ ) Our meaning for certainly can now be represented as follows:
(89) [[certainly]]K = λp[p ∩ E ↓ 6= ∅]
Given that CK (p) = {v ∈ p | ∀u ∈ p : v K u}, i.e. the most plausible
worlds in p, (89) actually ensures that if certainly(p) is true, then f (p) ≥ high.
This seems to me the right result. It derives the fact that certainly(p) leaves
some little room for doubt; it says that p is very plausible, given K. The
adjective possible can now be given the following representation:
(90) [[possible]]K = λp[p ∩ E ↑ 6= ∅]
Again, this ensures that the plausibility of p is at least low. Our problematic
example, it is possible that Le Pen will win, even though he certainly won’t,
now comes out as follows:
(91) [win ∩ E ↑ 6= ∅] ∧ [¬win ∩ E ↓ 6= ∅]
This is consistent, given the way we determine high and low. More in particu-
lar, (91) is true just in case f (win) = low and f (¬win) = high. As the reader
may have anticipated, I will now derive the meaning of possibly from that of
possible by applying domain shrinkage to E ↑ . Such domain shrinkage is still
thought to be contextually restricted, but we define a constraint on possible
domain shrinks:
Definition 2.7 Let ∇ be the set of domain shrinks and let n be any natural
number, s.t. n < high, then in context c,
∇ ⊆ {gn | gn (E ↑ ) = {w0 | kw
a
(w0 ) ≤ n}}.
Applying a domain shrink takes us to a domain of worlds whose plausibilities
are strictly greater than low. The meaning of possibly is the following, where,
as before, g has to be universally closed by application of Op.
(92) [[possibly]]K = λpλg[p ∩ g(E ↑ ) 6= ∅]
Op is defined as before, but now operating on g ∈ ∇. The derivation of (93)
is given in Derivation (2.5).
(93) Le Pen will possibly win.
19 Except, potentially when they occur embedded under an attitude verb like believe. Sim-
ilarly for w. I ignore this complication here.
2.4 possibly: Shrink, but don’t weaken! 59
Derivation 2.5
λg[win ∩ g(E ↑ ) 6= ∅]
apply Op I
Op(λg[win ∩ g(E ↑ ) 6= ∅])
check strengthening I
↑ ↑
∀g ∈ ∇[win ∩ g(E ) 6= ∅] |= [win ∩ E 6= ∅]
I
↑
∀g ∈ ∇[win ∩ g(E ) 6= ∅]
Strengthening is satisfied. The bottom line of Derivation (2.5) states (indi-
rectly) that the plausibility of “win” must be strictly greater than low. This
clearly entails that it must be greater than or equal to low.
We can now test whether (94a) is inconsistent, which is what we want.
(94b) is the meaning assigned to this proposition.
(94) a. # Le Pen will possibly win, even though he certainly won’t.
b. ∀g ∈ ∇[win ∩ g(E ↑ ) 6= ∅] ∧ [¬win ∩ E ↓ 6= ∅]
This is clearly inconsistent. The first conjunct states that the plausibility of
“win” is strictly greater than low whereas the second conjunct has it that the
plausibility of “¬win” is greater than or equal to high. But for any p we have
that if f (p) > low, then f (¬p) must be < high. Hence (94b) is inconsistent.
2.4.3 The status of possibly as a PPI
I will now show that our independently motivated semantics for possibly derives
its distribution. We noted above that whenever strengthening is satisfied in
some environment, application of a DE operator Γ to that environment reverses
its entailment relations, so strengthening is no longer satisfied after application
of Γ. I will go through some examples to assure you that this really works.
Consider (95), where possibly occurs under negation with its derivation in (2.6).
(95) ?? Stanley didn’t possibly eat his wheaties.
Recall that g(E ↑ ) ⊆ E ↑ . The fact that “eat” has no intersection with the
former therefore does not entail that it also has no intersection with the latter.
Consider now the alternative Derivation 2.7 for (95), where the negation enters
the derivation after application of Op. Strengthening fails here, too. The fact
that not every way of constraining our domain gives us a domain that has an
intersection with “eat” does not entail that the unconstrained domain has no
intersection with “eat”. The reader might wonder why strengthening applies
after the negation has been merged and not before. I clearly need this for the
60 Domains for Adverbs
Derivation 2.6
λg¬[eat ∩ g(E ↑ ) 6= ∅]
apply Op I
Op(λg¬[eat ∩ g(E ↑ ) 6= ∅])
check strengthening I
↑ ↑
∀g ∈ ∇¬[eat ∩ g(E ) 6= ∅] 6|= ¬[eat ∩ E 6= ∅]
I
undefined.
Derivation 2.7
λg[eat ∩ g(E ↑ ) 6= ∅]
apply Op I
↑
Op(λg[eat ∩ g(E ) 6= ∅])
merge negation I
¬Op(λg[eat ∩ g(E ↑ ) 6= ∅])
check strengthening I
¬∀g ∈ ∇[eat ∩ g(E ↑ ) 6= ∅] 6|= ¬[eat ∩ E ↑ 6= ∅]
I
undefined.
system to work. I follow Chierchia (2001) in assuming that strengthening is
checked at the phase level of the syntactic derivation Chomsky (1999, 2001).20
Similar results obtain when possibly occurs within the scope of DE subjects
like few students or DE adverbs like rarely. We give the Derivation (2.8) for
few students here. We are only interested in the wide scope construal for few
students, i.e. the reading where the subject outscopes possibly.
(96) ?? Few students possibly ate their wheaties.
If we take perhaps and maybe to be synonymous with possibly, these results
extend to these adverbs as well.
Intervention
In a recent paper, Szabolcsi (2002) discusses the behavior of the English PPI
some. She analyzes some intriguing properties of this expression there. I will
20 We might also say that application of Op is restricted to the phase level. In that case,
Derivation (2.7) would not arise at all.
2.4 possibly: Shrink, but don’t weaken! 61
Derivation 2.8
Op(λg[few(student)(λx[eat(x) ∩ g(E ↑ ) 6= ∅])])
check strengthening I
∀g ∈ ∇[few(student)(λx[eat(x) ∩ g(E ↑ ) 6= ∅])] 6|=
[few(student)(λx[eat(x) ∩ E ↑ 6= ∅])]
I
undefined.
discuss two of them. In Linebarger (1987), she notes that (anti-)licensing of
polarity items can be obstructed by certain kinds of interveners. Chierchia
(2001) argues that this cannot be reduced to a syntactic Relativized Minimality
effect, essentially, because one cannot identify the class of interveners in a
natural way. Below is an example with the PPI somewhat.21
(97) a. * Stanley didn’t like this somewhat.
b. Stanley didn’t always like this somewhat.
When always intervenes between the negation and somewhat, the sentence be-
comes good. Recall our internet game examples. We saw that possibly cannot
follow never in such cases. Of course, it also cannot follow sentential negation
(98a). The interesting thing is that, also in this case, an intervening always has
a meliorating effect (98b), i.e. (98b) seems to be much better than (98a).
(98) a. * This is a fun, free game where you’re not possibly further than
a click away from winning $1000!
b. ...where you’re not always possibly a click away from winning
$1000!
In order to see how our setup can derive this, I will give an informal presentation
of how Chierchia (2001) derives similar intervention effects for the NPI any.
First, note that always seems to block the licensing of this element (Linebarger,
1987).
(99) a. Stanley didn’t like anything.
b. ?? Stanley didn’t always like anything.
Chierchia notes that the class of interveners seem to share the following char-
acteristic: While they do not (necessarily) introduce a scalar implicature in a
UE environment, they do introduce one in a DE environment. Always partic-
ipates in a (lexicalized) scale with sometimes. Thus, (100a) carries the scalar
implicature that Stanley didn’t always eat his wheaties. Since always is the
21 This example is good on a so-called ‘metalinguistic’ or emphatic denial reading. See
Szabolcsi (2002) for discussion.
62 Domains for Adverbs
strongest element in the relevant scale, (100b) does not introduce such a scalar
implicature. However, under negation, the scale is reversed: (100c) does intro-
duce the implicature that Stanley sometimes did eat his wheaties. I refer the
reader to Krifka (1995); Chierchia (2001) for discussion of an algorithmic way
to derive this.
(100) a. Stanley sometimes ate his wheaties.
I Stanley didn’t always eat his wheaties.
b. Stanley always ate his wheaties.
c. Stanley didn’t always eat his wheaties.
I Stanley sometimes ate his wheaties.
Recall that Chierchia assumes that any comes with a domain expansion variable
g which is universally closed by his Op, and that application of Op is subject
to a strengthening condition. He now argues that if strengthening takes scalar
implicatures into account22 we can derive the intervention effects as failure of
strengthening. In other words, application of Op must yield a proposition
which is stronger than the strongest meaning of the proposition without do-
main expansion. Suppose that we have two sets X, Y of potatoes, such that
X ⊆ Y . Suppose that John did not always have a potato in Y , though he some-
times did. This does not entail that he did not always (though sometimes did)
have a potato in X. Chierchia goes through a number of examples, including
ones with numeral interveners (e.g. “*John didn’t give two people anything”).
For our setup, we would have to show that, while the negation reverses the
strengthening relation, as we have shown, interveners would have the effect of
blocking such reversal. This is not, in general, possible. In fact, most of the
interveners are UE in their relevant argument. Suppose that Γ is DE and that
Θ is UE. Then, p is in a DE environment in both of the following examples:
(101) a. Γ(p)
b. Γ(Θ(p))
Hence, addition of an implicature triggered by a (UE) intervener, should not
have any rescuing effect on strengthening, as long as the implicature is added
on both sides. If we only add it on one side, it could still not rescue a failure
of strengthening which is what we would need. Suppose that, instead of being
subject to strengthening, PPIs are subject to an anti-weakening constraint.
In other words, application of a domain shrink must never lead to a weaker
statement than the same statement without shrinkage. In order to implement
this compositionally, we could define a universal closure operator Ω as follows:
22 Chierchia argues that there are strong reasons to assume that calculation of scalar im-
plicatures are calculated on subconstituents of the clause, and not on the end result of the
derivation as commonly assumed. I refer the reader to Chierchia’s work for discussion.
2.4 possibly: Shrink, but don’t weaken! 63
Definition 2.8 Let ∇ be the set of domain shrinks as defined above. ϕ0 is
obtained from ϕ by removing all free occurrences of g and adding scalar impli-
catures.
Ω(ϕ) = ∀g ∈ ∇[ϕ], if ϕ0 6|= ∀g ∈ ∇[ϕ]
else = undefined.
Though this will derive the intervention effects we have seen for PPIs while
retaining the result that they cannot be (directly) outscoped by a DE operator,
it does introduce an asymmetry between NPIs (subject to strengthening) and
PPIs (subject to anti-weakening) which should be further motivated. One
piece of support comes from the fact that that PPIs can be outscoped by non-
monotone quantifiers like exactly three N. These couldn’t satisfy strengthening
for the simple reason that they are non-monotone. In other words, (102a) has a
reading according to which there are exactly three people such that John gave
them something. Similarly, (102b) allows the subject to outscope possibly, i.e.
it has a reading according to which there are exactly three students such that
they possibly ate their wheaties. This already shows that there is a problem
with assuming strengthening for PPIs. Furthermore, the same quantifier gives
rise to an intervention effect with any (102c).
(102) a. John gave exactly three people something.
b. Exactly three students possibly ate their wheaties.
c. ?? John didn’t give exactly three people anything.
We could not generalize the anti-weakening approach to NPIs, because, then,
these should be licensed by non-monotone quantifiers, contrary to fact:
(103) * Exactly three students ate anything.
In a sense, the asymmetry between strengthening and anti-weakening reflects
the fact that, where NPIs are licensed, PPIs are anti-licensed (Giannakidou,
1997). I go through the derivation of our example (98b), but before doing
so, I must device a meaning for always. I follow, among others, Beaver and
Clark (2002) in assuming that always is a universal quantifier over events. In
other words, it denotes the subset relation over sets of these, and, crucially,
the restriction is always contextually given. In our example, the contextual set
could be all events of playing the relevant game. Then, (104a) comes out as
(104b) and (104c) comes out as (104d).
(104) a. You’re always a click away from winning.
b. {e | play(e)} ⊆ {e | 1click(e)}
c. You’re always possibly a click away from winning.
d. ∀g ∈ ∇[{e | play(e)} ⊆ {e | 1click(e) ∩ g(E ↑ ) 6= ∅}]
{e | 1click(e) ∩ g(E ↑ ) 6= ∅}, where g is an arbitrary domain shrink, is a
subset of {e | 1click(e) ∩ E ↑ 6= ∅}. For ease of exposition, I will refer to the
former set of events as A, and the latter as B. Furthermore, I will refer to the
64 Domains for Adverbs
restrictor of always (in our case {e | play(e)}), as C. The proposition “You’re
not always possibly a click away from winning” thus comes out as λg[C 6⊆ A]
before application of Ω. Adding the scalar implicature to this, we get λg[C 6⊆
A ∧ C ∩ A 6= ∅]. Applying Ω to this, we need to check that anti-weakening
is satisfied. This is done by checking that the following non-entailment holds,
recalling that A ⊆ B:
[C 6⊆ B ∧ C ∩ B 6= ∅] 6|= ∀g[C 6⊆ A ∧ C ∩ A 6= ∅]
Indeed it does. One can also see that it is the implicature introduced by the
negated always which is responsible for the failure of the entailment. In this
sense, then, always intervenes precisely because it introduces an implicature
which blocks the weakening effect which the negation and other DE operators
have on domain shrinkage. If always does not intervene, anti-weakening is vio-
lated. This is because, as we have seen, the following is true for any proposition
p.
∀g[p ∩ g(E ↑ ) 6= ∅] |= [p ∩ E ↑ 6= ∅]
Since the negation is DE (even AMo), we therefore have the following, which
is a violation of anti-weakening.
¬[p ∩ E ↑ 6= ∅] |= ∀g¬[p ∩ g(E ↑ ) 6= ∅]
In other words, we have derived that certain interveners can rescue a PPI,
just in case they are of the sort that introduce scalar implicatures under DE
operators. Without such intervention, the PPI is illicit under DE operators.
Double licensing
Szabolcsi (2002) discusses another intriguing phenomenon about PPIs, namely
what she calls ‘double licensing’. She attributes the observation to Jespersen.
Consider (105).
(105) a. * I doubt that Stanley liked it somewhat.
b. * I think Stanley didn’t like it somewhat.
c. I doubt that Stanley didn’t like it somewhat.
In this case, the adverbs behave differently from somewhat. (106c) does not
seem to improve very much, compared to (106a, 106b). In (107), however,
there is a detectable improvement.
(106) a. ?? I doubt that Stanley possibly liked it.
b. ?? I think Stanley didn’t possibly like it.
c. ?? I doubt that Stanley didn’t possibly like it.
(107) I don’t doubt that Stanley possibly liked it.
2.4 possibly: Shrink, but don’t weaken! 65
This bears on the assumption that strengthening/anti-weakening is checked at
the phase level. For example, for (106a) to come out bad, it is essential that
strengthening cannot be checked in the embedded clause. On the other hand,
for (108a) to come out good, it seems to be crucial that it can be checked in
the embedded clause. (108b), in turn seems to require the opposite. It might
be that this could be made to follow from properties of the verbs in question.
For instance, think is known to be a “neg-raising” verb, in the sense that the
matrix negation in (108b) seems to be equivalent to negating the embedded
clause.
(108) a. Nobody thinks that Le Pen will possibly win.
b. ?? John doesn’t think that Le Pen will possibly win.
One phenomenon (pointed out to me by Anastasia Giannakidou (p.c.),
which could be taken as a double licensing phenomenon is the following. As
we have seen, possibly is generally bad under negation. However, (109a,b), in-
volving another modal seem to be much better, in fact, quite perfect. Note,
however, that possibly seems to be the only adverb for which a second modal
has a “double licensing” effect. Thus, there is no meliorating effect of the pres-
ence of the modal verb in (109c). Note also that the good examples (109a,b)
always get an emphatic/contrastive reading, which, as we know, can rescue a
PPI alone. Finally, English is the only language I am aware of that exhibits
this phenomenon. This is illustrated by the sharp ungrammaticality of the
Norwegian sentence (109d).
(109) a. Stanley couldn’t possibly have eaten his wheaties.
b. Stanley can’t possibly have eaten his wheaties.
c. * Stanley can’t [probably/ maybe/ perhaps/ evidently/ fortu-
nately/ paradoxically/ . . .] have eaten his wheaties.
d. * Ståle [an/ kunne] ikke [muligens/ kanskje] ha spist
S [can/ could] not [possibly/ maybe] have eaten
hvetekakene sine.
the-wheaties his
This sheds doubt on the hypothesis that this phenomenon should be ana-
lyzed as double licensing in Szabolcsi’s sense. The double licensing phenomena
she discusses are crosslinguistically robust, and generalize to more than one
item. Furthermore, we have seen that the adverbs are not generally rescued by
adding another licenser on top of negation (106c). Thus, I will put the cases in
(109a,b) aside for now as a quirky property of one expression in one language.
In sum, it seems doubtful that Szabolcsi’s double licensing phenomenon applies
to adverbial PPIs, although more research is needed to capture the difference
between e.g. (106c) and (107).
66 Domains for Adverbs
Summary
In this section, we have seen that the behavior of possibly can be derived from
its semantic difference with the proposition-embedding adjective possible. More
in particular, we have seen that possibly yields stronger statements that possible
and that this can be implemented in van Rooy’s (2001) framework, by assum-
ing that the adverb comes with a variable over domain shrinking functions,
mapping the epistemic accessibility relation E ↑ to one of its subsets. Intu-
itively, what such domain shrinkage does is increasing the level of plausibility
ascribed to the modified proposition. We have furthermore seen that if such
domain shrinkage is subject to an anti-weakening constraint, and that such
anti-weakening takes scalar implicatures into account, the limited distribution
of the adverb follows. Hopefully, the anti-weakening constraint on domain
shrinkage can be derived from general pragmatic constraints in the spirit of
van Rooy (2002) and the way this author derives the strengthening constraint
for NPIs. If so we will have a general explanation of the distribution of SOA
and PPIs more generally.
I would like to stress that the machinery employed here for polarity phe-
nomena and for epistemic modality was developed independently, and for other
purposes. In other words, the claim is that the distribution of such expressions
as possibly can be made to follow from already existing theories of modality
and polarity.
2.5 Prospects and consequences
I have concentrated on deriving the behavior of possibly here. The reason for
this is mainly that a similar derivation for all the speaker oriented adverbs we
discussed in section 2 would be too ambitions a task for the present thesis. Nev-
ertheless, I think the data discussed in section 2 and 3 allow us to be optimistic
about the prospects of explaining the distribution of SOA, phase quantifiers
and frequency adverbs in terms of their (independently motivated) scopal re-
quirements. For manner adverbs, and aspectual adverbs like completely which
generally cannot outscope any other adverbs, such an analysis might seem less
adequate. They are not polarity items, for example. However, we noted that
some manner adverbs can occur high, but then, they receive a subject-oriented
or speaker-oriented reading. This can be treated as a scopal effect (Ernst,
2000), if we assume that they come with an implicit variable that must be
c-commanded by its binder. For adverbs like completely, which do not behave
like this, we would need to motivate an analysis according to which such ad-
verbs do not come with an implicit variable, or that a subject/speaker-oriented
reading would always lead to semantic or pragmatic degradedness. This would
explain why they don’t receive these readings, but in order to explain why
they can’t occur in high positions, we would need something more. It has been
noted quite frequently that these adverbs interact with the argument structure
2.5 Prospects and consequences 67
and temporal constitution of the main VP (Chomsky, 1965; Alexiadou, 1997;
Cinque, 1999). Thus, I speculate that, given a proper explanation of that, it
might be made to follow that these adverbs cannot apply to a predicate which
has already been modified by frequency adverbs, phase quantifiers, or SOA.
There are indications that the reasoning pursued in the present chapter
extends to some auxiliaries. Consider, for example the pair might, can. The
former, but not the latter appears to be a PPI. Furthermore, while the former
is restricted to an epistemic interpretation, the latter has a wider range of
(root) modal uses. Even when can is interpreted epistemically it isn’t a PPI.23
This fact seems to show that one can’t account for the PPI status of might
by assuming that there is an epistemic modality head M which c-commands
a functional head Neg hosting the negation. Under standard assumptions,
epistemic interpretation of can should force it to move to M (at LF), if there is
such a head, so we would be lead to expect that epistemic can should also be
a PPI, contrary to fact. Otherwise, one would be forced to say that only the
epistemic modals which are, in fact, PPIs occupy M at LF, thus rendering the
account vacuous.
(110) a. The university might be in that direction.
b. The university might not be in that direction.
might > not. cf. *mightn’t
c. ?? Nothing interesting might be in that direction.
d. ? Few interesting sites might be in that direction.
e. ?? I doubt that the university might be in that direction.
(111) a. The university can be in that direction.
b. The university can’t be in that direction
not > can
c. Nothing interesting can be in that direction.
d. Few interesting sites can be in that direction.
e. I doubt that the university can be in that direction.
2.5.1 Short Verb Movement
What are the consequences for a syntactic approach like that of Cinque (1999),
if a similar account can be developed for all adverbs? I would like to separate
this into two questions. The first one is whether or not adverbs are to be
analyzed as specifiers of unique (functional) heads. I do not think the present
chapter bears on that question at all. Ultimately, it depends on how one should
analyze purely syntactic phenomena like verb placement among adverbs in the
23 Giannakidou (1997) also argues that Greek subjunctive morphology is an NPI. Certain
kinds of Germanic (embedded) verb-second clauses appear to be PPIs, i.e. the (“bridge”)
predicates which allow embedded V2 are never DE.
68 Domains for Adverbs
Romance languages (Pollock, 1989; Cinque, 1999). I return to this question in
chapter 4.
The second question is this. Assuming that adverbs are specifiers of unique
functional heads, how should we derive the ordering of these heads? The stan-
dard answer is that functional heads are ordered by syntactic selection. It
seems to me that the present chapter does bear on this question. In particu-
lar, if the present account is on the right track, it seems preferable to assume
that functional heads are not ordered by selection. To make this point clear, I
would like to point out some Norwegian facts that plainly cannot be accounted
for by assigning positions to adverbs in a linear sequence of functional heads
(Nilsen, 2001).24 There are triplets of adverbs that enter into non-linear order-
ing patterns. We have seen that the Norwegian adverb muligens ‘possibly’ has
to precede sentential negation. alltid ‘always’, on the other hand has to follow
it.
(112) a. Jens hadde ikke alltid pusset tennene sine.
J had not always brushed the-teeth his
b. * Jens hadde alltid ikke pusset tennene sine.
J had always not brushed the-teeth his
The ungrammaticality of (112b) probably relates to the observation (Beghelli
and Stowell, 1997) that universals generally don’t like to immediately outscope
negation. By linearity, we now expect that muligens ‘possibly’ has to precede
alltid ‘always’. But we have already seen that this is not true. We repeat the
relevant example here.
(113) Dette er et morsomt, gratis spill hvor spillerne alltid
this is a fun free game where the-players always
muligens er et klikk fra å vinne $1000!
possibly are one click from to win $1000
Thus the ordering of muligens, ikke, alltid isn’t linear. That is to say, we cannot
account for their relative ordering by assigning positions in a linear sequence
to them. Now assume that they are to be analyzed as specifiers of functional
heads. If these heads are ordered by selection, we would be forced to assume
multiple positions for the adverbs. To see this, suppose that we allow Norwe-
gian alltid ‘always’ to occupy a “high” position, and let this be responsible for
the examples where this adverb precedes muligens ‘possibly’. Given that, as
we have seen, muligens can follow the negation (ikke) just in case alltid inter-
venes, we might consider adding an extra, low position for muligens as well.
This results in the tree in figure 2.1. This accommodates all the grammatical
examples, but also many ungrammatical ones: in particular, this tree leads us
to expect sentences like (112b) and ones where muligens immediately follows
the negation to be possible, contrary to fact. Hence, it follows that some extra
24 Given binary branching, functional heads ordered by selection necessarily give rise to a
linear sequence of specifiers.
2.5 Prospects and consequences 69
XP
HHH
H
H
alltidhigh YP
HHH
H
muligenshigh ZP
H
HH
H
ikke UP
H
HH
H
alltidlow WP
HH
muligenslow ..
.
Figure 2.1: transitive hierarchy for non-transitive triplet
theory, such as the one advocated in the present chapter, is needed to rule these
out. But the extra theory seems to do quite well on its own, i.e. without extra
positions and selectional sequences. In other words, ordering phenomena such
as these don’t provide arguments for selectional sequences.
An alternative, if one wants to say that adverbs are specifiers of unique func-
tional heads, would be to say that these heads are freely ordered. This would
amount to assuming some kind of IP-shell analysis. A particularly intrigu-
ing phenomenon to handle for such an approach is the following, discussed by
Cinque (2000a, to app.). In the Romance languages, there is considerable vari-
ation with respect to which adverbs a given verb form can precede or follow.
For instance, Cinque shows that a French past participle can follow manner
adverbs like bien ‘well’, whereas Italian past participles can’t follow bene ‘well’.
(114) a. Il en a bien compris peine la moit.
he of-it has well understood hardly the half
b. Gianni ha (*bene) capito (bene).
G has (*well) understood (well)
Furthermore, Cinque notes that the following generalization appears to be true:
If in a language L, the past participle (or another verb form) can follow (pre-
cede) a given adverb, then it can follow all other adverbs that can precede
(follow) that adverb. In other words, verb-adverb ordering is transitive.25 If
true, I take this to show that whatever is the right account for adverb-adverb
ordering, it should somehow carry over to verb-adverb ordering. In other words,
25 If our result that adverbs like still, already, always, often, etc. can precede “high” adverbs
(SOA) carries over to Italian and other Romance varieties, this is problematic for Cinque’s
transitivity claim, since he shows that the past participle generally cannot precede SOA in
Romance, but they generally can precede the Romance translations of still, etc.
70 Domains for Adverbs
if we account for adverb distribution in terms of the semantic scopal require-
ments of the individual adverbs, we should somehow be able to account for
“short verb movement” phenomena as scopal in nature as well.
First, if short verb movement is to be treated as a scopal phenomenon, it
had better show some scopal effects. In fact, (Cinque, 1999, p49, n8) points out
that, whereas (115a) entails that Gianni still has long hair, (115b) is compatible
with him now having short hair.26
(115) a. Gianni ha sempre avuto i capelli lunghi.
G has always had the long hairs
b. Gianni ha avuto sempre i capelli lunghi
G has had always the long hairs
Suppose that past participial morphology is treated as a “modifier” on a par
with adverbs like sempre, but that it attracts the verb.27 Suppose, further-
more, that it conveys anterior tense of the constituent it modifies.28 Then the
position of the verb in (115) would be a function of where the past participial
morphology is merged. If it is merged above sempre, as would be the case in
(115b) anterior tense applies to the constituent [always HAVE long hair], thus
allowing for the possibility that the state of affairs denoted by this constituent
no longer holds. If it applies below sempre, the adverb would result in what
has come to be known as the “universal” perfect (Dowty, 1979; Vlach, 1993;
Iatridou et al., 2002), which does entail that the state of affairs still holds.
In such a theory, one would have to derive the fact that perfect morphology
cannot outscope certain adverbs by means of the scopal requirements of the
perfect and the adverbs.
In fact, the claim that Romance participles can’t precede “high” adverbs
may not even be true. An internet search (google) for exact strings like
“dato fortunatamente” (‘given fortunately’), “avuto probabilmente” (‘have-
PTC probably’), etc. returned several hundred hits among which were the
following sentences.
(116) a. Due incendi che non hanno avuto fortunatamente
Two fires that not have-3pl had fortunately
conseguenze rilevanti si sono sviluppati
consequences relevant SI are developed
26 My French informants report that there is no such semantic contrast in this language.
27 such attraction may be made to follow on phonological grounds. The affix and the verbal
stem are both “phonologically incomplete” in the sense that they to not make up phonological
words on their own.
28 Iatridou et al. (2002) argue that one should not, strictly speaking, analyze the perfect as
“anterior”, i.e. that the Reichenbachian R E interval is irrelevant for the perfect. See Borik
(2002) for a sophisticated implementation of Reichenbachian ideas, which appears to take
care of the objections raised in Iatridou et al. (2002) (although Borik herself does not relate
her analysis to that of Iatridou et. al.). The precise semantics of the perfect need not concern
us in this informal discussion, although ultimately it is, of course, important.
2.5 Prospects and consequences 71
b. le analisi hanno dato fortunatamente esito
the analyses have-3pl had fortunately output
negativo
negative
c. è stato probabilmente stampato a Roma
is-3sg been probably printed in Rome
There are two scenarios, depending on which status we assign to examples like
(116). I discuss them in turn.
Suppose that these are, in fact, straightforwardly grammatical in Italian.
Then past participial movement is allowed around very high adverbs. Then,
given that the finite verb can also precede or follow high adverbs, we face
the following problem which was pointed out by Bobaljik (1999) for argument
ordering versus adverb ordering. Auxiliaries (and arguments) come in a spe-
cific order, so, in general, we have that, given a sequence of three auxiliaries
aux1 , aux2 , aux3 and one adverb a of the relevant type, a can occupy any
position in the sequence of auxiliaries, as long as the relative ordering of aux-
iliaries remains constant. Conversely, we have that given a sequence of adverb
a1 , a2 , a3 , and one auxiliary aux, aux can occupy any position in the sequence
of adverbs, as long as the relative ordering of the adverbs remains the same.
It follows that there can be no single selectional sequence accommodating the
relative ordering of both adverbs and auxiliaries.
Suppose, on the other hand, that examples like (116) can be disregarded,
for example because the adverb in question is in a so-called comma-intonation.
Then, we can manintain Cinque’s view that the participle can’t move around
these adverbs.29 We have seen above that phase quantifiers like still can precede
“high” adverbs in English. Cinque shows that Italian participles can precede
such adverbs. If our result that still can precede probably carries over to Italian
ancora and probabilmente, we have another failure of transitivity: PTC can
precede ancora, ancora can precede probabilmente, but (in the current scenario)
PTC can’t precede probabilmente. The following examples were found on the
internet, suggesting that our result does carry over to Italian.
(117) a. La risposta è ancora probabilmente no
the answer is still probably no
b. Gli americani sono ancora probabilmente in maggioranza,
the Americans are still probably in majority
ma non per molto.
but not for long
A possible defense of functional sequences might be that one can accomodate
this by assuming an extra, high position to be available for ancora. But adding
extra positions for different elements significantly expands the expressive power
of the system, and, leads to overgeneration in some cases. Secondly, although
29 A relevant question is, of course, what the precise status of the “comma-intonation” is.
72 Domains for Adverbs
it may be made to work for the current problem, we have already seen that it
will not do for the transitivity failure we noted for Norwegian.30
A question raised by the suggestion that verb placement can be treated as a
scopal phenomenon in the manner outlined here, raises the following question:
Why doesn’t every language have short verb movement? In fact, I think there
is evidence that even languages like Norwegian do have short verb movement.
Consider (118a) which exhibits crossing scope dependencies between adverbs
and verbs. In the sharply ungrammatical (118b) the adverbs immediately pre-
cede the constituent they modify.31 Crossing scope dependencies are entirely
unexpected if absence of SVM is analyzed by letting the verbs remain in-situ.
(118) Norwegian
a. . . .at det ikke lenger alltid helt kunne ha blitt
. . .that it not any.longer always completely could have been
ordnet.
fixed
b. * . . .at det ikke kunne lenger ha alltid blitt helt
. . .that it not could any.longer have always been completely
ordnet.
fixed
Therefore, (118a) may be taken as evidence for a rather elaborate system of
short verb movements in Norwegian, inspired by Koopman and Szabolcsi (2000)
on verbal complexes in West Germanic and Hungarian.32 In particular, this
pattern would arise if scope reflects the order of merger, and each adverb at-
tracts the projection of the closest verb, and each verb attracts the projection
of the closest adverb, as illustrated in derivation 2.9.
Such an approach could be extended to phenomena like “affix hopping” in
English and other languages. In chapter 4, such a “generalized verb raising”
approach to SVM is developed in more detail.
2.5.2 Semantic selection
In Bartsch (1976); Ernst (2001), analyses of adverb distribution have been
promoted which share with the present chapter the assumption that it should
30 The difference between the two cases is that, in the Norwegian case, we made use of the
relation “x must precede y”, while the current problem concerns failure of transitivity of the
relation “x can precede y”. This is why extra positions can resque the latter, but not the
former.
31 the scope of alltid and lenger is not easy to determine, so one could also let the former
adverb outscope blitt ‘been’ or the latter be outscoped by this verb.
32 Bentzen (2002) shows that SVM is visible in some varieties of Norwegian, including my
own. For example, examples like (i) are perfectly grammatical.
(i) Jeg har ikke spist ofte tran.
I have not eaten often cod.liver.oil
However, in my variety, this is limited to a few adverbs, thus I could not substitute alltid
‘always’ for ofte ‘often’ in (i).
2.5 Prospects and consequences 73
Derivation 2.9
[completely [fixed]]
move VP I
[fixed [completely]]
merge been I
[been [fixed [completely]]]
move AdvP I
[completely [been [fixed]]]
merge always I
[always [completely [been [fixed]]]]
move VP I
[[been [fixed]] [always [completely]]]
merge have I
[have [[been [fixed]] [always [completely]]]]
move AdvP I
[[always [completely]] [have [been [fixed]]]]
merge any.longer I
[any.longer [[always [completely]] [have [been [fixed]]]]]
move VP I
[[have [been [fixed]]] [any.longer [always [completely]]]]
merge could I
[could [[have [been [fixed]]] [any.longer [always [completely]]]]]
move AdvP I
[[any.longer [always [completely]]] [could [have [been [fixed]]]]]
merge not I
[not [[any.longer [always [completely]]] [could [have [been [fixed]]]]]]
be given a semantic treatment. The execution of the analyses is quite differ-
ent, however. While these authors make use of rich semantic ontologies and
selectional restrictions, no use has been made here, of notions like “event”,
“proposition”, “fact” etc. to account for the distribution of different adverbs.
For example, Ernst (2001) assumes that there are different ontological entities
74 Domains for Adverbs
like “events”, “propositions” and “facts” that adverbs can apply to, and that
these ontological categories enter into the following system of type-conversion
(his “FEO-calculus”)
event⇒spec-event⇒proposition⇒fact⇒speech act
In this setup, some adverbs are event modifiers (e.g. completely), some proposi-
tional modifiers (e.g. not) etc. The fact that not must precede (outscope) com-
pletely now follows from their (semantic) selectional requirements and the FEO-
calculus. The structure of the account is thus remarkably similar to Cinque’s.
The FEO-calculus corresponds to Cinque’s hierarchy of functional projections,
and the ontological categories to Cinque’s heads. As far as I can see, ontological
categories and his FEO-calculus are in no less need of explanation and motiva-
tion than Cinque’s sequence of heads. In practice, Ernst always assigns identity
types to his adverbs, that is, they always map a constituent of semantic type X
to another constituent of type X. If this is taken to be a general restriction on
his system, he also predicts adverb ordering to be linear, which we have seen
that it is not. If Ernst allows adverbs to have non-identity types, i.e. maps from
type X to type Y, he no longer predicts linear ordering of adverbs. However,
this move jeopardizes the intuitive “plausibility” his approach might have. For
example, the Norwegian non-transitive triplet of adverbs muligens (‘possibly’),
ikke (‘not’), alltid (‘always’) can be accommodated in the FEO-calculus if (and
only if; see Nilsen (2001)) muligens takes a fact and returns a proposition, ikke
takes an event and returns a fact, and alltid takes a proposition and returns an
event. See Chapter 1 and Nilsen (2001) for discussion of this point.
I have assumed that adverbs like always are temporal and that, say possibly
applies to propositions, but the fact that the two can occur in either order,
with different scopal effects, suggests that this should not be brought to bear
on the unavailability of certain adverb orderings. In fact, that would be a sur-
prising claim, which would require an explanation. For example, the fact that
everybody quantifies over individuals does not prevent it from outscoping (or
being outscoped by) modal adverbs or verbs, negation and other non-individual
operators. It would be surprising, then, if it were to turn out that the fact that
some adverbs operate on times or events should prevent them from behaving
similarly. Ernst could maybe maintain his point of view by aluding to the
empirical fact that nominal quantifiers tend to be give rise to inverse scope
effects, whereas adverbs do not seem to do so. However, it is not clear that
these phenomena are the same. In other words, it is not clear that ability to
generate inverse (non-surface) is what enables nominal quantifiers to relate to
their variables at a distance. However, if we assume that VPs denote events,
and that some adverbs force conversion to “higher” ontological types in Ernst’s
sense, it just seems to be empirically wrong to say that event modifiers must
be sisters of event denoting expressions. For example, if always is an event
modifier, the fact that it can precede and outscope SOA, which we have seen
it can, seems to be a counterexample.
2.5 Prospects and consequences 75
2.5.3 Summary
I have argued that SOA should be treated as PPIs, in the sense of being ex-
cluded from DE environments and that this suffices to derive their syntactic
distribution. Positive phase quantifiers are also PPIs, but only excluded from
AA environments. We have seen that the status of the adverb possibly as a PPI
follows from its lexical semantics if we derive its meaning from the meaning of
that of possible by means of an operation Ω on its epistemic accessibility rela-
tion E, and making Ω subject to an anti-weakening contstraint, sensitive to the
scalar implicatures of the modified proposition. We also saw that intervention
effects follow in a manner entirely parallel to Chierchia’s (2001) derivation of
such effects with NPIs. The analysis appears to maintain the advantages that
the analysis of adverb ordering proposed by Ernst (2001); Svenonius (2001),
while sidestepping the difficulties associated with their ontological approach.
In appendix A, some further evidence for the analysis of SOA presented here
is adduced from the fact that SOA in many languages are incompatible with
degree modification.
76 Domains for Adverbs
CHAPTER 3
V2 and Holmberg’s
Generalization
3.1 Introduction
The standard1 (symmetric) analysis of verb second (V2) (Holmberg and Platzack,
1995; den Besten, 1989) is that the finite verb (Vf) head-moves to the highest
functional projection of the clause. Then some other constituent, for instance
the subject or an adverbial, has to move to the specifier of that same projec-
tion. These two movement steps, in addition to a general ban on adjunction
to CP, ensure that Vf will always end up in the second position of the clause.
It is the purpose of the present chapter to argue against a head movement
analysis of V2. The main argument for an XP-movement analysis will come
the fact that certain (apparent) V2-violations in Mainland Scandinavian seem
to pose severe problems for a head movement analysis. The problematic data
involve focus particles that can intervene between the finite verb and the first
constituent. It will be argued that these cannot be treated as a clitic on the
verb and that the V2 violations are real. The interaction between V2 violations
with focus particles and argument shift of weak pronouns will be used to show
that the verb does not move to second position as a head. From this conclu-
sion it follows that weak pronouns, in fact, do not shift. When they appear
to have moved, it is a larger constituent containing the VP that has moved.
1 Sections 3.2–3.5 will appear as Nilsen (to app.b). The remaining sections, including the
“third approximation” were developed after submission of that paper.
78 V2 and Holmberg’s Generalization
CP
H
HH
H
XPi CP
HHH
C 0 IP
H
H 0 PPP
Vfj C . . . tj . . . t i . . .
Figure 3.1: standard analysis of V2
Thus, pronoun shift and V2 is treated as surface reflexes of one and the same
operation. This gives a very simple explanation for Holmberg’s Generalization
to the effect that object shift cannot cross phonetically realized material from
the VP: it cannot do so because it is the VP itself, or rather an XP containing
it, that moves.
The traditional view of V2 is also challenged by the fact that there are
topicalization-like processes and wh-movement effects that seem to require a
sentence internal landing site. Furthermore, some facts concerning subject-
verb inversion are problematic for the standard treatment of V2. Inversion is
usually analyzed as V-to-C movement, with the subject in spec-IP or equivalent
position. This leads to the expectation that it can only occur when the verb
really is in C. However, there are cases in which the verb is arguably much
lower than this, and the subject still has to follow it.
The proposed account builds on recent work by Kayne (1998, 1999); Cinque
(1999); Rizzi (1997); Koopman and Szabolcsi (2000) and others. In particular,
no covert movement is used, all movements are to the left and the analysis relies
heavily on the use of ’remnant’ movement, i.e. movement of a constituent con-
taining a trace. More specifically, the proposal is that Rizzi’s (ibid.) functional
projections FocP, TopP and FinP are merged below sentential adverbs. V2
consists in successive raising of TopP around sentential adverbs, carrying the
verb-initial FinP along. One of the key features of the analysis is thus that it
renders V2 sensitive to the properties of individual classes of adverbs. Finally,
it is argued that it may be possible and advantageous to treat V2 phenomena
without reference to functional projections, such as FocP, TopP, FinP, etc.
The chapter is organized as follows. In section 2, the basic data concerning
the V2 violating focus particles and their interaction with pronoun shift is
presented. Section 3 presents a ’first approximation’ to an analysis that can
handle the facts. Section 4 suggests that the problems encountered with the
first analysis is that fronting operations and S-V inversion should be able to
access quite low positions. In section 5 the main proposal is developed. Section
6 sums up and concludes.
3.2 Some data 79
3.2 Some data
3.2.1 V2-violations with focus particles
As has been discussed by Egerland (1998), there exist certain apparent excep-
tions to the V2-generalization in Mainland Scandinavian involving so-called
’focus-sensitive adverbs’ or ’focus particles’ (henceforth fpt). The phenomenon
is illustrated in (119) below with data from Norwegian.
(119) a. Jens bare gikk.
J just left
b. Jens nesten gråt.
J almost cried
It is also possible to have the fpt after Vf, as in (120). Neither of the two orders
appear to be marked or degraded in any way.
(120) a. Jens gikk bare.
J left just
b. Jens gråt nesten.
J cried almost
Other expressions that exhibit the same behavior include til og med ’even’ (lit.
”to and with”), minst ’at least’ utelukkende ’exclusively’ ikke mer enn såvidt
’not more than barely’, simpelthen ’simply’. Thus we are not dealing with a
quirk of a couple of words. Below are examples.
(121) a. Han til og med leste den.
he even read it
b. han utelukkende sover hele dagen
he exclusively sleeps whole the-day
c. Han ikke mer enn såvidt berørte den.
he not more than barely touched it
d. Han simpelthen tok den.
he simply took it
Egerland (ibid.) notes that with nesten there is a truth-contitional difference
corresponding to its different positions. Consider the following examples.
(122) a. Jens nesten brølte hurra.
J almost roared hooray
b. Jens brølte nesten hurra.
J roared almost hooray
80 V2 and Holmberg’s Generalization
(122a) can only mean that Jens pronounced the word ”hurra” in a manner
that almost qualifies as roaring it. Let us refer to this as the ’manner’ reading.
The most salient reading of (122b) is that he didn’t cry ”hurra”, although he
was about to, i.e. a ’modal’ reading. It can also get the manner meaning if
pronounced with heavy stress on the verb. See Rapp and Von Stechow (1996)
for discussion of these and other readings of the German adverb fast (’almost’).
This pattern can also be taken to indicate that the fpt is not adjoined to C’,
since the manner reading presumably results from attaching the adverb lower,
not higher, than the site responsible for the modal reading.
Egerland (ibid.) analyzes this phenomenon in terms of the Universal Hier-
archy of functional projections proposed in Cinque (1999). The cases involving
nesten and those involving bare are given different analyses. Egerland main-
tains a standard analysis of V2 in terms of head movement to the highest FP
in the clause (ForceP in the ’split CP’ framework of Rizzi (1997)). He analyses
the adverb nesten (’almost’) as a specifier of a modal projection in the IP-layer.
For (122a), he claims that Vf can remain in or below that modal head when
nesten is in its specifier. The adverb bare ’only,’ ’just’, he treats as a syntactic
clitic on Vf. The analysis of bare as a clitic is supported by two facts. The
first is that the adverb can be phonetically reduced into the monosyllabic ba’
in Swedish (not possible in Norwegian).
(123) Per ba’ gick. [Swe]
P just left
The second argument is that, according to Egerland, bare cannot appear in
front of auxiliaries.
(124) * Per bara/ba’ har gått. [Swe]
P just has left
The same applies to Norwegian as long as the example is read with neutral
intonation, but if the auxiliary is stressed (125), the result is much better; also
if a less semantically impoverished auxiliary is used (126).
(125) Jens bare HAR gått.
J just has left
”It just IS the case that Jens has left.”
(126) Jens bare måtte gå.
J just must-past leave
”Jens just HAD TO leave.”
There are other problems with assuming that bare is a clitic. First, it can be
modified. If one does not take simpelthen in (127) to directly modify bare, the
example would present the same kind of problem as (128) below.
(127) Jens simpelthen bare gikk.
J simply just left
3.2 Some data 81
After Kayne (1975), one of the defining characteristics of syntactic clitics has
been taken to be that they cannot be modified. Secondly, other adverbs that
cannot plausibly be taken to modify bare can also precede Vf when bare does;
in fact, only when bare does.
(128) a. Jens vanligvis bare svarer ikke.
J usually just answers not
b. * Jens vanligvis svarer ikke.
J usually answers not
This points to the conclusion that bare should be treated on a par with
nesten ‘almost’, so that, when these adverbs are present, Vf can remain in a
low position in the IP-field. If the position of bare is lower than that of vanligvis,
we can also explain why the latter adverb actually has to precede Vf when bare
does. Compare (129) to (128a):
(129) * Jens bare svarer vanligvis ikke.
J just answers usually not
Since the negation must follow Vf in the relevant construction, it cannot be
regarded as V-in-situ, either. This is illustrated in (130).
(130) a. Jens bare liker ikke fiskekaker.
J just likes not fishcakes
b. * Jens bare ikke liker fiskekaker.
J just not likes fishcakes
So far, we can conclude that V2-violating bare is not a clitic on Vf; that
bare occupies a position lower than vanligvis ‘usually’; and that Vf can remain
below that position when bare is present, although it cannot remain in situ.
3.2.2 Pronoun Shift
In the next few paragraphs, we will see that there are reasons to think that
V2 does not involve head movement of Vf, but rather movement of a phrasal
category. A corollary of our observations will be that weak pronouns, in fact, do
not shift. In the discussion of weak pronouns, we use the phonetically reduced
forms ’n ‘he/him/it’ and ‘a ’she/her/%it’2 since these are unambiguously weak:
they must shift to the left. Consider the pattern below.
2 Some dialects use ‘a to replace all feminine nouns, including inanimates. These dialects
typically use the “personal” pronouns for inanimates, also in their unreduced forms, like in (i),
although, if the pronoun in (i) is prosodically stressed, it must refer to a person Cardinaletti
and Starke (1995).
(i) Hu ligger på bordet.
She lies on the-table
“It [the book] is lying on the table”
Other dialects (including my own) use ‘n for inanimates, regardless of gender.
82 V2 and Holmberg’s Generalization
(131) a. Derfor vanligvis bare svarte ’n ’a ikke.
therefore usually just answered he her not
b. Derfor svarte ’n ’a vanligvis bare ikke.
therefore answered he her usually just not
c. * Derfor ’n ’a vanligvis bare svarte ikke.
therefore he her usually just answered not
d. * Derfor svarte vanligvis bare ’n ’a ikke.
therefore answered usually just he her not
In (131) we see that when the subject and the object are both realized as weak
pronouns, they must remain immediately right-adjacent to the verb. In (131a),
the verb+arguments complex remains below the position of bare, whereas in
(131b), the entire complex is moved around bare and vanligvis to the second
position. (131c)-(131d) are added to show that the complex cannot be split
up. This seems to indicate that it is moving as a constituent. The alternative
would be to say that the verb and the arguments move separately, but that
the verb somehow blocks further movement of the pronouns (cf. the Shortest
Move/Minimal Link Condition Chomsky (1995)). One would need an extra
landing site for the pronouns, higher than the negation, but lower than the other
adverbs. The pronouns would move as high as they could without crossing the
verb in the overt syntax, and then proceed to the higher position(s) covertly.
The features attracting the pronouns to the higher positions would have to be
’optionally strong.’ It is obviously simpler to say that the Vf and the pronouns
are moving as a constituent. That analysis obviates the need for extra landing
sites and optionally strong features.
Taking V2 to be derived by XP-movement, we arrive at the following result:
Generalization 1 Weak pronouns do not shift.
When a weak pronoun appears to have moved across adverbs, it is something
else containing the pronouns that has moved. Consider now the following
pattern where the subject and the object appear to be moving as a constituent
without Vf.
(132) a. Derfor svarte Jens ’a vanligvis ikke.
therefore answered J her usually not
b. Derfor svarte vanligvis ikke Jens ’a.
therefore answered usually not J her
c. * Derfor svarte Jens vanligvis ikke ’a
therefore answered J usually not her
d. * Derfor svarte ’a vanligvis ikke Jens.
therefore answered her usually not J
The subject and the weak, pronominal object can follow the adverbs as long
as they remain adjacent. In Swedish, (132c),(132d) are also possible. The two
3.3 First approximation 83
arguments need not remain adjacent in that language. In Danish, only (132a)
is grammatical. For Norwegian, then, the subject and the object seem to make
up a constituent in these examples.
Assuming that pronoun shift actually does not exist squares well with one
of the fundamental properties that pronoun shift appears to have: No effect. It
has been noted in the literature that pronoun shift does not create new binding
possibilities, it does not license parasitic gaps, it does not block wh-movement
(relativized minimality), it does not interfere with passivization (relativized
minimality); in short, it has no effect at all. This obviously supports general-
ization 1.3
3.3 First approximation
A simple way to derive these constituents, which will be shown to be inadequate
shortly, would be to assume the following. There is an XP dominating VP into
which Vf always moves. This XP, in turn, moves to spec-Fin prior to fronting of
some constituent to spec-Top. The derivations would go as in derivation 3.1 and
3.2, ignoring the base position of the adverbial derfor (’therefore’). Phonetic
material is in boldface and cyclicity is ignored in this derivation. This gives us
Derivation 3.1
[TopP Top [FinP Fin [IP not [XP X [VP he answered her ]]]]]
moves to X I
[TopP Top [FinP Fin [IP not [XP answeredi +X [VP he ti her ]]]]]
XP moves to spec-Fin and some constituent topicalizes I
[TopP therefore Top [FinP [XP answeredi +X [VP he ti her ]] Fin [IP
not tXP ]]]
the verb+arguments constituent we demonstrated in (14) above. If the subject
bears focus, the remnant VP is extracted into the IP-field prior to movement
of XP to spec-Fin.
Thus the constituent made up by the subject and the object in (15) is the
remnant VP. If bare is merged into the IP, it attracts XP (cf. Kayne 1998). In
this case, either XP or IP moves further up to spec-Fin.
3.3.1 Problems
One of the attractive features of an analysis along these lines is that, prima
facie, it seems to explain Holmberg’s Generalization.
3A similar point will be made below for movement of Vf to “second position”.
84 V2 and Holmberg’s Generalization
Derivation 3.2
[TopP Top [FinP Fin [IP not [XP answeredi +X [VP Jens ti her ]]]]]
VP scrambles into IP I
[TopP Top [FinP Fin [IP not [VP Jens ti her] [XP answeredi +X tVP ]]]]
XP moves to spec-Fin and some constituent topicalizes I
[TopP therefore Top [FinP [XP answeredi +X tVP ] Fin [IP not [VP Jens
ti her] tXP ]]]
Generalization 2 (Holmberg (1986, 1999) (HG)) Argument shift can-
not cross any phonetically realized material from the VP,i.e. the verb, a verbal
particle, a dative preposition, or other arguments of the verb, although it can
cross traces of these, as well as sentential adverbs.
Employing a non-movement analysis of argument shift, HG seems to follow
trivially. Weak pronouns cannot cross any phonetically realized material from
the VP, simply because it is the VP itself (or something bigger) that moves.
Unfortunately, the explanation offered of HG from this account breaks down
when one looks at the system more closely. This is because of what would
happen in non-V2 contexts. Vf would presumably move to X in these contexts
as well, and then nothing prevents the remnant VP to scramble into IP, yielding
ungrammatical orders such as the following.
(133) a. * . . . at Jens ’a ikke svarer
. . . that J her not answers
b. * . . . at Jens ikke fiskekaker liker.
. . . that J not fishcakes likes
In order to prevent such orders, we would have to make extraction of VPs or
objects from XP somehow contingent upon subsequent raising of XP to spec-
Fin. That is, we would have to reintroduce a notion of HG which is what we
set out to derive.
Similarly, one would need to account for the unavailability of subject-verb
inversion in non-V2 contexts with weak pronominal subjects. Suppose that the
finite subordinator at ‘that’ is generated in Fin and somehow blocks XP-to-
spec-Fin movement as well as topicalization. As it stands, our account would
lead to the incorrect expectation that the following order should be grammatical
with the structure in (134b).
(134) a. * . . . at ikke svarte ’n ’a
. . . that not answered he her
b. [TopP Top [FinP that Fin [IP not [XP answeredi +X [VP he ti
her ]]]]]
3.4 More data: how initial is the initial position? 85
Another problem is that it is not clear that the account explains root-V2.
The extent to which it succeeds in doing so depends on whether TopP is the
only projection dominating FinP. In e.g. Rizzi (1997), which is where the
names TopP, FinP originate from, two other FPs are postulated, both of which
dominate FinP, i.e. FocP and ForceP. Thus, either we would have to show that
these projections do not exist, or that, for independent reasons, they cannot
be filled in the relevant cases. These considerations are taken to show that a
more radical departure from standard assumptions is needed.
3.4 More data: how initial is the initial posi-
tion?
Before we turn to our second approximation to the proper analysis of V2,
some data will be reviewed that purport to show that the initial position, i.e.
spec-CP in traditional analyses, need not be construed as a base-generated
initial position. The evidence we will review suggests that operations like wh-
movement and topicalization are done in (at least) two separate movement
steps, one targeting an IP-internal position, and a second one whose nature
we try to elucidate in the remainder of the chapter. Consider the following
contrasts:
(135) a. Al very probably won.
b. * How probably did Al win?
c. How probable is it that Al won?
(136) a. Al quite possibly won.
b. * How possibly did Al win?
c. How possible is it that Al won?
(137) * How [probably/ possibly/ fortunately/ necessarily/ evidently/ maybe/
frankly/ usually] did Al win?
It seems that wh-movement of higher adverbs (cf. Cinque 1999) is systemati-
cally impossible. This contrasts with the behavior of lower adverbs, which do
allow this kind of movement:
(138) How [quickly/ effortlessly/ often/ soon/ frequently] did Al win?
Given that degree modification of the higher adverbs is possible (cf. (135a,
136a)), and that the adjectival counterparts of, e.g. (138) (i.e. ”how fortu-
nate/usual/... is it that...”) are grammatical, it seems that we are dealing
with a syntactic phenomenon, rather than a (purely) semantic one. This sug-
gests that wh-movement is a composite operation, consisting of two parts: One
movement step targeting a relatively low, ”IP-internal” position, call it P1 and
86 V2 and Holmberg’s Generalization
another step targeting the first position of the clause, (P2). If P1 is generated
lower than adverbs like probably, the ungrammaticality of examples like (138)
would follow from the Extension Condition (Chomsky (1995)). In other words,
the only way for them to obey the Extension Condition would be to merge with
P1, the wh-position directly, and then raise to their base-position.
There are some indications that the same reasoning applies to topicalization.
That is, there are topicalization-like processes that seem to target an IP-internal
position in embedded clauses only. One case in point is so-called ’stylistic
fronting’ in Icelandic (cf. a.o. Holmberg (2000)).
Another embedded, ”IP-internal” topicalization process is illustrated by
the following Norwegian examples. These may seem slightly contrived, but
the contrast between (139b) and the other two is quite sharp. In (139a), we
see an adverbial modifier (bak låven), modifying the most deeply embedded
predicate, but appearing displaced from it, in the mittelfeld of the next clause
up. In (139b) we see that this kind of displacement is unavailable if the target
is the root clause. In this case, the displaced adverbial will appear in the very
first position, as illustrated in (139c).
(139) a. Det jeg sa var at jeg bak låven aldri har
what I said was that I behind the-barn never have
skjønt hvorfor han plantet tulipaner.
understood why he planted tulips
b. * Jeg har bak låven aldri skjønt hvorfor han
I have behind the-barn never understood why he
plantet tulipaner.
planted tulips
c. Bak låven har jeg aldri skjønt hvorfor han
behind the-barn have I never understood why he
plantet tulipaner.
planted tulips
It is tempting to say that bak låven is, in some sense in the ”same position”
in (139a) and (139c). Yet another indication that something is wrong with the
standard view of the left periphery is what appears to be obligatory subject-
verb inversion in a quite low position. Consider the following Norwegian ex-
amples where the object has been topicalized around a V2-violating fpt.
(140) a. Meg vanligvis bare svarte ikke Jens.
me usually just answered not J
b. Meg vanligvis bare svarte Jens ikke.
me usually just answered J not
c. * Meg vanligvis bare Jens svarte ikke.
me usually just J answered not
3.5 Second approximation 87
d. * Meg vanligvis Jens bare svarte ikke.
me usually J just answered not
e. * Meg Jens vanligvis bare svarte ikke.
me J usually just answered not
Given the discussion in section 2, it appears that Vf cannot be in C. One
could try to say that, in (140a) the subject is inside VP. This would not work
for (140b), since here, the subject precedes the negation as well. Thus, the
standard ’V-to-C’ analysis of subject-verb inversion cannot handle this phe-
nomenon. The same point could be made with ’distributive’ conjunctions like
(141) (cf. Zamparelli 2000).
(141) a. Meg vanligvis både slo de og sparket.
me usually both beat they and kicked
”They usually both beat and kicked me.”
b. * Meg vanligvis både de slo og sparket.
me usually both they beat and kicked
c. * Meg vanligvis de både slo og sparket.
me usually they both beat and kicked
d. * Meg de vanligvis både slo og sparket.
me they usually both beat and kicked
For reasons of space, we will not enter into a discussion of distributive con-
junction here. We refer the reader to Zamparelli’s work for an analysis that is
congenial to the analysis to be presented here.
3.5 Second approximation
Taking our conclusions so far quite literally, we arrive at the following picture
of ’clausal architecture’:
(142) [advP adv* [WP [FocP [TopP [FinP VP]]]]]
Here, the heads Foc, Top, Fin are the ones argued for by Rizzi (1997). W is
the head introduced by Kayne (1998) to deal with scope phenomena involving
the fpt only. We do not invoke Rizzi’s (ibid.) head ’Force’, mainly because
it is not necessary for our purposes. Adverbs attract TopP to their specifiers.
We assume that, in case nothing is focus, or the entire sentence is, TopP is
attracted to spec-Foc, and subsequently to spec-W. Thus in these cases, the
sequence W, Foc will simply be omitted. Let us see how the system works by
going through some derivations.
88 V2 and Holmberg’s Generalization
3.5.1 Root clauses
We begin by deriving simple root clauses like the following.
(143) Jens svarte meg vanligvis.
J answered me usually
Derivation 3.3
[VP John answered me]
merge Fin and move V I
[FinP answered [VP John me]]
merge Top and move John I
[TopP John [FinP answered [VP me]]]
merge usually and move TopP I
[AdvP [TopP John [FinP answered [VP me]]] usually]
In this derivation, the entire sequence John answered me climbs around the
adverb. The only movement step which is always necessary is movement of
Vf to Fin. The object could also have moved to spec-Top. In that case, the
subject would either remain in-situ, or move to spec-Foc.
Consider now a sentence with an indefinite object noun phrase. In these,
the indefinite will move to spec-Foc, prior to movement of TopP to spec-W.
Derivation 3.4
[VP John read a book]
merge Fin and move V I
[FinP read [VP John a book]]
merge Top and move John I
[TopP John [FinP read [VP a book]]]
merge Foc and move a book I
[FocP a book [TopP John [FinP read [VP ]]]]
merge W and move TopP I
[WP [TopP John [FinP read ]] [FocP a book]]
merge usually and move TopP I
[AdvP [TopP John [FinP read ]] usually [WP [FocP a book]]]
3.5 Second approximation 89
(144) Jens leste vanligvis en bok.
J read usually a book
It is easy to see that addition of more adverbs would only lead to iteration
of this last step, so V2 is derived for these cases. If the subject bears focus, we
now expect the ungrammatical (145a), as shown in derivation 3.5.
(145) a. * Derfor gjenkjente meg vanligvis en student.
therefore recognized me usually a student
b. Derfor gjenkjente vanligvis en student meg.
therefore recognized usually a student me
Derivation 3.5
[VP a student recognized me]
merge Fin and move V I
[FinP recognized [VP a student me]]
merge Top and therefore I
[TopP therefore [FinP recognized [VP a student me]]]
merge Foc and move a student I
[FocP a student [TopP therefore [FinP recognized [VP me]]]]
merge W and move TopP I
[WP [TopP therefore [FinP recognized [VP me]]] [FocP a student]]
merge usually and move TopP I
[AdvP [TopP therefore [FinP recognized [VP me]]] usually [WP [FocP a
student]]]
Recall from the discussion in section 2 that such orders are actually attested
in Swedish. Thus we suggest that this is how they are derived. In order to
rule them out in Norwegian we will resort to the strategy suggested in section
3, i.e. that the subject pied-pipes the VP to spec-Foc. This makes the subject
and the object behave as a constituent and prevents the subject from inverting
with the object. This is illustrated in derivation 3.6.
In Danish subjects must precede all adverbs. Objects follow them unless
they are weak pronouns, in which case they must also precede them. I would
like to suggest that in this language, spec-Foc has been ’grammaticalized’ as a
case position for objects. Weak pronouns, being marked for case, do not have
to move there. This means that there are but two positions the subject can
choose between: it can move to spec-Top, or it can remain in-situ. I the first
case, it will end up in the V2-initial position, as what happens to therefore in
90 V2 and Holmberg’s Generalization
Derivation 3.6
[VP a student recognized me]
merge Fin and move V I
[FinP recognized [VP a student me]]
merge Top and therefore I
[TopP therefore [FinP recognized [VP a student me]]]
merge Foc and move VP I
[FocP [VP a student me] [TopP therefore [FinP recognized]]]
merge W and move TopP I
[WP [TopP therefore [FinP recognized]] [FocP [VP a student me]]]
merge usually and move TopP I
[AdvP [TopP therefore] [FinP recognized]] usually [WP [FocP [VP a stu-
dent me]]]]
derivation 3.6. In the latter, it will remain immediately right-adjacent to the
verb, much as the weak object pronoun in derivation 3.3. Thus we have three
levels of freedom with regard to subjects and spec-Foc. In Swedish, the subject
can move there alone; in Norwegian, it must pied-pipe the VP, and in Danish,
it cannot move there at all.
bare
Let us now turn to the problematic cases with V2-violating focus particles. In
general, we expect these to appear in the Adv* area of (142). I suggest that
they differ from other adverbs in that, instead of attracting TopP, they attract
their focus associate, and come with a W-projection into which bare itself and
TopP moves, i.e. we essentially adopt the treatment in Kayne (1998). Suppose
that we add bare to the end result of derivation 3.6. Then, if en student is the
associate, we get derivation 3.7, corresponding to the grammatical sentence in
(146).
(146) Derfor gjenkjente bare en student meg.
therefore recognized just one student me
Adding usually, we get the continuation in derivation 3.8, corresponding, again,
to the grammatical sentence (147).
(147) Derfor gjenkjente vanligvis bare en student meg.
therefore recognized usually only one student me
3.5 Second approximation 91
Derivation 3.7
[VP one student recognized me]
merge Fin and move V I
[FinP recognized [VP one student me]]
merge Top and therefore I
[TopP therefore [FinP recognized [VP one student me]]]
merge Foc and move VP I
[FocP [VP one student me] [TopP therefore [FinP recognized]]]
merge W and move TopP I
[WP [TopP therefore [FinP recognized]] [FocP [VP one student me]]]
merge just and move FocP I
[bareP [FocP [VP one student me]] just [WP [TopP therefore [FinP rec-
ognized]]]]
merge W and move just and TopP I
[WP [TopP therefore [FinP recognized]] just [bareP [FocP [VP one stu-
dent me]] [WP ]]]
Derivation 3.8
merge usually and move TopP I
[AdvP [TopP therefore [FinP recognized]] usually [WP only [bareP [FocP
[VP one student me]]]]]
Consider now derivation 3.9 of (128a) above, repeated here, in which the verb
is the associate. In this case, FinP moves to spec-bare.
(148) Jens vanligvis bare svarer ikke.
J usually just answers not
The essential difference between these examples and ’ordinary’ V2 sentences is
thus that the verb has been pulled out of TopP. Therefore, when TopP moves
around higher adverbs, the verb is left behind. Of course, the last step of
derivation 3.9) can be skipped, yielding the equally grammatical sentence Jens
bare svarer ikke (’J just answers not’).
Since weak pronouns do not move, they will be left inside the VP. This is
what causes them to stay adjacent to the verb. The partial derivation 3.10 for
(149) (=131a), illustrates this.
92 V2 and Holmberg’s Generalization
Derivation 3.9
[VP John answers]
merge Fin and move V I
[FinP answers [VP John]]
merge Top and move John I
[TopP John [FinP answers [VP ]]]
merge not and move TopP I
[AdvP [TopP John [FinP answers]] not ]
merge just and move FinP I
[bareP [FinP answers] just [AdvP [TopP John] not ]]
merge W and move just and TopP I
[WP [TopP John] just [bareP [FinP answers] [AdvP not ]]]
merge usually and move TopP I
[AdvP [TopP John] usually [WP just [bareP [FinP answers] [AdvP not ]]]]
Derivation 3.10
[AdvP [TopP therefore [FinP answered [VP he her]]] not ]
merge just and move FinP I
[bareP [FinP answered [VP he her]] just [AdvP [TopP therefore] not ]]
merge W and move just and TopP I
[WP [TopP therefore] just [bareP [FinP answered [VPhe her]] [AdvP
not ]]]
merge usually and move TopP I
[AdvP [TopP therefore] usually [WP just [bareP [FinP answered [VP he
her]] [AdvP not ]]]]
(149) Derfor vanligvis bare svarte ’n ’a ikke.
therefore usually just answered he her not
Nothing special needs to be said about the low subject-verb inversions noted
in section 4. There are two cases to consider. If the subject is a weak pronoun,
it remains in the VP, and moves with the verb, as in the previous derivation.
If it is a full noun phrase, as in (150) (=140a) it can move to spec-Foc, giving
rise to the derivation 3.11.
3.5 Second approximation 93
(150) Meg vanligvis bare svarte ikke Jens.
me usually just answered not J
Derivation 3.11
[VP John answered me]
merge Fin and move V I
[FinP answered [VP John me]]
merge Top and move me I
[TopP me [FinP answered [VP John]]]
merge Foc and move VP I
[FocP [VP John] [TopP me [FinP answered]]]
merge W and move TopP I
[WP [TopP me [FinP answered]] [FocP John]]
merge not and move TopP I
[AdvP [TopP me [FinP answered]] not [WP [FocP John]]]
merge just and move FinP I
[bareP [FinP answered] just [AdvP [TopP me] not [WP [FocP John]]]]
merge W and move just and TopP I
[WP [TopP me] just [bareP [FinP answered] [AdvP not [WP [FocP John]]]]]
merge usually and move TopP I
[AdvP me usually [WP just [bareP [FinP answered] [AdvP not [WP [FocP
John]]]]]]
As the reader may have noticed, nothing has been said, so far, about exam-
ples like (131b), repeated in (151), where Vf does appear in the second position
even though there is an fpt modifying it.
(151) Derfor svarte ’n ’a vanligvis bare ikke.
therefore answered he her usually just not
In order to handle these cases, we propose that FinP, when it moves to spec-
bare, can pied-pipe TopP. This has the effect of making bare behave as other
adverbs with respect to V2. This is illustrated in the partial derivation 3.12.
Auxiliaries and the derivation of HG
There is a large literature on the syntactic treatment of auxiliaries (cf. Cinque
2000, Koopman and Szabolcsi 2000, Julien 2000 for recent discussion). One
94 V2 and Holmberg’s Generalization
Derivation 3.12
[AdvP [TopP therefore [FinP answered [VP he her]]] not ]
merge just and move TopP I
[bareP [TopP therefore [FinP answered [VP he her]]] just [AdvP not ]]
merge W and move just and TopP I
[WP [TopP therefore [FinP answered [VP he her]]] just [bareP [AdvP not
]]]
merge usually and move TopP I
[AdvP [TopP therefore [FinP answered [VP he her]]] usually [WP just
[bareP [AdvP not ]]]]
controversy is whether or not to treat e.g. participial constructions as biclausal
(Kayne 1993). We remain agnostic about this question here, simply analyzing
auxiliaries by stacking them onto the VP. We do treat them as ’raising verbs’
in the sense that they attract the subject from the inner VP. This is in order
to exclude sentences like (49) where the subject follows an auxiliary.
(152) * Derfor kan ikke ha ’n sett Jens.
therefore can not have he seen J
We assume that participial VPs move to spec-Foc unless the participial verb is
a topic. Apart from this, auxiliaries do not pose any special problems for our
account. We need to discuss them, however, in order to show that the account
really derives HG which we repeat here for convenience.
(153) Holmberg’s Generalization (HG)
Argument shift cannot cross any phonetically realized material from
the VP,i.e. the verb, a verbal particle, a dative preposition, or other
arguments of the verb, although it can cross traces of these, as well
as sentential adverbs.
We have said that weak pronouns never move. What is crucial is that they
never shift. Otherwise, they are free to move, as it were. In particular, they
can move to spec-Top, which ultimately places them in the initial position of
the clause. They cannot move to spec-Foc for the simple reason that they are
weak, and, in Danish, because they bear morphological case. Consider now the
contrast between (154) and (155), a typical example of HG.
(154) a. Jeg så ’n ikke.
I saw him not
b. * I så ikke ’n.
I saw not him
3.5 Second approximation 95
(155) a. * Jeg har ’n ikke sett.
I have him not seen
b. Jeg har ikke sett ’n.
I have not seen him
We have already seen how it comes about that the weak pronoun must precede
the negation in (154a). We need to demonstrate that it cannot do so if there
is an auxiliary. The first steps in the derivation of (155b) are as follows.
Derivation 3.13
[VP I have [PtcP seen him]]
merge Fin and move V I
[FinP have [VP I [PtcP seen him]]]
At this point, one of two things can happen: either the subject moves to
spec-Top, or the object does. As we shall see below, the participial verb can
also do this. If the subject moves, we get the following continuation, deriving
(155b).
Derivation 3.14
merge Top and move John
[TopP I [FinP have [VP [PtcP seen him]]]]
merge Foc and move PtcP I
[FocP [PtcP seen him] [TopP I [FinP have [VP ]]]]
merge W and move TopP I
[WP [TopP I [FinP have ]] [FocP [PtcP seen him]]]
merge not and move TopP I
[AdvP [TopP I [FinP have ]] not [WP [FocP [PtcP seen him]]]]
In the alternative case, the object moves to spec-Top, yielding the following
continuation of derivation 3.13, corresponding to the grammatical sentence in
(156).
(156) Han har jeg ikke sett.
him have I not seen
If the participial verb is a topic, it moves to spec-Top on its own. This
implies that heads can move to specifier positions, contra standard assumptions.
It is possible that this problem would disappear if we adopt a biclausal structure
for participial constructions, (Kayne 1993). Another alternative would be to
96 V2 and Holmberg’s Generalization
Derivation 3.15
merge Top and move him
[TopP him [FinP have [VP I [PtcP seen]]]]
merge Foc and move PtcP I
[FocP [PtcP seen] [TopP him [FinP have [VP I]]]]
merge W and move TopP I
[WP [TopP him [FinP have [VP I]]] [FocP [PtcP seen]]]
merge not and move TopP I
[AdvP [TopP him [FinP have [VP I]]] not [WP [FocP [PtcP seen ]]]]
say that the participle adjoins to Top, and that this serves the same purpose
as moving it to spec-Top. We choose to live with this problem for the purposes
of this section. Consider the contrast below.
(157) a. Sett har jeg ’n ikke.
seen have I him not
b. * Sett har jeg ikke ’n.
seen have I not him
When the participle is fronted, the pronoun must precede the negation. This
fact is problematic for the traditional account which treats weak pronoun dis-
tribution as pronoun shift and V2 as V-to-C with subsequent topicalization.
In order to rule out (155a), such an account must block pronoun shift over the
verb, but allow it over its trace. In order to account for the grammaticality of
(157a), the same account must allow movement over the verb. In fact, in view
of (157b) such movement must be forced just in case the verb subsequently
topicalizes or undergoes V-to-C movement.
Suppose that the proper formulation of HG is in entirely phonological terms.
The fact that pronoun shift is allowed around traces might follow rather ele-
gantly from such a formulation, since traces do not have phonological content.
But the fact that they can cross adverbs seems rather mysterious; the more
so because they must cross adverbs if they can. It seems unlikely that there
should be any phonological property distinguishing adverbs from all other ma-
terial. To be on the safe side, we will show that there isn’t. The Norwegian
word fortsatt ambiguously represents the adverb ’still’ and the participial verb
’continued’. In the adverb interpretation it forces pronoun shift and in the
participial interpretation it blocks it.
(158) a. Han har fortsatt det.
he has continued it
3.5 Second approximation 97
b. Han har det fortsatt.
he has it still
Rejecting a phonological formulation, there are two scenarios to consider ac-
cording to whether or not bare heads are allowed to topicalize. If they are,
one would need to assume that there is some domain extension mechanism (cf.
Chomsky’s 1995 notion of ’equidistance’ ) which would be triggered by such
topicalization. If head movement to spec-CP is not endorsed, one would need
to assume that pronoun shift cannot cross the participle for syntactic reasons,
but that this can be violated if the (remnant) VP subsequently topicalizes. On
our account, nothing special needs to be said. We assume that heads can move
to spec-Top with the caveat noted above. The derivation of (157a) runs as
follows.
Derivation 3.16
[FinP have [VP I [PtcP seen him]]]
merge Top and move Ptc I
[TopP seen [FinP have [VP I [PtcP him]]]]
merge not and move TopP I
[AdvP [TopP seen [FinP have [VP I [PtcP him]]]] not]
This concludes our discussion of Mainland Scandinavian root clauses.
3.5.2 More problems
One obvious problem with the second approximation is that it leads us to
expect sentences like (159) to be underivable, contrary to fact.
(159) Sannsynligvis har Jens gått hjem.
probably has J gone home
The adverb sannsynligvis ‘probably’, being generated in the “adv*” area of
(142), would have to be lowered into spec-TopP in order to end up in the first
position. Assuming that lowering is impossible for principled reasons, this must
be wrong.
A second problem, pointed out to me by Richard Kayne (p.c.) is that
the account of topicalized bare participles in terms of head-movement to spec-
TopP fails to generalize to cases like the following, where the fronted material
is syntactically complex, but nevertheless stranding the pronoun.
(160) a. [lagt på bordet] har jeg dem ikke.
put on the-table have I them not
98 V2 and Holmberg’s Generalization
b. Jeg har (*dem) ikke (*dem) lagt *(dem) på bordet
I have (*them) not (*them) put *(them) on the-table
(*dem).
(*them)
Given that the subject-initial counterpart of (160a) (i.e. (160b)) has the pro-
noun sandwiched between the participle and the PP, it seems that we have to as-
sume that the weak pronoun is moved out of the VP prior to VP-topicalization.
Thus, (160) suggests a derivation like derivation 3.17, which contradicts our as-
sumption that weak pronouns don’t shift.
Derivation 3.17
[not [put them on the table]]
move them I
[themi [not [put ti on the table]]]
merge I and have I
[I have [them [not [put ti on the table]]]]
move have and VP I
[[put ti on the table]j havek [ I tk [themi [not tj ]]]]
However, there are reasons to think that the position of the pronoun in (160) is
due to an operation different from “ordinary” pronoun shift (henceforth PS; I
will refer to the pronoun shift of examples like (160) as “stranded PS”, in short
SPS). It is well-known that PS does not license parasitic gaps, thus (161a) is
ungrammatical.
(161) a. * Jeg kysset henne aldri uten å danse med pg først.
I kissed her never without to danse with pg first
b. Jeg kysset henne aldri uten å danse med henne først.
I kissed her never without to danse with her first.
At first sight, SPS behaves the same, i.e. (162) is also ungrammatical.
(162) * Kysset har jeg henne aldri uten å danse med pg først.
kissed have I her never without to danse with pg first
However, when the adverbial PP is moved along with the topicalized participle,
the parasitic gap becomes possible, if fact, even obligatory: (163b) is sharply
ungrammatical.
(163) a. [kysset uten å danse med pg først] har jeg henne aldri.
[kissed without to danse with pg first] have I her never
3.6 German 99
b. * [Kysset uten å danse med henne først] har jeg henne
[kissed without to danse with her first] have I her
aldri.
never
For the standard theory of PS, this would have to be taken to indicate that
the dependency between the pronoun and the VP-internal trace in SPS (163a)
must be of a different kind than the dependency between the pronoun and the
trace in “ordinary” cases of PS, such as (161b). SPS exhibits A-properties,
while PS doesn’t. In our setup, the difference would lie in the presence (163a)
versus absence (161b) of a dependency. That is, I am suggesting that there is
movement in the former case but not in the latter.4
A third problem is that we have stipulated that e.g. the subject must pied-
pipe the VP (or vP) when it moves to spec-TopP. This is in order to rule
out examples where the object is carried along with Vf to the initial position,
crossing the subject, i.e. orders like (145), grammatical in Swedish. Suppose,
putting aside Swedish for the moment, that only VPs can move to spec-FocP.5
Then these stipulations would follow from that. There is evidence that it is at
least possible to move VPs to spec-Foc, with subsequent raising of the remnant
EVP, thus yielding “VP-extraposition” structures.
(164) a. . . . at han hver dag møtte en ny pike
. . . that he every day met a new girl
In Nilsen (2000), these are interpreted as fronting of the adverbial hver dag to
a high, left peripheral position. The grammaticality of (165), where a sentence
adverb intervenes between the matrix subject and hver dag might be taken to
call that analysis into question, however. Instead, I will interpret it as leftward
movement of the VP [met a new girl] prior to leftward movement of the remnant
EVP [every day tVP ], as illustrated in the derivation 3.18 below.
(165) . . . at han tydenigvis hver dag møtte en ny pike
. . . that he evidently every day met a new girl
3.6 German
German does not have V2-violations with fpt (166a). Thus, it might appear
that our argument in section 2, that Mainland Scandinavian V2 is not derived
by head movement, does not carry over to German. Of course, we would
not want to say that German V2 is derived in a fundamentally different way
4 This does not explain the properties of sentences like (163). In particular, it does not
explain the obligatoriness of the parasitic gap. Although I find this question intriguing, I will
leave it aside for now. What is important here is just that SPS behaves differently from PS
in important respects, so we should not expect both to be reducible to the same operation.
5 In case one needs further projections over VP, one could generalize this statement to
“extended V-projections”.
100 V2 and Holmberg’s Generalization
Derivation 3.18
[EVP [VP met a new girl] every day]
merge Foc0 and move VPI
[FocP [VP met a new girl] Foc0 [EVP tVP every day]]
merge W0 and move EVPI
[WP [EVP tVP every day ] W0 [FocP [VP met a new girl] Foc0 tEVP ]]
from Mainland Scandinavian V2. After all, this is one property that the two
languages obviously have in common. We would like to know, however, what
underlies the contrast between the German sentence (166a) and the Norwegian
one (167), and how the analysis of V2 defended here would fare with German
facts.
(166) a. * Gerhard nur weiß es nicht.
G only knows it not
b. Gerhard weiß es nur nicht.
G knows it only not.
“Gerhard just doesn’t know it.”
(167) Jens bare vet det ikke.
J just knows it not
However, as discussed in Meinunger (to app.); Fanselow (2002),6 other ex-
pressions, such as the the operator mehr als ‘more than’ must precede the finite
verb (when this is the modifiee). Consider the examples in (168).
(168) a. daß Hans seinen Profit letztes Jahr mehr als verdreifachte
that H his profit last year more than tripled
b. * Hans verdreifachte seinen Profit letztes Jahr mehr als.
H tripled his profit last year more than
c. % Hans mehr als verdreifachte seinen Profit letztes Jahr.
H more than tripled his profit last year.
d. % Seinen Profit mehr als verdreifachte Hans letztes Jahr.
his profit more than tripled H last year.
6 Meinunger (to app., 2001) gives examples like (168c) as ungramatical. Fanselow (2002)
reports that in a survey, 6 out of 20 speakers actually accept them. For Dutch, I found that
4 out of 8 speakers accept them. The differences were very sharp: the ones who accept them,
find them perfectly grammatical, while those who don’t find them sharply ungrammatical.
3.6 German 101
(169) a. % Hij meer dan verdubbelde zijn scoretotaal
hemore than dubbled his score-total
vorig jaar.
b. * Hij verdubbelde zijn scoretotaal meer dan.
he dubbled his score-total more than
c. % De winst meer dan verdubbelde.
The gain more than dubbled
d. * De winst verdubbelde meer dan.
The gain dubbled more than
In this case, the v2-violation is actually obligatory for the people who accept
it. The speakers who don’t accept them have to use a periphrastic construction
to express the same (or a similar) meaning. I take this to show that some
varieties of Dutch and German do allow for the relevant pattern. More research
is required to determine the precise generalizations concerning the limitations
on this phenomenon.
3.6.1 Hallman’s analysis
In a recent paper, Peter Hallman (2001) proposes to tie together the verb-
finality of German embedded clauses and the root-embedded asymmetry in the
same language (i.e. root clauses are V2, embedded clauses are typically verb-
final) in the following way. He follows, more or less the standard approach to
root V2, i.e. the verb moves to some high F0 , T0 in his case, and then some
XP moves to spec-F0 .7
TP
H
HH
H
XPi TP
H
H
HH
T0 AgrP
H
Vfj
H0 HH
T H
subj AgrP
H
HH
Agr0 VP
PPP
. . . ti . . . tj . . .
Figure 3.2: Hallman’s root V2
Hallman’s innovation is to treat embedded (V-final) clauses as, in a sense,
V2 as well. His derivation of a V-final structure is given in Figure 3.3, where
7 Hallman has AgrP, hosting the subject of the clause below TP.
102 V2 and Holmberg’s Generalization
Vf moves to T0 as in Figure 3.2, and then the entire AgrP is moved to spec-T.
In this sense, Vf is actually in second position, even in V-final structures.
CP
H
H
HH
0
H
C TP
H
HH
HH
H
AgrPi TP
H HH
HH
0
T ti
subj AgrP
H
H0
HH Vfj T
Agr0 VP
PP
. . . tj . . .
Figure 3.3: Hallman’s V-final
3.6.2 Müller’s V2 as vP first
Müller (2002) proposes an analysis of German V2 which, as we shall see, shares
crucial assumptions with the analysis developed in section 3.5. However, the
execution is quite different, and some of the facts discussed in the present work,
notably HG, does not fall under Müller’s analysis. I will use his analysis as a
spring board to extend my analysis to German.
Müller casts the analysis in terms of phase theory (Chomsky, 1999, 2001).
He adopts the following definitions to that end.8
(170) Strict Cycle condition SCC
Within the current XP α, a syntactic operation may not target a
position that is included within another XP β that is dominated by
α.
(171) Phase Impenetrability Condition (PIC)
Material that is dominated by a phase XP is not accessible to oper-
ations at ZP (the next phase) unless it is part of the edge domain of
X.
(172) Edge Domain
A category is in the edge domain of a head X if it is at an edge of
the minimal residue of X
8 Phases are utilized in derivations of island constraints. See Starke (2001) for a recent,
comprehensive account of island constraints without resorting to phases.
3.6 German 103
(173) a. Minimal Residue
The minimal residue of X includes X, and the head minimally
c-commanded by X (the head residue, HR), and the specifiers
of X (the spec-residue, SR).
b. Edge
A category is at an edge of the minimal residue of X iff it is
the highest phonologically overt item in HR or SR.
Müller then assumes that German clause structure conforms to (174);9 he treats
“scrambling” as movement of some XP to an outer specifier of v; he follows
Diesing (1992) and others in assuming optional subject raising to spec-T; he
assumes that weak pronouns are obligatorily fronted within TP; that there is
wh-movement to the specifier of “filled C”; and, finally, that main verbs always
remain in situ, since head movement is ruled out in principle.
(174) [CP C [TP T [vP NP [vP [VP . . . V ] v ]]]]
V2 results from attraction of v by C. Since head movement is not possible,
v must move as a phrase, thus (potentially) pied piping other material. Müller
proposes the following constraint on vP movement to C:
(175) Edge Domain Pied Piping Constraint (EPC)
A moved vP contains only the edge domain of its head.
This forces massive evacuation of vP prior to movement of vP to CP. He notes
that such evacuation cannot be triggered by features, and therefore suggests
that Last Resort (LR)10 should be weakened to a ‘soft’ (i.e. violable) constraint.
Thus, his EPC must be ranked higher than Last Resort, implying an Optimality
Theoretic (OT) evaluation procedure.11 Given all this, Müller’s derivation for
(176) is given in derivation 3.19.
(176) Die Maria hat den Fritz geküßt
The-nom M has the-acc F kissed
Müller assumes that adverbs can be merged as specifiers of vP. If an adverb is
merged before the subject, the subject will be in the edge domain of v, so the
EPC forces adverb to move out before vP fronting takes place. This would lead
to a subject-initial sentence again. If the adverb is higher than the subject,
the latter is forced out of vP, again because of the EPC, leading to an adverb-
initial sentence. One must then assume that vP evacuation is order preserving:
If not, one would be able to derive sentences where the VP precedes the subject.
9 Note that this structure seems to imply that Müller has head directionality.
10 LR states that every movement must result in checking of a feature.
11 He also suggests that one could reformulate LR to the effect that movement must either
be driven by feature checking or by the need to satisfy a constraint like EPC.
104 V2 and Holmberg’s Generalization
Derivation 3.19
[vP [Die Maria] [vP [VP [den Fritz] geküßt] hat]]
merge T and move VP I
[TP [VP [Den Fritz] geküßt]i [TP T [vP [die Maria] [vP ti hat]]]]
merge C and move vP I
[CP [vP [die Maria] [vP ti hat]]j C[+vP ] [TP [VP [den Fritz] geküßt]i [TP
T tj ]]]
Müller suggests that this is a general property of movement operations that are
not feature driven.12 I give his derivation for (177) in derivation 3.20.
(177) Gestern hat die Maria den Fritz geküßt.
Yesterday has the-nom M the-acc F kissed
Derivation 3.20
[vP gestern [vP [die Maria] [vP [VP [den Fritz] geküßt] hat]]]
merge T and move VP and die Maria I
[TP [die Maria]i [VP [den Fritz] geküßt]j T [vP gestern [vP ti [vP tj hat]]]]
merge C and move vP I
[CP [vP gestern [vP ti [vP tj hat]]]k C [TP [die Maria]i [VP [den Fritz]
geküßt]j T tk ]]
Object-initial sentences are analyzed by scrambling the object to spec-vP af-
ter merging the subject. He analyzes such scrambling as an instance of feature-
driven movement, and he denotes the feature (or bundle of features) responsible
for scrambling as [Σ]. Thus, (178) is derived in derivation 3.21.
(178) Den Fritz hat die Maria geküßt.
the-acc F has the-nom M kissed
This kind of derivation generalizes to other material which can be moved to
spec-v. in other words, VP-topicalizations and long-distance topicalizations
will be derived in the same way, by first moving the constituent in question to
the highest specifier of v, then evacuating vP in order to satisfy the EPC and
finally moving vP to spec-CP.
12 One would still like to know why. Note also that, given that the subject and the VP
move separately, this would have to be stated as a constraint, not on a single application of
move, but on a set of applications, all of which are not triggered by a feature. The fact that
the two movements must cooperate in this way seems to me to suggest that we are rather
dealing with one movement. I return to this point below.
3.6 German 105
Derivation 3.21
[vP [die Maria] [vP [VP [den Fritz][Σ] geküßt] hat[+Σ] ]]
move den Fritz I
[vP [den Fritz]Σ [vP [die Maria] [vP [VP tΣ geküßt] hat[+Σ] ]]]
merge T, and move VP and die Maria I
[TP [die Maria]i [VP tΣ geküßt]j T [vP [den Fritz][Σ] [vP ti [vP tj hat[+Σ] ]]]]
merge C and move vP I
[CP [vP [den Fritz][Σ] [vP ti [vP tj hat[+Σ] ]]]k C [TP [die Maria]i [VP tΣ
geküßt]j T tk ]]
Müller’s account is similar to the one developed in section 3.5 (Nilsen, to
app.b) in the sense that, whereas Müller has vP first, I have TopP first. In fact,
movement to spec TopP and movement to an outer specifier of vP (triggered by
Σ) are also similar, modulo labelling and Müller’s EPC-driven vP-evacuation
is analogous to our movement to spec-Foc. But there are also differences. An
obvious one is that it is unclear how, on Müller’s account, we could account for
Holmberg’s Generalization (HG) in the way that we did in section 3.5. Since
the pronouns follow the finite verb, they would have to reside in VP on Müller’s
account. But VP could only participate in Müllerian vP-fronting in the relevant
cases at the cost of incurring an EPC-violation. Perhaps one could treat the
EPC as a soft constraint as well, which is what Müller suggests for Last Resort.
But assuming an OT evaluation mechanism on top of the sort of syntax that
we are considering seems to me slightly off the parsimonious track. Another
problem, pointed out by Müller, is that the analysis relates topicalizaton to
scrambling. In other words, in this setup, an object noun phrase can only end
up preceding the finite verb by scrambling past the subject to the left edge of
vP. This creates problems for all the other Germanic V2 languages which do
not allow such scrambling in the first place, but do allow objects to occupy
the first position.13 Müller’s account also has some advantages over mine. For
example, it does not have a problem with generating V2-initial adverbs, which,
as we saw, pose problems for my account. Secondly, and related to the previous
point, Müller’s account correctly leads on to expect there to be an asymmetry
between subjects and adverbs in first position on the one hand, and objects
and other arguments on the other. The latter kind of expression can only
occupy the first position in certain discourse functions (see below), whereas
subjects and adverbs do not require special discourse function to occupy the
first position.
13 Dutch, Norwegian, Swedish and Icelandic allow “argument-shift” of weak pronouns and
full noun phrases, as long as the relative ordering of the arguments is unaffected. Danish
only allows shifting of weak pronouns.
106 V2 and Holmberg’s Generalization
3.7 Third approximation: V2 without positions
In order to incorporate the advantages of Müller’s account, I will assume that
there is a head Σ which can merge above or below the subject. If it is merged
below the subject, as in the leftmost tree in figure (3.4), the subject will count
as a “specifier” of Σ. If it is merged above, any adverb merged to the result of
that will count as the “specifier” of Σ. This is illustrated in the rightmost tree
in figure (3.4). I will furthermore assume that Σ has an EPP-feature: it must
have a phonetically visible specifier.
VP VP
H H
HH HH
Adv VP Adv VP
H H
H
H H
Subj VP Σ VP
HH HH
Σ VP Subj VP
PP PP
......... .........
Figure 3.4: Unmarked Σ
Finally, Σ attracts the verb. If it has a marked feature, it attracts some phrase
functioning as a contrastive topic. Before we go on, we must address how the
verb moves to Σ.
Head Movement: an aside
Müller’s analysis is motivated by the desire to rule out head movement in
principle. If this is to be possible, phenomena like V2 must be analyzed without
recourse to HM. Needless to say, I fully agree with Müller that V2 can and
should be so analyzed. However, I would like to point out some aspects of
Müller’s approach that I take to be problematic. First, the machinery that he
employs to rid the system of HM is rather elaborate (Phases, massive remnant
movement, optimality theory, etc.). Thus it is not clear that eliminating HM in
this way represents a simplification of the system. This becomes even less clear
if we look at how Müller derives the unavailability of HM. Suppose that we treat
the unavailability of HM as an empirical generalization in need of explanation,
and then look at how it is explained in the setup under consideration.
Generalization 3 (Müller (2002)) Heads stay.
Müller’s explanation is that adjunction of a head to another one would violate
the Extension Condition Chomsky (1995), stated roughly in (179)
(179) Extension Condition
Merger extends the tree at the root.
3.7 Third approximation: V2 without positions 107
This would rule out head movement only if it follows from independent consid-
erations that moved heads cannot be merged to the root. In other words, the
tree in figure 3.5 must be ruled out if α is a moved head (not if it isn’t moved,
of course, because then heads could never be merged to anything). We now
?
HH
α ?
PP
. . . tα . . .
Figure 3.5: Illicit merger
need to explain why it matters that α is moved and why it matters that it is a
head. To this end, we adopt some contextual definition of the notion “head”,
crucially involving that all heads project. In other words any syntactic object
that does not project further is a maximal projection, hence not a head. We
then assume some version of the Chain Uniformity Condition, such as (180):
(180) Chain Uniformity Condition (CUH)
Chains are uniform with respect to the feature [±max] (and, perhaps
other features)
If the label of the root node is not allowed to project from the moved head,
this setup rules out the configuration in question, because the trace of α is a
head (does project further), while α is a maximal projection (does not project).
Hence the configuration violates the CUH. As noted in Fanselow (2002), the
CUH would be satisfied if α projects, since, in that case, both the trace and
its antecedent are heads. In order to rule out this option, Müller could make
reference to his Unambiguous Domination constraint (UD):
(181) Unambiguous Domination (UD)
An α-trace cannot be α-dominated.
Müller motivates this constraint by independent considerations.14 This rules
out the version of figure 3.5 where the root label is projected from α, because
the root node dominates the trace of α.
In sum, HM is ruled out, because the Extension Condition rules out adjunc-
tion of a head to a head; the CUH rules out attachment of a non-projecting,
moved head to the root node, given the [±max] distinction; and the UD rules
out attachment of a projecting, moved head to the root node.
It seems to me that this is just a roundabout way of stating generalization
3. It seems to say that heads can’t move because they have a feature [-max]
that prevents them from moving. Furthermore, this feature doesn’t seem to do
anything else than just that.
14 It is needed to explain e.g. why remnant VPs can be topicalized but not scrambled in
German.
108 V2 and Holmberg’s Generalization
Suppose that we don’t allow ourselves all this machinery. In other words,
we do away with the distinction between heads and phrases. Then the CUH
becomes vacuous, so it can be dismissed, too. Thus, we are left with the exten-
sion condition. Then we would be closer to the system discussed in Koeneman
(2000); Fanselow (2002). However, whereas, e.g. Fanselow (2002) uses the CUH
to allow head movement only if the head projects, we do not have this option,
since we do not have the CUH (or any distinction to state it on). Therefore,
we would be led to allow heads to move to specifiers.
Holmberg (2000) argues on the basis of empirical facts that head movement
to specifiers should be allowed, just in case it is movement of the phonological
features of the head alone. In other words, if it is driven by purely phonological
considerations. Bobaljik and Brown (1997) point out that, even with the elab-
orate machinery employed int the scenario above to rule out head movement,
one could think of ways to allow it. Their idea is that, in a theory of move as
“copy+merge”, one could merge two heads before they are merged to the main
tree. In other words, if the derivation has reached a stage such as (182a), where
x is a head, and the next thing to be merged y is also a head, one can copy
x and merge it toy, yielding the intermediate stage (182b) with two distinct
syntactic objects. Then these are merged in (182c).
(182) a. [xp . . . x . . .]
b. [y x y ] [xp . . . x . . .]
c. [yp [y x y ] [xp . . . x . . .]]
One might object that this would allow “sideways” movement, i.e. movement
to a non-c-commanding position, but Bobaljik and Brown (1997) argue that
there are obvious ways around this.
I will not dwell on the issue here. I will assume that movement of V to Σ
is head-movement in the traditional sense that the verb adjoins to Σ. I take it
to be driven by the phonetic emptiness of Σ.
3.7.1 the analysis
In a sense, Müller’s massive vP evacuation is forced by the fact that he assumes
that v and the left-edge XP may be separated by other material. This leads to
the situation that lower specifiers of v and VP do not make up a constituent.
Hence, when they must leave vP, they must do so separately. But the mere fact
that they must land in the same order as they started may lead one to suspect
that they move as one constituent, rather than massive parallel movement. The
elegance of that would be that it would derive the order preservation property,
rather than stipulating it as an extra constraint on the output. Suppose, there-
fore, that our head Σ corresponds to the standard v. As before, it attracts the
highest verb. It must have a specifier, i.e. it has an “EPP” feature. If it also has
a marked Σ feature, it furthermore requires its specifier to be a (contrastive)
topic. Suppose, furthermore that, Σ merges above or below the subject, and
3.7 Third approximation: V2 without positions 109
that an adverb merged immediately after Σ will satisfy its EPP. This would be
the same for all the V2 languages. They differ with respect to what happens
next.
Main clauses
For Main clauses, I assume very much the same analysis as before. Any VP-
node which contains focused material must move out of the scope of Σ, to
the left of its specifier. Next, Σ pied pipes it’s “specifier” to the left of that
again. For ease of reference, I will refer to the smallest node containing both Σ
and its specifier (and potentially more material following Σ) as ΣP. Thus, we
end up with the following kind of structure. Note now that whatever material
H
HH
H
HH
ΣP H
H HH
HH VP tΣP
XP ΣP P PP
P PP . . . focus . . .
Vf+Σ. . . tVP
Figure 3.6: V2 structure
follows Σ inside ΣP will end up preceding the extracted VP, as it did before
movement. Furthermore, no reordering of material will take place within VP.
Hence, Müller’s order preservation constraint follows.
Before I demonstrate this, some words on what can occupy spec-Σ when
this head has an unmarked feature are in order. Subjects definitely can, and
some though not all adverbial type elements. Thus, the examples in (183) do
not require any marked intonation pattern, while those in (184) do.
(183) a. Han så ikke Jens.
he saw not J
b. Kanskje han ikke så Jens.
maybe he not saw J
c. Derfor så ’n ikke Jens.
Therefore saw he not J
(184) a. Alltid så han Jens.
always saw he J
b. Jens så han ikke.
J saw he not
c. Så Jens gjorde han ikke.
saw J did he not
110 V2 and Holmberg’s Generalization
d. muligens så han Jens.
possibly saw he J
(183a) is the clearest case of an unmarked XP. In (183b) we see the well-
known fact that the adverb kanskje ‘maybe’ can serve as both the first con-
stituent and the finite verb simultaneously, as it were (Platzack, 1986).15 I
take this to be a special case, attributable to the fact that the adverb is “ver-
bal” and “phrasal” or “adverbial” at the same time, so it can check both the
EPP feature and the V feature of Σ. Finally, discourse relation markers, like
derfor ‘therefore’, likevel ‘nevertheless’ etc. can (but need not) occupy the first
position. These may clearly relate to a marked, rather than an unmarked Σ, if
marked Σ is to be related to topichood. I will leave these aside. Hence it really
seems that the subject is the core unmarked XP. Everything else requires a
marked intonation pattern. Consider now potentially problematic cases where
an unstressed object pronoun occupies XP. In particular, consider (185b) and
(185c) as replies to the question in (185a).
(185) a. Så du Jens?
Saw you J
b. Nei, han så jeg ikke.
no him saw I not
c. Nei, jeg så ’n ikke.
no I saw him not
(185b), construed as an answer to (185a), suggests that you did see somebody
else, although you didn’t see Jens. (185c) does not give rise to such a sug-
gestion. In other words, even the unstressed object pronouns seem to receive
a contrastive interpretation when they occur in first position. Thus, I take
this to suggest that the only way for a non-subject (discourse markers aside)
to occupy spec-σ is by attraction of a marked value. Now, note also that the
discourse value of the first (marked) constituent is no that of “new information
focus”. This can be seen by the fact that an object in first position is very
clumsy as an answer to an object wh-question. Suppose that, upon meeting
John outside the cinema, I ask him (186a). (186b) does not seem to be an
adequate answer to the question, or at the very least it requires that we have
been talking about the movie “Mulholland Drive” before. In other words, it
must be in our common ground. (186c) would be the most straightforward
answer.
(186) a. Hvilken film har du sett?
which movie have you seen?
b. # Mulholland Drive har jeg sett.
M D have I seen
15 In Swedish, the corresponding adverb works slightly differently: It behaves as a finite
verb, i.e. may (but need not) occupy the second position in traditional terms. I refer the
reader to Platzack’s work for discussion.
3.7 Third approximation: V2 without positions 111
c. Jeg har sett Mulholland Drive.
I have seen M D
It seems to me that non-canonical topicalization often (if not always) leads to
switching of topic to another already accessible one. Thus, if the discourse has
been about topics x, y, z, and z is the current one, topicalization of a constituent
denoting y makes y the current topic again. Non-canonical topicalization of an
x which is the current topic, like in (185b) seems to suggest that x is no longer
the topic. In other words, marked Σ denotes “switch topic”.
The analysis for simple main clauses generalizes straightforwardly to Ger-
man and Dutch. I return to the slightly more complicated question of pe-
riphrastic constructions and verb finality shortly.
Order preservation and scrambling
Let us now see how our setup derives order preservation. I give the derivation
for (187) in derivation 3.22. The crucial fact is that the two objects must occur
in the same order as they were merged.
(187) Jeg ga alltid Jens en kylling.
I gave always J a chicken
Derivation 3.22
[I gave Jens a chicken]
merge Σ and move gave and I I
[Ii [gavej +Σ[ ti tj Jens a chicken]]
move VP I
[[ti tj Jens a chicken]k [Ii [gavej +Σ tk ]]]
merge always and move ΣP I
[[Ii [gavej +Σ tk ]l [always [ti tj Jens a chicken]k tl ]]]
Suppose that we move the two objects separately instead of as a constituent.
Clearly, this could reverse their order, if we can move the indirect object first,
and then move the direct object across it. Of course, this is perfectly possible
in languages like German which has argument–argument scrambling. However,
Mainland Scandinavian, Icelandic and Dutch do not allow argument reordering.
I suggest that the difference lies in whether the language in question allows the
arguments themselves, or only VPs to “scramble” out of ΣP for focus reasons. If
only VPs are allowed, the Extension Condition will actually rule out reordering.
Consider the following partial derivation 3.23 for (188).16 I assume a VP-shell
16 As argued extensively in Nilsen (1997). Norwegian and Swedish do allow for “object
shift” of full DPs, indirect, as well as direct objects, even simultaneously. However, the
112 V2 and Holmberg’s Generalization
type analysis for double object constructions.
(188) Jeg ga Jens alltid en kylling.
I gave J always a chicken
Derivation 3.23
[I gave[ Jens [ a chicken]]]
merge Σ and move gave and I I
[ Ii [gavej +Σ [ti tj [ Jens [ a chicken]]]]]
move a chicken I
[[a chicken]k [ Ii [gavej +Σ [ti tj [ Jens tk ]]]]]
merge always and move Jens I
[ [Jens tk ]l [always [[a chicken]k [Ii [gavej +Σ [ ti tj tl ]]]]]]
move ΣP I
[[Ii [gavej +Σ tl ]]m [[Jens tk ]l [always [[a chicken]k tm ]]]]
Given that we can only move VPs, we could not move a chicken around Jens
without violating the extension condition. In derivation 3.23, we do have move-
ment of a chicken around the indirect object. However, this will always be
repaired automatically. Either the indirect object itself moves out of ΣP, as
in derivation 3.23, or if it doesn’t, it will be carried along with ΣP fronting,
around the direct object again. This is illustrated in the alternative deriva-
tion 3.24 for (188), starting from the step where a chicken has already left
ΣP. The same obvioulsy applies when the subject is not in the first position.
Derivation 3.24
[[a chicken]k [ Ii [gavej +Σ [ti tj [ Jens tk ]]]]]
merge always I
[always [[a chicken]k [ Ii [gavej +Σ [ti tj [ Jens tk ]]]]]]
move ΣP I
[[Ii [gavej +Σ [ti tj [Jens tk ]]]]l [always [[a chicken]k tl ]]]
Hence, it seems to me that this actually reduces the order preserving nature
of scrambling/argument shift in the relevant languages to one rather natural
difference: the “free word order” languages allow movements of smaller units,
analysis presented there does not derive the difference between pronoun shift and object
shift of full DPs.
3.7 Third approximation: V2 without positions 113
hence more orderings ensue. Whether DP scrambling, rather than merely VP
scrambling is allowed, could be related to the presence of morphological case,
but addressing this question is beyond the scope of the present work. It also
seems to me that the present analysis would extend to other “second position”
phenomena, either by analyzing the second position elements by attraction to
Σ, or by treating them on a par with weak pronouns, i.e. elements that cannot
trigger extraction from ΣP on their own, and hence, will tend to be tagged
along when ΣP moves to the first position. This explains Norwegian patterns
like the following, discussed in chapter 1 as “Bobaljik’s Paradox” (Bobaljik,
1999; Nilsen, 1997), where three arguments appear to be scrambling on their
own, among several adverbs, where the ordering of the adverbs and the ordering
of the arguments must remain unaltered.
(189) a. Derfor ga Jens Kari kyllingen tydeligvis ikke
therefore gave J K the-chicken evidently not
lenger kald.
any.longer cold
b. Derfor ga Jens Kari tydeligvis kylingen ikke lenger kald.
c. Derfor ga Jens tydeligvis Kari kyllingen ikke lenger kald.
d. Derfor ga Jens tydeligvis Kari ikke kyllingen lenger kald.
e. Derfor ga Jens tydeligvis Kari ikke lenger kyllingen kald.
f. Derfor ga Jens tydeligvis ikke lenger Kari kyllingen kald.
g. Derfor ga tydeligvis Jens ikke lenger Kari kyllingen kald.
h. Derfor ga tydeligvis ikke Jens lenger Kari kyllingen kald.
i. Derfor ga tydeligvis ikke lenger Jens Kari kyllingen kald.
j. * Derfor ga Jens ikke tydeligvis Kari lenger kyllingen kald.
k. * Derfor ga Jens tydeligvis ikke kyllingen lenger Kari kald.
Such patterns are obviously problematic if one wants to assume fixed landig
sites for scrambled arguments, and at the same time assume fixed positions
for the adverbs. In the present account, then, we don’t have to assume fixed
positions for either. The ordering of the adverbs follows from the scope require-
ments of the different adverbs,17 while the ordering of the arguments follows
from the order of merger in the VP (linking) in addition to the restriction of
scrambling to VP nodes.18
Again, Dutch behaves essentially the same. For German, which allows
argument reordering, we would simply loosen up the requirement that only VP
17 according to the analysis developed in chapter 2, lenger ‘no.longer’ must follow the
negation because it is a negative polarity item, while tydeligvis ‘evidently’ must precede it
because it is a positive polarity item.
18 It should be pointed out here that my solution to Bobaljik’s paradox is independent
of the question whether adverbs occur in fixed positions. Thus, one could claim that the
adverbs occur in fixed positions and that remnant VP-nodes can scramble in the fashion
outlined among the FPs hosting the adverbs.
114 V2 and Holmberg’s Generalization
nodes can scramble. As we noted above, if arguments can scramble on their
own, they may not end up in the same order as they started. I give a derivation
for the German sentence below, where the direct object has scrambled around
an adverb and the subject.
(190) Gestern küßte den Fritz warscheinlich die Maria.
Yesterday kissed the-acc F probably the-nom M
Derivation 3.25
[ΣP gestern [küßtei +Σ [[die Maria] ti [den Fritz]]]]
move die Maria I
[[die Maria]j [ΣP gestern [Küßtei +Σ [tj ti [den Fritz]]]]]
merge warscheinlich and move den Fritz I
[[denFritz]k [warscheinlich [[die Maria]j [ΣP gestern [küßtei +Σ [tj ti tk ]]]]]]
move ΣP I
[[ΣP gestern [küßtei +Σ [tj ti tk ]]]l [[den Fritz]k [warscheinlich [[die Maria]j
tl ]]]]
It can be seen that it is the possibility of extracting the argument [die Maria]
without pied piping its dominating VP node that results in the potential for
reordering. A remaining problem is Danish, where all subjects and weak pro-
nouns must precede all adverbs, and all other arguments line up following the
adverbs. In other words, Danish does not have argument shift or scrambling.
This would follow if Danish must always move the VP sister of the subject out
of ΣP immediately after XP movement to spec-Σ. I leave open the interesting
question why this should be so.
Periphrastic constructions In chapter 2, (page 72) I noted that sequences
of adverbs and auxiliaries in Norwegian enter into crossing scope dependencies.
Hence, I concluded that there must be a rather elaborate set of movements in
order to generate this. The sentence we considered was the following, where the
four adverbs and the four auxiliaries are linearly separated, but semantically
interspersed, as it were. In other words, the linear order in (191b) reflects the
semantic scope of the adverbs and auxiliaries in (191a). However, while (191a)
is perfectly grammatical, (191b) is sharply ungrammatical.
(191) Norwegian
a. . . .at det ikke lenger alltid helt kunne ha blitt
. . .that it not any.longer always completely could have been
ordnet.
fixed
3.7 Third approximation: V2 without positions 115
b. * . . .at det ikke kunne lenger ha alltid blitt helt
. . .that it not could any.longer have always been completely
ordnet.
fixed
The same observation can be made on the basis of Dutch data. Thus, (192) is
a similar example in this language.
(192) . . .dat het niet meer helemaal kon worden gemaakt
. . .that it not any.longer completely could be fixed
In chapter 2, I suggested that this can be derived by letting adverbs attract
projections of verbs, and verbs attract projections of adverbs. I repeat the
derivation I gave there for the Norwegian sentence.
What could be driving all these movements? Of course, we could stipulate
that all these expressions are lexicalized with uninterpretable V-features and
Adv-features, but that does not lead to any further understanding of why
the movements should apply. Another possibility is that they are driven by an
adjacency requirement on the auxiliaries. This also seems rather unsatisfactory;
we should probably rather derive the adjacency requirement from something
more fundamental. The auxiliaries enter into selectional relations with each
other, and it might be that the reason for all the movements is that each
auxiliary must be adjacent to the auxiliary (verb) it selects. For example,
auxiliary ha ‘have’ selects for a participial complement, while kunne ‘could’
selects for an infinitival complement. If an intervening adverb would block the
selection relation, this could suffice to drive the movements. This would leave
some room for variation with respect to how exactly a language chooses to
satisfy the requirement. Thus, for lack of any better account, I assume that
this is how the movements are triggered. Note that, if the adverb projections
had not moved in derivation 3.26, the verbs would not end up being adjacent.
Suppose that, in a configuration like figure 3.7, the four nodes xp1 , xp2 , yp, x
are ‘equidistant’ to α. If we raise yp to spec-α, and subsequent attractors will
iterate this option, we will derive climbing of yp to the first position. If we,
instead, move the entire complement of α, xp1 , and later iterate this option,
we derive roll-up structures. Both options are needed to derive the ordering
patterns of verbal clusters. If we extract the head of xp, and later on iterate
this, we end up with the order of merger. This is illustrated in derivation 3.27.
This derives the English pattern, where adverbs and verbs are interspersed.
The verbs are adjacent at the point of merger. The fourth option is to move
xp2 . This option we have already illustrated in derivation 3.26 for Norwegian
facts. Given what we have seen so far, these derivational options seem to be
parameters that are fixed for entire categories.19
19 Some questions arise with respect to interaction of two kinds of x-raising at the same
time. For example, one could investigate whether some otherwise possible orderings of Dutch
verbal clusters are ruled out in the presence of a preceding crossing-scope adverb cluster.
116 V2 and Holmberg’s Generalization
Derivation 3.26
[completely [fixed]]
move VP I
[fixed [completely]]
merge been I
[been [fixed [completely]]]
move AdvP I
[completely [been [fixed]]]
merge always I
[always [completely [been [fixed]]]]
move VP I
[[been [fixed]] [always [completely]]]
merge have I
[have [[been [fixed]] [always [completely]]]]
move AdvP I
[[always [completely]] [have [been [fixed]]]]
merge any.longer I
[any.longer [[always [completely]] [have [been [fixed]]]]]
move VP I
[[have [been [fixed]]] [any.longer [always [completely]]]]
merge could I
[could [[have [been [fixed]]] [any.longer [always [completely]]]]]
move AdvP I
[[any.longer [always [completely]]] [could [have [been [fixed]]]]]
merge not I
[not [[any.longer [always [completely]]] [could [have [been [fixed]]]]]]
An approach along these lines could be extended to roll-up structures like
the ones found in German. I will not do that here, but see Koopman and
Szabolcsi (2000) for an analysis of verbal clusters along similar lines for several
languages.
3.7 Third approximation: V2 without positions 117
αP
H
HH
αEPP xp1
HH
yp xp2
H
H
x ...
Figure 3.7: x-raising configuration
Derivation 3.27
[vp2 completely [vp1 fixed]]
move fixed I
[vp2 fixedi [vp2 completely [vp1 ti ]]]
merge been and move completely I
[vp3 completelyj [vp3 been [vp2 fixedi [vp2 tj [vp1 ti ]]]]]
merge always and move been I
[vp4 beenk [vp4 always [vp3 completelyj [vp3 tk [vp2 fixedi [vp2 tj [vp1 ti
]]]]]]]]
merge have and move always I
[vp5 alwaysl [vp5 have [vp4 beenk [vp4 tl [vp3 completelyj [vp3 tk [vp2 fixedi
[vp2 tj [vp1 ti ]]]]]]]]]]
merge any.longer and move have I
[vp6 havem [vp6 any.longer [vp5 alwaysl [vp5 tm [vp4 beenk [vp4 tl [vp3
completelyj [vp3 tk [vp2 fixedi [vp2 tj [vp1 ti ]]]]]]]]]]]]
merge could and move any.longer I
[vp7 any.longern [vp7 could [vp6 havem [vp6 tn [vp5 alwaysl [vp5 tm [vp4
beenk [vp4 tl [vp3 completelyj [vp3 tk [vp2 fixedi [vp2 tj [vp1 ti ]]]]]]]]]]]]]]
merge not and move could I
[vp8 couldo [vp8 not [vp7 any.longern [vp7 to [vp6 havem [vp6 tn [vp5 alwaysl
[vp5 tm [vp4 beenk [vp4 tl [vp3 completelyj [vp3 tk [vp2 fixedi [vp2 tj [vp1 ti
]]]]]]]]]]]]]]]]
Dutch and Mainland Scandinavian would follow derivation 3.26. In Dutch,
it seems that non-verbal material from the most deeply embedded VP ends up
in the adverb cluster, rather than in the verb cluster. Hence, we get “climbing”
of verbal particles etc. leading to the characteristic Dutch pattern in (193)
(Koopman and Szabolcsi, 2000).
118 V2 and Holmberg’s Generalization
(193) . . .dat Jan Marie op zal willen bellen
. . .that J M up shall want call
‘”. . . that Jan will want to call Mary up”
Once we have formed the adverb cluster and the verb cluster, and merged
the finite verb, we merge Σ and execute the analysis as before.
Embedded clauses
Finite embedded clauses are normally not V2. In particular, only so-called
“bridge” predicates allow for embedded V2. I return to embedded V2 shortly.
Embedded (non-V2) clauses must have a different Σ than root clauses. I assume
that the complementizer is such a Σ. It has no EPP-feature, so no topicaliza-
tion is possible. Furthermore, it does not attract the verb. Other than that,
everything goes as before. However, since the verb is not a weak element like
the pronouns, it will always trigger VP extraction from ΣP. I give the derivation
for (194) in derivation 3.28.
(194) . . . at Jens ofte spiser tran
. . . that J often eats cod.liver.oil
Derivation 3.28
[that [VP1 Jens [VP2 eats cod liver oil]]]
merge often and move VP2 I
[ [VP2 eats cod liver oil]i [often [that [VP1 Jens ti ]]]]
move ΣP I
[[that [Jens ti ]]j [often [[eats cod liver oil]i tj ]]]
The question now arises whether the direct object could have extracted alone,
so that the verb could become separated from it by the adverb, much as in the
cases discussed for arguments in the previous subsection. The short answer is
that it can. Bentzen (2002) notes that sentences like the following are, in fact,
grammatical in Norwegian.
(195) . . .at han spiser ofte tran.
. . .that he eats often cod.liver.oil
However, this option is quite restricted. The adverb ofte is one of the few which
can occupy this position, and substituting e.g. alltid ‘always’ or other adverbs
for ofte leads to degradedness. It remains to be understood what governs the
availability of orders like (195).
For OV languages I will assume the following, inspired by the analysis
proposed by Hallman (2001). The verb moves to Σ in these cases as well,
but there is no EPP-feature, hence no topicalization. In this case, what is
3.7 Third approximation: V2 without positions 119
attracted to C is the VP dominating the complementizer. Thus, essentially,
the verb will be left behind in the final position, as wanted. Before attraction
to C, scrambling works as before, i.e. with VPs in Dutch, hence leading to
order preservation, and with the arguments themselves in German, leading
to potential reordering of the arguments. I give the derivation for the Dutch
embedded sentence (196) in derivation 3.29.
(196) . . .dat Jan Marie kuste
. . .that J M kissed
Derivation 3.29
[ΣP that [VP Jan kissed Mary]]
move V I
[kissedi [ΣP that [VP Jan ti Mary]]]
merge C and move ΣP I
[CP [ΣP that [VP Jan ti Mary]]j [kissedi tj ]]
The bracket to the immediate left of kissed in derivation 3.29 is not la-
belled. Given that the verb is attracted there by Σ, it should be a ΣP. Hence,
one might wonder why we do not get Vf first, and complementizer second. We
need the Σ to pied-pipe its specifier just in case it is not realized by dat, and
similarly for German. This problem is similar to the question raised by the
traditional analysis of v2, namely why CPs do not allow multiple adjunction
(in V2 languages). I speculate that the answer should ultimately be given in
phonological terms. In other words, dat wants to be leftmost in an intonation
phrase as do topic switchers. Such prosody-semantics/pragmatics correspon-
dences have been explored for focus and stress by Reinhart (1995); Szendrői
(2001). Working out such an account in detail for topic switch and intonation
phrasing is beyond the scope for the present dissertation.
Note that we can use the same account for pronoun-shift and scrambling
in Dutch as the one we explored for Norwegian above. In particular, weak
pronouns are expected to immediately follow the complementizer, unless they
have been extracted along with other material from ΣP prior to ΣP fronting. In
either case they will end up preceding the verb. Our account also explains why
one part of HG holds for Dutch, namely that objects cannot scramble around
subjects. This follows if Dutch, like Norwegian, does not allow scrambling of
arguments on their own, but only of VP nodes. Finally, we have an explanation
why the other part of HG does not appear to hold for Dutch: Scrambling is not
contingent on verb movement in this language. This now follows from the fact
that the verb (in embedded clauses) is attracted and then stranded by Σdat . In
Norwegian, Σat does not attract the verb, so, since only VP-nodes can move,
120 V2 and Holmberg’s Generalization
objects must end up following the verb.20
3.7.2 ΣP fronting
Why does ΣP have to move to the beginning of the clause? Are there any
languages where it does not? Suppose that there are. Such a language would
have sentences with the word-order of penultimate steps of the derivations
above for Norwegian. In other words, it would have a designated XP position
left adjacent to the finite verb, it would generally move arguments to the left
of this XP position, and, finally destressed material, like weak pronouns would
occur to the right of the verb. If wh-movement is thought of as movement to
spec-Σ, this language would also have a wh-phrases to the immediate let of
the finite verb, though clause internally. In fact, Jayaseelan (2001) shows that
the Dravidian language Malayalam has precisely such properties and proposes
an analysis which is congenial to the one I am proposing in several respects.
In other words, Malayalam is generally OV, and wh-phrases must immediately
precede Vf.21 Compare (197) to (197b) (Jayaseelan, 2001, p40).
(197) a. ninn-e aarε aTiccu?
you-acc who beat-past
b. iwiTe aarε uNTε?
here who is
c. awan ewiTe pooyi?
he where went
d. nii aa pustakam aar-kkε kiDuttu?
you that book who-dat gave
(198) a. * aarε ninn-e aTiccu?
who you-acc beat-past
b. * aarεiwiTe aarε uNTε?
who here is
c. * ewiTe awan pooyi?
where he went
d. * nii aar-kkεaa pustakam kiDuttu?
you who-dat that book gave
More or less focused material occurs to the left of the verb, as seen in (199),
and, finally, destressed material can occur to the right of Vf (200b). The latter
option is unavailable or indefinite noun phrases (200c).
20 An alternative formulation would be to say that also the Norwegian complementzer
attracts the verb, but that the verb pied-pipes the VP in this language. The choice between
the two alternatives hinges on the proper explanation of examples like (195).
21 Jayaseelan (2001) points out that Vf left-adjacent wh-phrases of this sort are also found
in Hungarian, Basque, and several African languages, like Aghem, Chadic and Kirundi. For
this, I refer the reader to Jayaseelan’s work and references cited there.
3.8 Summary 121
(199) ñaan innale Mary-k’k’ε oru kattε ayaccu
I yesterday Mary-dat a letter sent
(200) a. aarum kaND-illa, aana-ye
nobody saw-neg elephant
b. aarε ayaccu, ninn-e?
who sent you-acc
c. ?* ñaan awan-εayaccu, oru kattε
I he-dat sent a letter
Jayaseelan (2001) analyzes the postverbal destressed material as occupying
a high ToP, with IP moving around it, but does not give arguments for this.
Malayalam is a pro-drop language, so one cannot test for weak pronouns. The
postverbal topic position could in principle also be analyzed in the same way as I
have analyzed Norwegian weak pronouns. Finally, the XP position is reserved
for focused material in Malayalam. It is typically occupied by an indefinite
noun phrase, and if there are both definites and indefinites in the clause, the
indefinites must occupy XP. I take it that this gives an important clue to why
ΣP does not move to the left in Malayalam: Germanic Σ is associated with
topichood or switch topic, hence it must move out of the “focus area of the
clause.” If we assume with Cinque (1993) that the main stress of the clause
falls on the most deeply embedded constituent on the recursive side of the tree,
we can make sense of this. I follow Reinhart (1995) in assuming that the focus
of a sentence must contain the main stress of the clause. This explains why
focused material must extract from ΣP prior to ΣP fronting. The reason why
weak pronouns are allowed to stay in ΣP must then be related to the fact
that weak pronouns are not good switch topics. Weak pronouns function like
discourse variables; they can never cause topic switch, and, as is well known,
can never be focused or contrasted (Kayne, 1975). Other material does not
have this deficiency. they can be contrasted and focused , and they can also
cause topic switch. The fact that all potential topic switchers must evacuate
ΣP suggests the following generalization:
Generalization 4 (Unambiguous Topic Switch) ΣP may contain at most
one potential topic switcher.
Whether this is the right formulation remains to be seen. If it is, it would have
to be derivable from more fundamental considerations about topics and topic
switch. I will leave that for future investigation.
3.8 Summary
In this chapter, I have argued that there are strong reasons to reject the stan-
dard analysis of V2 in terms of head movement to C with subsequent topicaliza-
tion. I argued that Mainland Scandinavian V2-violations with focus particles
122 V2 and Holmberg’s Generalization
, when seen in conjunction with the behavior of weak pronouns, in particu-
lar Holmberg’s Generalization, strongly suggest that the verb ends up in the
left periphery of the clause by means of an XP-movement, rather than a head
movement operation.
On the proposed account, the part of HG concering weak pronouns is han-
dled by assuming that weak pronouns to not move on their own. They do
not trigger VP-movement out of ΣP, hence they must stay inside the fronted
ΣP when they can, i.e. when nothing else triggers movement out of ΣP of a
VP-node containing them. HG concerning argument shift of full noun phrases
follows if verbs that do not move to Σ trigger extraction of their dominating
VP-node out of ΣP prior to ΣP-fronting. Such extraction will necessarily take
arguments following the verb along. Hence, full DP can “scramble” among
sentential adverbs, just when they have not been taken along in ΣP-extraction
triggered by a verb. When arguments can scramble, they must always end
up in the same order as before scrambling. This follows from the assumption
that scrambling of arguments in Scandinavian and Dutch is really movement of
VP-nodes. Hence, because of the extension condition, one cannot reverse the
relative ordering of scrambled arguents in these languages. The possibility in
German of such order reversal is due to the fact that arguments in this language
can scramble on their own, i.e. DPs rather than VPs scramble in German.
I have argued that no reference needs to be made to specific Topic Phrases
and Focus Phrases, as was done in section 3.5. Instead a “dynamic” interpreta-
tion of notions like topic and focus, along with default stress assignment rules
can drive the movement operations required to derive the observed word-order
patterns. This suggests that Last Resort should not be stated in terms of fea-
ture checking, which only indirectly affects the interfaces. Rather, it should be
stated as interface requirements directly. This has the potential of solving (or
at least reducing) a frequently noted problem with ‘remnant movement’ anal-
yses, namely that it is hard to see how all the movements could be triggered.
If the current proposal is on the right track, this suggests that one should not
look for formal (uninterpretable) features that trigger each movement oper-
ation. Rather, it seems that several operations can be triggered in order to
satisfy one interface requirement, such as generalization 4.
CHAPTER 4
Verb movement, Scope and
Scrambling
4.1 Introduction
In this chapter, I will argue that the phenomena discussed in the literature
under the heading of ‘(Short) Verb Movement’ (SVM) must be seen, in part
as a scopal phenomenon, and in part as a phenomenon similar to scrambling
as seen with arguments in the Germanic languages. In this way, we can, in
principle, extend the approach we pursued in the previous chapters for adverb
placement to verb placement. The leading idea is that verbal morphology at-
taches freely to VPs much in the same way as adverbs and auxiliaries. The
relative order with which such elements are stacked onto the VP is then gov-
erned by the scopal requirements of the individual expressions. Just what the
scopal requirements of a given verbal form would be is often a difficult question,
often compounded by the fact that the surface position of an element may be
the result of remnant movements of the kind we demonstrated for Norwegian
auxiliary sequences which entered into crossing scope interactions with adverbs.
Because of this, the analysis presented in this chapter at times leaves much to
be said. The idea is not as much to present a fleshed out analysis of SVM,
as to point to a novel interpretation of it, and discuss some of the questions
that would have to be solved under this view. The idea is illustrated by the
tree in figure 4.1. Affixes are different from adverbs in that they are phonolog-
ically incomplete, i.e. they are affixes; similarly for bare verb stems. Therefore,
124 Verb movement, Scope and Scrambling
VP
HH
H
Adv* VP
H
HH
-en VP
H
HH
Adv* VP
P
PP
. . . beat- . . .
Figure 4.1: SVM without positions
affixes will attract a verb stem for purely phonological reasons (cf. the “stray
affix filter”). In this way, Chomsky’s (1995) suggestion that head movement is
a PF-phenomenon is partly correct, albeit not in precisely the way Chomsky
had in mind. The approach also leads to an understanding of why it should
be that SVM is optional within a range of adverbs: Suppose that some range
of adverbs a1 , . . . , an allows a verb form v+affix to occupy any position in the
sequence. In our approach this means that the affix can be merged above or
below each of the adverbs, and then attracts the verb stem for phonological
reasons. In other words, we do not have to postulate “optionally strong” fea-
tures or other mechanisms to accommodate the apparent optionality of SVM,
the point being that the movement operation is obligatory, but the attractor
can be merged in several “positions”.
The approach also has advantages over the approach to SVM proposed by
Svenonius (2001); Ernst (2001), according to which the verb always occupies
the functional head T (or some other head) and the adverbs can be merged
to spec-TP or spec-VP, subject to s-selectional requirements of the adverbs.
This approach also derives the apparent optionality of SVM, but it leads to
TP
H
HH
H
Adv* TP
HH
H
T vP
V
H
T HH
Adv* vP
PP
.........
Figure 4.2: Svenonious/Ernst-type SVM
other problems. As discussed in chapter 1, it seems to force us to view T as
a semantically vacuous head. Consider how adverb ordering is dealt with in
this setup: Some adverbs (e.g. completely) semantically select for constituents
denoting events, and the resulting structure [completely XP] also denotes an
4.1 Introduction 125
event. Other adverbs select for propositions and return propositions (e.g.
not) , while yet others select for facts and return facts (e.g. paradoxically),
etc. The crucial idea is that vPs and TPs can denote an of these ontological
entities, and that events can be turned into propositions, which, in turn can be
turned into facts, etc. but not vice versa. Hence, if paradoxically is merged to
vP, this constituent must first be lifted so as to denote a fact, then the adverb
can apply to this constituent, resulting in a vP denoting a fact. It follows that
completely or not cannot apply after this, because they select for events and
propositions, respectively and facts cannot be turned into either of these. Now
consider what happens when T is merged to our fact-denoting vP. Suppose
that T is not semantically vacuous, which seems natural, i.e. it conveys tense.
For the Svenonius/Ernst setup to work, T must be remarkably open minded
with respect to the semantic denotation of its complement vP. Furthermore,
whatever vP denotes, the result of applying T to this must denote the same.
Thus, if vP denotes a fact, [T vP] must also denote a fact, for, if it were allowed
to denote a proposition or an event, then not and completely would wrongly
be predicted to be able to precede and outscope paradoxically. Finally, if not
and completely are ever to be allowed to merge to TP, this constituent must be
allowed to denote propositions or events in some cases, and a fortiory, vP must
also denote propositions or events in these cases. Hence, it seems to follow that
T cannot have a denotation.
Again, in our system, where semantic selection plays no role, the problem
doesn’t arise. T is a modifier just like the adverbs, and they can be merged in
any order, as long as scopal requirements are satisfied, so T is allowed to have
semantic content, as wanted. In other words, we sidestep Svenonius/Ernts’s
problem by denying the existence of TP as syntactically distinct from vP or
VP.
Suppose that, instead of their s-selection approach with type-conversion
systems, Svenonius/Ernst would adopt something along the lines of the analysis
developed in chapter 2 to handle adverb ordering. Then, they could maintain
the label TP as distinct from vP etc. and assume that T has semantic content.
If certain adverbs must precede T when it is specified for, say, past, this can now
be treated as a scopal phenomenon, much as in the present chapter. However,
in such a theory, the question arises what is the motivation, or even what could
be the motivation for changing the label from vP to TP, except for purely
theory internal considerations. In other words, what could be the motivation
for claiming that T (or any other expression) occurs in a fixed position when
everything else is floating around it: there’s nothing to be fixed with respect
to.1 One could try to say that tense is special, because every sentence has
tense. But this is certainly not a universal property of natural language, as
1 Incidentally, abandoning TP does not create problems for implementing the insight that
nominative case and tense are intimately connected. The connection has never been explained
anyway, so we could stipulate the EPP with respect to a “floating” T head just as much as
with a fixed one.
126 Verb movement, Scope and Scrambling
some languages, like Mandarin Chinese notoriously do not inflect verbs for
tense. It is not even clear that it is a property of any language. It is not clear,
for example that generic sentences or ‘eternal’ sentences like (201) are “tensed”
in any semantic sense of the word.
(201) a. A quadratic equation always has more than one solution.
b. Time is branching.
c. Giraffes have long necks.
4.2 A Bobaljik Paradox for SVM
In Chapter one, we discussed the ordering paradox pointed out by Bobaljik
(1999); Svenonius (2001) for adverb ordering versus argument ordering in lan-
guages like Dutch and Norwegian. The problem is can be stated as follows.
Given a sequence of arguments A1 , A2 , A3 , and an adverb a, the adverb can
occupy any position in the sequence of arguments as long as the relative order-
ing
√ of the arguments remains unaltered. Thus, we have the following, where
“ a ” indicates possible adverb positions among the arguments Ai .
√ √ √ √
a A1 a A2 a A3 a
Conversely, given a sequence of adverbs a1 , a2 , a3 and one argument A, the
argument can occupy any position among the adverbs, as long as the ordering
√
of the adverbs remains the unaltered. Thus, we have the following, where “ A ”
represents possible argument positions among the adverbs.
√ √ √ √
A a1 A a2 A a3 A
It follows that there cannot be a single linear sequence of functional heads
(fseq) which accommodates the relative ordering of both kinds of elements.
Bobaljik also argues that such a paradox can be found in the case of verb
placement. I return to Bobaljik’s argument after presenting Cinque’s evidence
for verb movement.
I give Cinque’s hierarchy below, and his examples illustrating the possible
positions for active past participles in Italian, French and Logurdese Sardinian.
(202) [moodspeech−act frankly [moodevaluative fortunately [moodevidential al-
legedly [modepisthemic probably [Tpast once [Tf uture then [modirrealis
perhaps [modnecessity necessarily [modpossibility possibly [asphabitual usually
[asprepetetive again [aspf req(I) often [modvolitional intentionally [aspcelerative(I)
quickly [Tanterior already [aspterminaitive no longer [aspcontinuative still
[aspperf ect(?) always [aspretrospective just [aspproximative soon [aspdurative
briefly [aspgeneric/progressive characteristically(?) [aspprospective almost
[aspsg.completive(I) completely [asppl.completive tutto [voice well [aspcelerative(II)
fast/early [asprepetetive(II) again [aspf req(II) often [aspsg.completive(II) com-
pletely ]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]
4.2 A Bobaljik Paradox for SVM 127
Cinque argues that an an active past participle optionally precedes or fol-
lows all the adverbs above tutto and below possibly in the hierarchy. He illus-
trates this with the following examples.
(203) a. Da allora, non hanno rimesso di solito mica più
since then not the-have put usually non any.longer
sempre completamente tutto bene in ordine.
always completel everything well in order
b. Da allora, non hanno di solito rimesso mica più sempre comple-
tamente tutto bene in ordine.
c. Da allora, non hanno di solito mica rimesso più sempre comple-
tamente tutto bene in ordine.
d. Da allora, non hanno di solito mica più rimesso sempre comple-
tamente tutto bene in ordine.
e. Da allora, non hanno di solito mica più sempre rimesso comple-
tamente tutto bene in ordine.
f. Da allora, non hanno di solito mica più sempre completamente
rimesso tutto bene in ordine.
g. * Da allora, non hanno di solito mica più sempre completamente
tutto rimesso bene in ordine.
h. * Da allora, non hanno di solito mica più sempre completamente
tutto bene rimesso in ordine.
Thus, he shows that the PTC cannot follow the elements, tutto ‘everything’
and bene ‘well’. He shows that in Logurdese Sardinian, PTC, can follow tottu
‘everything’, but not bene ‘well’, and that in French, PTC can follow both tout
and bien. Hence, he argues that the difference between these languages can
be captured by stating that PTC must move to AspP lCompletive in Standard
Italian, whereas in Logurdese Sardinian, it only has to move to the Voice head,
and in French, it may stay even lower. Movement to the higher heads is then
optional. Cinque also shows that finite verbs can precede or follow high adverbs.
His examples are given below.
(204) a. Mi ero francamente purtroppo evidentemente formato
me was frankly unfortunately evidently formed
una pessima opinione di voi.
a bad opinion of you
b. Francamente mi ero purtroppo evidentemente formato una
pessima opinione di voi.
c. Francamente purtroppo mi ero evidentemente formato una
pessima opinione di voi.
d. Francamente purtroppo evidentemente mi ero formato una
pessima opinione di voi.
128 Verb movement, Scope and Scrambling
(205) a. Evidentemente mi ero probabilmente allora formato una
evidently me was probably then formed a
pessima opinione di voi.
bad opinion of you
b. Evidentemente probabilmente mi ero allora formato una pes-
sima opinione di voi.
c. Evidentemente probabilmente allora mi ero formato una pes-
sima opinione di voi.
(206) a. Allora aveva forse saggiamente deciso di non presentarsi
Then he-had maybe wisely decided to not go
b. Allora forse aveva saggiamente deciso di non presentarsi
c. Allora forse saggiamente aveva deciso di non presentarsi
The question now arises whether there is an overlap in the range of fi-
nite verb movement and participle movement. If there is, we have a problem.
Bobaljik (1999) argues that Cinque, in fact, has data which demonstrates such
an overlap. In particular (203a-203b) show that PTC can precede both mica
‘notemphatic ’ and di solito ‘usually’. Cinque also shows (pp. 50-51) that Vf can
follow mica. He argues that this is not due to a higher position of the adverb,
but, rather movement of Vf around mica.2
(207) Gianni (*non) mica (*non) gli telefonerà.
G (*not) not (*not) to-him will-tepephone
Below is an example I found on the internet, where the Vf follows di solito
‘usually’ as well.
(208) Aumenti di dosatura di solito sono molto rapidamente effettivi,
increases in dosage usually are very quickly effective
spesso in un giorno.
often in a day
Thus, we have that di solito must precede mica. Vf must precede PTC. Oth-
erwise, the two pairs can be combined in all and only the orderings in (209)
(209) a. Vf PTC di solito mica
b. Vf di solito PTC mica
c. Vf di solito mica PTC
d. di solito Vf PTC mica
e. di solito Vf mica PTC
2 Mica is an emphatic negation which, when it follows Vf, must cooccur with the sentential
negation non. As (207) shows, it cannot cooccur with sentential negation when it precedes
Vf. Hence it seems that negative concord, perhaps more generally, is restricted to “negative”
sentences where Vf would not be c-commanded by a negative element in the absence of non.
4.2 A Bobaljik Paradox for SVM 129
f. di solito mica Vf PTC
Hence, there is no linear sequence of positions that can accommodate all the
orderings in (209) without also allowing ungrammatical orders, i.e. orderings
where Vf follows PTC, or mica precedes di solito. As briefly discussed by
Bobaljik (1999), Cinque could solve the problem by stipulating that multiple
verb movements must preserve the ordering of the verbs. This is similar to the
order preservation principle adopted by Müller (2002) for the massive evac-
uation of vP he assumes (see chapter 3.6.2 for discussion). But such order
preservation would be a curious stipulation in need of explanation. In other
words, we would need an explanation why Vf needs to precede PTC anyway.
One might suspect, then, that this explanation, in conjunction with an ap-
proach to adverb ordering along the lines of chapter 2 would suffice to explain
the limitation to the orderings in (209).
Just how high can the PTC raise? Cinque states that PTC do not raise
across “higher” adverbs, but he does not give examples of ungrammatical sen-
tences. I searched the internet (google) for exact strings like “avuto fortunata-
mente” ‘have-PTC fortunately’, stato probabilmente ‘been probably’. Below
are some of the examples I found.
(210) a. Due incendi che non hanno avuto fortunatamente
Two fires that not have-3pl had fortunately
conseguenze rilevanti si sono sviluppati
consequences relevant SI are developed
b. le analisi hanno dato fortunatamente esito
the analyses have-3pl had fortunately output
negativo
negative
c. è stato probabilmente stampato a Roma
is-3sg been probably printed in Rome
The native speakers I have consulted report that these sentences are grammat-
ical, and do not require a ‘comma intonation’ on the adverbs.3 It seems, then
that the Italian PTC can precede very “high” adverbs. It could be that our
examples in (210) are derived by moving Vf and PTC around the adverb as a
remnant XP. In other words, it could turn out that the correct constituency for
these examples is the one given in figure 4.3. Such an analysis is actually sup-
ported by the fact that the relevant examples are reported to be ungrammatical
if the two verbs are separated by a “high” adverb. Thus, (211) is reported to
be bad.
(211) É purtroppo stampato probabilmente a Roma.
is-3sg unfortunately printed probably in Rome
3 If they did, the examples could be disregarded. Comma intonation is known to license
all sorts of adverb orders (Jackendoff, 1972).
130 Verb movement, Scope and Scrambling
XP
H
HH
H
ZP XP
P H
PP HH
. . .Vf. . .PTC. . . X YP
HH
Adv YP
PP
. . .tZP . . .
Figure 4.3: Remnant movement analysis for (210)
Given this, I tentatively conclude that movement of the participle around high
adverbs is indeed impossible. I return below to a suggestion as to why this
should be so.
4.3 English and French
Pollock (1989) notes that English differs from French in disallowing verb move-
ment of finite lexical verbs around adverbs, while allowing it for be (and for
some varieties) have. He gives examples like the following (his (17-19)). (212c)
is possible in British English, but generally unavailable in American English.4
In French, such verb movement is even obligatory (213).
(212) a. John is not happy.
b. * John seems not happy.
c. John hasn’t a car.
d. * John owns not a car.
e. John has often kissed Mary.
f. * John kisses often Mary.
g. John often kisses Mary.
(213) a. Jean (n’) aime pas Marie.
J (ne) loves not M
b. Il est rarement satisfait.
He is rarely satisfied
c. * Jean ne pas aime Maire.
J ne not loves M
d. * Jean souvent embrasse Marie.
J often kisses M
4 Note, incidentally, that the ungrammaticality of examples like (212b) shows that the
negation cannot modify the adjective happy directly; similarly for often and sarcastic in
(214b).
4.3 English and French 131
e. Jean embrasse souvent Marie.
J kisses often M
He also notes that English exhibits a similar restriction on SVM of infinitives
as illustrated in (214) (Pollock, 1989, p382), while French again allows this
accross the board.
(214) a. (?) I believe John to be often sarcastic
b. * To look often sad during ones honeymoon is rare.
Pollock points out that one obvious difference between lexical verbs and aux-
iliary have/be is that, whereas the former assign θ-roles the latter do not.
Hence, he suggests that verb movement, in English, though not in French, has
the property of blocking θ-assignment. He implements this idea by assuming
that his functional head Agr is opaque for theta-assignment in English (though
not in French). Thus, if the verb moves to Agr, as shown in figure 4.4, whether
or not it moves further to T, it cannot assign a theta role to its arguments inside
the VP. While I agree with Pollock that the restriction to these verbs should
TP
HHH
H
T AgrP
HHH
H
Agr VP
H
H P
P
H . . .ti . . .
Vi Agr[±opaque]
Figure 4.4: Pollock’s explanation for the restriction to have/be
be stated in terms of the relation between a θ-assigner and its arguments, I
obviously cannot adopt his execution of the idea. An alternative to his opaque
Agr analysis would be to blame the intervening adverb for the blocking effect.
In other words, rather than stipulating that English Agr is opaque, we could
stipulate that English theta assigners must surface adjacent to (the trace of)
their arguments (Stowell, 1981).5 As Pollock points out, it is crucial that this
adjacency requirement be stated on all arguments of the verb, so as to block
verb movement ([V adv] orders) for all verbs, including unergatives. Just as
with Pollock’s opacity of certain kinds of Agr, we would hope that this ad-
jacency requirement can ultimately be reduced to more primitive notions. I
will treat adjacency as a requirement on PF. I will treat English sentences like
5 A problem for this formulation, pointed out to me by Richard Kayne (p.c.) is that
English verbs can be separated frome their arguments by verbal particles, as in let out the
dog. The particle does seem to project its own phonological word. One way out would be
to adopt a small-clause analysis for particle constructions and claim that the theta relation
holds between the verb and the [Prt DP] constituent.
132 Verb movement, Scope and Scrambling
(212f) in the following way. We base generate the sequence [often kiss- Mary] in
the standard way. Now, we merge the affix -s which is phonologically complete,
i.e. it needs to be adjacent to a host (the verb) which does not end in a word
boundary. Hence, we could generate the configuration in figure 4.5 if it had
not been for the fact that the verb must be adjacent to the object. Thus, this
is what happens in French. In English, we must find a way to satisfy both the
stray affix filter and the adjacency requirement. It seems that the way to do
this is to split up the lowest VP constituent by Mary prior to movement of the
entire complement of the affix around it. The derivation is given in derivation
4.1 What is the sense in which the verb and Mary are adjacent in the bottom
VP
H
H
H
kiss- VP
H
H
H
-s VP
HH
often VP
H
H
t Mary
Figure 4.5: verb/affix/adverb placement
Derivation 4.1
[ often kiss- Mary]
move Mary I
[Maryi [often kiss- ti ]]
merge -s I
[ -s [Maryi [often kiss- ti ]]]
move often kiss I
[[often kiss ti ]j [ -s [Maryi tj ]]]
line of derivation 4.1? In other words, how does this count as “adjacent”, while
the configuration in figure 4.5 does not? The question can be rephrased as
follows: why is V adjacent to O in the following configuration just in case X is
an affix?
(215) VXO
It seems obvious that this calls for a phonological explanation. An affix by
definition phonologically incorporates into the preceding stem, resulting in a
single phonological word. In other words, adjaceny should be stated as follows:
4.3 English and French 133
Generalization 5 (Adjacency) No phonological word may intervene between
X and Y.
If the adverb in derivation 4.1 could be merged above tense, we would have
the simpler derivation in derivation 4.2. I assume that this is done for other
adverbs, but, as we shall see below, there are reasons to assume that frequency
adverbs like often cannot be merged above tense. Later on, I will try to derive
this restriction from the semantics of the adverbs.
Derivation 4.2
[ -s [kiss- Mary]]
move kiss and merge often I
[often [kissi [-s [ti Mary]]]]
I treat auxiliaries like have as modifiers of the VP, just like the adverbs and the
affixes. Movement of such auxiliaries is not subject to adjacency, because the
auxiliaries do not assign θ-roles (Pollock, 1989). Thus, the derivation of (216)
is given in derivation 4.3.
(216) John has often kissed Mary.
Derivation 4.3
[kiss Mary]
merge -ed and move kissI
[kissi [-ed [ti Mary]]]
merge have I
[have [kissi [-ed [ti Mary]]]]
merge often I
[often [have [kissi [-ed [ti Mary]]]]]
merge T and move haveI
[havej [-s [tj [often [kissi [-ed [ti Mary]]]]]]]
Orders where the auxiliary follows the frequency adverb is also possible, es-
pecially if the auxiliary is stressed, leading to a “verum focus” interpretation
(217).
(217) John often HAS kissed Mary.
I take the stress requirement to indicate that this is a marked option. Hence, I
take them to be derived by extracting the lowest VP [has kissed Mary] in the
134 Verb movement, Scope and Scrambling
fashion of derivation 4.1. This is illustrated in derivation 4.4. Crucially, tense
is still merged after the frequency adverb.
Derivation 4.4
[VP2 often [ have [VP1 kissi [-ed [ti Mary]]]]]
move VP1 I
[[VP1 kissi [-ed [ti Mary]]]j [VP2 often [ have tj ]]]
merge T and move VP2 I
[[VP2 often [ have tj ]]k [-s [[VP1 kissi [-ed [ti Mary ]]]j [tk ]]]]
The range of possible generation sites for finite T is limited. This is illustrated
by contrasts like the following, where, as before I take the “low” position of the
main verb in (218) (cf. Jackendoff (1972); Pollock (1989)) to reflect remnant
movement of [completely V t] around the affix.
(218) a. John completely lost his mind
b. * John completely will lose his mind
c. John will completely lose his mind.
d. * John completely is losing his mind.
e. John is completely losing his mind.
f. * John completely has lost his mind.
g. John has completely lost his mind.
Similarly, while speakers allow both (219a,219b), they find (219b) marked,
(219a) being the neutral option.
(219) a. John has always done his homework.
b. ? John always has done his homework.
We thus have to account for the complete ungrammaticality of adverbs like
completely in the pre-aux position contra the relative markedness of frequency
adverbs in this position. Moreover, we must account for the markedness of
always auxf contra the unmarkedness of auxf always.
The impossibility of (218f) would follow if not only T, but also aux must
merge outside of the adverb. This could be made to follow from the fact that
completely is an aspectual adverb, applying to ‘inner’ (or predicate) aspect
in the sense of Verkuyl (1993); Borik (2002). Completely can only apply to
gradable predicates for which there is a well-defined maximal degree (Doetjes
et al., 1998). Thus, (220a) is odd, because there is no well defined maximal
degree of snoring. There are degrees of it, though, as seen with John snored
too much/very much/more than Bill. There is, however, a maximal degree of
recovering; hence (220b) is good.
4.3 English and French 135
(220) a. John (*completely) snored (*completely).
b. John (completely) recovered (completely).
I assume that such “maximal” degrees can be implemented in terms of a
Verkuyl (1993) style path structure. In other words, completely can apply
to a predicate only if it denotes a path with a termination point, beyond which
there can be no more change of the relevant kind.6 Predicates denoting ter-
minated path structures are “telic” in the sense of being non-homogeneous for
the reason that no proper subpart of a terminated path is itself a terminated
path (see Verkuyl (1993); Borik (2002)). It now suffices to note that [aux VP]
is always homogeneous, regardless of the aspectual properties of the VP. In
other words, if the predicate [have recovered] holds of John at interval i, then,
presumably it holds of John at every subinterval i0 of i. Hence this larger
constituent is not telic, i.e. does not denote a terminated path, and completely
cannot apply to it. In other words, the only possible order of merger would
be [have[completely VP]], and this, in turn, requires the modified VP to be
telic. Along with the adjacency requirement on lexical verbs, this derives the
fact that lexical verbs can follow completely, while auxiliaries can’t. I show the
derivation of (220b) in derivation 4.5.
Derivation 4.5
[recover]
merge completelyI
[completely [recover]]
merge T and move recoverI
[recoveri [-ed [completetely ti ]]]
For sentences like (218g), the correct derivation depends on whether com-
pletely can apply outside of participial morphology or not. For concreteness,
I assume that it can’t, although I do not know of any evidence from word
order in either direction. Given that, and the adjacency requirement, we get
the derivation in derivation 4.6 for (218f).7 Of course, if completely can merge
outside of PTC, we get a simpler derivation, where the main verb moves to the
6 This only partially characterizes the behavior of completely: in English, (i) is not a
good sentence, even though [go to the store] clearly does denote a terminating path in the
relevant sense. Interestingly, the Norwegian translation of completely, i.e. helt can modify
the corresponding predicate, as seen in (ii).
(i) *John went completely to the store.
(ii) Jens gikk helt til butikken (Norwegian, same)
Similarly, completely (and Norwegian helt seems to require the presence of a verbal particle
in some cases, e.g.
(iii) John has eaten it ?(up) completely.
7 The fact that the progressive in (218e seems to obligatorily outscope the adverb, suggests,
that, at least in this example, the adverb must attach before the affix.
136 Verb movement, Scope and Scrambling
left of PTC without prior extraction of the object. In either case, the auxiliary
will necessarily end up preceding the adverb.
Derivation 4.6
[loose his mind]
merge completely and move his mind I
[[his mind]i [completely [loose ti ]]]
merge PTC and move completely loose I
[[completely loose ti ]j [-ed [[his mind]i tj ]]]
merge have I
[have [[completely loose ti ]j [-ed [[his mind]i tj ]]]]
merge PRES and move have I
[havek [-s [tk [[completely loose ti ]j [-ed [[his mind]i tj ]]]]]]
Why is movement of Vf around frequency adverbs obligatory in French, and
not just marked? It is standardly assumed that Vf is in the highest position
within “IP” in French. Hence it seems that French is, in some sense V2. This
is further supported by the “stylistic” inversion cases discussed recently by
Kayne and Pollock (1999), whereby one gets obligatory subject–verb inversion
triggered by fronting of a non-subject.
(221) A qui a téléphoné ton ami?
to whom has telephoned your friend
Kayne and Pollock (1999) argue that this does not involve a low position of the
subject, but rather, extraction of the subject out of IP, with subsequent fronting
of IP around it. Let us assume that French follows the derivational pattern in
derivation 3.27 in the precious chapter. In other words, it allows adverbs to
occur interspersed between the auxiliaries. Suppose, furthermore that French
also has a Σ head, which normally attracts the subject to its specifier. Then, it
could be that stylistic inversion involves movement of some non-canonical XP to
spec-Σ, and that this triggers extraction of the subject from ΣP and subsequent
raising of ΣP around the subject, much as what happens in the Germanic V2
languages. In other words, I adopt the analysis of Kayne and Pollock (1999),
just stated in the terminology of my chapter 3. But now we can understand
why the finite verb must precede adverbs in French. It is because ΣP attracts
it, and that, as we have argued, ΣP must precede adverbs. Thus the difference
between French and English lies in the attraction of Vf to Σ in French, versus
absence of such attraction in English. Whether extraction of non-clitic material
from Σ is triggered in French in the same way as Germanic now depends on
whether Σ is categorized with the same features as Germanic Σ. In other words,
4.3 English and French 137
extraction from ΣP was argued to be triggered by the generalization that ΣP
could only contain one potential topic switch. But if French Σ does not encode
topic switch, no extraction is predicted. These considerations suggest that
verb movement in French is really ΣP fronting. Hence, (213e) is derived as
in derivation (4.7. Now, the obligatoriness of Vf>adv orders is expected in
Derivation 4.7
[ Σ [Jean embrasse Marie]] move Vf and Jean I
[Jeani [embrassej +Σ [ ti tj Marie]]]
move Marie I
[Mariek [Jeani [embrassej +Σ [ ti tj tk ]]]]
merge souvent and move ΣP I
[[Jeani [embrassej +Σ [ ti tj tk ]]]l [souvent [Mariek ]]]
French. It does not indicate the presence of a high functional head attracting
the verb. In fact, it is not the verb itself that moves around the adverb, but
rather a bigger constituent containing it.
For English, it would seem that auxiliary movement around adverbs is sim-
ilar to scrambling of arguments in other languages. This is actually supported
by the fact that it has information structural effects. Consider the pattern in
(222).
(222) a. John’s always done his homework.
b. * John always’s done his homework.
c. John has always done his homework.
d. ? John always has done his homework.
e. John always HAS done his homework.
A clitic auxiliary is completely ungrammatical following an adverb. With
non-clitic, unstressed auxiliaries, the Adv>Aux order is possible, but marked,
whereas with a stressed auxiliary, such orders are perfect. I take this to re-
flect the fact that the auxiliary generally avoids stress, perhaps because it is
semantically quite bleached. The point here is that this is highly reminiscent
of the patterns holding of scrambling phenomena. A second observation sup-
porting the scrambling interpretation of English auxiliary movement is that it
may have scopal effects. Thus, (223a) was found on the internet. The adverb
seems to outscope the epistemic auxiliary, and similarly for (223b), also found
on the internet.
(223) a. It is safe to conclude that mariners’ thorough knowledge of
the river always must have been essential for a safe passage.
138 Verb movement, Scope and Scrambling
b. Damages caused by these items can sometimes be extensive
and costly to repair. It often might result in the total refin-
ishing of the top surface.
This suggest the following generalization 6 and, perhaps, more surprisingly,
generalization 7.
Generalization 6 (Optional Verb Movement) Optional verb movement a-
round adverbs is a scrambling phenomenon.
Generalization 7 (Obligatory Verb Movement) Obligatory verb move-
ment around adverbs does not exist.
In cases where the verb seems to move obligatorily around adverbs, such as
in Germanic and French, it is something else, containing the verb which has
moved. In other words, obligatory verb movement and obligatory pronoun shift
can both be reduced to fronting of Σ.
4.4 Italian verb scrambling and VP scrambling
(Cinque, 1999, p31) notes that the Italian Vf cannot precede high adverbs
unless some other material follow them, or they occur in a so-called comma-
reading. This is illustrated with the contrast between (224a) and (224bc)
(Cinque, op. cit.).
(224) a. * Gianni lo merita [francamente/ fortunatamente/
G it deserves [frankly/ fortunately/
evidentemente/ probabilmente/ forse].
evidently/ probably/ perhaps]
b. Gianni lo merita, [francamente/ fortunatamente/
G it deserves [frankly/ fortunately/
evidentemente/ probabilmente/ forse].
evidently/ probably/ perhaps]
c. Gianni lo merita [francamente/ fortunatamente/
G it deserves [frankly/ fortunately/
evidentemente/ probabilmente/ forse] per più di una
evidently/ probably/ perhaps] for more than one
ragione.
reason
I take this to reflect the following situation. Finite tense can be merged
above or below high adverbs. If it is merged above, the Vf will occur to the left
of the adverbs, because it is attracted by the affix. But since the high adverbs
are stress avoiders, this leads to the situation where they must be read with
a comma-intonation in case there is nothing further down in the clause which
4.5 Summary 139
can receive the default stress (cf. Cinque (1993)). If there is lower material,
nothing special needs to be done with the adverb.
“Lower” adverbs are not stress avoiders, and Cinque shows that these ac-
tually do participate in stress-driven movements (pp. 13-16). For example,
he notes that ancora and other low adverbs can follow the entire VP, some-
times requiring a slight pause. This is illustrated with di già ‘already’ in (225)
(Cinque, 1999, p14).
(225) Gianni ha ricevuto la notizia DI GIÀ.
G has received the news already
This is interpreted as movement of the VP (or a somewhat larger constituent)
around the adverb, rendering the adverb the lowest element on the recursive
side of the tree, thus the recipient of prosodic stress. It is striking that the
domain for such VP scrambling out of the focus domain has exactly the same
range as movement of the participial verb. In other words, it seems that the
Italian and French PTC can scramble around all and only adverbs which are
not stress avoiders.
4.5 Summary
This conclude my tentative discussion of (short) verb movement phenomena.
I have suggested that apparent cases of obligatory verb movement should be
treated as pied piping by a larger constituent, and hence, that it does not exist.
Optional verb movement as observed with English auxiliaries, and participles
in French and Italian, as well as ordinary finite verbs in Italian is treated, in
part, as base generation of the verbal morphology above or below the adverbs
in question, and in part as a scrambling (i.e. stress/focus driven) operation.
For languages where all verb forms follow all adverbs, as is apparently the case
in Spanish, but also Germanic, modulo V2, an analysis in the sort proposed
in the previous chapter (derivation 3.26 is assumed. Hence such languages are
predicted to exhibit crossing scope dependencies between adverbs and verbs.
The present chapter leaves the following hard questions to be solved. Why
must finite tense apparently be merged above certain adverbs? Why can’t
participial morphology be merged above stress avoiding adverbs? I do not
have an answer to these questions at present, but I think one should try to
address them before rejecting an account along the present lines.
If correct, the analysis presented here suggests that very few distributional
phenomena, if any at all, should be handled in terms of selectional sequences
of functional heads. This does not mean that there cannot be any “functional”
heads. What I have argued against in this thesis is to order them by (syntactic)
selection.
140 Verb movement, Scope and Scrambling
APPENDIX A
Degrees and SOA
In this appendix, I will argue that there is independent evidence for the analysis
of speaker-oriended adverbs (SOA) proposed in Chapter 2.
A.1 The veldig ∼ vis Competition
Unlike English, Norwegian does not allow degree modified sentence adverbs,
like (226).
(226) a. * Jens har veldig sannsynligvis gått hjem.
J has very probably gone home.
b. * Jens har ganske muligens gått hjem.
J har quite possibly gone home
c. * Jens er helt tydeligvis ikke forbryteren.
J is completely evidently not the-perpetrator
In some cases, it is possible to circumvent this problem by omitting the deriva-
tional suffix -vis.
(227) a. Jens har veldig sannsynlig gått hjem.
J has very probable gone home
b. ? Jens har veldig mulig gått hjem.
J has very possible gone home
c. Jens er helt tydelig ikke forbryteren.
J is completely evident not the-perpetrator
142 Degrees and SOA
This gives the impression that “bare” adjective phrases can be used as sentence
adverbials in Norwegian. However, if the adjective is not degree modified, the
result is ungrammatical, i.e. in this case, -vis is obligatory.1
(228) a. Jens har sannsynlig*(-vis) gått hjem
J has probable gone home
b. Jens har mulig*(-ens) gått hjem.
J has possible gone home
c. Jens er tydelig*(-vis) ikke forbryteren.
J is evident not the-perpetrator
In other words, -vis and degree modification are in complementary distribution.
A.1.1 A morphological analysis?
One way one may want to think about this is to say that the derivational suf-
fix -vis turns the adjectives into adverbs presyntactically, and that the degree
modifiers veldig ‘very’, ganske ‘pretty’ helt ‘completely’ syntactically select for
adjectives.2 I will refer to this as the “morphological analysis” (MA). It is com-
patible with with the fact that adj+-vis cannot occur in other positions where
adjectives typically occur, e.g. predicative position, and adnominal position,
i.e. they are not adjectives.
(229) a. * Det er tydeligvis at Jens er forbryteren.
it is evidently that J is the-perpetrator
b. * Den mest sannsynligvise løsningen er at Jens er
the most probably-def the-solution is that J is
forbryteren.
the-perpetrator
However, it leaves open some some other questions, like the following ones: 1)
why can “bare” APs occur in sentence-adverbial position just in case they are
degree modified? 2) Why are English sentence adverbs ending in -ly compatible
with degree modification (230), while Norwegian -vis adverbs are not? The
fact that English does allow for degree modified -ly adverbs actually suggests
that the phenomenon cannot be reduced to the status of a modifier as an
adjective/adverb. English -ly adverbs are also ungrammatical in predicative
and adnominal position so they are not adjectives.
(230) a. John has very probably gone home.
b. John has quite possibly gone home.
c. John has very evidently gone home.
1 the ending -ens on mulig ‘possible’ is irregular: I take it to be an allomorph of -vis.
2 On the selectional properties of very etc., see Corver (1997); Doetjes et al. (1998); Kayne
(2002).
A.1 The veldig ∼ vis Competition 143
(231) a. * It is evidently that John has gone home.
b. * A probably solution is that John is the perpetrator.
The worst problem for MA is that some of the degree modifiers incompat-
ible with -vis have a relatively free distribution, syntactically. Corver (1997);
Doetjes et al. (1998) argue that one should distinguish two classes of degree
modifiers, namely those that select for adjectives, and those that do not. En-
glish very is in the first class (class 1), whereas more is in class 2. Thus, more,
but not very can modiy nouns (232a), verb phrases (232b)-(232c) and preposi-
tional phrases (232d). Hence, it seems that more is capable of modifying any
syntactic category.
(232) a. Stanley has [more/*very] money (than Bill).
b. Stanley [[likes her] [more/*very]]
c. * Stanley very [likes her].
d. Stanley is [more/*very] into syntax.
The examples with very would be rescued by insertion of the dummy much
(Corver, 1997), i.e. very much. In (233), we see that the distribution of veldig
‘very’ is something in between English very and more, i.e. veldig can to some
extent modify verb phrases and PPs, but not nouns. Norwegian mer ‘more’
behaves like its English cognate.
(233) a. Ståle har [mer/ *veldig] penger (enn Willy).
S has [more/ *very] money (than W)
b. Ståle liker henne [mer/ veldig ?(mye)]
S likes her [more/ very ?(much)]
c. * . . . at Ståle veldig liker henne
. . . that S very likes her
d. Ståle er [mer/ veldig (mye)] mot hvalfangst.
S is [more/ very (much)] against whale-hunting
If MA is to have any plausibility, there should be some clear cases of class
2 degree modifiers that can modify adverbs ending on -vis, since these are
indifferent ot the morpho-syntactic class of the category they modify. There
are none.3
(234) a. * Ståle har [veldig/ altfor/ litt/ mer/ nok/ tilstrekkelig/
S has [very/ all-too/ a.little/ more/ enough/ sufficiently/
utrolig/ kjempe-/ . . .] sannsynligvis gått hjem.
incredibly/ giant-/ . . .] probably gone home
3 In colloquial Norwegian, the noun kjempe ‘giant’ productively acts as a prefixal degree
modifier to adjectives, as in kjempe-stor ‘giant-big’ (“very big”), kjempe-smart ‘giant-smart’
(“very smart”), etc.
144 Degrees and SOA
b. Ståle har [veldig/ altfor/ ?litt/ mer/ ?nok/
S has [very/ all-too/ ?a.little/ more/ enough/
tilstrekkelig/ utrolig/ ?kjempe-/ . . .] sannsynlig gått
sufficiently/ incredibly/ ?giant-/ . . .] probable gone
hjem.
home
Even mer ‘more’, which is a clear class 2 degree adverb, and certainly can
modify the adjective sannsynlig ‘probable’ (i.e. there shoudn’t be any semantic
incompatibility), is strongly incompatible with -vis. In short, it does not seem
to be the case that the complementarity of veldig and -vis is reducible to the
s-selectional properties of the former and the the morphosyntactic category of
the latter.
A.1.2 DegPs and sentence adverbs
Having dismissed MA, let us have a second look at the Norwegian facts.
The contrast between (227) and (228) indicates that degree modification (of
sannsynlig and tydelig; we return to mulig) suffices to turn these adjectives into
sentence adverbs. The fact that the suffix -vis is capable of turning an adjective
into an adverb, i.e. without extra degree modification, in addition to the com-
plementary distribution between -vis and other degree modifiers, suggests that
-vis itself is a degree modifier.4 In this way, the facts in (226)-(228) support
the following generalization:
Generalization 8 (Norwegian) Sentence adverbs are degree modified adjec-
tives.
Let us now turn to some facs that support this generalization. A very
productive way of turning adjective into “evaluative” adverbs, is to add the
post-adjectival degree modifier nok ‘enough’.5 Below is a non-exhaustive list
of some adverbs formed in this way found in a search in the Oslo Corpus of
Tagged Norwegian Text (OCTN).
4 ftn4w-vis appears on some adverbs, like delvis ‘partly’ which are not sentence advebs on
their most common use. It seems that when -vis attaches to an adjective ending on -(l)ig,
the result is invariably a sentence modifier.
5 see Barbiers (2001) on this phenomenon in Dutch with genoeg 1enough’.
A.1 The veldig ∼ vis Competition 145
beklagelig nok regettable enough ‘regrettably’
besnærende nok captivating enough ‘captivatingly’
ironisk nok ironical enough ‘ironically’
merkelig nok strange enough ‘strangely’
mirakuløst nok miraculous enough ‘miraculously’
naturlig nok natural enough ‘naturally’
overraskende nok surprising enough ‘surprisingly’
(235) paradoksalt nok paradoxical enough ‘paradoxically’
pussig nok funny enough ‘funnily enough’
rettferdig nok fair enough ‘fairly enough’
rett nok right enough ‘true enough’
rimelig nok reasonable enough ‘reasonably’
sant nok true enough ‘true enough’
typisk nok typical enough ‘typically enough’
utrolig nok unbelievable enough ‘unbelievably’
In fact, it seems that this is the productive way of making evaluatives.
Note for example that most of the examples in (236) are incompatible with
thie ending -vis, even the ones that end in -lig, i.e. *merkeligvis ‘strangely’,
*rimeligvis ‘reasonably’, *utroligvis ‘unbelievably’, etc. On the other hand,
evaluatives which have forms ending on -vis, like heldigvis ‘fortunately’, can
occur with nok, as in heldig nok ‘fortunately’ (lit. “fortunate enough”).
I illustrate the use of these “adverbs” in (236). Note that they are always
“evaluative”, i.e. they impose some evauation of the asserted proposition on
the part of the speaker.6
(236) a. på skolen gikk det naturlig nok til helvete.
on school went it natural enough to hell
b. Utrolig nok føk pucken inn bak Steve Allman i
incredible enough darted the-puck in behind S A in
gjestenes mål
the-guests’ goal
c. Ironisk nok har hun det best når mamma og pappa
ironic enough has she it best when mum and dad
forsvinner til Syden på ferie.
vanish to the-south on vacation
I take this to support the anaysis of SOA proposed in chapter 2. There
I claimed that adverbs like possibly differ from the corresponding adjective
possible in that the former give rise to stronger statements than the latter.
This, in turn, was implemented by applying a domain-shrinking function to
the epistemic accessibility relation inherent to the meaning of the adjective.
From the perspective of the present appendix, it seems that the domain shrink
6 The examples are also from the OCTN.
146 Degrees and SOA
is, in fact, the ending -vis, and that domain shrinkage should be related to
degree modification, perhaps more generally.
Bibliography
Ackema, Peter, and Ad Neeleman. 2002. Effects of short-term storage in pro-
cessing rightward movement. In Storage and computation in the language
faculty, ed. S. Nooteboom, F. Weerman, and F. Wijnen, 219–256. Dordrecht:
Kluwer. 2, 19, 21, 23, 26, 28
Åfarli, Tor A. 1996. Dimensions of phrase structure: the representation of
sentence adverbials. Ms. University of Trondheim. 6, 12
Alexiadou, Artemis. 1997. Adverb placement a case study in antisymmetric
syntax . Amsterdam: John Benjamins. 37, 67
Alexiadou, Artemis. 2001. On the status of adverb in a grammar without a
lexicon. Ms. University of Stuttgart. 37
Barbiers, Sjef. 1995. The syntax of interpretation. Doctoral Dissertation, Lei-
den University. 31, 35
Barbiers, Sjef. 2001. Is vreemd genoeg genoeg? In Kerven in een rots, ed. B.
Dongelmans et al., 15–28. Stichting Neerlandistiek Leiden Reeks 7. 144
Bartsch, Renate. 1976. The grammar of adverbials. North-Holland: Elsevier.
72
Beaver, David, and Brady Clark. 2002. Always and only – why not all focus
sensitive operators are alike. Ms. Stanford. 50, 63
Beghelli, Filippo, and Timothy Stowell. 1997. Distributivity and negation. In
Ways of scope taking, ed. Anna Szabolcsi, 71–109. Dordrecht: Kluwer. 11,
50, 68
148 Bibliography
Bellert, Irena. 1977. On semantic and distributional properties of sentential
adverbs. Linguistic Inquiry 8:337–51. 38, 39
Bentzen, Kristine. 2002. Independent V-to-I movement without morphological
clues. Handout from paper presented at Grammatik i Fokus, Lund, 2002.,
Feb 2002. 72, 118
Bernardi, Raffaella. 2002. Reasoning with polarity in categorial type logic.
Doctoral Dissertation, Utrecht University. 6, 42
den Besten, Hans. 1989. Studies in Western Germanic syntax . Amsterdam:
Rodopi. 77
Bianchi, Valentina. 1995. Consequences of antisymmetry for the syntax of
headed relative clauses. Doctoral Dissertation, Scuola Normale Superiore,
Pisa. 34
Bobaljik, Jonathan. 1999. Adverbs: the hierarchy paradox. Glot International
4:27–28. 11, 12, 14, 71, 113, 126, 128, 129
Bobaljik, Jonathan, and Samuel Brown. 1997. Interarboreal operations: Head
movement and the extension requirement. Linguistic Inquiry 28:345–356.
108
Borik, Olga. 2002. Aspect and reference time. Doctoral Dissertation, Utrecht
institute of Linguistics OTS. 70, 134, 135
Brody, Michael. 2000. Mirror theory. Linguistic Inquiry 31:29–56. 4
Cardinaletti, Anna, and Michal Starke. 1995. The typology of structural defi-
ciency. on the three grammatical classes. ZAS Papers in Linguistics 1:1–55.
81
Chierchia, Genarro. 2001. Scalar implicatures, polarity phenomena, and the
syntax/pragmatics interface. Ms. University of Milan – Biocca. 2, 38, 51,
60, 61, 62, 157
Chomsky, Noam. 1965. Aspects of the theory of syntax . Cambridge, Mas-
sachusetts: MIT Press. 67
Chomsky, Noam. 1994. Bare phrase structure. Ms. MIT. 4, 20
Chomsky, Noam. 1995. The minimalist program. Cambridge, Massachusetts:
MIT Press. 1, 82, 106
Chomsky, Noam. 1999. Derivation by phase. Ms. MIT. 1, 60, 102
Chomsky, Noam. 2001. Beyond explanatory adequacy. Ms. MIT. 1, 29, 60,
102
Bibliography 149
Cinque, Giglielmo. 1993. A null theory of phrase and compound stress. Lin-
guistic Inquiry 24:239–297. 121, 139
Cinque, Guglielmo. 1999. Adverbs and functional heads – a crosslinguistic
perspective. Oxford: Oxford University Press. 3, 7, 18, 21, 29, 37, 38, 44,
48, 67, 68, 70, 78, 80, 138, 139, 155
Cinque, Guglielmo. 2000a. On Greenberg’s U20 and the Semitic DP. Ms.
University of Venice. 69
Cinque, Guglielmo. 2000b. ”Restructuring” and functional structure. Ms.
University of Venice. 21
Cinque, Guglielmo. 2002. Complement and adverbial PPs: Implications for
clause structure. Paper presented at GLOW 2002, Amsterdam/Utrecht. 21,
23, 30, 34
Cinque, Guglielmo. to app. Issues in adverbial syntax. Lingua Special edition
on adverbs, ed. by Artemis Alexiadou. 7, 38, 48, 69
Corver, Norbert. 1997. The internal syntax of the dutch extended adjectival
projection. Natural Language and Linguistic Theory 15:289–368. 142, 143
Diesing, Molly. 1992. Indefinites. Cambridge, Massachusetts: MIT Press. 103
Doetjes, Jenny, Ad Neeleman, and Hans van de Koot. 1998. Degree expressions.
UCL Working Papers in Linguistics 10:323–367. 134, 142, 143
Dowty, David. 1979. Word meaning and montague grammar : the semantics
of verbs and times in generative semantics. Dordrecht: Reidel. 70
Dowty, David. 2000. The dual analysis of adjuncts/complements in categorial
grammar. ZAS Papers in Linguistics 17. 37
Egerland, Verner. 1998. On verb-second violations in Swedish and the hier-
archichal ordering of adverbs. Working Papers in Scandinavian Syntax 61.
79
Emonds, Joseph. 1976. A transformational approach to English syntax: Root,
structure-preserving and local transformations. New York: Academic Press.
155
Ernst, Thomas. 2000. Manners and events. In Events as grammatical objects,
ed. Carol Tenny and James Pustejovsky, 335–358. Stanford: CSLI. 19, 48,
66
Ernst, Thomas. 2001. The syntax of adjuncts. Cambridge: Cambridge Univer-
sity Press. 7, 17, 35, 38, 48, 72, 73, 75, 124
150 Bibliography
Fanselow, Gisbert. 2002. Münchhausen-style head movement and the analysis
of verb second. Ms. University of Potsdam. 100, 107, 108
Giannakidou, Anastasia. 1997. The landscape of polarity items. Doctoral
Dissertation, University of Groningen. 42, 63, 67
Greenberg, J. 1966. Some universals of grammar with particular reference to the
order of meaningful elements. In Universals of language, ed. J. Greenberg,
73–113. Cambridge, Massachusetts: MIT Press. 21
Groenendijk, Jeroen, and Martin Stokhof. 1984. Studies on the semantics of
questions and the pragmatics of answers. Doctoral Dissertation, University
of Amsterdam. 43
Groenendijk, Jeroen, Martin Stokhof, and Frank Veltman. 1996. Coreference
and modality. In Handbook of contemporary semantic theory, ed. S. Lappin,
179–216. Oxford: Blackwell. 54
Hallman, Peter. 2001. On the derivation of verb-final andits relation to verb-
second. Ms. U. of Michigan. 101, 118
Harper, W. L. 1976. Ramsey test conditionals and iterated belief change (a
response to Stalnaker). In Foundations of probability theory, statistical infer-
ence and statistical theories of science, ed. W. L. Harper and C. A. Hooker.
Dordrecht: Reidel. 56
Holmberg, Anders. 1986. Word order and syntactic features in the Scandinavian
languages and english. Doctoral Dissertation, University of Stockholm. 3,
84
Holmberg, Anders. 1999. Remarks on Holmberg’s generalization. Studia Lin-
guistica 53:1–39. 3, 84
Holmberg, Anders. 2000. Scandinavian stylistic fronting: How any category
can become an expletive. Linguistic Inquiry 31:445–483. 86, 108
Holmberg, Anders, and Christer Platzack. 1995. The role of inflection in Scan-
dinavian syntax . Oxford: Oxford University Press. 77
Iatridou, Sabine, Elena Anagnostopoulou, and Roumyana Izvorski. 2002. Ob-
servations about the form and meaning of the perfect. In Ken hale. a life
in language, ed. Michael Kenstowicz, 189–238. Cambridge, Massachusetts:
MIT Press. 70
Jackendoff, Ray. 1972. Semantic interpretation in generative grammar . Cam-
bridge, Massachusetts: MIT Press. 129, 134
Jayaseelan, K. A. 2001. IP-internal topic and focus phrases. Studia Linguistica
55:39–75. 120, 121
Bibliography 151
Julien, Marit. 2000. Syntactic heads and word formation : a study of verbal
inflection. Doctoral Dissertation, University of Tromsø. 37
Kadmon, Nirit, and Fred Landman. 1993. Any. Linguistics and Philosophy
16:353–422. 2, 38, 43, 51
Kayne, Richard S. 1975. French syntax: The transformational cycle. Cam-
bridge, Massachusetts: MIT Press. 81, 121
Kayne, Richard S. 1994. The antisymmetry of syntax . Cambridge, Mas-
sachusetts: MIT Press. 4, 19, 20, 34
Kayne, Richard S. 1998. Overt vs. covert movement. Syntax 1:128–191. 78,
90
Kayne, Richard S. 1999. Prepositional complementizers as attractors. Probus
11:39–73. 78
Kayne, Richard S. 2000. Recent thoughts on antisymmetry. Talk presented at
the conference on antisymmetry, Cortona, Italy. 20
Kayne, Richard S. 2002. On the syntax of quantity in english. Ms. NYU. 142
Kayne, Richard S., and Jean-Yves Pollock. 1999. New thoughts on stylistic
inversion. Ms. NYU and CNRS-Lyon. 136
Koeneman, Olaf. 2000. The flexible nature of verb movement. Doctoral Dis-
sertation, Utrecht University. 108
Koopman, Hilda, and Anna Szabolcsi. 2000. Verbal complexes. Cambridge,
Massachusetts: MIT Press. 21, 29, 30, 72, 78, 116, 117
Koster, Jan. 1974. Het werkwoord als spiegelcentrum. Spektator 3:603–618.
31, 34, 35
Kratzer, Angelika. 1977. What ‘must’ and ’can’ must and can mean. Linguistics
and Philosophy 1:337–355. 54
Kratzer, Angelika. 1991. Modality. In Semantics: An iternational handbook
of contemporary research, ed. A. von Stechow and D. Wunderlich, 639–650.
Berlin: de Gruyter. 54
Krifka, Manfred. 1995. The semantics and pragmatics of polarity items. Lin-
guistic Analysis 25:209–257. 2, 38, 43, 51, 53, 62
Lahiri, Utpal. 1997. Focus and negative polarity in Hindi. Natural Language
Semantics 6:57–123. 38, 51
Linebarger, Marcia-C. 1987. Negative polarity and grammatical representation.
Linguistics and Philosophy 10:325–387. 43, 61
152 Bibliography
Löbner, Sebastian. 1999. Why German schon and noch are still duals: A reply
to van der Auwera. Linguistics and Philosophy 22:45–107. 46
Meinunger, André. 2001. Adjacency requirement blocks verb raising. Paper
presented at GLOW 2001, Portugal. 100
Meinunger, André. to app. Restrictions on verb raising. Linguistic Inquiry .
100
Moortgat, Michael. 1996. Categorial type logics. In Handbook of logic and
language, ed. J. van Benthem and A. ter Meulen. Cambridge, Massachusetts:
MIT Press. 6
Mulders, Iris. 2002. Transparent parsing – head-driven processing of verb-final
structures. Doctoral Dissertation, Utrecht University. 24, 25
Müller, Gereon. 2002. Verb-second as vP-first. Ms. IDS Mannheim. 29, 102,
106, 129
Nilsen, Øystein. 1997. Adverbs and A-shift. Working Papers in Scandinavian
Syntax 59:1–32. 6, 111, 113
Nilsen, Øystein. 2000. The syntax of circumstantial adverbials. Oslo: Novus
Press. 2, 29, 32, 33, 35, 99
Nilsen, Øystein. 2001. Adverb order in type logical grammar. In Proceedings of
the Amsterdam Colloquium 2001 , ed. R. van Rooy and M. Stokhof, 156–161.
Amsterdam. 6, 68, 74
Nilsen, Øystein. to app.a. Domains for adverbs. Lingua Special edition on
adverbs, ed. by Artemis Alexiadou. 37
Nilsen, Øystein. to app.b. Verb second and Holmberg’s generalization. In J.W.
Zwart and W. Abraham (eds.) Studies in Comparative Germanic Syntax.
29, 77, 105
Nilsen, Øystein, and Nadezhda Vinokurova. 2000. Generalized verb raisers. In
Proceedings of the 2000 International Workshop on Generative Grammar , ed.
Young Jun Jang and Jeong-Seok Kim, 167–176. Hansung university, Seoul.
21, 37
Pesetsky, David. 1995. Zero syntax: Experiencers and cascades. Cambridge,
Massachusetts: MIT Press. 34
Platzack, Christer. 1986. The position of the finite verb in Swedish. In Verb
second phenomena in germanic languages, ed. Hubert Haider and Martin
Prinzhorn, 27–47. Dordrecht: Foris. 110
Pollock, Jean-Yves. 1989. Verb movement, universal grammar, and the struc-
ture of IP. Linguistic Inquiry 20:365–424. 3, 68, 130, 131, 133, 134, 155
Bibliography 153
Pritchett, B. 1992. Grammatical competence and parsing performance.
Chicago: University of Chicago Press. 25
Reinhart, Tanya. 1995. Interface Strategies. OTS working Papers. 119, 121
Rizzi, Luigi. 1997. The fine structure of the left periphery. In Elements of
grammar , ed. L. Haegeman, 281–337. Dordrecht: Kluwer. 3, 78, 80
van Rooy, Robert. 2001. Attitudes and context change. Ms. of book ILLC,
Amsterdam. 56, 57
van Rooy, Robert. 2002. Negative polarity items in questions. Ms. ILLC,
Amsterdam. 39, 43, 53, 66
Starke, Michal. 2001. Move dissolves into merge: a theory of locality. NYU.
5, 29, 102
Stowell, Tim. 1981. Origins of phrase structure. Doctoral Dissertation, MIT.
131
Svenonius, Peter. 2001. Subject positions and the placement of adverbials.
In Subjects, expletives, and the EPP , ed. Peter Svenonius, 199–240. Ox-
ford/New York: Oxford University Press. 7, 11, 13, 14, 19, 35, 75, 124,
126
Szabolcsi, Anna. 2001. Hungarian disjunctions and positive polarity. Ms. NYU.
49
Szabolcsi, Anna. 2002. Positive polarity–negative polarity. Ms. NYU. 45, 46,
60, 61, 64
Szendrői, Kriszta. 2001. Focus and the syntax-phonology interface. Doctoral
Dissertation, University College London. 119
Verkuyl, Henk. 1993. A theory of aspectuality: The interaction between temporal
and atemporal structure. Cambridge: Cambridge University Press. 134, 135
Vlach, Frank. 1993. Temporal adverbials, tenses and the perfect. Linguistics
and Philosophy 16:231–283. 70
Westerståhl, Dag. 1988. Quantifiers in formal and natural languages. In Hand-
book of philosophical logic, ed. D. Gabbay and F. Günthner, 1–131. Dor-
drecht: Kluwer. 38, 51
van der Wouden, Anton. 1997. Negative contexts : collocation, polarity and
multiple negation. London: Routledge. 37, 42
Zwarts, Frans. 1995. Nonveridical contexts. Linguistic Analysis 25:286–312.
42
Zwarts, Frans. 1998. Three types of polarity. In Plurality and quantification,
ed. F. Hamm and E. Hinrichs, 177–238. Dordrecht: Kluwer. 42
154 Bibliography
Samenvatting in het Nederlands
De notie van zinsarchitectuur Wat is de betekenis van de bewering dat
een zinsstructuur bestaat uit een sequentie fseq van n ≥ 1 functionele hoof-
den F0 ? Op welke gronden zou deze bewering afgewezen worden? Eén van
de de hoofdstellingen van deze dissertatie is dat deze bewering in feite fout is.
Er wordt beargumenteerd dat de essentiële empirische motivatie voor fseq,
namelijk bijwoordverplaatsing, werkwoordverplaatsing en de tweede positie
van het werkwoord (verb-second, V2) beter verklaard kunnen worden in al-
ternatieve benaderingen die geen gebruik maken van fseq.
De basisargumentatie voor het bestaan van functionele hoofden, F0 , bestaat
uit variatie in woordvolgorde tussen talen waar voor sommige uitdrukkingen
een syntactisch hoofd en ander materiaal, veelal bijwoordelijke vervoegingen,
voorkomen. Kortom, Emonds (1976); Pollock (1989) maken gebruik van het feit
dat in het Frans finiete werkwoorden (Vf) vooraf moeten gaan aan bijwoorden
zoals souvent ‘vaak’ (237), terwijl in het Engels de finiete werkwoorden juist
na het correponderende bijwoord van ‘vaak’, often, komen.
(237) a. Jean embrasse souvent Marie. [Fr]
Jan kust vaak Marie.
b. * Jean souvent embrasse Marie.
Jan vaak kust Marie.
(238) a. John often kisses Mary.
b. * John kisses often Mary.
Met een vergelijkbare argumentatie, komt Cinque (1999) tot de conclusie
dat er verscheidene hoofdposities Pi in een zinsdeel moeten bestaan waar finie-
te werkwoorden en andere werkwoordsvormen naar verplaatst kunnen worden.
Allereerst merkt hij op dat alleen bijwoorden in bepaalde volgordes kunnen
voorkomen en dat de restricties op deze bijwoordvolgorde universeel zijn. Ver-
volgens merkt hij op dat er een meer gedetailleerde variatie is tussen (Ro-
maanse) talen waarin het mogelijk is om werkwoorden te plaatsen tussen een
sequentie van bijwoorden. Uiteindelijk beargumeert hij dat als in een taal L,
een werkwoordsvorm V vooraf kan gaan aan een willekeurig bijwoord a, dan
moet V ook voorafgaan aan alle bijwoorden ai die in een sequentie na het bijwo-
ord a komen. In andere woorden, de volgorde van werkwoorden en bijwoorden
volgt de mathematische notie van transitiviteit.
Dit patroon zou gevolgd worden mits de verschillende bijwoorden speci-
ficeerders (=‘specifiers’) zijn van de specifieke functionele hoofden Fj en de
werkwoorden aan de verschillende hoofden Fj gekoppeld kunnen worden. Dit
wordt geı̈llustreerd in figuur A.1.
XP
H
H
H0
Adv1 X
H
HH
X YP
H
HH
Adv2 Y0
H
H
H
Y ZP
HH 0
Adv3 Z
H
H
Z ...
Figure A.1: Architectuur voor zinsdelen met een transitief volgorde patroon
Gegeven een dergelijke hierarchische architectuur, zou iedere taal werkwo-
ordsverplaatsing naar een andere hoofdpositie vereisen of toestaan. Echter, als
aangenomen wordt dat de positie van het werkwoord constant blijft, en dat
bijwoorden boven of onder deze positie vastgehecht kunnen worden, dan zou
transitiviteit niet opgaan. In dit geval kunnen de sequentiele relaties tussen de
bijwoorden en werkwoorden niet onafhankelijk van elkaar gegeven worden.
Dit illustreert een cruciaal aspect van de betreffende bewering: crosslinguı̈-
stische variatie van de hier beschouwde distributiefenomenen, kunnen worden
ondergebracht in een lineaire sequentie. Als we empirisch bewijs kunnen vin-
den voor niet-lineaire patronen van woordvolgordes, hebben we een tegenvoor-
beeld gevonden. In dit onderzoek wordt aangetoond dat het voorkomen van
de Noorse bijwoorden: muligens ‘mogelijk(erwijs)’, ikke ‘niet’ and alltid ‘altijd’
niet transitief geordend is. Dit laat zien dat een relatieve ordening zoals in
figuur 1 niet mogelijk is.
Een gerelateerd probleem is dat twee verschillende soorten uitdrukkingen
zich soms houden aan orthogonale eisen voor de woordvolgorde. Dit gaat op
voor de onderlinge volgorde van argumenten en bijwoorden in het Scandinavisch
en het Nederlands. Het komt ook voor met betrekking tot de onderlinge vol-
gorde van de werkwoorden in het Italiaans, vergeleken met de volgorde van de
bijwoorden. Aangezien de volgorde patronen orthogonaal zijn, kunnen ze niet
worden ondergebracht in één enkele fseq.
Het afschaffen van fseq Het is aangetoond dat de onderlinge volgorde van
bijwoorden in zinsdelen voor een groot gedeelte volgen uit het feit dat heel veel
bijwoorden (positieve) ‘polarity items’ (PPI) zijn. Bovendien zorgen verschil-
lende bijwoorden ervoor dat er een omgeving ontstaat waar andere bijwoorden
gevoelig voor zijn. Het is bijvoorbeeld aangetoond dat spreker-geörienteerde
bijwoorden, zoals evidently (=duidelijk genoeg), paradoxically (= paradoxaal
genoeg), fortunately (= gelukkig(-erwijs)), possibly (=mogelijk(-erwijs)), etc.
positieve polarity items zijn, in de zin dat deze woorden uitgesloten zijn in
neerwaards implicerende omgevingen. Zodoende, terwijl zin (239a) (Internet-
bron) als grammaticaal wordt bevonden, wordt zin (239b) ongrammaticaal
beoordeeld door autochtone spekers van het Engels.
(239) a. His retaliations killed or endangered innocents
Zijn wraaknemingen doodde of bedreigde onschuldigen
and often possibly had little effect in
en vaak mogelijk(-erwijs) hadden weinig effect op
locating terrorists.
het localiseren van terroristen
“Zijn wraakneming had tot gevolg dat onschuldigen gedood en
bedreigd werden en hadden mogelijkerwijs vaak weinig effect
op het localiseren van de terroristen.”
b. ?? His retaliations killed or endangered innocents
Zijn wraaknemingen doodde of bedreigde onschuldigen
and rarely possibly had an effect in
en vaak mogelijk(-erwijs) hadden een effect op
locating terrorists.
het localiseren van terroristen
Het PPI gedrag van bijwoorden zoals possibly wordt afgeleid door de anal-
yse van negatieve polarity items, zoals gegeven door Chierchia (2001), aan
te passen aan positieve polarity items. Possibly wordt afgeleid van possible
door toepassing van een functie waar de modale basis die met het bijvoegelijke
naamwoord wordt geassocieerd te verkleinen. Een dergelijke verkleining wordt
geregeerd door een pragmatische anti-verzwakkingsrestrictie. De output van
de domeinverkleining is niet logisch afleidbaar uit de input. Er wordt bear-
gumenteerd dat dit voldoende is om de meeste van de geobserveerde volgorde
patronen met bijwoordelijke zinsdelen af te leiden.
Een vergelijkbare analyse wordt voorgesteld voor de volgorde van zowel
verbale morfologie als hulpwerkwoorden. Bovendien volgt hieruit dat zulke
sequentiële fenomenen geen ondersteuning bieden voor een fseq.
Er wordt beargumenteerd dat de standaardanalyse van werkwoordplaats-
ing op de tweede positie (‘verb-second’, V2) in termen van hoofdverplaatsing
naar C tal van problemen oplevert met betrekking tot verb-second schendingen
door focus-gevoelige partikels in het Scandinavisch. De voorgestelde analyse
beschouwt verb-second als de uitkomst van de verplaatsing van één XP naar
de eerste positie, waarbij de XP zowel de constituent die voorafgaat aan het
werkwoord als het werkwoord zelf, als ook de onbeklemtoonde voornaamwoor-
den bevat. Deze analyse lost de problemen omtrent de focus partikels op en
verklaart Holmberg’s generalisatie met betrekking tot de wisselwerking tussen
argumentsverschuiving en werkwoordsverplaatsing in het Scandinavisch. Argu-
mentsverschuiving kan geen werkwoord, of enig ander fonetisch waarneembaar
materiaal dat onderdeel is van de VP, kruisen. In sommige gevallen kunnen
en moeten argumenten echter bijwoorden kruisen. Deze generalisatie volgt als
het argument en het werkwoord niet apart verplaatsen, maar als een XP, die al
deze elementen bevat, verplaatst. In andere woorden, argumentsverschuiving
kan niets anders kruisen dan bijwoorden, omdat er geen argumentsverschuiving
bestaat: er bestaat slechts verplaatsing van VP’s over bijwoorden.
Er wordt beargumenteerd dat fenomenen die korte werkwoordsverplaatsing
beschrijven, zoals in de Romaanse talen, in weze een ‘scrambling’ fenomeen
is. Dit wordt ondersteund door het feit dat zulke verplaatsingen semantische
(‘scopal’) effecten hebben. Deze verplaatsingen zijn gevoelig voor informatie-
structurele eigenschappen van de betrokken expressies. Bovendien zijn ze op-
tioneel voor een beperkte reeks bijwoorden. Vandaar dat korte werkwoordsver-
plaatsing geen ondersteuning biedt voor het bestaan van verschillende func-
tionele hoofden, zoals algemeen wordt aangenomen.
Door dit alles samen te nemen, kan vastgesteld worden dat de distributie
van werkwoorden en bijwoorden in zinsdelen geanalyseerd kan worden zonder
gebruik te maken van ‘posities’ in een sequentie van functionele hoofden. Door-
dat werkwoorden en bijwoorden alleen onderhevig zijn aan ‘scope’ en andere in-
terface vereisten, maar verder onbeperkt samengevoegd kunnen worden, wordt
het probleem van transitiviteit en orthogonale sequentiële patronen opgelost.
Curriculum vitae
Øystein Nilsen was born in Norway on March 29, 1971. He studied linguistics,
philosophy and psychology at the University of Tromsø between 1992 and 1996,
and completed his M.Phil. in linguistics at the same university in 1998. After
working as a lecturer at the Tromsø Institute of Linguistics, he enrolled as a
Ph.D. student at the Utrecht institute of Linguistics OTS in August 1999. The
present dissertation is the result of work he carried out there.