2019-12-01

Constituency and Coordination

For Syntax FANBOYS

As any seasoned linguist will tell you, grammar based on folk theory is, well, folksy. And, since most folks are superficial in their linguistic analysis, so, too, are their contrived grammars. At the opposite pole, we have crushing formality. And, since most formalities are constrained, so, too, are their contrived grammars. What lies in between is how I perceive linguistic syntax. The top-down system derives from bottom-up observations. Message parsing has focused on giving a bottom-up method of transformations that preserves semantic value (for the most part). What follows here is formal insights (expressed in constituency grammar) that message parsing invites. And, following that, I aim to tackle this question: How the hell does coordination jive with constituency?

For those who don't read my work (that is, pretty much everybody), here's an example of a message parse with some respective notes:
  1. When did he and why did he do that? : START
  2. When did he do that and why did he do that? : (1) ADD
  3. When did he do that? Why did he do that? : (3) SPLIT
  4. Did he do that at some time? Did he do that for some reason? : (3) MOVE * 2
  5. He did do that at some time. He did do that for some reason. : (4) MOVE * 2
  6. He did do that. He did do that. : (5) CUT * 2
The ADD rule replaces elliptical elements.

The SPLIT rule divides sentences connected by coordinators.

The MOVE rule shifts and replaces constituents to do any of the following:
  • Revert a sentence to a truth-apt form, or 
  • Move a sentence's constituent to a canonically ordered position.
The CUT rule eliminates modifiers.

Now, as I wrote in a tangential blog, I first devised this method to dissolve philosophical questions. Message parsing mirrors processes seen in formal proofing. I only later discovered that it also helps language learners understand syntax without metalanguage. But, could a metalanguage seamlessly capture this process? Would it reveal undiscovered constituents? I say yes with guarded optimism. Let's see if the resultant hypotheses and trees are sufficiently compelling.

To start, I'll cover some essentials of such a constituency grammar: 
  • Every constituent shall have at most two sub-constituents.
  • The grammar will contain all of the following syntactic categories (with accompanying phrasal nodes) and phrasal nodes:
    • At the argument level:
      • Argument Phrases (GP), 
      • Determiners (D), 
      • Nouns (N).
    • At the first-order level:
      • Predicate Phrases (KP), 
      • Adjectives (J), 
      • Verbs (V), 
      • Prepositions (P).
    • At the higher-order level:
      • Inflectional Phrases (IP), 
      • Adverbs (R), 
      • Prepositions (P), 
      • Intensifiers (S), 
      • Auxiliaries (X).
    • At the operator level:
      • Complementizers (C), 
      • Relativizers (L), 
      • Coordinators (O).
At this point, I'm ready to posit some new-ish constituents. The first deals with transformations. Normally, syntax trees indicate movement into formerly empty slots. But, these empty slots are usually constituents of a sentence's syntax that happen to be empty, save for the supposed transformation. Instead, I'm going to grant a broader transformational phrase (FP). The point of the FP is to trace constituents back from their canonical positions. Every FP parents a head Fn and a tail matching the constituent that is occupied by the FP. An intrusion test can determine where FP's can arise in a given language, and I'll demonstrate as much in future posts. For now, what's more interesting is to show how FP's correspond to the MOVE rule.

The second is the coordinator phrase (OP) with very similar features. Again, it parents a head O and a tail matching the constituent that is occupied by the OP. Since coordinators form a syntactic class, the SPLIT rule is one test that can help to identify them.
(1) When did he and why did he do that?
Now, this is more than just a pretty tree. One main advantage to it is that each Fn traces from a named constituent without conflating syntactic categories. For instance, interrogative pro-forms are not complementizers. It just turns out that the F's are adjacent to them in English.

Second, every IP, CP, OP, or FP of the prior three is a sentence, and every sentence is in the message parse.

Finally, OP sits at the top of the tree, which carries another major advantage. Normally, syntacticians keep CP's at the top of their constituency trees. In light of OP, that assessment is incomplete. From a logical perspective, this is not so surprising. Logical operators translate as coordinators more often than they translate to complementizers (with a clear exception of "→" to "if"). Also, in terms of fronting, if CP's parent OP's, the results are ungrammatical.
  • And if I refuse? -- The OP parents the CP.
  • *If and I refuse? -- The CP parents the OP.
  • It's hard to say whether it's his heart or his lungs. -- The OP parents the CP's.
  • It's hard to say whether it's his heart or whether it's his lungs. -- The OP parents the CP's.
  • *It's hard to say whether it's his heart whether or it's his lungs. -- A CP parents the OP.
Just like any CP's head, it turns out that any OP's head can be empty.

What about conjunctions of non-constituents?

I've thought of how to investigate that issue. We can first see where the current tree fails to capture constituents by working from the un-ellided sentence, and then elliding words and phrases until we no longer can.
  • When did he do that and why did he do that? -- No ellipses exist.
  • When did he do and why did he do that? -- The GP parented by the first VP is ellided.
  • When did he and why did he do that? -- The first VP is ellided.
  • When did and why did he do that?
  • When and why did he do that? -- The FP that parents F2 is ellided.
So, there's no real change, and we'll need one to force constituent conjunction. To do that, certain structures, like IP ::= GP KP, would need a major overhaul, and I may tackle such a solution in the future.

The easier solution is to observe that constituents mark two heads of a syntactic bridge. When ellipses occur, a scan for missing data is filled by the appropriate constituents in the built bridge (the un-ellided sentence). One bridge's complete structure (heads included) allows us to rebuild the other. This will work even if the incomplete bridge's constituents are filled by other words or phrases.

In this tree, three constituent rebuilds come from the ellided constituents, themselves ([GP], [VP], [FP]). The fourth is the only one of further interest: "When did and why did he do that?" We just find the first common ancestor bridged by the first ellided constituent and the last ellided constituent. That node is IP, so [IP] is the bridge.
When did and why did he do that?
IP31 helps rebuild IP30.

No comments:

Post a Comment