2018-09-12

Language Without Metalanguage 2:
Breadth Over Depth

Approaching Languages More Logically

Before I got into languages professionally, I trained to become a professional logician. This is why I titled this blog series as I did. The distinction between object languages (OL) and metalanguages (ML) exists in both fields. However, when it comes to using maths and logics (i.e., formal languages) or natural languages well, they're irrelevant, despite what many language instructors say.

This false belief and bad practice arises for many reasons. I've discussed economic and social reasons previously. There's yet another cognitive reason: Language instructors confuse linguistic ML's value for logical ML's value. You see, logical ML is all about questions of depth. It checks if a logic is consistent, complete, compact, decidablesound, and the like. These are important things to have in a logic. They show how far and how reliably you can use a logic's OL. However, once someone has proved an OL's worth through an ML, the rest of us can just use the OL. It's the division of labor. The nerds do the deep stuff so that we can do the broad stuff.

Linguistic ML, however, focuses on deriving all and only the well-formed formulas (WFF's) of natural languages. Logics have WFF rules, and by rules, I mean top-down definitions. These definitions say what counts as a "sentence" in a logic's OL. In fact, in logic, WFF's must be defined, or else there is no OL. This clearly isn't the case for natural languages, though. French doesn't cease to exist if we produce no WFF rules for it. Natural languages are bottom-up. Their WFF rules are in the brains of native speakers. A linguistic ML just tries to say explicitly what they are.

Now, linguistic ML might be valuable to language learners if we definitively knew the rules (we don't) and if we could program humans like machines (we can't). We can't reverse-engineer this part of our human endowment, and for various reasons. For one, it's a plain empirical fact that no one is taught their first languages. Data further suggests that we can't be taught second languages. We can learn them, but not with ML-directed instruction. Why? Because, like in logic, language learners must have an OL before any ML can make sense to them:
"This hypothesis [that we learn languages by imitation] would not account for the many instances when adults do not coach their children in language skills. Positive reinforcement doesn't seem to speed up the language acquisition process. Children do not respond to or produce metalanguage until 3 or 4, after the main portion of the grammar has been mastered."
The facts, face them!
This explains the deep stupidity among most language instructors. They pretend that a linguistic ML is as definitive logical WFF rules. Then, they force learners to retain a linguistic ML before they have a functioning object language. Why? Because that's how teachers teach logic or math. They teach the WFF rules, show some examples, and drill students with proofs or computations. Most of us can add or do deductions thanks to such instruction. With natural language instruction, though, this approach falls flat on its ass. Natural languages aren't "harder". Your language instructors are just fucking deluded. It's why 99% of people who are taught arithmetic can do arithmetic fluidly. It's why 99% of babies easily acquire their first languages (i.e., they just ignore linguistic ML). And finally, it's why about 5% of people who are taught foreign languages actually become fluent in them.

That should convince anyone to abandon any ML-directed approach. However, it leaves us in a bind. We need to understand an OL, and we can't gain it by being taught an ML. It also seems that acquisition via mere exposure (i.e., immersive learning) wanes as we age. So, if you're prepubescent, you may have a decent chance.

But, what about the rest of us?

To answer that question, we have to know a little-known trick. Formal languages borrow universal features from natural languages. In logic, though, this borrowing remains incomplete. No logic on Earth can create WFF's for every sentence of a natural language. Logical WFF's only formalize some subset of natural OL sentences. As logicians have progressed, they've created OL's that "increase logics' expressiveness." Gottlob Frege invented a more expressive first-order predicate logic after millennia of no solutions to the problem of multiple generality. CI Lewis did something similar with the first modal logic. Even still, logical WFF's are still far from matching what we can deduce in natural languages. For instance, there is still no generahylly accepted logic which can handle second-order inferences well:
  • "John runs fast," implies, "John runs."
  • But, "John does not run fast," does not imply, "John does not run." He may just run slowly.
While I and others have invented logical OL's that handle the above issue, it wouldn't help a language learner. They still wouldn't cover every natural language. Also, it wouldn't matter if they did. Replacing a linguistic ML with a logical ML would be more correct and universal, but it would require way more ML-directed instruction. If you think verb conjugation and agreement rule drills suck, imagine writing strings to capture the features of any natural language in a pristine formalism! Or, just imagine shooting yourself in the face. It's about the same.


Not all hope is lost, though, because one bit of logical ML informs an OL-driven approach. Among logical WFF's, some are atomic formulas, or "atomic sentences". There are only two things for a language learner to know about them:
  1. An OL's atomic sentences are its smallest possible sentences.
  2. All other sentences in a language are connected, transformed, or expanded atomic sentences.
Now, in my estimation, there are eight (or, with quantification, 28) atomic sentences for all natural languages. Luckily, there's no need for learners to memorize them. We only need to learn steps to pull them out of other sentences. This is the essence of message parsing.

Finally, you get to good stuff! How do I do those "message parses"?

I can't fit everything about them here, but their essentials are very simple. There are just four steps, and the third step contains an optional sub-step:
  1. ADD (ellipses).
  2. MOVE (transformations into a canonical word order, if one exists).
  3. SPLIT (sentences with connectives).
    1. SWAP (pro-forms with their referents, referents with pro-forms, or connectives with second-order modifiers).
  4. CUT (modifiers).
And, with these steps, there are two rules: 
  1. Do the step below only if you can't do any steps above.
  2. Do the steps above only if they help you do a step below.
Through these steps, an unparsed sentence's meaning becomes clearer. To demonstrate, I took French sentence out of my PollyGot database. I don't know French. However, with some work, I can figure out its message parse in a few minutes:
Note: The atomic sentences are the green ones.
I can then translate every parsed sentence, working bottom-up to translate the unparsed sentence:

Quelque chose généralise la douleur. Something generalizes the pain.
De la fatigue caractérise l’infection. Fatigue characterizes the infection.
Une douleur caractérise l’infection. A pain characterizes the infection.
La douleur est généralisée par quelque chose. The pain is generalized by something.
Une douleur et de la fatigue caractérisent l’infection. Pain and fatigue characterize the infection.
De la fièvre caractérise l’infection. Fever characterizes the infection.
La douleur est généralisée. The pain is generalized.
De la fièvre, une douleur et de la fatigue caractérisent l’infection. Fever, pain and fatigue characterize the infection.
De la fièvre, une douleur qui est généralisée et de la fatigue caractérisent l’infection. Fever, pain that is generalized, and fatigue characterize the infection.
L’infection est caractérisée par de la fièvre, une douleur qui est généralisée et de la fatigue. The infection is characterized by fever, pain that is generalized and fatigue.
La grippe est une infection. Influenza is an infection.
La grippe est une infection qui est caractérisée par de la fièvre, une douleur qui est généralisée et de la fatigue. Influenza is an infection that is characterized by fever, pain that is generalized and fatigue.
La grippe est une infection caractérisée par de la fièvre, une douleur généralisée et de la fatigue. Influenza is an infection characterized by fever, generalized pain and fatigue.

Normally, though, I don't parse sentences completely. I only parse them until I can understand them completely. That rarely requires going to the atomic level.

Wait, how can you know the OL's syntax well enough to do such a parse?

That will also be the topic of a later post. For now, all that matters is this: I didn't consult a grammar book. I looked for words that would help me do the steps. I used my knowledge of my natural languages, a bit of logical sense, an online dictionary, some machine translations, and searches for substrings. They guided the parse and helped me check my parsed sentences' grammar.

Best of all, it gradually became automatic. I grew a sense of when and where to apply these rules and checks. It got me through my third language. It's getting me through my fourth, fifth, and sixth languages. I'm not doing it obsessively, either. I just use this method when a sentence confuses me. That, too, becomes less and less frequent. That's why I impart this moral: When it comes to language learning, a breadth of experience is worth much more than a depth of analysis. So, if you are starting with your second language, you can still message-parse sentences from your first language. That practice can show you where and how to add, move, swap, and cut sentences in other languages.

On the other hand, you could be this joke. That's always fun.

No comments:

Post a Comment