The Handbook of NLP with Gensim - By Chris Kuo NLU + NLG = NLP NLP is an umbrella term that covers natural language understanding (NLU) and NLG. We’ll go through both in the next sections. NLU Many languages, such as English, German, and Chinese, have been developing for hundreds of years and continue to evolve. Humans can use languages artfully in various social contexts. Now, we are asking a computer to understand human language. What’s very rudimentary to us may not be so apparent to a computer. Linguists have contributed much to the development of computers’ understanding in terms of syntax, semantics, phonology, morphology, and pragmatics. NLU focuses on understanding the meaning of human language. It extracts text or speech input and then analyzes the syntax, semantics, phonology, morphology, and pragmatics in the language. Let’s briefly go over each one: Syntax: This is about the study of how words are arranged to form phrases and clauses, as well as the use of punctuation, order of words, and sentences. Semantics: This is about the possible meanings of a sentence based on the interactions between words in the sentence. It is concerned with the interpretation of language, rather than its form or structure. For example, the word “table” as a noun can refer to “a piece of furniture having a smooth flat top that is usually supported by one or more vertical legs” or a data frame in a computer language. NLU can understand the two meanings of a word in such jokes through a technique called word embedding. Phonology: This is about the study of the sound system of a language, including the sounds of speech (phonemes), how they are combined to form words (morphology), and how they are organized into larger units such as syllables and stress patterns. For example, the sounds represented by the letters “p” and “b” in English are distinct phonemes. A phoneme is the smallest unit of sound in a language that can change the meaning of a word. Consider the words “pat” and “bat.” The only difference between these two words is the initial sound, but their meanings are different. Morphology: This is the study of the structure of words, including the way in which they are formed from smaller units of meaning called morphemes. It originally comes from “morph,” the shape or form, and “ology,” the study of something. Morphology is important because it helps us understand how words are formed and how they relate to each other. It also helps us understand how words change over time and how they are related to other words in a language. For example, the word “unkindness” consists of three separate morphemes: the prefix “un-,” the root “kind,” and the suffix “-ness.” Pragmatics: This is the study of how language is used in a social context. Pragmatics is important because it helps us understand how language works in real-world situations, and how language can be used to convey meaning and achieve specific purposes. For example, if you offer to buy your friend a McDonald’s burger, a large fries, and a large drink, your friend may reply "no" because he is concerned about becoming fat. Your friend may simply mean the burger meal is high in calories, but the conversation can also imply he may be fat in a social context. Now, let’s understand NLG. NLG While NLU is concerned with reading for a computer to comprehend, NLG is about writing for a computer to write. The term generation in NLG refers to an NLP model generating meaningful words or even articles. Today, when you compose an email or type a sentence in an app, it presents possible words to complete your sentence or performs automatic correction. These are applications of NLG. This content is from the book The Handbook of NLP with Gensim - By Chris Kuo (Oct 2023). Start reading a free chapter or access the entire Packt digital library free for 7 days by signing up now. To learn more, click on the button below. |