AI - Natural Language Processing

Natural Language Processing (NLP) Explained

AI - Natural Language Processing


What is Natural Language Processing?

Natural Language Processing (NLP) is an AI method that allows intelligent systems to communicate like English with the help of natural languages. It is necessary to process natural language when you want a system, such as a robot, to follow your instructions or when you seek advice from a dialogue-based clinical expert system.


NLP focuses on allowing computers to carry out useful tasks using the natural languages that humans speak. An NLP system can take input and produce output in the form of speech or written text.


Key Terms in NLP

These are some important concepts & terms related to natural language processing:


• Phonology - This is the study of how sounds are arranged systematically.

• Morphology - This involves studying how words are formed from basic meaningful units.

• Morpheme - This is the smallest unit of meaning in a language.

• Syntax - This refers to the arrangement of words to create sentences and includes understanding the structural roles of words in sentences and phrases.

• Semantics - This is about the meaning words and how to combine them into meaningful phrases and sentences.

• Pragmatics - This deal with how sentences are used and understood in various contexts and how their interpretation can change.

• Discourse - This examines how the previous sentence can influence the understanding of the next one.

• World Knowledge - This encompasses general knowledge about the world.


Techniques in NLP

NLP techniques are methods and algorithms used to process, analyze, and understand human language and data. Some common techniques in NLP include:

• Tokenization - This technique involves breaking a sentence or phrase into smaller units called tokens.

• Part-of-Speech Tagging - This technique identifies and labels words in a sentence based on their part of speech (noun, verb, adjective).

• Named Entity Recognition (NER) - This NLP technique identifies named entities in text, including people, organizations, locations, dates, and more.

• Semantic Analysis - This NLP technique assesses the sentiment conveyed in a text.


Steps in NLP

To better understand and analyze written and spoken language, the following 5 NLP steps are followed:


• Lexical Analysis - This step involves identifying and analyzing the structure of words. The lexicon of a language refers to the collection of words and phrases. Lexical analysis breaks down the text into paragraphs, sentences, and words.


• Syntactic Analysis (Parsing) - This step analyzes the words in a sentence for grammatical correctness and arranges them to show their relationships. For example, the sentence "The school goes to boy" would be rejected by an English syntactic analyzer.


• Semantic Analysis - This step takes out the exact or dictionary meaning from the text. It checks the text for meaningfulness by mapping syntactic structures and objects in the relevant domain. The semantic analyzer ignores nonsensical phrases like "hot ice-cream."


• Discourse Integration - The meaning of a sentence often depends on the preceding sentence. Additionally, it also considers the meaning of the following sentence.


• Pragmatic Analysis - This step reinterprets what was said to understand its actual meaning. It involves obtain aspects of language that require real-world knowledge.


Components of NLP

There are two main components of NLP:


• Natural Language Understanding (NLU)

NLU allows machines to understand and analyze human language by extracting metadata from content, which includes concepts, entities, keywords, emotions, relationships, and semantic roles. Understanding involves the following tasks:

  • Mapping the input in natural language into useful representations.
  • Analyzing various aspects of the language.

Natural Language Generation (NLG)

It is the method of creating meaningful phrases and sentences in natural language from an internal representation.

It involves −

• Text planning − This includes collecting relevant content from a knowledge base.

• Sentence planning − This involves selecting necessary words, creating meaningful phrases, and establishing the tone of the sentence.

• Text Realization − This is the process of converting the sentence plan into a sentence structure.

Challenges in NLP

Natural Language Processing often encounters various challenges due to the complexity and variety of human language. The most frequent challenge is Ambiguity (doubtfulness), which can be categorized into different levels −

• Lexical Ambiguity − This occurs at a very basic level, such as word-level. For instance, is the word "board" being used as a noun or a verb?

• Syntax Level Ambiguity − A sentence can be interpreted in multiple ways. For example, "He lifted the beetle with a red cap." − Did he use the cap to lift the beetle, or did he lift a beetle that had a red cap?

• Referential ambiguity − This involves referring to something using pronouns. For example, Rima went to Gauri. She said, "I am tired." − Who exactly is tired?