Built with cutting edge Natural Language Processing techniques, virtual assistants such as Alexa, Cortana and Siri have become quite popular. We are able to have conversations with them as they are able to understand our instructions and execute/ respond appropriately.
For example:
“Siri where is the nearest Pizza place?”
From this Siri is able to understand that this is a ‘where question’, the dimension is ‘nearest’ and that the keyword is ‘pizza’. From that it is able to return the expected results such as:
“The nearest Pizza Place is 100 meters away on Greenfield Road.”
Alexa, Cortana and Siri are built on Natural Language Processing systems which enable them to capture the data, break it down and act on the instructions.
In this Article we will dive into:
- What is Natural Language Processing?
- Applications of NLP
- Steps of NLP
- Resources for Learning NLP
What is Natural Language Processing?
Natural Language refers to the way we humans communicate with each other. For example text and speech.
Applications of NLP
- Auto Predict for example in gmail
- Auto Correct i.e google Docs, Ms Word.
- Speech Recognition i.e Alexa
- Machine translation i.e Google Translate
- Sentimental Analysis
- Chatbots
- Spell checking
- Keyword searching
- Spam filter i.e gmail
- Advertising matching
- Information retrieval
Steps to Natural Language Processing
- Tokenization
- Word Normalization
- POS tags
- Named Entity Recognition
- Chunking
Tokenization
This Is the process of breaking data into ‘tokens’ which are small units.
For example:
Imagine that you are instructing your google assistant to open playstore. You would give it the instruction:
Open PlayStore
Google would then break down the command into:
Open PlayStor
Word Normalization
In this step the machine prepares text, words, and documents for further processing.
It can be defined as the process of removing additions or variations to a root word that the machine can recognize.
It does this by two methods:
- Stemming
- Lemmanization
Stemming
This is the process of normalizing words into its root form.
It involves removing affixes from a root word.
For example:
Security unsecure securing secured securer
Are all derived from the word:
Secure
Lemmatization.
Different forms of words are grouped together into groups called lemma.
For example:
Gone Going Went
Are grouped into:
Go
POS Tags
In this step the words are assigned Parts of Speech tags that indicate how the word functions in meaning and grammatically within the sentence.
For example:
Where is the nearest smoothie shop?
Name Entity Recognition.
In this step the computer reads through the information and detects the named entities.
Named Entities are particular terms that represent specific entities that are informative and have a unique context.
For example:
Names of People, Company, Quantities and Occasions.
Chunking
The process of extracting phrases from unstructured text, which means analyzing a sentence to identify the constituents(Noun Groups, Verbs, verb groups, etc.)
For example:
The quick, brown fox jumped over the lazy dog.
Noun Phrase Verb Phrase
Resources
Planning on working on any NLP projects?
- NLTK (Natural Language Toolkit) is a popular open-source package in Python, Instead of building all tools from scratch it provides all common NLP Tasks.
- Neural Network Methods in Natural Language Processing (Synthesis Lectures on Human Language Technologies) by Yoav Goldberg is a great book that goes into the application of neural network models to natural language data.
- Edureka is a great youtube channel for programming tutorials and I recommend their Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK tutorial.