Discuss spaCy in detail?

spaCy is one of the best text analysis library. spaCy excels at large-scale information extraction tasks and is one of the fastest in the world. It is also the best way to prepare the text for deep learning. spaCy is much faster and more accurate than NLTK Tagger and TextBlob.

How to Install?

pip install spacy
python -m spacy download en_core_web_sm

Top Features of spaCy:

  1. Non-destructive tokenization

  2. Named entity recognition

  3. Support for 49+ languages

  4. 16 statistical models for 9 languages

  5. Pre-trained word vectors

  6. Part-of-speech tagging

  7. Labeled dependency parsing

  8. Syntax-driven sentence segmentation

Import and Load Library:

import spacy
  
# python -m spacy download en_core_web_sm
nlp = spacy.load("en_core_web_sm")

POS-Tagging for Reviews:

It is a method of identifying words as nouns, verbs, adjectives, adverbs, etc.

import spacy
  
# Load English tokenizer, tagger, 
# parser, NER and word vectors
nlp = spacy.load("en_core_web_sm")
  
# Process whole documents
text = ("""My name is Shaurya Uppal. 
I enjoy writing articles on GeeksforGeeks checkout
my other article by going to my profile section.""")
  
doc = nlp(text)
  
# Token and Tag
for token in doc:
  print(token, token.pos_)
  
# You want list of Verb tokens
print("Verbs:", [token.text for token in doc if token.pos_ == "VERB"])