ML Wiki

  • Home
  • Page Info
    • What links here
    • Related changes
    • Special pages
    • Permanent link
    • Page information
  • Log in

NLP Pipeline
Revision as of 23:29, 27 April 2017 by Alexey (Talk | contribs) (Created page with "== NLP Pipeline == === NLP Applications === * Tokenization * Stop Words Removal * Text Normalization (e.g. U.S.A. -> USA) ** Spelling Correction *...")

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Contents

  • 1 NLP Pipeline
    • 1.1 NLP Applications
    • 1.2 Information Retrieval
  • 2 Sources

NLP Pipeline

NLP Applications

  • Tokenization
  • Stop Words Removal
  • Text Normalization (e.g. U.S.A. -> USA)
    • Spelling Correction
    • Lemmatization (or sometimes Stemming)
    • find equivalence classes (using thesauri, e.g. WordNet) (semantic stuff)
  • POS Tagging
  • Named Entity Recognition
  • building Statistical Language Models


Information Retrieval

  • Tokenization
  • Stop Words Removal
  • Text Normalization
    • Stemming or Lemmatization
    • Spelling Correction
    • Phonetic Normalization (e.g. with Soundex)
    • find equivalence classes (using thesauri, e.g. WordNet) (semantic stuff)
  • Named Entity Recognition
  • building Inverted Index and Vector Space Model



Sources

  • Information Retrieval (UFRT)
Retrieved from "http://mlwiki.org/index.php?title=NLP_Pipeline&oldid=774"
Categories:
  • Information Retrieval
  • NLP

This page was last modified on 27 April 2017, at 23:29.
Machine Learning Bookcamp: learn machine learning by doing projects (get 40% off with code "grigorevpc")
2012 – 2022 by Alexey Grigorev
Powered by MediaWiki. TyrianMediawiki Skin, with Tyrian design by Gentoo.
Privacy policy About ML Wiki Disclaimers