Please enable JavaScript.

Coggle requires JavaScript to display documents.

Claim verification - Coggle Diagram

- - - - Character encoding, language identification, tokenization (Hello World _> HelloWorld! so string to sequence of tokens/words, Python spaCy), stopword removal, word normalization => stemming (map different forms of same word to same normalized form by stripping affixes: walking to walk, in English typically not done anymore) Better: lemmatization: map tokens to lexicon entries 8complex lexicon and mapping rules!!)
  - - - Sentence splitting
        
        Part-of-speech tagging (assigning grammatical categories to each token)
        
        Named entity recognition: classify tokens into predefined classes
        
        Coreference resolution: identify when different words refer to same entity: Translate He to John
        
        Parsing