The Classical Language Toolkit (CLTK)¶
Contents¶
- About
- Citation
- Installation
- Importing Corpora
- Corpus Readers
- Akkadian
- Arabic
- Aramaic
- Bengali
- Chinese
- Coptic
- Ancient Egyptian
- Old English
- Middle English
- French
- Middle High German
- Middle Low German
- Gothic
- Greek
- Alphabet
- Accentuation and diacritics
- Converting Beta Code to Unicode
- Converting TLG texts with TLGU
- Corpus Readers
- Information Retrieval
- Lemmatization
- Lemmatization, backoff method
- Named Entity Recognition
- Normalization
- POS tagging
- Prosody Scanning
- Sentence Tokenization
- Stopword Filtering
- Swadesh
- TEI XML
- Text Cleanup
- TLG Indices
- Transliteration
- Word Tokenization
- Word2Vec
- Gujarati
- Hebrew
- Hindi
- Javanese
- Kannada
- Latin
- Corpus Readers
- Clausulae Analysis
- Converting J to I, V to U
- Converting PHI texts with TLGU
- Information Retrieval
- Declining
- Lemmatization
- Lemmatization, backoff method
- Line Tokenization
- Macronizer
- Making POS training sets
- Named Entity Recognition
- PHI Indices
- POS tagging
- Prosody Scanning
- Scansion of Poetry
- Semantics
- Sentence Tokenization
- Semantics
- Stemming
- Stoplist Construction
- Stopword Filtering
- Swadesh
- Syllabifier
- Text Cleanup
- Transliteration
- Word Tokenization
- Word2Vec
- Malayalam
- Marathi
- Multilingual
- Old Norse
- Odia
- Ottoman
- Pali
- Persian
- Phonology
- Old Portuguese
- Prakrit
- Punjabi
- Sanskrit
- Old Swedish
- Tamil
- Telugu
- Tibetan
- Tocharian B
- Urdu