Please enable JavaScript.
Coggle requires JavaScript to display documents.
Part of Speech Tagging and Chunking with Conditional Random Fields.…
Part of Speech Tagging and Chunking with Conditional Random Fields.
Information
Part-of-speech
Task of assigning grammatical classes to words in a natural language sentence.
Subsequent processing stages (such as parsing) become easier if the word class for a word is available.
Chunking
He reckons the current account deficit will narrow to only # 1.8 billion in September . can be divided as follows: P [NP He ] [VP reckons ] [NP the current account deficit ] [VP will narrow ] [PP to ] [NP only # 1.8 billion ] [PP in [NP September ] .
Conditional Random Fields
The Model
Approach
For training for the POS tagger we use the hindi morph analyzer to get the root-word and possible pos tags for every word in the corpus.
Along with the root-word and suggested pos tags other information like suffixes, word length indicator and presence of special characters is added to the training data .
The data is then trained using “CRF++, Yet Another CRF package” on a set detailed features and their combination.
Training
In the first phase chunk tags(Chunk Boundary-Chunk Label) are assigned to each word in the training data and the data is trained to predict the corresponding B-L tag. We use only the local context of words and their POS categories to train.
We first extract chunk boundary and chunk Label markers for
each word in the corpus .
Training for chunker is done in two phases
Next the chunk label markers(L) from the B-L chunk tags are extracted and added to the training data along side the words and the POS categories. Now in the second phase we train the system on the above feature template for predicting the chunk boundary markers(B).
Finally chunk label markers(L) from the first phase and the chunk boundary markers from the second phase are combined together to obtain the chunk tag.
Goals
Objective
Building a complete system for POS tagging and chunking for hindi
Meta
Author
Himanshu Agrawal
Anirudh Mani
Year
2005
Result