Please enable JavaScript.
Coggle requires JavaScript to display documents.
BOBI (1. Extracting Phrases (Stop and Cleaning of Texts, head…
BOBI
1. Extracting Phrases
- Stop and Cleaning of Texts
- head_sentences(xml_file_name)
- context_and_id(file_name)
-
- context_sentences_list(context_and_id_var)
- build_stop_word_regex(stop_word_file_path)
- generate_candidate_keywords(sentence_list, stop_word_pattern)
3. Best Context ID
-
-
- cosine_sim(vector1, vector2)
- wiki_keyword_summary_cleansed(phrase)
- best_cosine_id_yes(phrase) -> returns yes if found
- best_cosine_id_dis(phrase1, phrase2)-> choose without disambiguation.
2. Filtering Phrases
- pos_filtering_clean(generate_candidate_keywords_var)
("NN","NNPS", "NNP","NNS","JJ")
- separate_words(text, min_word_size)
- calculate_word_scores(pos_filtering_var)
- generate_candidate_keyphrase_scores(pos_filtering_var, calculate_word_scores_var)
- Sorted keywords and length < 3
4. See also
-
- important_term -> important term in phrase
-
-
-
-
-
-