Identifying Food-related Word Association and
Topic Model Processing using LDA


Yu-Chin Li, Tsung-Chih Hu, Kuo-En Chang

Data Science
(DS) and Natural Language Processing (NLP)

. Literature Review

Human manifests certain generalizations which transcend culture,gender,age,and language

focus
on testing accuracy, efficiency and effectiveness

Ramachandran and Hubbard (2001),proved that a special connection exists between languages and objects or concepts.

click to edit

These mutual inter-connections or “word association” (Meara, 2009) and the aggregative effect “categorization” (Squire &Kandel, 2000), are both extremely important cognitive activities

click to edit

click to edit

2.2 Context and word association network

2.3 Topic model processing: Latent Dirichlet
Allocation (LDA)

click to edit

click to edit

2.organization in the form of a network.

This paper applies LDA model methods (Blei, Ng, & Jordan, 2003) for its calculations. LDA is a statistical analysis model for random variable sets. It takes into consideration the sequential relevance of the context of data points.To find the common latent semantic properties of the entire data set,it is assumed that the words in the data set obey Dirichlet distribution and Bayes’ theorem (Vapnik,1998); statistical processing with words in thedataset is carried out and the results are presented as probability.

click to edit

2.1 Human memory encoding and retrieval

click to edit

Ellis (2002),
Lastly, from the perspective of neuroscience, the effective encoding and repeated retrieval of memory is the key factor in the transformation of short term memory to long-term memory.

language is the projection and extension of the human mind.
In terms of the organization of knowledge, Langacker (1987) proposed the theories of encyclopedic knowledge and cognitive domain and suggest that knowledge (including grammar and lexical knowledge) is more like an interconnected network.

human memory and cognition display clear organization and hierarchy. When humans encounter external information, cognitive mechanisms transform the experience into representations and construct “schema” or “frameworks.” According to Anderson’s (1977) “Schema theory,” schema are knowledge representation structures used by humans for the generalization and abstraction of objects, events and fields.

1.context

(Gu, 2003; Nation, 2013;Schmitt, 1997)
context is the most important factor for vocabulary acquisition, and vocabulary should be learnt in context.
(Baddeley, 1982; & B a d d e l e y, 2014) found an interesting relationship among human memory and change of context, which he called “encoding specificity.”

Drum and Konopak (1987) assert that vocabulary is learned as a nodal network, which is supported by a structural representation of domain knowledge.
(Weldon, Stadler, & Riegler, 1992),compared to merely aural input indicating the strength index of cue retrieval.

This paper aims to find an automatic topic modeling process that can approach the associative tendencies of humans.
Chen, Wang, and Ko (2009). They used the Balanced Corpus of Modern Chinese published by Academia Sinica in 2006, and applied Latent Semantic Analysis(LSA) (Deerwester, Dumais, Furnas, Landauer, & Harshman, 1990) to build a space to represent semantic links between Chinese words .