Please enable JavaScript.
Coggle requires JavaScript to display documents.
Document Summarization (Social context summarization
(references from…
Document Summarization
supervised
methods
classification problem and classify
each sentence individually without
leveraging the relationship among sentences
unsupervised methods:
use heuristic rules to select the most
informative sentences into a summary directly
by exploiting different features and
relationships of the sentences
rhetorical structures [Marcu, 1997]
lexical chains [Barzilay and Elbadad, 1997]
hidden topics in the documents
[Gong and Liu, 2001]:
hidden topics can be discovered in a document and the projection of each sentence on each topic through Latent Semantic Analysis [Deerwester et al., 1990] ------
They selected the sentence which have the
large projections on the salient topics to form the summary
graphs based on the similarity of
sentences [Mihalcea, 2005]
constructed a graph in which each node is a sentence and the weight of the edge linking two nodes is the similarity between the corresponding sentences ----- The direction of the edges can be decided by the appearance order of the sentences. After constructing the graph, employed some
graph-based ranking algorithms like HITS [Kleinberg, 1999] and PageRank [Brin and Page, 1998] to decide the importance of a vertex (sentence) which can take into account the global information recursively computed from the entire graph
Maximal Marginal Relevance (MMR) :question: [Carbonell etal., 1997]. According to MMR, a sentence is chosen for in clusion in summary such that it is maximally similar to the document and dissimilar to the already-selected sentences.------This approach works in an ad hoc manner and tends to select long sentences. However, in this paper, the redundancy is controlled by a probabilistic model which can be learned automatically
-
-
-
-
generative models :unlock:: assigning a joint probability to paired observation and label sequences
- the parameters are typically trained to maximize the joint likelihood of training examples
- To define a joint probability over observation and label sequences, a generative model needs to enumerate all possible observation sequences, typically requiring a representation in which observations are task-appropriate atomic entities
Maximum Entropy Markov Models :unlock: (MEMMs) are conditional probabilistic sequence models that attain all of the
above advantages. In MEMMs, each source state has a exponential model that takes:
- input: the observation features,
- output: a distribution over possible next states.
These exponential models are
trained by an appropriate iterative scaling method in the maximum entropy framework
Generative vs. Discriminative models
- A generative algorithm models how the data was generated in order to categorize a signal. It asks the question: based on my generation assumptions, which category is most likely to generate this signal?
- A discriminative algorithm does not care about how the data was generated, it simply categorizes a given signal.