Please enable JavaScript.
Coggle requires JavaScript to display documents.
Identifying Forensic Interesting Files in Digital Forensic Corpora by…
Identifying Forensic Interesting Files
in Digital Forensic Corpora by Applying
Topic Modelling (2021)
context
latent semantic analysis
latent semantic indexing
spaCy and OpenNMT for stopwords and tokenization
category
analysis methodology prototype
correctness
methodology
removal of uninteresting files (system files, auxilliary files, software files)
removal of stop words, white spaces, tokenization
use spacy to create graph of entity and objects
used single value decomposition for dimensionality reduction
all matching blacklisted keywords from nsa are considered interesting
statistics
lsa on 29.8 million files took 4 hours on i5
execution time, hit ratio, precision, recall
conclusion
how i can apply this to my work
there is no concept of timeline
contribution
search using malware will yield result such as files with wannacry,etc.
reference
Forensic corpus data reduction techniques for faster analysis by eliminating tedious files (2019)
credibility
who wrote it
D. Paul Joseph
where is he from?
researcher in india (phd)
was it referenced? where was it published?
Review of NLP-based Systems in Digital Forensics and Cybersecurity (2021)
published in Advances in Distributed Computing and Machine Learning