Please enable JavaScript.
Coggle requires JavaScript to display documents.
Spectra Clustering Notes, Science Question: Is there any correlation…
Spectra Clustering Notes
Science Question: Is there any correlation between spectral feature and functional groups?
Spectral Comparison between NIST molecules
Feature informativeness vs rarity
2d plot
C-H: Very abundant, not informative
Detection of CH4 in habitable exoplanets is not correct
=O: Not abundant, very informative
New Functional Group CC#C -feature at 16 micron.
Clear distinction between CC#CH and CC#CC
Informativeness is defined as the probability of detecting a functional group given the observation fo a spectral feature
We want to look for groups of few molecules that all have the same spectral features and molecules not in those groups does not have those features
T-SNE analysis of NIST Spectra
Most data points are not well separated, as are the spectral features
Only 2 unique groups are identified: CC#C and =O. CC#C is a new group. TSNE is able to seperate =O into two groups of molecules, though the non-linear nature of T-SNE means you cannot measure the meaning between groups
So Far the Study has been performed on the entire wavelength as a whole, which might not be a good idea
Initial analysis was done on All data, which yield poor result
Current analysis is done by cutting 3-5 micron from the data, which yield some improvements and lead to initial results
Molecular Spectral Complexity and Simularity
Similarities
Similarity Matrix Between Molecules
535x535 matrix of the similarity score between molecules
Spectra of each molecule is an array of numbers, we can use NLTK to analyze the similarity between two molecule's spectra.
It may also be possible to extract similar corpus for each sentence?
What is the probability of detecting a functional group given the observation of a spectral feature?
Say a feature space between 5-6 micron.
What molecules have spectral features in this region?
What is the probability of detecting absorption in this region?
What is the probability of the molecule to be attributed to the absorption of this region
Creepy Plot
Simplistic First Jab
Assumes uniform spectral absorbance and equal mixing ratio
Sum Plot
Another way of showing that C-H is dominant.
Detection of this feature, and overtones of this feature in broad band spectroscopy (HST, JWST) would be a major milestone.
But for habitable exoplanets, just claiming detection it's CH4 could be a huge missed opportunity
vis-NIR feature of CH4 is overtone of the 3.3 micron feature
features in 5-25 micron important to evaluate if additional hydrocarbons are detected
Could be used for probabilistic calculations down the line
Spectra/sum plot
normalize spectral features by dividing by the sum plot.
could evolve into a probability plot
Definitions
Functional Groups: The collection of bonds between two or more atoms
t-SNE: T-distributed Stochastic Neighbor Embedding
Science Q: Is there a "Forbidden Zone"?
4-5 micron is quite "empty"
Cross Section Grid: How "coarse" is enough? Is the linear interpolation method accurate enough?
It's not only the availability of functional groups, but also the position of the functional groups with respect to the rest of the functional groups that are present in the molecule.