Please enable JavaScript.
Coggle requires JavaScript to display documents.
Comparative genomics via rRNA operons and their neighbor genes - Coggle…
Comparative genomics via rRNA operons and their neighbor genes
Working process
1.Get database table
DATABASE TABLE
complete_genome
_Biosample_GCF_FTP.txt
.Biosample.txt
.Nucinfo.txt
_Biosample_NucID_Genome.txt
_Biosample_SRS.txt
SRSID.txt
1 more item...
Genome database
Extracted Biosample ID
Nucleotide ID
Assembly ID
RSA run
sequencer
SRA run
type of genomic
2. Filter genome
3. Create artificial read
4. Assembly with short read
5. Find gap pattern and flanking region in same bact
6. Identify flanking of conserved repeat
Can be used for identification of bact ?
3 more items...
neighbor
2 more items...
conserve ?
position
base
number
same pattern in four different bacteria?
Gap
rRNA operon
1 more item...
Cas9
transposable elements
3 more items...
large phage-mediated repeats
segmental duplications or large tandem arrays
SPades ?
catch small plasmid?
label gap assembly
complete genome
gene annotation
functional genomics studies
isolate individual strains
generate read
seq by
Illumina
PacBio
nanopore ?
correct resolution of all large plasmid sequences
Background and objective
Previous study
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies
--> identify and verify gap from long and short reads
finishing using
super assembly
supporting Illumina data
to obtain high-quality genome assembly
long read
post-assembly polishing steps
gap closure strategies
indentify gap from complete genome of PacBio
GC content
all gap are similarly
gap length
longer than Illumina read length
read coverage
lower than recommend coverage (>100x)
ability to form strong secondary structures
randomly distributed gap -> rejected
corresponding annotations
active transposon
interfere circulation process
multi rRNA operon
phage integration
mageplasmid
Transposon-related proteins
1 kb flanking -> self blast
blast to several region -> > 95% similarlity > repetitive DNA sequences contribute gap assembly
RiboFR-Seq: a novel approach to linking 16S rRNA amplicon profiles to metagenomes
-> combination of gap and neighbor region to provide consensus classification
RiboFR-Seq (Ribosomal RNA gene flanking region sequencing)
Advantages
can correct errors in traditional 16S rRNA based taxonomic classification
required much less memory
classification by clustering the non-ribosomal reads of BRPs
provide 16s copy number more accurate than rrnDB database
unbiasedly classify 16S amplicons and metagenomic contigs.
limitation
short 16s RNA (shorter than recognition site) might be miss cut -> fail
long runtime
short bridge -> multiple alignments