Ms. Thulile Nhlapo (ARC)

Report
Viral metagenomic analysis of sweet potato using
high-throughput deep sequencing
Student: Thulile Faith Nhlapo
Supervisors: Dr. J. Rees, Prof. M.E.C Rey, Dr. D.A. Odeny
Collaborators: Ms J. Mulabisana, Dr. M. Cloete
1
Sweet potato viruses and their effects on production
•
Sweet potato is highly nutritious and is used as
a poverty alleviation crop (food security)
–
•
Healthy
“Potential infection”
Good source of carbohydrates, proteins, fiber,
iron, vitamin C, B, and Vitamin A (beta carotene)
Viral diseases can reduce crop quality and
yield by up to 100%
Viral Infection/diseased
•
A collection of viruses may infect sweet potato
(disease complex)
•
In SA 12 viruses have been identified either
occurring singly or in combination (viral
synergy) decreasing yield by 50-100%
Sweet potato virus families
•
Compared to viruses of other agriculturally important crops, sweet potato
viruses have been poorly studied but recently more viruses infecting sweet
potato are being described
•
Over 30 sweet potato viruses have been identified and assigned to 9
families
•
7 RNA virus families have been identified: Bromoviridae, Bunyaviridae,
Closteroviridae, Comoviridae, Flexiviridae, Luteoviridae, and Potyviridae
•
2 DNA virus families have been identified: Caulimoviridae, Geminiviridae
Symptoms associated with viral infection
Symptoms observed on sweet potato plants in the field (A&B) chlorotic spots with purple rings, (C) upward curling of leaves, (D) insect
damage. Symptoms observed in the glasshouse (A) chlorotic spots with purple rings, (B) chlorotic spots with purple rings, and purple edged
vein feathering, (C) upward curling of young leaves, (D) chlorotic spots and vein clearing.
Metagenomics and viral metagenomics?
•
Metagenomics- “or community genomics, is an approach aimed at analyzing the
genomic content of microbial communities within a particular niche”
•
Viral metagenomics- the study of viral communities. Viral metagenomics can be
used to analyse viral sequences in any sample type (soil, plant, water, human gut etc.)
•
Is a powerful tool for virus discovery, can be applied to the problem of determining
etiology in diseases
•
Also a metagenomic study or analysis is not biased towards culturable organisms;
therefore the total genetic diversity of microorganisms can be studied
Using next generation sequencing approach for
metagenomics
1.
2.
Cloning dependent
sequencing
Deep sequencing
Expensive
Cheaper
Time consuming
Faster, accurate
Require large amounts of DNA
Small amounts of DNA (detect low
virus titers)
Inserts sometimes unstable
No cloning
Produces large contiguous
sequences
Short reads
Viral metagenomics-viruses small genomes, so assembly not a problem
Bioinformatic- developed software and algorithm for analysis of short reads
Aims
1.
To carry out a metagenomic study of sweet potato viruses in the Western and
Eastern Cape provinces of South Africa
2.
To undertake genetic characterisation of sweet potato viruses under South African
conditions in order to generate a basis for their classification
3.
Explore diagnostic strategies using next generation sequencing (NGS)
Overview of metagenomics strategy
Sampling
RCA
Input
Symptomatic & Asymptomatic leaves
DNA Isolation
RNA Isolation
Sample
preparation
Nextera
Sample preparation
Ribo-Zero&TruSeq
SequencingMiSeq
Output
Data Analysis-CLC Bio
Overview of bioinformatics strategy
Sequence Reads (Raw Data)
1. Download reference
sequences (NCBI)
2. Read map to reference viral
genomes (0.8-0.99 stringency)
3. Extract new consensus
sequence
2. Trim reads for
adaptors
Unmapped reads
De novo assembly (25-64 k-mer)
BLASTn
BLASTn
4. Retrieve full genomes of most
closely related species
5. Multiple Sequence Alignment
MEGA 5.05
CLC Bio 6.0.1- Plug-ins for
additional alignmentsMUSCLE and ClustalW
Identify contigs
6. Pairwise ComparisonSequence ID
8. Full genomes
9. Design primers
7. Phylogenetic tree
10. Confirm by PCR
Sweet potato sampling sites
Date
Location
November 2012
P.E. (Eastern Cape)
November 2012
P.E. (Eastern Cape)
Subsistence
November 2012
P.E. (Eastern Cape)
Subsistence
November 2012
Type of farming
Symptomatic
Subsistence
N
Sample
Subsistence/Commercial
Eastern Cape size= 20
Alice (Eastern Cape)
Western Cape
January 2013
Klawer (Western Cape)
Commercial
January 2013
Lutzville (Western Cape)
January 2013
Paarl (Western Cape)
Commercial
January 2013
Franschhoek (Western Cape)
Commercial
Asymptomatic
Commercial
RESULTS
Rolling circle amplification (RCA) provides DNA sequencing template
•
•
•
•
Genomic DNA (gDNA) isolation - Qiagen DNeasy Plant Mini Kit
Rolling circle amplification (RCA) - IllustraTM TempliPhi 100 Amplification Kit
Nextera DNA sample preparation
Sequencing on the Illumina MiSeq Benchtop Sequencer
M 1 2 3
4 5
6
7 8 9 10 11 12 13 14 15 16 17 18 19 20
M 1
20
2
3
4 5
6
7
8
9
10 11 12 13 14 15 16 17 18 19
10Kb
3Kb
1Kb
RCA products for Eastern Cape samples
DNA isolation of symptomatic and asymptomatic
plants collected from the Eastern Cape
Results: DNA data, symptomatic samples
Western Cape sample (KT10): Sequence identity and percentage genome coverage of DNA
circular viruses and mitochondrial DNA
Reference genome
Percentage
identity
Average
coverage
Percentage of
genome covered
Consensus
length
Sweet potato geminivirus
strain SPLCSPV (JQ621844)
94.38%
3 359 X
99.3%
2 769 bp
Sweet potato geminivirus
strain SPMaV (JQ621843)
98.10%
2 940 X
99.92%
2 781 bp
Ipomoea batatas
mitochondrial plasmid-like
DNA (FN421476)
100%
3 713 X
100%
1 027 bp
Example of mapping and coverage-KT10
Reads mapped to SPLCSPV-ZA (94 % similarity)
Reference
New consensus
Neighbour-joining tree of geminiviruses
Ribo-zeroed total RNA provides sequencing template for RNA
sequencing
•
•
•
•
•
Total RNA isolation - Qiagen RNeasy Mini Kit
DNase treatment of samples prior to sequencing
rRNA depletion- Ribo-ZeroTM Magnetic Kit (Plant Leaf)
TruSeq Stranded Total RNA Sample Preparation
Sequencing on the Illumina MiSeq Benchtop Sequencer
M
1
2
3 4 5
6
7
8
9 10 11 12 13 14 15 16
10Kb
3Kb
1Kb
RNA isolation of symptomatic and asymptomatic
plants collected from the Eastern and Western Cape
M
Results: RNA data, symptomatic samples
•
1% of data mapped to viral genomes
•
Majority of reads mapped to sweet potato chloroplast genome
•
Assembled near complete genomes of:
– Sweet potato chlorotic stunt virus (RNA 2 segment) (SPCSV)
• Still need to assemble RNA 1 segment (Total genome size = 17 630nt)
• 3 593 reads out of 5 432 520, consensus length 4 811bp, 61 X average coverage
– Sweet potato feathery mottle virus (SPFMV)
• Ordinary-strain
• Common-strain (Sweet potato virus C) (SPVC)
– Sweet potato virus G (SPVG)
Summary of results for RNA viruses
Reference genome
Percentage
identity
Average
coverage
Percentage of
genome covered
Consensus
length
Sweet potato virus C
Peru(GU207957)
94.07%
446 X
99.92%
10 812 bp
Sweet potato feathery mottle
virus (AB439206)
93.96 %
255 X
98.83%
10 694 bp
Sweet potato virus
G(JQ824374)
97.92%
51 X
99.92%
10 743 bp
Sweet potato chlorotic stunt
virus RNA 2 (KC146843)
96.99 %
750 X
99.85 %
8 205 bp
Mapping sequence reads to SPFMV reference genome
Consensus length= 10 694 bp Average coverage= 255 X
Reference
New consensus
New consensus shares 94% similarity with reference (variation)
ZOOM-in
Neighbour-joining tree of criniviruses (SPCSV)
EA
WA
Sweet potato chlorotic stunt virus isolates: WA- West African strain
EA-East African strain
Neighbour-joining tree of potyviruses (SPFMV, SPVC, SPVG)
EA & O
S
SPFMV lineage
C
Sweet potato feathery mottle virus isolates: EA-East African strain S-S strain
C-Common strain
G-Sweet potato virus G
2-Sweet potato virus 2
Sequence data suggests multiple infection
Sample ID
KT10
DNA Viruses
RNA Viruses
SPLCSPV
(JQ621844)*
Leaf curling
SPMaV
(JQ621843)
Leaf curling
SPFMV 10-O
strain
(AB439206)
SPCSV (RNA 2)
(KC146843)
KF1
Chlorosis
Purpling leaves
Leaf curling
SPMaV
(JQ621843)
Leaf curling
SPVG
(JQ824374)
SPFMV 10-O
strain
(AB439206)
K17
Purple ringspots
SPLCSPV
(JQ621844)
SPFMV
(AB439206)
F11
Symptoms Observed
SPVC
(NC_014742)
Purple ringspots
Chlorotic spots
Chlorotic spots
Purple ringspots
Leaf vein feathering (with
pigmentation)
Vein clearing
Chlorotic spots
Observed symptoms on sweet potato plants. (A1) Purple ringspots and chlorotic
spots on KT10 sample, these symptoms are associated with Sweet potato
feathery mottle virus (SPFMV). (A2) Upward curling of leaves associated with
Sweet potato leaf curl virus (SPLCV). (B) Upward curling of leaves and chlorotic
spots on sample KF1, symptoms associated with SPLCV and SPFMV. (C)
Purple ringspots, leaf vein feathering with purple feathering and chlorotic spots
on sample F11, these are symptoms associated with SPFMV and Sweet potato
virus G (SPVG). (D) Chlorotic spots and vein clearing on sample K17,
symptoms associated with Sweet potato virus C (SPVC), the C strain of the
potyvirus SPFMV.
Sweet potato virus distribution
 SPFMV (O-strain)
 SPCSV
 SPLCSPV-ZA
 SPMaV-ZA
Western Cape
SPVG
Eastern Cape
 SPVC (SPFMV C-strain)
Advantages of this sequencing approach?
•
Detect viruses by direct sequencing
•
Generate complete/near complete viral genomes
•
High average sequence depths
•
Deep sequencing is efficient diagnostic tool
– Detected viral pathogens
– Detected mixed infections
– Detected diverse viral strains
Acknowledgements
•
Supervisors
–
–
–
•
Collaborators
–
–
•
•
Julia Mulabisana
Dr. M. Cloete
ARC-VOPI senior researchers, technicians, and staff
–
–
–
•
•
•
Dr. J. Rees
Prof. C. Rey
Dr. D. Odeny
Sidwell Tjale
Thakhani Ramathavhatha
Dr. Laurie
ARC-BTP senior students, researchers and bioinformaticians
Farmers in Western and Eastern Cape
This work is based on the research support in part by the National Research
Foundation of South Africa (Grant reference number UID 79983)
Other funding sources: ARC-PDP and DAFF
THANK YOU

similar documents