Powered by OpenAIRE graph
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Genome Researcharrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
Genome Research
Article
Data sources: UnpayWall
Genome Research
Article . 2005 . Peer-reviewed
Data sources: Crossref
Genome Research
Article . 2005
versions View all 2 versions

Whole genome shotgun sequencing ofBrassica oleraceaand its application to gene discovery and annotation inArabidopsis

Authors: Mulu, Ayele; Brian J, Haas; Nikhil, Kumar; Hank, Wu; Yongli, Xiao; Susan, Van Aken; Teresa R, Utterback; +3 Authors

Whole genome shotgun sequencing ofBrassica oleraceaand its application to gene discovery and annotation inArabidopsis

Abstract

Through comparative studies of the model organismArabidopsis thalianaand its close relativeBrassica oleracea, we have identified conserved regions that represent potentially functional sequences overlooked by previousArabidopsisgenome annotation methods. A total of 454,274 whole genome shotgun sequences covering 283 Mb (0.44×) of the estimated 650 MbBrassicagenome were searched against theArabidopsisgenome, and conservedArabidopsisgenome sequences (CAGSs) were identified. Of these 229,735 conserved regions, 167,357 fell within or intersected existing gene models, while 60,378 were located in previously unannotated regions. After removal of sequences matching known proteins, CAGSs that were close to one another were chained together as potentially comprising portions of the same functional unit. This resulted in 27,347 chains of which 15,686 were sufficiently distant from existing gene annotations to be considered a novel conserved unit. Of 192 conserved regions examined, 58 were found to be expressed in our cDNA populations. Rapid amplification of cDNA ends (RACE) was used to obtain potentially full-length transcripts from these 58 regions. The resulting sequences led to the creation of 21 gene models at 17 newArabidopsisloci and the addition of splice variants or updates to another 19 gene structures. In addition, CAGSs overlapping already annotated genes inArabidopsiscan provide guidance for manual improvement of existing gene models. Published genome-wide expression data based on whole genome tiling arrays and massively parallel signature sequencing were overlaid on theBrassica–Arabidopsisconserved sequences, and 1399 regions of intersection were identified. Collectively our results and these data sets suggest that several thousand newArabidopsisgenes remain to be identified and annotated.

Keywords

DNA, Complementary, DNA, Plant, Models, Genetic, Gene Expression Profiling, Arabidopsis, Chromosome Mapping, Brassica, Genomics, Genes, Plant, Chromosomes, Plant, Databases, Genetic, Conserved Sequence, Genome, Plant

  • BIP!
    Impact byBIP!
    citations
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    65
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Top 10%
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Top 10%
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Top 10%
Powered by OpenAIRE graph
citations
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
65
Top 10%
Top 10%
Top 10%
bronze