The Genomic Variation of 3,000 diverse Rice (Oryza sativa L.)Accessions:Discoveries And Applications In Rice Improvement

黎志康  Fan Zhang  Whensheng Wang  Ramil Mauleon  Zhiqiang Hu  Dmytro Chebotarov  Shuaishuai Tai  Zhichao  Min Li  Tianqing Zheng  Roven Rommel Fuentes  Locedie Mansueto  Dario Copetti  Millicent Sanciangco  Kevin Christian Palis  Jianlong Xu  Chen Sun  Hongliang Zhang  Binying Fu  Yongming Gao  Xiuqin Zhao  Fei Shen  Xiao Cui  Hong Yu  Zichao Li  Miaolin Chen  Jeffery Detras  Yongli Zhou  Xinyuan Zhang  Yue Zhao  Dave Kudrna  Chunchao Wang  Rui Li  Ben Jia  Jinyuan Lu  Xianchang He  Zhaotong Dong  Jiabao Xu  Yanhong Li  Miao Wang  Jianxin Shi  Jing Li  Dabing Zhang  Seunhee Lee  Wushu Hu  Alexander Poliakov  Inna Dubchak  Victor Jun Ulat  Frances Nikki Borja  John Robert Mendoza  Jauhar Ali  Ming Yang  Yongchao Niu  Zhen Yue  Ma.Elizabeth B.Naredo  Jayson Talag  Xueqiang Wang  Jinjie Li  Xiaodong Fang  Ye Yin  Jean-Christophe Glaszmann  Jianwei Zhang  Jiayang Li  Ruaraidh Sackville Hamilton  Rod A.Wing  Chaochun Wei  Jue Ruan  Gengyun Zhang  Kenneth L.Mcnally  Nickolai Alexandrov  Hei Leung Hei  
【摘要】:Asian cultivated rice(Oryza sativa L.) is the staple food for the half world population with rich within species diversity.Comprehensive analyses of the genome resequencing data of a core collection of 3,010 rice accessions revealed four important aspects of the genomic diversity within O.sativa.First,over 42 million SNPs and 3 million small InDels were discovered against five reference genomes.Secondly,we discovered more than 90,000 structural(100 bps) variations(SVs) when compared with the Geng reference genome,consisting primarily of small-sized deletions and translocations,plus many duplications and inversions.These SVs accounted for significant portion of the rice genome and involved most(~80%) rice genes.Thirdly,we discovered 12,000 full-length novel protein-coding genes plus 9000+ novel genes of partial sequences that are absent in the Nipponbare reference genome.The rice pan genome consists of ~62%core genes/gene families that are present in all rice lines,and ~38%distributed genes that are present only some of rice accessions.Fourthly,using large-effect SNPs and InDels of all loci in the PG,a rice functional haplotype database was constructed.The SNP,PG and SV variation clearly resolve the 3,010 rice accessions into two major subspecies,Xian(indica) and Geng(japonica),plus Aus and Aro groups and multiple subpopulations within Xian and Geng.Significantly reduced diversity was found in genomic regions where over 1,000 agronomically important and domestication-related genes locate.Tremendous efforts have been taken to identify and mine genes/QTLs/alleles associated with a wide range of important traits in rice using the 3 k materials and SNP data.To demonstrate how the results of the 3 k RGP could be used in rice improvement,genomic constitutions of 476 BC_1 F_5 introgression lines(ILs) from our breeding program were constructed using high-quality reference genomes of their 9 parents and the 3 k RGP databases.When compared with their recurrent parent,Huang-hua-zhan(HHZ),all these ILs had improved tolerances to more than 2 abiotic stresses(drought,salinity,submergence,low-input) in addition to high yield potential,more than 16 of these ILs were released as new varieties and grown in nearly 1 million ha in Asia and Africa.Many more of these ILs are in the pipeline to be released as new varieties in different rice ecologies in Asia.Deep analyses of the high quality genomic sequencing data of the 9 parents revealed allelic(haplotype) differenced at 9,334 loci plus gene presence/absence differences of ~10,000 genes differentiate HHZ and the 8 donors.Examination of genes responsive to 7 different selection schemes and GWAS analyses of the phenotypic data the 476 HHZ ILs across diverse environments led us to discover large numbers of loci and alleles associated with abiotic stress tolerances and yield traits.The generated genetic and genomic information of the HHZ ILs allowed us to propose novel breeding strategies how to use the genomic/genetic information of the ILs for developing superior rice varieties with significant improvement in multiple complex traits by genomebased breeding by design.

