1.报告题目: Oxford Nanopore Technology - the data and the applications
报告人: Dr. Zemin Ning
The Wellcome Trust Sanger Institute
主持人:薛勇彪
时间:2015年7月3日 (星期五)下午14:00-15:00
地点:中国科学院北京基因组研究所1层会议室
摘要:
Single molecule sequencing based on nanopore technology opens up new frontiers for genomics as well as applications where short DNA sequencing reads fall short. With its long length profile and speedy data production on a portable device, the MinION sequencers provide promising prospects for diagnostics, genome assembly, phasing and many more applications.
In this talk, I will briefly introduce the technology, discuss the characteristics of the data and highlight data processing flows. These will be followed by two applications – detection of structural variations and genome assembly with or without using data from other sequencing technology. For CNV detection, we developed an empirical model to quantify the likelihood of the CNV events for each mapped read. Using this model, we are able to reduce the noise level resulting from repeats and base errors and therefore accurately identify deletions, insertions and inversions from the sequencing data. On genome assembly, I will describe a novel algorithm using overlap graph, which can assemble the E. coli genome into one single contig with pure nanopore reads.
主讲人介绍: Dr. Zemin Ning
Dr. Zemin Ning is a Senior Scientific Manager and heads the group of “Sequence Assembly and Data Analysis” at the Wellcome Trust Sanger Institute, UK. After completing postdoc training at Cavendish Laboratory, Cambridge University, he joined in Sanger in 1999 to pursue bioinformatics research. He has been active in genome informatics, specializing in sequence alignment and genome assembly. Over the past years, he and his colleagues in the group have developed a number of bioinformatics tools, widely appreciated by the genomics community. The group also produced over 30 de novo assemblies from large animal and plant genomes, including Gorilla, Zebrafish, Tasmanian Devil, Bamboo and Miscanthus.
2.报告题目: Systematic Discovery of Comlpex Indels in Human Cancers
报告人: Dr. Kai Ye
McDonnell Genome Institute
主持人:薛勇彪
时间:2015年7月3日 (星期五)下午15:00-16:00
地点:中国科学院北京基因组研究所1层会议室
摘要:
Complex indels are formed by replacing one DNA fragment with another. They have occasionally been reported in earlier studies using traditional sequencing instruments. However, their detection and interpretation are extremely challenging using sequencing data from modern high throughput instruments. As a result, complex indels have been consistently under-represented in recent genomics studies. Here, we present a systematic analysis of somatic complex indels in the coding sequences of over 8,000 cancer cases using the newly developed computational tool Pindel-C. We discovered 285 complex indels in cancer genes like PIK3R1, TP53, ARID1A, GATA3, and KMT2D in approximately 3.5% of 8,060 cancer cases analyzed; nearly all of complex indels were overlooked (81.1%) or mis-annotated (17.6%) in 2,199 samples previous reported. Our study shows that in-frame complex indels are highly enriched in PIK3R1 and EGFR while frameshifts are found in VHL, GATA3, TP53, ARID1A, PTEN, and ATRX, consistent with their functional roles. Tumor clonal structure analysis reveals that complex indels can function both as initiators and progressors in different cancer cases, mostly by collaborating with point mutations and simple indels. Further, many complex indels display strong tissue specificity (e.g., VHL from kidney cancer and GATA3 from breast cancer). Finally, three-dimensional structural analyses further support the findings of previously missed, but potentially druggable mutations in EGFR, MET, and KIT oncogenes. This study represents the first comprehensive discovery of complex idels in human cancers and indicates the critical importance of improving complex indel discovery and interpretation in both research and clinical settings.
主讲人介绍: Dr. Kai Ye
Dr. Kai Ye is an assistant professor in the genome institute at Washington University in St. Louis. He has a broad background in biology, pharmacology and computer science. During undergraduate study and graduate research projects, he performed gene clone and protein expression. Besides experimental work on protein-drug interaction, he worked on protein sequence analysis, molecular dynamics and docking during his PhD research. As a postdoctoral fellow at European Bioinformatics Institute, he carried out the research on the first short read split-read software and applied it on the data of the cancer genome project and the 1000 Genomes Project. At the Division of Medical Statistics and Bioinformatics of Leiden University Medical Center in the Netherlands, he expanded his research to include transcriptome, epi-genome and disease causal genes/pathways identification using machine learning approaches. Recently, he is exploring cloud computing and investigating porting tools on cloud computing environment. Currently He is chairing the indel/SV/CNV group of the Genome of the Netherlands, coordinating Dutch researchers and international experts on data processing and variant validation.