0% found this document useful (0 votes)

10 views50 pages

Week 9

The document discusses gene expression quantification and its applications in biomedical fields such as drug discovery, vaccine design, and biomarker identification. It covers various technologies for analyzing gene expression, including microarray and RNA sequencing, and highlights the importance of data repositories like the Gene Expression Omnibus and the Genomic Data Commons. Additionally, it outlines the history and evolution of genomic sequencing and transcriptomics, emphasizing the significance of projects like The Cancer Genome Atlas in cancer research.

Uploaded by

smishra8094

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views50 pages

Week 9

Uploaded by

smishra8094

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Gene Expression : Quantification of Information

Molecules and their Applications

Prof. Gajendra P.S. Raghava

Head, Center for Computational Biology

Web Site: [Link]

These slides were created with using various resources so

no claim of authorship on any slide
Biomedical- Applications
Concept Level
★Proteome annotation ★Drugs discovery ★Vaccine Design ★Biomarkers

Molecules or Objects
Proteins & Peptides Gene Expression Chemoinformatics Image annotation
• Structure • Disease • Drug design • Image
prediction biomarkers • Chemical Classification
• Subcellular • Drug biomarkers descriptor • Medical images
localization • mRNA • QSAR models • Disease
• Therapeutic expression • Personalized classification
Application • Copy number inhibitors • Disease
• Ligand binding variation diagnostics
Molecular Biology Overview
Cell Nucleus

Chromosome

Protein Gene (mRNA), Gene (DNA)

single strand
History of genomes sequencing
 1977 bacteriophage øX174 (5386bp, 11 genes)
 1981 mitochondrial genome (16,568bp)
 1986 chloroplast genome (120,000 bp)
 1995 Haemophilus influenzae (1.8Mb)
 1996 Saccharomyces whole genome (12.1Mb)
 1997 E. coli (4.6Mb; 4200 proteins)
 1998 Caenorhabditis elegans (97 Mb; 19,000 genes)
 2000 Arabidopsis thaliana (115Mb, 30,000 genes)
 2001 mouse (1 year!)
 2001 Homo sapiens (2 projects)
 2005 Pan, rice
 2006 Populus
Analysing the flow of genetic information
• Genome mapping
• Genome sequencing Structural
• Genome annotations genomics

Nucleus

• DNA arrays and chips

DNA (Genome)
• RNA sequencing
•(semi) qRT-PCR
pre-mRNA • Northern blot + hybrid.
Cytoplasm • Transcriptional fusions

mRNA
• 2D electrophoresis
mRNA (Transcriptome) • Gel-free methods Functional
Mass spectrometry genomics
Protein sequencing
Proteins (Proteome) • Translational fusional
• Immunodetection
• Enzyme activities
Metabolites
(Metabolome) • Chromatography
• Mass spectrometry
• NMR
Glycomics Lipidomics
(Sugars) (Lipids)

Metabolomics
Chromosome
(23 pair) Epigenomics
M
M

Ac
Ac

Cell Nucleus Chromatin

Organ, Tissue
Genomics (3×109)
miRNA
DNA (4 chemicals: A, T, G, C)
World of OMICs
Non-coding RNA Transcriptomics
mRNA (copies)

M C
A

A
I
V

Y
M
E Proteomics
D
Glycomics (Sugars attached proteins) Protein (20 chemicals: A, C, D ..)
The evolution of transcriptomics
Hybridization-based

P. Brown, et. al. Affymetrix, whole genome 2008 many groups, mRNA-seq:
Gene expression profiling expression profiling using tiling direct sequencing of mRNAs
using spotted cDNA array: identifying and profiling using next generation
microarray: expression levels novel genes and splicing sequencing techniques (NGS)
of known genes variants
History
➢ 1980s: antibody-based assay (protein chip?)

➢ ~1991: high-density DNA-synthetic chemistry (Affymetrix/oligo

chips)

➢ ~1995: microspotting (Stanford Univ/cDNA chips)

➢ replacing porous surface with solid surface replacing

radioactive label with fluorescent label improvement on
sensitivity
Stanford/cDNA chip
Flow diagram of cDNA chip
microarray technology,
where we detect relative
expression of each gene
cDNA Microarray Technology

Major Steps
1. Spot cloned cDNAs onto a glass microscope slide

2. Label 2 RNA samples with 2 different colors of fluorescent dye

3. Mix two labeled RNAs and hybridize to the chip

4. Make two scans - one for each color

5. Calculate ratios of amounts of each RNA that bind to each spot

Gene Expression Data

On p genes for n slides: p is O(10,000), n is O(10-100), but growing,

Slides
slide 1 slide 2 slide 3 slide 4 slide 5 …
1 0.46 0.30 0.80 1.51 0.90 ...
2 -0.10 0.49 0.24 0.06 0.46 ...
Genes 3 0.15 0.74 0.04 0.10 0.20 ...
4 -0.45 -1.03 -0.79 -0.56 -0.32 ...
5 -0.06 1.06 1.35 1.09 -1.09 ...

Gene expression level of gene 5 in slide 4

= Log2( Red intensity / Green intensity)

These values are conventionally displayed on a red (>0) yellow (0) green (<0)
scale.
Affymetrix Expression Arrays

Flow chart of Affymetrix

Comparison of two technologies
Stanford/cDNA chip and Affymetrix/oligo chip
Aspect cDNA Microarray Affymetrix GeneChip
Hybridization-based, uses cDNA probes Oligonucleotide-based, uses short
Technology
(longer sequences). 25-mer probes.
Short oligonucleotides (~25 base
Probe Type Double-stranded cDNA (500–5,000 bp).
pairs).
Fluorescent dyes Cy3/Cy5) for co- Single fluorescent label (usually
Target Labeling
hybridization. biotin-streptavidin).
Relative expression comparison of two Absolute or relative quantification
Data Acquisition
samples. per sample.
Lower sensitivity, especially for low- Higher sensitivity due to specific
Sensitivity
abundance transcripts. oligonucleotides.
Ratios of fluorescence intensities between Signal intensity corresponds to
Quantification
samples. expression level.
Cost Relatively lower. Relatively Higher
Comparison of gene expression between Genome-wide expression profiling,
Examples of Use
two conditions. genotyping.
Analysis of Microarray Data
 Analysis of images
 Preprocessing of gene expression data
 Normalization of data
 Subtraction of Background Noise
 Global/local Normalization
 House keeping genes (or same gene)
 Expression in ratio (test/references) in log
 Differential Gene expression
 Repeats and calculate significance (t-test)
 Significance of fold used statistical method
 Clustering
 Supervised/Unsupervised (Hierarchical, K-means, SOM)
 Prediction or Supervised Machine Learning (SVM)
Videos on Microarray
 [Link] (animation)
 [Link]
 [Link]
What is RNA-Seq?
RNA-Seq is the process of sequencing the transcriptome which includes
protein coding and non-coding transcripts.

Applications:
 Gene (exon, isoform) expression estimation
 Differential gene (exon, isoform) expression analysis
 Transcriptome assembly - Map exon, intron boundaries, splice junctions
 Discovery of novel transcribed regions
 Analyses of alternate splicing
Overview of RNA-Seq
Transcriptome profiling using NGS
Sequencing using RNA-Seq technology
(Major Steps)
1. RNA Isolation: Extract total RNA from cells or tissues.
2. mRNA Enrichment: Enrich mRNA using poly(A) selection.
3. cDNA Synthesis: Convert RNA to cDNA using reverse transcription.
4. Fragmentation: Break cDNA into smaller fragments for library preparation.
5. Adapter Ligation: Attach sequencing adapters to both ends of the cDNA fragments.
6. Library Amplification: Amplify the library using PCR to generate enough copies.
7. Quality Control: Assess the library's size and purity using tools like Bioanalyzer.
8. Sequencing: Sequence the cDNA library using a sequencer (e.g., Illumina).
9. Data Preprocessing: Trim adapters and remove low-quality reads.
10. Assembling of Reads: Mapping of reads to obtain sequence of transcriptome
Line1: Sequence identifier

FASTQ files Line2: Raw sequence

Line3: meaningless
Line4: quality values for the sequence
Analysis Pipeline
Short reads to differential expression
Raw Sequence Data QC by
FASTQ Files FastQC/R

Reads Mapping

Unspliced Mapping Spliced mapping

BWA, Bowtie TopHat, MapSplice

Mapped Reads
Expression Quantification SAM/BAM Files

Summarize read counts FPKM/RPKM

Cufflinks QC by
RNA-SeQC
DE testing

DEseq, edgeR, etc Cuffdiff

List of DE
Functional Interpretation
Function Integrate with
Infer networks
enrichment other data

Biological Insights & hypothesis

Quantification of gene expression
using RNA-seq
Mapping

Alignment to genome
-Hisat2
-STAR

Counts reads per transcript

Normalization Read counts tables

FPKM TPM
RPKM
Fragments Per Kilobase Transcripts Per
Reads Per Kilobase
of transcript per Kilobase Million
Million
Million mapped reads.
Patient Technologies Data Analysis Integration and interpretation
point mutation

Small indels

Further understanding of cancer and clinical applications

Genomics Copy number
WGS, WES variation
Functional effect
Structural of mutation
variation

Differential
expression
Transcriptomics Network and
Gene fusion pathway analysis
RNA-Seq
Alternative
splicing

RNA editing
Integrative analysis
Methylation
Epigenomics
Bisulfite-Seq Histone
ChIP-Seq modification

Transcription
Factor binding

Shyr D, Liu Q. Biol Proced Online. (2013)15,4

Concept of Single Cell

The basic unit of life

Why single cell gene expression?
Improvements in scRNA-seq methods
~10 ~100 ~1000 ~10 000 ~100 000

[Link]
From Svensson et al, 2018.
Benefits of single cell sequencing
Opens the door to several biological and clinical questions

✓ Understanding heterogeneous samples:

✓ E.g. analyse cellular heterogeneity during immune or
stem cell development
✓ Identification and analysis of rare cell types
✓ E.g. circulating tumor cells from liquid biopsy
✓ Understanding cellular transitions and switches in
cell state
✓ Dissecting complex infections and revealing drug resistance
genotypes
Gene Expression Omnibus (GEO)
➢A public repository for gene expression data

➢It is a primary database data is submitted by the scientific community.

➢ This repository maintain functional genomics data
➢MIAME (Minimum Information About a Microarray Experiment)-compliant data submissions.
➢Array- and sequence-based data are accepted.
➢Online resource for retrieval of gene expression data
➢Convenient for deposition of data, as required by funding agencies/journals.
The GDC Data Portal: An Overview
The Genomic Data Commons (GDC) Data Portal provides users with web-based access to data from cancer
genomics studies.
Key GDC Data Portal features
•Open, granular access to information about all datasets available in the GDC.
•Advanced search and visualization-assisted filtering of data files.
•Data visualization tools to support the analysis and exploration of data (including on a gene and mutation level from
•Open-Access MAF files.
•Cart for collecting data files of interest.
•Authentication using eRA Commons credentials and auathorization using dbGaP for access to controlled data files.
•Secure data download directly from the cart or using the GDC Data Transfer Tool.
•For more information about available datasets, see the GDC Website.

Accessing the GDC Data Portal

The GDC Data Portal is accessible using a web browser such as Chrome, Firefox, and Microsoft Edge at
the following
URL: [Link]
The front page displays a summary of all available datasets:

[Link]
GDC data portal

 Data Category: SNV, transcriptome profiling, CNV, sequencing reads,

biospecimens, clinical, DNA methylation, somatic mutation, combined nucleotide
variation

 Data Type: RAW single somatic mutation, annotated somatic mutation, aligned
reads, gene expression quantification, and so on…..

 Clinical data: Collection of data related to patient diagnosis, demographics,

exposures, laboratory tests, and family relationships.

 Data Retrieval: Data is searchable in the API, Data Portal, or Legacy Archive.
[Link]
The Cancer Genome Atlas (TCGA)

• Launched in 2006 as a pilot, expanded in 2009, ended in 2017

• NIH-funded program to perform a comprehensive and integrated analysis of key

genomic/molecular features of many cancers

• A ‘marker paper’ in each project to provide fundamental insights

• Make the data publicly available to the research community

• Serves as a model for the power of teamwork in science.

U.K., France, Netherlands, Canada, U.S.

• Uveal melanoma chosen as one of 10 rare cancers included

TCGA history
 Initiated in 2005, to catalogue genetic mutations responsible for cancer

 TCGA is supervised by the Center for Cancer Genomics and the National Human Genome
Research Institute.

 A three-year pilot project, begun in 2006, focused on characterization of three types of

human cancers: glioblastoma, lung, and ovarian cancer.

 In 2009, it expanded into phase II, which planned to complete the genomic
characterization and sequence analysis of 20–25 different tumor types by 2014

 Contain Gene expression, copy number variation, SNP genotyping, DNA methylation etc.

 There are 3554 authorized requesters associated with TCGA study (currently)
Project Cases Seq Exp SNV CNV Meth Clinical Clinical Supplement
TCGA-BRCA 1,098 1,098 1,097 1,044 1,098 1,095 1,098 1,098
TCGA-GBM 617 406 166 396 599 423 617 617
TCGA-OV 608 575 492 443 597 602 608 608
TCGA-LUAD 585 582 519 569 518 579 585 585
TCGA-UCEC 560 559 559 542 558 559 560 560
TCGA-KIRC 537 535 534 339 534 533 537 537
TCGA-HNSC 528 528 528 510 526 528 528 528
TCGA-LGG 516 516 516 513 515 516 516 516
TCGA-THCA 507 507 507 496 505 507 507 507
TCGA-LUSC 504 504 504 497 504 503 504 504
TCGA-PRAD 500 498 498 498 498 498 500 500
TCGA-SKCM 470 470 469 470 470 470 470 470
TCGA-COAD 461 460 459 433 460 458 461 461
TCGA-STAD 443 443 439 441 443 443 443 443
TCGA-BLCA 412 412 412 412 412 412 412 412
TCGA-LIHC 377 377 376 375 376 377 377 377
TCGA-CESC 307 307 307 305 302 307 307 307
TCGA-KIRP 291 291 291 288 290 291 291 291
TCGA-SARC 261 261 261 255 261 261 261 261
TCGA-LAML 200 195 188 149 200 140 200 200
TCGA-ESCA 185 185 184 184 185 185 185 185
TCGA-PAAD 185 185 178 183 185 184 185 185
TCGA-PCPG 179 179 179 179 179 179 179 179
TCGA-READ 172 171 167 158 167 165 172 172
TCGA-TGCT 150 150 150 150 150 150 150 150
TCGA-THYM 124 124 124 123 124 124 124 124
TCGA-KICH 113 66 66 66 66 66 113 113
TCGA-ACC 92 92 80 92 92 80 92 92
TCGA-MESO 87 87 87 83 87 87 87 87
TCGA-UVM 80 80 80 80 80 80 80 80
TCGA-DLBC 58 48 48 37 50 48 58 58
TCGA-UCS 57 57 57 57 57 57 57 57
TCGA-CHOL 51 51 36 51 36 36 51 51
11,315 10,999 10,558 10,418 11,124 10,943 11,315 11,315
Types of data

• Core dataset: • Future datasets:

➢ Pathology report ❖ 50x Whole-genome sequencing
➢ Histology images ❖ Bisulfide sequencing
➢ Clinical data ❖ Protein Array
➢ Whole exome-seq
➢ SNP 6.0 array
➢ mRNAseq
➢ miRNAseq
➢ Methylation array
Single Cell Expression Atlas

Discover and interpret gene

expression analysis results
at single cell level

[Link]/gxa/sc/

[Link]/gxa/sc/
Http://[Link]/raghava/cancerdr/
Overall Architecture of CancerlivER
Cancer Biomarkers
◆ A biomarker is a biological molecule found in blood, other body fluids, or tissues that is a sign of a
normal or abnormal process, or of a condition or disease (National Cancer Institute (NCI))

Biomarkers

Based on Disease State Based on Biomolecules

Diagnostics DNA Biomarker

RNA Biomarker
Prognostics
Protein Biomarker

Predictive
Glyco Biomarker
Thank You
[Link]

Questions & Answers Session

Gene Expression Quantification Methods
No ratings yet
Gene Expression Quantification Methods
146 pages
Large-Scale Gene Expression Assessment
No ratings yet
Large-Scale Gene Expression Assessment
27 pages
Bioinformatics in Molecular Biology
No ratings yet
Bioinformatics in Molecular Biology
105 pages
Overview of Transcriptomics Techniques
No ratings yet
Overview of Transcriptomics Techniques
90 pages
Bioinformatics Pipeline for Transcriptomics
No ratings yet
Bioinformatics Pipeline for Transcriptomics
6 pages
Transcriptomic Analysis and Instrumentation 2025
No ratings yet
Transcriptomic Analysis and Instrumentation 2025
75 pages
Rnalater Shelf Life in Transcriptomics
No ratings yet
Rnalater Shelf Life in Transcriptomics
68 pages
Bioengineering: Systems Biology & Bioinformatics
No ratings yet
Bioengineering: Systems Biology & Bioinformatics
132 pages
Global Expression Profiling Techniques
No ratings yet
Global Expression Profiling Techniques
18 pages
Microarrays in Gene Expression Analysis
No ratings yet
Microarrays in Gene Expression Analysis
62 pages
Genomics: Microarray & RNA-Seq Analysis
No ratings yet
Genomics: Microarray & RNA-Seq Analysis
41 pages
Gene Expression Profiling with Microarrays
No ratings yet
Gene Expression Profiling with Microarrays
34 pages
Bioinformatics Tools in Genome Sequencing
No ratings yet
Bioinformatics Tools in Genome Sequencing
17 pages
Introduction to Bioinformatics Concepts
No ratings yet
Introduction to Bioinformatics Concepts
104 pages
Bioinformatics Tools Installation Guide
No ratings yet
Bioinformatics Tools Installation Guide
22 pages
Gene Expression and Microarray Techniques
No ratings yet
Gene Expression and Microarray Techniques
24 pages
Cancer Genomics: Resources & Insights
No ratings yet
Cancer Genomics: Resources & Insights
47 pages
Gene Expression - Microarrays: Misha Kapushesky
No ratings yet
Gene Expression - Microarrays: Misha Kapushesky
144 pages
Microarray Analysis of Gene Expression
No ratings yet
Microarray Analysis of Gene Expression
144 pages
cDNA Microarray Techniques in Genomics
No ratings yet
cDNA Microarray Techniques in Genomics
38 pages
Introduction To Bioinformatics: Course 341 Department of Computing Imperial College, London Moustafa Ghanem
No ratings yet
Introduction To Bioinformatics: Course 341 Department of Computing Imperial College, London Moustafa Ghanem
42 pages
Introduction to Omics Technologies
No ratings yet
Introduction to Omics Technologies
20 pages
Data Integration in Omics Research
No ratings yet
Data Integration in Omics Research
38 pages
BMB-502 Assignment Question Answer
No ratings yet
BMB-502 Assignment Question Answer
19 pages
Genomics and Transcriptomics Overview
No ratings yet
Genomics and Transcriptomics Overview
8 pages
DNA Microarray Overview: (Some Slides From Dr. Holly Dressman, Duke University
No ratings yet
DNA Microarray Overview: (Some Slides From Dr. Holly Dressman, Duke University
34 pages
Biomolecular Engineering Overview and Techniques
No ratings yet
Biomolecular Engineering Overview and Techniques
19 pages
CMMB 461: DNA Microarray Overview
No ratings yet
CMMB 461: DNA Microarray Overview
37 pages
RNA-seq Workflow and Analysis Guide
No ratings yet
RNA-seq Workflow and Analysis Guide
41 pages
Understanding Transcriptomics and Its Applications
No ratings yet
Understanding Transcriptomics and Its Applications
22 pages
DNA Microarray Analysis Overview
No ratings yet
DNA Microarray Analysis Overview
34 pages
Understanding DNA Microarrays: Overview
No ratings yet
Understanding DNA Microarrays: Overview
36 pages
Gene Expression and Microarray Techniques
No ratings yet
Gene Expression and Microarray Techniques
23 pages
Functional Genomics Overview Guide
No ratings yet
Functional Genomics Overview Guide
24 pages
Nutrigenomics: Cholesterol & Gene Expression
No ratings yet
Nutrigenomics: Cholesterol & Gene Expression
17 pages
Gene Expression Analysis Techniques
No ratings yet
Gene Expression Analysis Techniques
20 pages
Gene Cloning Applications in Transcriptomics
No ratings yet
Gene Cloning Applications in Transcriptomics
43 pages
Lecture 1 Bioinformatics
No ratings yet
Lecture 1 Bioinformatics
67 pages
Genomics and Genetic Engineering Overview
No ratings yet
Genomics and Genetic Engineering Overview
31 pages
Bioinfo S2 1920 L10 Gene Expression 1 Slide
No ratings yet
Bioinfo S2 1920 L10 Gene Expression 1 Slide
84 pages
Overview of Transcriptomics Techniques
No ratings yet
Overview of Transcriptomics Techniques
13 pages
Understanding Omics: Genomics, Proteomics, Transcriptomics, Metabolomics
No ratings yet
Understanding Omics: Genomics, Proteomics, Transcriptomics, Metabolomics
6 pages
Transcriptome Data Analysis Methods and Protocols 1st Edition Yejun Wang eBook complete pdf edition
100% (2)
Transcriptome Data Analysis Methods and Protocols 1st Edition Yejun Wang eBook complete pdf edition
60 pages
Understanding Transcriptome and Transcriptomics
No ratings yet
Understanding Transcriptome and Transcriptomics
10 pages
Types of Microarrays Explained
No ratings yet
Types of Microarrays Explained
48 pages
Transcriptome Data Analysis Methods and Protocols 1st Edition Yejun Wang eBook full content bundle
100% (2)
Transcriptome Data Analysis Methods and Protocols 1st Edition Yejun Wang eBook full content bundle
69 pages
Genomics and Proteomics Overview
No ratings yet
Genomics and Proteomics Overview
43 pages
Computational Biology Course Overview
No ratings yet
Computational Biology Course Overview
69 pages
Genomics: Experimental Methods Overview
No ratings yet
Genomics: Experimental Methods Overview
56 pages
Genomics and Proteomics Technologies Overview
No ratings yet
Genomics and Proteomics Technologies Overview
9 pages
Advances in Genomics and Proteomics
No ratings yet
Advances in Genomics and Proteomics
57 pages
Advanced Gene Set Enrichment Analysis
No ratings yet
Advanced Gene Set Enrichment Analysis
70 pages
Gene Expression Analysis Techniques
No ratings yet
Gene Expression Analysis Techniques
38 pages
Understanding Microarray Technology
No ratings yet
Understanding Microarray Technology
51 pages
Omics Data Integration for Drug Discovery
No ratings yet
Omics Data Integration for Drug Discovery
32 pages
Overview of DNA Microarray Technology
No ratings yet
Overview of DNA Microarray Technology
35 pages
Abb Lesson 8 Rna Seq
No ratings yet
Abb Lesson 8 Rna Seq
51 pages
Understanding Transcriptomics and Its Methods
No ratings yet
Understanding Transcriptomics and Its Methods
9 pages
DNA Microarrays: Gene Expression Guide
100% (13)
DNA Microarrays: Gene Expression Guide
15 pages
2025 Annual Investment Plan for MIS
No ratings yet
2025 Annual Investment Plan for MIS
1 page
Nova Milk Frother Recipe Guide
No ratings yet
Nova Milk Frother Recipe Guide
86 pages
Alexander The Great - Krzysztof Nawotka
100% (5)
Alexander The Great - Krzysztof Nawotka
454 pages
Amul Career Opportunities in Mumbai
No ratings yet
Amul Career Opportunities in Mumbai
2 pages
Nutrition Care Process Quiz
No ratings yet
Nutrition Care Process Quiz
19 pages
Maths Exhibition: Magnetic Shapes Activity
No ratings yet
Maths Exhibition: Magnetic Shapes Activity
82 pages
Understanding Projective Techniques in Psychology
No ratings yet
Understanding Projective Techniques in Psychology
12 pages
ACLS 2020 Guidelines Overview
No ratings yet
ACLS 2020 Guidelines Overview
1 page
Omaxe Cassia Price List & Payment Plans
No ratings yet
Omaxe Cassia Price List & Payment Plans
3 pages
English Courses in Malaysia - British Council
No ratings yet
English Courses in Malaysia - British Council
1 page
Understanding Projected Coordinate Systems
No ratings yet
Understanding Projected Coordinate Systems
61 pages
Provisional Remedies Syllabus Guide
No ratings yet
Provisional Remedies Syllabus Guide
4 pages
Assignment 2: Automata Theory Overview
No ratings yet
Assignment 2: Automata Theory Overview
5 pages
Innovative Pest Control in Organic Farming
No ratings yet
Innovative Pest Control in Organic Farming
11 pages
Integrity and Ethics: Insights from Tawanda Moyo
No ratings yet
Integrity and Ethics: Insights from Tawanda Moyo
5 pages
Resolving ORA-01555 Snapshot Errors
No ratings yet
Resolving ORA-01555 Snapshot Errors
9 pages
Yemayá: Goddess of the Sea and Life
No ratings yet
Yemayá: Goddess of the Sea and Life
108 pages
Nagatoro's Class Trip Insights
No ratings yet
Nagatoro's Class Trip Insights
8 pages
Ahmedabad Suburban Railway Upgrade Plan
No ratings yet
Ahmedabad Suburban Railway Upgrade Plan
3 pages
Software Engineering Challenges and Processes
No ratings yet
Software Engineering Challenges and Processes
3 pages
Optimal Transport in Complex Networks
No ratings yet
Optimal Transport in Complex Networks
5 pages
GCI Operations and Communication Guide
No ratings yet
GCI Operations and Communication Guide
26 pages
Disney Brochure
No ratings yet
Disney Brochure
16 pages
Housekeeping NC III Curriculum Guide
No ratings yet
Housekeeping NC III Curriculum Guide
19 pages
Enhancing the MATATAG Curriculum
100% (1)
Enhancing the MATATAG Curriculum
9 pages
Simpli Ed Versus Traditional Techniques For Complete Denture Fabrication: A Systematic Review
No ratings yet
Simpli Ed Versus Traditional Techniques For Complete Denture Fabrication: A Systematic Review
5 pages
USAF Tier 2 Fitness Standards Overview
No ratings yet
USAF Tier 2 Fitness Standards Overview
31 pages
Service Delivery Improvement Training Manual
No ratings yet
Service Delivery Improvement Training Manual
50 pages
Arizona Election Fraud Complaint
100% (2)
Arizona Election Fraud Complaint
53 pages
Service Tax Appeal Cases Overview
No ratings yet
Service Tax Appeal Cases Overview
29 pages

Week 9

Uploaded by

Week 9

Uploaded by

Gene Expression : Quantification of Information

Molecules and their Applications

Prof. Gajendra P.S. Raghava

Web Site: [Link]

These slides were created with using various resources so

Protein Gene (mRNA), Gene (DNA)

• DNA arrays and chips

Cell Nucleus Chromatin

➢ ~1991: high-density DNA-synthetic chemistry (Affymetrix/oligo

➢ ~1995: microspotting (Stanford Univ/cDNA chips)

➢ replacing porous surface with solid surface replacing

2. Label 2 RNA samples with 2 different colors of fluorescent dye

3. Mix two labeled RNAs and hybridize to the chip

4. Make two scans - one for each color

5. Calculate ratios of amounts of each RNA that bind to each spot

On p genes for n slides: p is O(10,000), n is O(10-100), but growing,

Gene expression level of gene 5 in slide 4

Flow chart of Affymetrix

FASTQ files Line2: Raw sequence

Unspliced Mapping Spliced mapping

Summarize read counts FPKM/RPKM

DEseq, edgeR, etc Cuffdiff

Biological Insights & hypothesis

Counts reads per transcript

Normalization Read counts tables

Further understanding of cancer and clinical applications

Shyr D, Liu Q. Biol Proced Online. (2013)15,4

The basic unit of life

✓ Understanding heterogeneous samples:

➢It is a primary database data is submitted by the scientific community.

Accessing the GDC Data Portal

 Data Category: SNV, transcriptome profiling, CNV, sequencing reads,

 Clinical data: Collection of data related to patient diagnosis, demographics,

• Launched in 2006 as a pilot, expanded in 2009, ended in 2017

• NIH-funded program to perform a comprehensive and integrated analysis of key

• A ‘marker paper’ in each project to provide fundamental insights

• Make the data publicly available to the research community

• Serves as a model for the power of teamwork in science.

• Uveal melanoma chosen as one of 10 rare cancers included

 A three-year pilot project, begun in 2006, focused on characterization of three types of

• Core dataset: • Future datasets:

Discover and interpret gene

Based on Disease State Based on Biomolecules

Diagnostics DNA Biomarker

Questions & Answers Session

You might also like