banner Bio-Rad

  
Quantification of Gene Expression:  mRNA Transcript Analysis

Research Highlight

Nature Reviews Molecular Cell Biology 8, 946 (2007)

A map of mRNA localization

http://fly-fish.ccbr.utoronto.ca

About Fly-FISH

This database documents the expression patterns of Drosophila mRNAs at the subcellular level during early embryogenesis. A high-resolution, high-throughput fluorescence detection method is used to detect expressed mRNAs (PDF). The overall findings and implications of the work performed thus far is summarized in Lecuyer et al (2007). The data can be accessed by searching the localization categories, searching for specific genes or browsing the list of tested genes.



Contrary to expectations, many Drosophila melanogaster mRNA transcripts appear to be localized prior to translation, report Henry Krause and colleagues in Cell.Localization of mRNA before translation offers some advantages over immediate translation — for example, several rounds of translation can occur at a subcellular location, bypassing the energetically costly need to move proteins individually. However, estimates had predicted that only 1–10% of mRNA transcripts are localized prior to translation.To test these estimates, researchers used high-resolution fluorescent in situ hybridization (FISH) to analyse 2,314 embryonically expressed D. melanogaster mRNAs. Strikingly, 71% of the embryonically expressed mRNAs were specifically localized.mRNA transcripts localized into approx35 categories, including subembryonic categories and subcellular categories. Given the diversity and frequency of the localization patterns observed, and given the close correlation between mRNA localization and protein translation, the authors propose that many, if not most, cellular functions are regulated by mRNA localization. For example, the fact that the temporal localization of anillin mRNA, which encodes an actin-interacting protein, resembles subsequent actin-filament distribution suggests that mRNA localization is involved in the organization of cytoskeletal networks. To make the most of their extensive data, the Krause team catalogued their findings on Fly-FISH, which is searchable by genes and localization categories. As such, it offers promise for various lines of inquiry. For instance, the spatio–temporal data it contains may provide insight into gene regulatory networks, and the ability to assess mRNA localization with colocalization of other mRNAs and proteins may help to uncover the functions of uncharacterized genes.

Global Analysis of mRNA Localization Reveals a Prominent Role in Organizing Cellular Architecture and Function

Eric Lécuyer, Hideki Yoshida, Neela Parthasarathy, Christina Alm, Tomas Babak, Tanja Cerovina, Timothy R. Hughes, Pavel Tomancak and Henry M. Krause



Although subcellular mRNA trafficking has been demonstrated as a mechanism to control protein distribution, it is generally believed that most protein localization occurs subsequent to translation. To address this point, we developed and employed a high-resolution fluorescent in situ hybridization procedure to comprehensively evaluate mRNA localization dynamics during early Drosophila embryogenesis. Surprisingly, of the 3370 genes analyzed, 71% of those expressed encode subcellularly localized mRNAs. Dozens of new and striking localization patterns were observed, implying an equivalent variety of localization mechanisms. Tight correlations between mRNA distribution and subsequent protein localization and function, indicate major roles for mRNA localization in nucleating localized cellular machineries. A searchable web resource documenting mRNA expression and localization dynamics has been established and will serve as an invaluable tool for dissecting localization mechanisms and for predicting gene functions and interactions.

Nature - January 2007 - Focus on RNA

RNA has occupied a pivotal position in the 'central dogma' of molecular biology, which states that information flows from DNA through RNA to proteins. In this issue, we feature a collection of articles that discuss the diverse functional roles of RNA in biological systems and highlight recent discoveries in RNA chemical biology, including advances in transcription, RNA structural biology, RNA interference and RNA engineering.
http://www.nature.com/nchembio/focus/rna/index.html
  • Chemical crosshairs on the central dogma - Aseem Z Ansari
  • RNA learns from antisense - David R Corey
  • RNA at Santa Cruz - Mirella Bucci
  • Synthetic RNA circuits - Eric A Davidson and Andrew D Ellington
  • Natural expansion of the genetic code - Alexandre Ambrogelly, Sotiria Palioura and Dieter Söll
  • Slicer and the Argonautes - Niraj H Tolia and Leemor Joshua-Tor


RNA Analysis Tools

http://www.ncbi.nlm.nih.gov/Class/NAWBIS/Modules/RNA/powerpoint/rna_files/frame.html

=> best view with MS Internet Explorer
developed by
Kristi L. Holmes, Ph.D.
The Bernard Becker Medical Library
Washington University in St. Louis School of Medicine
Bioinformatics @ Becker
Rana C. Morris, Ph.D.
National Center for Biotechnology Information
National Institutes of Health
NCBI Home Page
Nicola Gaedeke, Ph.D.
Biotools.info
Berlin, Germany
www.biotools.info
Donna Messersmith, Ph.D.
Scientist and President
Labs-Now LLC (Learn About Biomedical Science-Now)
www.labs-now.com


Quantitative Analysis of Nucleic Acids - the Last Few Years of Progress

Chunming Ding* and Charles R. Cantor*,†

*Bioinformatics Program and Center for Advanced Biotechnology, Boston University, Boston, Massachusetts 02215, USA
†SEQUENOM Inc., San Diego, California 92121, USA

Journal of Biochemistry and Molecular Biology, Vol. 37, No. 1, January 2004, pp. 1-10


DNA and RNA quantifications are widely used in biological and biomedical research. In the last ten years, many technologies have been developed to enable automated and high-throughput analyses. In this review, we first give a brief overview of how DNA and RNA quantifications are carried out. Then, five technologies (microarrays, SAGE, differential display, real time PCR and real competitive PCR) are introduced, with an emphasis on how these technologies can be applied and what their limitations are. The technologies are also evaluated in terms of a few key aspects of nucleic acids quantification such as accuracy, sensitivity, specificity, cost and throughput.






external links:

Zurück zum Seitenanfang
Intra- and Interspecific Variation in Primate Gene Expression Patterns

  Enard et al.  (2002)   Science 296 (5566): 340-343

Although humans and their closest evolutionary relatives, the chimpanzees, are 98.7% identical in their genomic DNA sequences, they differ in many morphological, behavioral, and cognitive aspects. The underlying genetic basis of many of these differences may be altered gene expression. We havecompared the transcriptome in blood leukocytes, liver, and brain of humans, chimpanzees, orangutans,and macaques using microarrays, as well as protein expression patterns of humans and chimpanzees using two-dimensional gel electrophoresis. We also studied three mouse species that are approximately as related to each other as are humans, chimpanzees, and orangutans. We identified species-specific gene expression patterns indicating that changes in protein and gene expression have been particularly pronounced in the human brain. 

 



Zurück zum Seitenanfang
Gene expression

by  M.Tevfik Dorak, MD, PhD

http://dorakmt.tripod.com/genetics/realtime.html

Genes are transcribed from 5' to the 3' of the sense strand via RNA polymerases. It is actually the antisensetemplate strand, which is transcribed (3' to 5') and gives a strand identical to the sense strand. It is possiblethat a gene is encoded on the sense strand and another one on the anti-sense strand in opposite direction (example). Also possible is the overlapping genes which are frequent in viruses and plasmid/phages. Regions relevant in gene expression.
Enhancers: A sequence on either side of the gene (cis-acting = on the same chromosome) that stimulates a specific promoter. It is not transcribed.
Promoters:
A sequence(s) in the close vicinity of the transcription initiation site 5' (upstream) to the gene. It may
be a general (cis-acting) or tissue/cell-specific one (cis-, or trans-acting = on a different chromosome). Initial binding site for RNA polymerase. Transcription factors bind to the promoters and allow RNA polymerase to act. The promoter is not transcribed itself. The common promoters, TATA and CAAT boxes, are found about 30 bp and 75 bp, respectively, upstream of the transcription initiation site.
T
ranscription initiation (cap) site: This is where the transcription of DNA to immature (precursor) pre-mRNA(nuclear RNA or nRNA) starts. It is immediately 5' to the gene. This sequence adds a 7-methylated GTP cap to the beginning of the mRNA (to protect it against the activity of 5'-exonuclease). From here to the translation initiation site, the sequence codes for the 5'-untranslated or UT (ribosome-binding) region and the signal peptide. The 5'-UT region is transcribed but not translated. It contains the site (the leader sequence) at which ribosomes initially bind to mRNA to start translation. The signal sequence is translated at the N-terminal and directs the protein to its correct cellular location (endoplasmic reticulum, Golgi apparatus, cell membrane, etc) or outside the cell through the cell membrane, and is finally removed at the final destination. The events that occur to mRNA before it leaves the nucleus are collectively called RNA processing or post-transcriptional modification (capping, polyadenylation and splicing). 
Translation initiation site (ATG): This sequence represents the beginning (N-terminal) of translated protein. [5' of DNA codes for N-terminal of a polypeptide]. It codes for a methionine but methionine is subject to post-translational elimination most of the time. Thus, each mature mRNA's first codon is for methionine (AUG) but not all polypeptides start with methionine. Translation takes place in the ribosome in the cytoplasm. 
Exon-intron boundaries:
Each intron starts with GpT and ends with ApG. The introns are subject to splicing
out (post-transcriptional modification which also includes 5' capping and 3' polyadenylation). Although they are not represented in the resultant polypeptide, they may contain some regulatory sequences. 
Stop codon:
One of the three codons marks the end of transcription. The triplet before this codes for the last
amino acid of a polypeptide chain (C-terminal). [3' of DNA codes for C-terminal of a polypeptide.) 
Intranslated Regions (UTRs):
5' UTR usually contains gene- or developmental stage-specific and common
regulators of expression (motifs, boxes, response or binding elements) , and 3' UTR is also involved in gene expression although it does not contain well-known transcription control sites. 3' UTR sequences (called cytoplasmic polyadenylation elements or adenylation control elements) can control the nuclear export, polyadenylation status, subcellular targeting and rates of translation and degradation of mRNA. The involvement of 3' UTR is well documented in controlling male and female gametogenesis and in early embryonic development. Myotonic dystrophy is a disease caused by the expansion of the triplet repeats in the 3' UTR of a protein kinase gene. 
Polyadenylation signal:
This sequence is immediately after (downstream to) the stop codon and codes for a
poly-A tail which varies in length. It is in the 3' untranslated region (histone mRNAs lack poly-A tail). Poly(A) tail is believed to stimulate translation initiation whereas its shortening triggers entry of mRNA into the decay pathway.  Position effect or tissue/cell-specific expression of genes in gene therapy depend on the effects of enhancers and promoters. The triplets on the DNA are transcribed to codons on mRNA [in the nucleus] and after splicing out intronic sequences, the codons are read by anti-codons of tRNA to be translated to amino acids. After various post-translational modifications (which may include phosphorylation, glycosylation, etc), a protein is made. Thus, the stages of protein synthesis are: transcription, splicing (nuclear processing), translation and post-translational modifications.  See also paramutation (paramutation is an allelic interaction that results in meiotically heritable changes in gene expression), methylation, genomic imprinting and allelic exclusion (glossary). Such epigenetic changes and especially their heritability are one of the very hot debates of recent years (see Hidden Inheritance by G Vines in New Scientist, 28 Nov 1998, pp.27-30; Epigenetics: Special Issue of Science, 2001). The National Fragile X Foundation website explains the molecular basis of a well-known epigenetic disease, fragile X syndrome. Common techniques to measure gene expression are Northern blotting, ribonuclease protection assay and reverse-transcription (RT) PCR. These are briefly described below.



Zurück zum Seitenanfang
Overview: Gene Structure

Talks presentations from  David Wishart, Departments of Computing Science and Biological Sciences,
University of Alberta, Edmonton, Alberta; USA:
Genes and Gene Expression

The gene is the fundamental unit of inheritance and the ultimate determinant of all phenotypes. The DNA of a normal human cell contains an estimated 50 to 100,000 genes, but only a fraction of these are used (or “expressed”) in any particular cell at any given time. For example, genes specific for erythroid cells, such as the hemoglobin genes, are not expressed in brain cells. According to the “central dogma” of molecular biology, a gene exerts its effects by having its DNA “transcribed” into a messenger RNA (mRNA), which is, in turn, “translated” into a protein, the final effector of the gene's action. Thus, molecular biologists often investigate gene “expression” or “activation,” by which is meant the process of transcribing DNA into RNA, or translating RNA into protein. The process of transcription involves creating a perfect RNA copy of the gene using the DNA of the gene as a “template.” Translation of mRNA into protein is a somewhat more complex process, since the structure of the gene's protein is “encoded” in the mRNA, and that structural message must be decoded during translation. 

 

Functional Components of the Gene

Every gene consists of several functional components, each involved in a different facet of theprocess of gene expression (Figure A). Broadly speaking, however, there are two main functional units: the “promoter” region and the “coding” region. The promoter region controls when and in what tissue a gene is expressed. For example, the promoter of the hemoglobin gene is responsible for its expression in erythroid cells and not in brain cells. How is this tissue-specific expression achieved? In the DNA of the gene's promoter region, there are specific structural elements, “nucleotide sequences” (see “Structural Considerations” below), that permit the gene to be expressed only in an appropriate cell. These are the elements in the hemoglobin gene that instruct an erythroid cell to transcribe hemoglobin mRNA from that gene. These structures are referred to as “cis”-acting elements because they reside on the same molecule of DNA as the gene. In some cases, other tissue type-specific “cis”-acting elements, called “enhancers,” reside on the same DNA molecule, but at great distances from the coding region of the gene. In the appropriate cell, the “cis”-acting elements bind protein factors that are physically responsible for transcribing the gene. These proteins are called “trans”-acting factors because they reside in the cell's nucleus separate from the DNA molecule bearing the gene. For example, brain cells would not have the right “trans”-acting factors that bind to the hemoglobin promoter, and therefore brain cells would not express hemoglobin. They would, however, have “trans”-acting factors that bind to neuron-specific gene promoters.

Figures: Gene expression. A gene's DNA is transcribed into mRNA which is, in turn, translated into protein. The functional components of a gene are schematically diagramed here. Areas of the gene destined to be represented in mature mRNA are called exons, and intervening areas of DNA between exons are called introns. The portion of the gene that controls transcription, and thereforeexpression, is the promoter. This control is exerted by specific nucleotide sequences in the promoter region (so-called “cis”-acting factors) and by proteins (so-called “trans”-acting factors) that must interact with promoter DNA and/or RNA polymerase II in order for transcription to occur. The primary transcript is the RNA molecule made by RNA polymerase II that iscomplementary to the entire stretch of DNA containing the gene. Before leaving the nucleus, the primary transcript is modified by splicing together exons (thus removing intron sequences), adding a cap to the 5´ end, and adding a poly-A tail to the 3´ end. Once in the cytoplasm, mature mRNA undergoes translation to yield a protein.


 

The structure of a gene's protein is specified by the gene‘s “coding” region. The coding region contains the information that directs an erythroid cell to assemble amino acids in the proper order to make the hemoglobin protein. How is this order of amino acids specified? As described in detail below, DNA is a linear polymer consisting of four distinguishable subunits called nucleotides. In the coding region of a gene, the linear sequence of nucleotides “encodes” the amino acid sequence of the protein. This genetic code is in triplet form so that every group of three nucleotides encodes a single amino acid. The 64 triplets that can be formed by four nucleotides exceeds the number of amino acids used to make proteins (20). This makes the code degenerate and allows some amino acids to be encoded by several different triplets. The nucleotide sequence of any gene can now be determined (see below). By translating the code, one can derive a predicted amino acid sequence for the protein encoded by a gene. 

Zurück zum Seitenanfang

Structural Considerations

Fine Structure 

The basic repeating units of the DNA polymer are nucleotides (Figure B). Nucleotides consist of an invariant portion, a five-carbon deoxyribose sugar with a phosphate group, and a variable portion, the “base.” Of the four bases that appear in the nucleotides of DNA, two are purines, adenine (A) and guanine (G), and two are pyrimidines, cytosine (C) and thymine (T). Nucleotides are connected to each other in the polymer through their phosphate groups, leaving the bases free to interact with each other through hydrogen bonding. This “base pairing” is specific, so that A interacts with T, and C interacts with G. DNA is ordinarily double-stranded, that is, two linear polymers of DNA are aligned so that the bases of the two strands face each other. Base pairing makes this alignment specific so that one DNA strand is a perfectly complementary copy of the other.

In every strand of a DNA polymer, the phosphate substitutions alternate between the 5´ and 3´ carbons of the deoxyribose molecules. Thus, there is a directionality to DNA: the genetic code reads in the 5´ to 3? direction. In double-stranded DNA, the strand that carries the translatable code in the 5´ to 3´ direction is called the “sense” strand, while its complementary partner is the “antisense” strand.

Figures: Structure of base-paired, double-stranded DNA. Each strand of DNA consists of a backbone of 5-carbon deoxyribose sugars connected to each other through phosphate bonds. Note that as one follows the sequence down the left-hand strand (A to C to G to T), one is also following the carbons of the deoxyribose ring, going from the 5´ carbon to the 3´ carbon. This is the basis for the 5´ to 3´ directionality of DNA. The 1´ carbon of each deoxyribose is substituted with a purine or pyrimidine base. In double-stranded DNA, bases face each other in the center of the molecule and base-pair via hydrogen bonds (dotted lines). Base-pairing is specific so that adenine pairs with thymine, and guanine pairs with cytosine.


Gross Structure

In eukaryotes, the coding regions of most genes are not continuous. Rather, they consist of areas that are transcribed into mRNA, the “exons,” which are interrupted by stretches of DNA that do not appear in mature mRNA, the “introns” (see Figures above). The functions of introns are not known with certainty. A purpose of some sort is implied by their conservation in evolution. However, their overall physical structure might be more important than their specific nucleotide sequences, since the nucleotide sequences of introns diverge more rapidly in evolution than do the sequences of exons. Overall, DNA that contains genes comprises a minority of total DNA. Between genes, there are vast stretches of untranscribed DNA that are assumed to play an important structural role. In the nucleus, DNA is not present as naked nucleic acid. Rather, DNA is found in close association with a number of accessory proteins, such as the histones, and in this form is called chromatin. Although many of DNA's accessory proteins have no known specific function, they generally appear to be involved in the correct packaging of DNA. For example, DNA's double helix is ordinarily twisted on itself to form a supercoiled structure. This structure must unwind partially during DNA replication and transcription. Some of the accessory proteins, for example, topoisomerases and histone acetylases, are involved in regulating this process. 

Summary

Genes specify the structure of proteins that are responsible for the phenotype associated with a particular gene. While the nucleus of every human cell contains 30 to 40,000 genes, only a fraction of them are expressed in any given cell at any given time. The “promoter” (with or without an “enhancer”) is the part of the gene that determines when and where it will be expressed. The “coding region” is the part of the gene that dictates the amino acid sequence of the protein encoded by the gene. DNA is a linear polymer of nucleotides. Ordinarily, the nucleotide bases of one strand of DNA interact with those of another strand (A with T, C with G) to make double-stranded DNA. In the cell's nucleus, DNA is associated with accessory proteins to make the structure called chromatin.



Zurück zum Seitenanfang
mRNA Transcript Analysis

Structural Considerations

The first step in gene expression is transcription of the genetic information in DNA into RNA. The individual building blocks of RNA, ribonucleotides, have the same structure as the deoxyribonucleotides in DNA, except that (1) the 2' carbon of the ribose sugar is substituted with an OH group instead of H; and (2) there are no thymine bases in RNA, only uracil (demethylated thymine), which also pairs with adenine by hydrogen bonding. Just like the DNA polymerases described above, the enzyme RNA polymerase II uses the nucleotide sequence of the gene's DNA as a template to form a polymer of ribonucleotides with a sequence complementary to the Dna template. 

In order for transcription to be “correct,” RNA polymerase II must use the antisense strand of DNA as a template, begin transcription at the start of the gene, and end transcription at the end of the gene. The signals that ensure correct transcription are provided to the RNA polymerase II by DNA in the form of specific nucleotide sequences in the promoter of the gene. After reading and interpreting these signals, the RNA polymerase generates a primary RNA transcript that extends from the initiation site to the termination site in a perfect complementary match to the DNA sequence used as a template. However, not all transcribed RNA is destined to arrive in the cytoplasm as mRNA. Rather, by an incompletely understood process, sequences complementary to introns (see above) are excised from the primary transcript, and the ends of exon sequences are joined together in a process termed “splicing.”

In addition to splicing, the primary transcript is further modified by the addition of a methylated GTP “cap” at the 5´ end, and the addition of a stretch of anywhere from 20 to 40 A bases at the 3´ end. These modifications appear to promote the “translatability” and relative stability of mRNAs and help direct the subcellular localization of mRNAs destined for translation.



Northern Blotting

The fundamental question in the analysis of gene expression at the RNA level is whether RNA sequences derived from a gene of interest are present in cells or tissues. Detecting specific RNA sequences can be accomplished by Northern blotting, the whimsically named analogue of Southern blotting, when applied to RNA analysis. RNA can be isolated from cells in its intact form, free from significant amounts of DNA. Messenger RNA is much smaller than genomic DNA, so it can be analyzed by agarose gel electrophoresis without the enzymatic digestion steps that are necessary for the analysis of high molecular weight DNA. RNA is single stranded and has a tendency to fold back on itself. This allows complementary bases on the same stretch of RNA to base-pair with each other and form what is termed “secondary structure.” Because secondary structure can lead to aberrant electrophoretic behavior, RNA is electrophoretically separated by size in the presence of a denaturing agent, such as formaldehyde or glyoxal/DMSO. After electrophoresis through a denaturing agarose gel, the RNA is transferred to a nitrocellulose or nylon-based membrane in the same manner as DNA for Southern blotting (see Figure 1). Hybridization schemes and blot washing are essentially the same for Northern blotting as for Southern blotting. In this manner, specific RNA sequences corresponding to those in cloned DNA probes can easily be identified.

Poster:   Southern Blot  &  Northern Blot

Poster  Board

Direct download:    Northern-Blot movie  (6.8 MB)

Direct download:    Southern-Blot movie  (1.6 MB)

Figure 1: Genomic Southern blotting. Genomic DNA is digested with a single restriction endonuclease resulting in a complex mixture of DNA fragments of different sizes, that is, molecular weights. Digested DNA is arrayed by size using electrophoresis through a semisolid agarose gel. Because DNA is negatively charged, fragments will migrate toward the anode, but their progress is variably impeded by interactions with the agarose gel. Small fragments interact less and migrate farther; large fragments interact more and migrate less. The arrayed fragments are then transferred to a sheet of nitrocellulose or nylon-based filter paper by forcing buffer through the gel as shown. The DNA fragments are carried by capillary action and can be made to bind irreversibly to thefilter. Now the DNA fragments, still arrayed by size on the filter, can be probed for specificnucleotide sequences using a 32P-radiolabeled nucleic acid probe. The probe will hybridize tocomplementary sequences in the DNA, and the position of the fragment that contains these sequences can be revealed by exposing the filter to x-ray film.


 

There is a lower limit to the sensitivity of Northern blotting, so that only moderately abundant mRNAs can be detected using this technique. One way to increase the sensitivity of Northern blotting is to enrich the RNA preparation for mRNA. Ordinarily, mRNA makes up less than 10% of the total RNA content of a cell or tissue. When RNA is isolated from these sources, all RNA species are being isolated, that is, ribosomal and transfer RNA as well as mRNA. As noted above, most mRNAs destined for the cytoplasm and translation are modified by the addition of a 3´ poly(A) tract. An RNA preparation can, therefore, be greatly enriched for mRNA species by removing all RNA molecules that lack the 3´ poly(A) tail. This can be done by exposing the RNA preparation to a tract of poly(U) or poly(T) bound to an immobilized support, such as a plastic bead. The poly(A) portion of mRNA will bind to the poly(U) or poly(T) material, and non–poly(A)-containing RNA can be washed away. After washing, the poly(A)-containing mRNA can be recovered from the solid support and used in Northern blot analysis. This procedure improves the sensitivity of Northern blotting by nearly two orders of magnitude.

A dramatic use of Northern blotting in cancer research has been the demonstration of oncogene expression in some human tumors. RNA was isolated from human tumor samples and analyzed by Northern blotting using cloned DNA probes derived from various oncogenes. The earliest observations included expression of c-abl and c-myc in human tumor cell lines and leukemic blasts. Since then, however, a large number of proto-oncogenes have been shown to be transcribed in primary human tumor tissue.


Zurück zum Seitenanfang

Nuclease Protection Assays (RPA)

Direct download:    RPA movie  (9.4 MB)

Another technique used in the analysis of mRNA is the nuclease protection assay. This assay differs from Northern blotting in two general respects: (1) it is more sensitive than Northern blotting and is therefore used for the detection of rare mRNA species; and (2) it provides detailed structural information about the mRNA being analyzed, and is thus often referred to as “transcript mapping.”

Nuclease protection assays (Figure 2) use a single-stranded radioactive DNA or RNA probe. The nucleotide sequence of the probe contains at least some nucleotides that are complementary to the mRNA being analyzed. The probe is annealed to the target mRNA by base-pairing, and the regions of the probe that are complementary to the target mRNA now become double-stranded, while the noncomplementary regions of the probe remain single-stranded. The annealed mixture is then subjected to digestion with an enzyme specific for single-stranded DNA (usually S1 nuclease), when using a DNA probe, or RNA (usually a mixture of RNase A and RNase T1), when using an RNA probe. The double-stranded annealed areas resist digestion, while all the single-stranded noncomplementary parts of the probe are digested away. In essence, areas in the probe that anneal to the mRNA are “protected” from digestion by the nucleases. The surviving, undigested parts of the probe can then be analyzed by electrophoresis through an agarose or polyacrylamide gel. The amount of radiolabeled probe resistant to digestion is proportional to the amount of target mRNA in the sample.

Figure 2: Nuclease protection assay. In this example, an mRNA containing a point mutation indicated by the inverted triangle in the mRNA on the right) is distinguished from its normal, non-mutated counterpart (mRNA on the left). The mRNA is mixed with a single-stranded 32P-labeled DNA or RNA probe that (1) has sequences perfectly complementary to the nonmutated region of interest in the mRNA, and (2) extends for some length beyond the mRNA. The mixture is heated then cooled to allow the probe to anneal to its complementary sequences in the mRNA. The annealed mixture is then treated with single-strand specific nucleases (S1 nuclease for a DNA probe, or RNases for an RNA probe). This results in digestion of the probe at all single-stranded areas: the extension beyond the mRNA sequences, and the single base-pair mismatch overlying the mutation (right). The radioactive digestion products are then separated by electrophoresis through a urea-containing polyacrylamide gel. The probe that annealed to normal, nonmutated mRNA is smaller than the undigested probe (by the length of the extended region not complementary to the mRNA) and will therefore migrate farther than undigested probe. The probe that annealed to the mutated mRNA will have been digested into two fragments whose summed length will equal that of the digested probe that annealed to nonmutated mRNA.


 

Nuclease protection assays can also provide structural information about target mRNA sequences. If there are any mismatches in the sequence of the target mRNA compared with the probe, the areas corresponding to the mismatches will generate small single-stranded loops (see Figure 1.11). Since the nucleases that digest the annealed probe/mRNA hybrid are specific for single-stranded nucleotides, any mismatches between probe and target are susceptible to digestion. Thus a mismatch can be detected if the nuclease-digested radiolabeled probe is smaller than would have been expected, or when the probe has been digested into multiple fragments. In fact, by careful measurement of the length of the digested probe, one can determine exactly where the mismatch has occurred in the target mRNA.

This technique has been used to detect single base mutations or small deletions in cellular mRNAs. For example, the proposed pathogenetic role of tumor suppressor genes, such as p53, in cancer depends on the inactivation of these genes, for example, by point mutation. Nuclease protection assays have been used to demonstrate the presence of point mutations in the mRNA for p53 in primary human lung cancer samples.


Zurück zum Seitenanfang

cDNA

The flow of genetic information usually runs from DNA to RNA to protein, according to the so-called “central dogma” of molecular biology. There are, however, exceptions to this rule, the most prominent of which involves the life cycle of retroviruses. These viruses encode their genetic information in RNA rather than DNA. When they invade a susceptible host cell, they direct the synthesis of a DNA intermediate that is a complementary copy of their genomic RNA. The enzyme that accomplishes this task, reverse transcriptase, is a DNA polymerase (see above) that uses RNA, rather than DNA, as a template to form a complementary DNA (cDNA) copy of the RNA. This enzyme can be used in vitro to make cDNA copies of any available RNA.

One important application of cDNA synthesis has been the construction of cDNA libraries, analogous to the genomic libraries described above (see Figure 3 and 4). A valuable tool for the analysis of gene expression would be a gene library that consisted only of the genes that were expressed in a cell or tissue of interest. Most of the time, one is really not concerned with all the DNA in the genome, for example, intron sequences, promoters, and vast regions of “uninformative” DNA that lie between genes. Furthermore, if one were interested in analyzing the genes expressed in a brain cell, why bother making a library that contained sequences for the hemoglobin gene? One way to construct a library comprising only tissue-specific expressed genes would be to clone all the mRNA in a specific cell or tissue of interest. Unfortunately, there is no way to ligate single-stranded RNA to a double-stranded DNA cloning vector. However, one can use all the mRNA in a cell as a template for making double-stranded cDNA, which can then be inserted into a cloning vector.

Figure 3: Constructing a genomic library. Genomic DNA and plasmid DNA are cut with EcoRIin preparation for cloning, as in Figure 4. (The vector DNA could also be bacteriophage DNA rather than plasmid DNA). In this case, all of the variously sized EcoRI-produced genomic DNA fragments are cloned individually into the EcoRI site of the plasmid, and the recombinant DNA isintroduced into E.coli by transformation. Transformed bacteria are selected by growth in thepresence of ampicillin, as in Figure 4. Since each bacterium can be transformed by only one recombinant plasmid, and since each colony on the agar plate arose from a single transformedbacterium, each colony (or clone) contains amplified plasmid bearing a single genomic EcoRI fragment. Taken together, all the bacterial colonies represent the entire genetic complement of theorganism from which the original genomic DNA was isolated. Thus, all of the clones on all of the plates can be thought of as a genomic library, with each individual clone representing one volume.

Figure 4: Gene cloning. In this example, a small amount of foreign DNA (a few nanograms) is digested with EcoRI. This foreign DNA can come from any source, the only requirement being that it contains the same restriction endonuclease recognition sites as the vector. Plasmid vector is also digested with EcoRI to create a linear DNA molecule. The “sticky” single-stranded ends of the foreign DNA can align and base-pair with the complementary “sticky ends” of the plasmid, after which DNA ligase covalently bonds foreign DNA to plasmid DNA. This recombinant DNA is introduced into E. coli by a process called transformation. Since the bacteria themselves are not resistant to ampicillin, growth in ampicillin will select only those bacteria that have taken up the plasmid DNA (which carries an ampicillin resistance gene). The plasmid contains a bacterial origin of replication so that as the bacterial culture grows, plasmids replicate resulting in several copies in each bacterium. When the culture has grown to sufficient size, plasmid DNA can be isolated biochemically, foreign DNA can be cut from the plasmid using EcoRI, and the resulting yield will often be milligrams of DNA, that is, greater than a 106-fold amplification.


 

To make a cDNA library, one isolates all the mRNA from a cell or tissue. Then, using this mRNA as a template, reverse transcriptase makes cDNA copies of each mRNA molecule in the mixture. The cDNA is ligated into a plasmid or phage vector as described above for genomic libraries, and the recombinant vectors are introduced into bacteria. After growth on agar plates, each bacterial colony or phage plaque of a cDNA library houses a unique recombinant vector containing the cDNA copy of a single mRNA. Desired clones can be detected by nucleic acid hybridization to the plaques or colonies using a radiolabeled gene probe. Alternatively, if the vector containing the cDNA molecules can direct transcription of mRNA by host bacterial cells, mRNA will be synthesized, and that mRNA will be translated. In this case, each bacterial colony or plaque will produce a different protein, and each protein will have been encoded by an mRNA from the original cell or tissue being investigated. If an antibody directed against a protein of interest is available, the cDNA clone corresponding to the mRNA that encodes that protein can be identified by binding the antibody to the colonies or plaques of the cDNA library. This technique, called “expression cloning,” often employs the bacteriophage ?gt11 as the cloning vector.

cDNA libraries can be used to clone cDNA for a known gene to discover the sequence of the mRNA it encodes. Alternatively, these libraries can be used to identify previously unknown genes. In a process called “differential screening,” cDNAs can be discovered that owe their existence to a particular differentiation or activation state in the cell of origin. For example, this technique has been used to identify genes whose expression is turned on by hormones or by growth factors. A rapid modification of this technique using PCR (called “differential display”) is described in the next section.


Zurück zum Seitenanfang

DNA Microarray Analysis

Another approach to comparative gene expression profiling employs the use of DNA microarrays, often referred to as DNA “chips.” Two basic types of DNA microarrays are currently available: oligonucleotide arrays and cDNA arrays. Both approaches involve the immobilization of DNA sequences in a gridded array on the surface of a solid support, such as a glass microscope slide or silicon wafer. In the case of oligonucleotide arrays, 25-nucleotide long fragments of known DNA sequence are synthesized in situ on the surface of the chip using a series of light-directed coupling reactions similar to photolithography. Using this method, as many as 300,000 distinct sequences representing over 6,000 genes can be synthesized on a single 1.3 cm × 1.3 cm microarray. In the case of cDNA microarrays, cDNA fragments are deposited onto the surface of a glass slide using a robotic spotting device. For both microarray approaches, the next step involves the purification of RNA from the source of interest (e.g., from a tumor), enzymatic fluorescent labeling of the RNA, and hybridization of the fluorescently labeled material to the microarray. Hybridization events are then captured by scanning the surface of the microarray with a laser scanning device and measuring the fluorescence intensity at each position in the microarray. The fluorescence intensity of each spot on the array is proportional to the level of expression of the gene represented by that spot. This process is illustrated in Figure 5.

Figure 5: DNA microarray analysis. In this example, RNA extracted from a tumor is end-labeled with a fluorescent marker, then allowed to hybridize to a chip derivatized with cDNAs or oligonucleotides as described in the text. The precise location of RNA hybridization to the chip can be determined using a laser scanner. Since the position of each unique cDNA or oligonucleotide is known, the presence of a cognate RNA for any given unique sequence can be determined.


 

DNA microarray technology is evolving rapidly, with improvements in miniaturization, reproducibility, production capabilities, and the development of alternative approaches to microarray synthesis. The application of gene expression profiling methods to important questions in biology and medicine is also emerging. For example, DNA microarrays have been recently demonstrated to be useful in understanding the cell cycle, hematopoietic differentiation, responses to serum stimulation, interferon gamma treatment, and cancer classification. The ability to monitor the expression levels of thousands of genes simultaneously offers the potential opportunity to expand the analysis of cancer genetics beyond single–candidate gene approaches, toward considering genetic networks. It is becoming increasingly clear that while some tumors appear to be caused by mutations in a single gene (e.g., oncogene or tumor suppressor gene), most cancers likely arise through the collaboration of multiple genes, none of which, when considered alone, are sufficient for transformation. Until recently, the analysis of such genetic networks has been impractical, in that methods for measuring the expression levels of multiple genes in parallel have not been available. The development of DNA microarrays may, in large part, have solved this problem. Microarrays capable of monitoring the expression levels of the entire human genome (estimated to contain approximately 100,000 genes) are likely to become available in the near future.

The challenge now is not so much how to generate complex gene expression data, rather how to interpret it. The key is to develop methods for recognizing meaningful gene expression patterns and distinguishing those patterns from noise. Such noise (random gene expression levels) can be generated by (1) variability among microarrays, (2) variability in RNA labeling and hybridization methods, and perhaps most importantly, (3) biological variability among samples. It is likely that all of the above sources of variability are significant. It has become clear that the successful elucidation of genetic networks through expression profiling will require the expertise of a new generation of scientists, namely, computational biologists. Improvements in DNA microarray fabrication will only become valuable if pattern recognition algorithms are similarly developed. Nonetheless, it is likely that the future of cancer diagnostics will include the analysis of gene expression profiles which might help guide treatment planning of individual patients.


Zurück zum Seitenanfang

Polymerase Chain Reaction  (PCR  &  RT-PCR)

real-time qRT-PCR       classical block RT-PCR & competitive RT-PCR

Roche Applied Science - PCR Application Manual 3rd Edition

Another important use of cDNA technology has allowed PCR to be applied to RNA. Since the Taq polymerase is a DNA polymerase (see above), it cannot use RNA as a template. Simply adding primers and Taq polymerase to an RNA preparation will not result in amplification. However, if an RNA of interest could be made into DNA, then PCR would proceed as usual. The first step in this analysis is generating a cDNA copy of the mRNA of interest using reverse transcriptase. This can be done using a primer consisting of Ts (complementary to the poly(A) tail) or of a sequence complementary to some portion of the 3´ region of the mRNA. The 5´ primer can then be added along with Taq polymerase, and the single-stranded cDNA made in the first step will be amplified as described above (see Figure 6). In one of the first applications of this technique, Ph' positive leukemias were diagnosed by identifying chimeric bcr-abl mRNA species in clinical material using PCR. Since then, so-called reverse transcriptase (RT) PCR has come into widespread use.

Figure 6: Polymerase chain reaction (PCR). DNA is mixed with short (10–20 base) single-stranded oligonucleotide primers that are complementary to the 5´ and 3´ ends of the sequence to be amplified. The mixture is heated to dissociate or “melt” all double-stranded DNA, and then cooled to permit the primers to anneal to their complementary sequences on the DNA to be amplified. Note that the 5´ primer will anneal to the “lower” strand, and the 3´ primer will anneal to the “upper” strand. A heat-resistant (thermostable) DNA polymerase (Taq polymerase, see text) was present in the original mixture, and it now synthesizes DNA by starting at the primers and using the strands to which the double-stranded DNA copies for every molecule of double-stranded DNA in the original mixture. The reaction is then heated to melt double-stranded DNA, cooled to allow reannealing, and the polymerase makes new double-stranded DNA again. There are now four double-stranded DNA copies for each original DNA molecule. This process can be repeated n times (usually 20–50) to result in 2'' copies of double-stranded DNA.


 

One inherent problem in using PCR to monitor mRNA expression is quantitation of the amplified PCR products. In Northern blotting or nuclease protection analysis, the intensity of the hybridization signal is directly proportional to the amount of target RNA in the sample. Thus, one can compare the number of RNA molecules in one sample with another. With PCR, a slight change in the efficiency of polymerization in an early cycle in one sample will lead to a geometrically increasing discrepancy between the amount of amplified product in that sample compared with another sample. Fortunately, a number of techniques have been described for normalizing the products of PCR reactions to allow quantitative comparisons. In general, they involve amplifying an easily distinguishable control RNA template in the same reaction as the RNA of interest. Normalization of the amplified experimental PCR products to the control products then allows comparisons to be made. One application of RT-PCR is a simple method for differential screening (see above) called differential display. Two cell populations to be compared are identified, and mRNA is isolated from both. Reverse transcription and PCR are performed using a poly-T primer, which will anneal to the 3´ poly-A tail of all the mRNA species, and a set of primers with random sequences, which by chance will anneal to sequences upstream of the poly-A tail in all the mRNA species. Since the upstream primer will anneal at random to different mRNA species, the lengths of the PCR products will vary for nearly every mRNA. If the amplification is performed in the presence of radiolabeled nucleotides, the products from the two reactions can be separated on a high-resolution gel. Bands that are much darker in one lane compared with another represent mRNA species that were overexpressed in one cell population compared with another. The cDNA representing this band can be recovered from the gel for further analysis and identification.


Zurück zum Seitenanfang

Serial Analysis of Gene Expression (SAGE)

Every cell type is thought to have a unique pattern of gene expression, the analysis of which can reveal the underlying mechanism of disease. The most straightforward way to display this unique pattern of gene expression would be to construct a cDNA library from the tissue of interest and sequence every clone. This is obviously an impossible task. Rather, a technique called “serial analysis of gene expression (SAGE)” achieves the same end in a practical manner. In SAGE, the investigator sequences a small and unique fragment of each expressed gene (called a SAGE tag) and quantifies the number of times it appears (called the SAGE tag number). The SAGE tag numbers, therefore, directly reflect the abundance of the corresponding transcript.

The sensitivity and the quantitative accuracy of SAGE are theoretically unlimited. The generation of a SAGE library does not require any prior knowledge of what genes are expressed in the cell of interest. Therefore, unlike DNA chip analysis, SAGE is able to detect and quantify the expression of previously uncharacterized genes. SAGE is based on two fundamental principles:

1. A short (10–11 bp) oligonucleotide fragment (SAGE tag) is sufficient to uniquely identify a specific mRNA transcript or its cognate cDNA. A 10-bp oligonucleotide sequence has a complexity of 410 different combinations. Because there are only about 100,000 genes encoded by the human genome, a 10-bp sequence tag corresponding to a defined position of a cDNA is sufficient to uniquely identify any transcribed human gene.

2. Multiple 10-base-pair SAGE tags can be concatenated in a single plasmid, thereby greatly compressing the number of actual plasmid preparations and DNA sequencing reactions that are required to analyze a large number of genes. In practice, a single sequencing reaction can provide information on 30 to 35 different SAGE tags, and therefore 30 to 35 different genes.

The generation of a SAGE library is a technically demanding, multi-step procedure that has been described in detail. Figure 7 outlines the essence of the method. SAGE has been used to characterize the yeast “transcriptome” (transcriptome is defined as the identity and expression level of all the genes expressed in a cell population at any given time), monitor alterations in gene expression patterns following ionizing radiation, during apoptosis induced by the p53 and the APC tumor suppressor proteins. In all of these cases, the ability to measure the expression levels of thousands of different transcripts simultaneously was extremely useful for the understanding of these physiologic processes. For example, in the case of p53, the analysis of over 100,000 SAGE tags identified not only several novel genes transcriptionally induced by p53, but also the concurrent induction of a group of genes involved in the regulation of cellular redox status. This led the authors to propose a novel mechanism of p53-induced cell death. The application of SAGE to the comparison of the expression profiles of normal and tumor tissues is probably the most attractive one, since by comparing the expression profiles of normal and cancer cells in a comprehensive way, it is possible to identify genes or subsets of genes that could be used as potential diagnostic/prognostic markers or therapeutic targets.

Figure 7: Construction and analysis of SAGE libraries. In step 1, a cDNA library is constructed from the cells or tissue of interest, and the cDNAs immobilized on magnetic beads at their 3' ends. In step 2, the cDNAs are subjected to restriction enzyme digestion with a so-called “anchoring enzyme.” This anchoring enzyme is a “frequent cutter” restriction endonuclease (usually NlaIII) that ensures that all the cDNAs are cut at least once. In step 3, cleaved cDNAs are divided into two pools. Short oligonucleotide linkers are ligated to the newly cut 5' ends of the tags. A different linker is used for each pool (linkers A and B as shown). These oligonucleotide linkers contain a recognition site for a “tagging enzyme”. This tagging enzyme is a type two restriction endonuclease (usually BsmfI) that cuts at some distance to the 3' side of the actual recognition site. In step 4, “SAGE tags” are released by cleavage with the tagging enzyme and further processed to yield blunt ends. In step 5, the free, blunt-ended SE tags are dimerized to yield ditags. These ditags are amplified by PCR using linker A and B primers. Note that each ditag is flanked by the recognition site for the frequent cutter anchoring enzyme used in step 2. These flanking recognition sites serve as “punctuation marks” for sequence analysis of the concatenated SAGE tags. Once sufficient amounts of ditags are generated, they are ligated together in linear arrays containing several ditags (i.e., they are concatemerized) and subcloned into a plasmid that can be used as template for automated sequencing. Each sequenced plasmid can yield data on 30 - 35 SAGE tags. Data are analyzed by using the SAGE-software that reads the sequence obtained, derives the SAGE tags, matches them to their cognate cDNA, and gives the gene expression profile in a numeric format.



Ribozymes

One of the more surprising discoveries of the past decade was that some RNA molecules have enzymatic activity. These RNAs, called “ribozymes,” can cleave RNA at sequence-specific sites. They were originally discovered in Tetrahymena, when it appeared that some of the primary RNA molecules in that species were capable of splicing out their introns without the aid of any protein enzymes. Ribozymes have also recently been described in higher organisms, and it is likely that they will be found to play a universal and important role in RNA processing. Sequence-specific ribozymes which will destroy specific mRNAs can be synthesized. One application of this technology is the introduction into malignant cells of ribozymes directed against activated oncogenes. In the laboratory, this technique can reverse the malignant phenotype of some cancer cells.


Zurück zum Seitenanfang

Gene expression profiling using a novel method:
amplified differential gene expression (ADGE)

Zhijian J. Chen, Hongxie Shen & Kenneth D. Tew
     Nucleic Acids Res. 2001; 29: e46

Amplified differential gene expression (ADGE) is a novel technique, designed to profile gene expression of the whole transcriptome or to compare expression of a set of genes between two samples. ADGE employs hybridization to quadratically amplify the ratio of an expressed gene between control and tester samples before displaying. The subtle structures of adapters and primers are designed for displaying the amplified ratio of an expressed gene between two samples. Four selective nucleotides at the 3' end of primers are used to increase PCR efficiency for targeted molecules and to improve detection of PCR products. Double PCR with the same pair of primers expands the detection range, especially for genes of low abundance. Integration of these steps makes ADGE sensitive and accurate. Application to drug resistant human tumor cell lines showed that ADGE accurately profiled expression levels for induced, repressed or unchanged genes. The qualitative expression patterns for ADGE were verified with RT–PCR. 

Summary

The genetic information in DNA is copied, or “transcribed,” into mRNA by the enzyme RNA polymerase II. Before being transported to the cytoplasm, primary transcripts in the nucleus are modified by splicing out introns, adding a 5´ cap and adding a 3´ poly(A) tract. Cytoplasmic mRNA can be detected by Northern blotting, nuclease protection assays, or by modified PCR. Although nuclease protection assays are technically somewhat more demanding than Northern blotting, they are more sensitive and can provide structural information about mRNA transcripts. A retroviral enzyme called reverse transcriptase can make cDNA copies of mRNA transcripts. These cDNAs can be cloned into cDNA libraries, which are useful for isolating and analyzing expressed genes. In the future, ribozymes may be useful for the selective elimination of specific mRNA species.


Gene System 320 - Gene Expression Profiling system

by Capital Genomix

GeneSystem 320 Manual.pdf




Overview of Technology
The GeneSystem320TM (GS320) gene expression analysis system is based upon the technology developed at the MD Anderson Cancer Center (University of Texas) by Dr. Michael MacLeod. This system, also known as RAGE (rapid analysis of gene expression) and combinatorial oligo PCR, enables the analysis of virtually any gene or set of genes using a defined set of reagents including 320 PCR primers. It follows a simple protocol that yields useful data on mRNA expression profiles within a couple of weeks in the lab. The object of GS320 is to provide a method for gene expression analysis that exceeds the capabilities of the state of the art. The GS320 technology is rapid and cost-effective, allows for easily reproducible results, has an adequate sensitivity to detect and quantify moderately rare transcripts, and identifies amplification products without additional cloning or sequencing steps. GS320 is flexible enough to analyze either a subset or virtually the complete genome. GS320 has the capability for detecting the frequency distribution of all polyadenylated mRNAs in a sample at any selected time. The method reduces the complexity of analysis because only a single unique fragment is derived from each molecular species of polyadenylated mRNA. The genome or a subset can be analyzed with a single set of reagents and reaction conditions. The technique allows for multiple samples to be analyzed simultaneously. The results generated from GS320 are proportional to the level of expression of the particular gene.

GS320 utilizes a defined set of 320 PCR primers and a unique message fragmentation protocol that enables analysis of virtually any eukaryotic gene or gene set. The protocol involves isolation of actively transcribing genes and recovery of a single fragment per mRNA molecule, conducting specific combinatorial oligo PCR on the fragments and then positively identifying the genes (PCR products) using dedicated software. The unique gene-specific fragments, called RAGEtags, are generated from mRNA using reverse transcription, a pair of restriction enzymes and 2 universal linkers.

RAGEtags may be generated in two orientations, A/B and B/A. The orientation refers to the relative location of the 3’ most paired Hsp92II and DpnII sites to each other and the poly(A) tail. The fragments of cDNA isolated by digestion with Hsp92II and ligation to the A linker followed by digestion with DpnII and ligation to the B linkers are referred to as A/B RAGEtags. The second orientation of RAGEtags addresses cDNA species that have the Hsp92II recognition site in proximity to the poly(A) tail. B/A RAGEtags are obtained by reversing the order of restriction digestion and linker ligation. The linkers are composed of common PCR primer annealing sites and four base overhangs complementary to the ends created by restriction with either Hsp92II (A-linker) or DpnII (B-linker).

The GeneSystem320TM Database Search Engine was developed to provide mRNA sequence data specific to GS320. The program can be used in two ways: to predict the pair of RAGE primers and the RAGEtag orientation to be used for amplification of each specific gene, and to determine the identity of RAGE amplimers after combinatorial RAGE analysis. The program utilizes sequence data collected from the NCBI Entrez GenBank and UniGene databases which is processed to extract the data appropriate to GS320. The data depends on the integrity of the 3’ end of the mRNA and includes a validation system to assess sequences for the likelihood that the last nucleotides in the sequence do in fact represent the 3’ end. A verification process is employed to select a representative sequence from each UniGene cluster that is most likely to correspond to a mature mRNA including the 3’ end. The database includes data for human, mouse and rat mRNAs. The database is updated monthly to add new and revised GenBank sequences and to maintain current UniGene data.
Zurück zum Seitenanfang

 ©  editor@gene-quantification.info