The vast majority of eukaryotic genes contain
introns, segments of unknown function that do not code for polypeptides.
Introns are present not only in protein-coding genes but also in some rRNA and even tRNA genes. Introns are removed from the primary transcript while RNA is still being synthesized and after the cap has been added, but before the transcript is transported into the cytoplasm.
The removal of introns and the joining of exons is called splicing. Splicing brings together the coding regions, the exons, so that the mRNA now contains a coding sequence that is completely colinear with the protein that it encodes.
Number and size of introns
The number and size of introns varies from gene to gene and from species to species.
For example, only about 235 of the 6000 genes in yeast have introns, whereas typical genes in mammals, including humans, have several.
The average size of an intron is about 2000 nucleotides; thus, a larger percentage of the DNA in most multicellular organisms encodes introns, not exons.
Duchenne muscular distrophy gene
An extreme example is the human Duchenne muscular distrophy gene.
This gene has 79 exons and 78 introns spread across 2.5 milllion base pairs.
When spliced together, its 79 exons produce an mRNA of 14,000 nucleotides, which means that introns account for the vast majority of the 2.5 million base pairs.
Alternative splicing
Alternative pathways of splicing can produce different mRNAs and, subsequently, different proteins from the same primary transcript. The altered forms of the same protein that are generated by alternative spicing are usually used in different cell types or at different stages of development.
Mechanism of exon splicing