7.3Genes and DNA


Definition of Genes

Although it is often said that genes are DNA or conversely that DNA is composed of genes, genes are not spread throughout macromolecular DNA from end to end. A gene is defined as a region of macromolecular DNA carrying information that determines the primary structure (amino acid sequence) of proteins or the structure (base sequence) of non-coding RNA (ncRNA).*1 In general, genes in prokaryotes are densely packed with extremely narrow intervals between them. In eukaryotes, the genes are sparsely scattered throughout the total DNA, with wide intervals between them.

*1 ncRNA does not contain structural information for proteins. It is a general term for RNA with other functions including classical RNA, tRNA, and the recently discovered miRNA, which has a large number of types. Refer to Chapters 8 and 10 for details.

Top of Page



A genome is total genetic information about a particular organism. In particular, a set of genes (DNA) contained per cell is termed the genome. Prokaryotes generally contain one set of DNA per cell. Thus, a cell with one set of a genome (or an individual from that set) is termed haploid. On the other hand, eukaryotic somatic cells are often diploid with two sets of a genome. Somatic cells in humans are also diploid, with one genome set from the mother and the other from the father.

Plasmid DNA in prokaryotes and DNA contained in eukaryotic mitochondria and chloroplasts is often separately treated as plasmid or mitochondrial genomes. This type of genome is termed an extrachromosomal genome. In contrast, the cell’s inherent DNA is termed the chromosomal genome.

Top of Page


DNA Content in Organisms

DNA content varies widely among organisms. Fig. 7-2 shows the DNA content per haploid genome of a number of organisms. In general, the DNA content per eukaryotic cell is higher than that per prokaryotic cell. For example, haploid DNA content in human somatic cells is 700 times the DNA content in Escherichia coli. When considering diploid somatic cells, the disparity is 1400 times or 6 pg (picograms) per cell. Among eukaryotes, although schematically higher-order organisms generally have higher DNA content, large disparities are observed among the same group. For example, among vertebrates, an extremely large disparity exists in fish and amphibians, depending on species, including species with DNA content higher than that in humans. Furthermore, some higher plant species have higher DNA content than humans. Thus, on an individual basis, it cannot be said that species with higher DNA content are more advanced. Neither is it true that humans have the highest DNA content and are thus the most advanced organism.

Fig. 7-2 Distribution of DNA content per cell (Haploid)

Top of Page


Number of Genes in Organisms

The human genome project, which aimed to determine the base sequence of total human DNA, is almost complete, and the genome base sequence of a number of other organisms is also available. One surprising finding is that gene number in E. coli is only six times smaller than that in humans. The estimated gene number for E. coli is 4300 and that for humans is 25,000. No much difference exists among Drosophila, Arabidopsis thaliana, and humans with regard to gene number..

Although the gene number in humans is not overwhelmingly larger than that in E. coli, eukaryotes, including humans, can synthesize a wide range of different proteins from one gene, with the ability to produce an estimated 100,000 types of proteins. This mechanism will be covered later (Chapters 8 and 24).

Top of Page


Eukaryotes Contain Many Non-gene DNA Regions

While DNA content in humans is much higher than that in E. coli, disparity in gene number is comparatively less. Indeed, a large number of regions in eukaryotic DNA do not include genes (amino acid sequence information). As indicated in Fig. 7-3, only 1.3% of human DNA carries amino acid sequences for proteins, with a few exceptions.

In contrast to prokaryotic DNA, one of the characteristics of eukaryotic DNA is that it contains a large number of repeated sequences (Table 7-2). In some species, they comprise more than half of all DNA. Short repeated base sequences may be present many times in the same area or scattered throughout the genome, and little is known about the meaning (significance) and function of their existence.

Fig. 7-3 What types of DNA do humans possess?

Table 7-2 Three types of DNA sequences found in eukaryote with different frequency

Top of Page


Exons and Introns

Fig. 7-4 Exons and introns

Fig. 7-4 is a schematic diagram of eukaryotic genes. The region that determines the structure (amino acid sequence) of eukaryotic proteins is contained in exons divided from each other by introns ( see Chapter 8, Fig. 8-8). Introns do not contain the coding information for amino acid sequences, i.e., under the classical definition, they are not genes, although regions containing introns are generally referred to as genes in eukaryotes. Some genes have introns that are tens to thousands of times longer than exons.

Top of Page


Transcriptional Regulatory Domains

While regions that function to regulate expression of specific genes are short at tens to hundred base pairs in prokaryotes, they are often extremely long in eukaryotes, in the order of tens of kilo base pairs. This is another major difference between prokaryotes and eukaryotes. We will cover transcriptional regulation in Chapter 10.


Exon Shuffling

The repeated sequences and introns in eukaryotic DNA are not present in bacteria. The presence of these elements was probably considered to be useful for eukaryotes in creating new genes through the evolutionary process. DNA recombination occurs at an extremely high frequency in the process called meiosis (see Chapter 18) in which germ cells are produced from gametocytes. This is homologous recombination, which occurs between different DNA molecules in the same base sequence. When recombination occurs between the first intron of one gene and the second intron of the same gene on another DNA, one DNA molecule loses an exon while the other gains one. These opportunities increase when there are repeated sequences within the intron. Similar to the shuffling of playing cards, the phenomenon of mixing of exons is termed exon shuffling. It may have been a powerful method for creating new genes during the evolutionary process.

Modern eukaryotes have many genes in which exons seem to be added by the abovementioned process. Through gaining a novel structure of nucleus that can store DNA, eukaryotes became capable of stably maintaining large amounts of DNA and actively performing modification of new genes, such as exon shuffling during meiosis, in order to increase gene clusters important for development and differentiation. The benefit of being diploid, which would not eliminate cells with genes for novel products, may also contribute to this process. Thus, eukaryotes are believed to be prepared for explosive development.

Top of Page