What is the structure and function of DNA. What is the biological role of DNA? Structure and functions. What is the "packaging" of a molecule

Almost everyone has heard about the existence of DNA molecules in living cells and knows that this molecule is responsible for the transmission of hereditary information. A huge bunch of different films, to one degree or another, build their plots on the properties of a small, but proud, very important molecule.

However, few people can at least roughly explain what exactly is part of the DNA molecule and how the processes of reading all this information about the "structure of the whole organism" function. Only a few are able to read “deoxyribonucleic acid” without hesitation.

Let's try to figure out what it consists of and what it looks like the most important molecule for each of us.

The structure of the structural link - nucleotide

The composition of the DNA molecule includes many structural units, since it is a biopolymer. A polymer is a macromolecule that consists of many small repeating fragments connected in series. Just like a chain is made up of links.

The structural unit of the DNA macromolecule is the nucleotide. The composition of the nucleotides of the DNA molecule includes the remains of three substances - phosphoric acid, saccharide (deoxyribose) and one of the four possible nitrogen-containing bases.

The composition of the DNA molecule includes nitrogenous bases: adenine (A), guanine (G), cytosine (C) and thymine (T).

The composition of the nucleotide chain is displayed by the alternation of the bases included in it: -AAGCGTTAGCACGT-, etc. The sequence can be any. This forms a single strand of DNA.

Helical molecule. The phenomenon of complementarity

The size of the human DNA molecule is monstrously huge (on the scale of other molecules, of course)! The genome of a single cell (46 chromosomes) contains approximately 3.1 billion base pairs. The length of the DNA chain, composed of such a number of links, is approximately two meters. It is difficult to imagine how such a bulky molecule can be placed within a tiny cell.

But nature took care of a more compact package and protection of its genome - two chains are interconnected by nitrogenous bases and form a well-known double helix. Thus, it is possible to reduce the length of the molecule by almost six times.

The order of interaction of nitrogenous bases is strictly determined by the phenomenon of complementarity. Adenine can only bind to thymine, while cytosine can only bind to guanine. These complementary pairs fit together like a key and lock, like puzzle pieces.

Now let's calculate how much memory in a computer (well, or on a flash drive) all the information about this small (on the scale of our world) molecule should occupy. The number of base pairs is 3.1x10 9 . There are 4 values ​​in total, which means that 2 bits of information are enough for one pair (2 2 values). We multiply all this by each other and we get 6200000000 bits, or 775000000 bytes, or 775000 kilobytes, or 775 megabytes. Which roughly corresponds to the capacity of a CD disc or the volume of some 40-minute film series in average quality.

Chromosome formation. Determination of the human genome

In addition to spiralization, the molecule is repeatedly subjected to compaction. The double helix begins to twist like a ball of thread - this process is called supercoiling and occurs with the help of a special histone protein, on which the chain is wound like a coil.

This process reduces the length of the molecule by another 25-30 times. Being subjected to several more levels of packaging, more and more compacted, one DNA molecule, together with auxiliary proteins, forms a chromosome.

All information that concerns the form, type and features of the functioning of our body is determined by a set of genes. A gene is a strictly defined section of a DNA molecule. It consists of an unchanged sequence of nucleotides. Moreover, the gene is rigidly determined not only by its composition, but also by its position relative to other parts of the chain.

Ribonucleic acid and its role in protein synthesis

In addition to DNA, there are other types of nucleic acids - messenger, transport and ribosomal RNA (ribonucleic acid). RNA chains are much smaller and shorter, which makes them able to penetrate the nuclear membrane.

The RNA molecule is also a biopolymer. Its structural fragments are similar to those that are part of DNA with a small exception of the saccharide (ribose instead of deoxyribose). There are four types of nitrogenous bases: familiar to us A, G, C and uracil (U) instead of thymine. The picture above shows all this clearly.

A DNA macromolecule is capable of transmitting information to RNA in an untwisted form. The unwinding of the helix occurs with the help of a special enzyme that separates the double helix into separate chains - like the halves of a zipper lock.

At the same time, a complementary RNA chain is created parallel to the DNA chain. After copying the information and getting from the nucleus into the environment of the cell, the RNA chain initiates the processes of synthesis of the protein encoded by the gene. Protein synthesis takes place in special cell organelles - ribosomes.

The ribosome, as it reads the chain, determines in which sequence the amino acids must be connected, one after the other - as information is read into the RNA. Then, the synthesized chain of amino acids takes a certain 3D shape.

This voluminous structural molecule is a protein capable of performing the encoded functions of enzymes, hormones, receptors and building material.

findings

For any living being, it is protein (protein) that is the end product of each gene. It is proteins that determine all the variety of forms, properties and qualities that are encrypted in our cells.

Dear blog readers, do you know where the DNA is, leave comments or reviews that you would like to know. Someone will find this very useful!

Biochemical bases of heredity.

Genetic role of nucleic acids.

Nucleic acids are biological polymers found in all cells, from primitive to complex ones. First discovered by Johann Friedrich Miescher in 1868 in cells rich in nuclear material (leukocytes, salmon spermatozoa). The term "nucleic acids" was coined in 1889.

There are two types of nucleic acids: DNA, RNA (ATP is a mononucleotide). DNA and RNA are the template molecules. DNA contains about 6 * 10 -12 g in somatic cells: in the nucleus, mitochondria. RNA is part of the ribosome, is found in the nucleus and cytoplasm.

The study and proof of the leading role of nucleic acids in the transmission of hereditary information was carried out on viral particles. Tobacco mosaic virus is known to be virulent for both tobacco and psyllium. The virus particle consists of 95% protein and 5% nucleic acid. The protein capsid was swapped in the viral particles, but after a while the protein in both strains was transformed into the previous form.

In bacteriophages that infect E. coli, phage envelope proteins were labeled with radioactive S, and phage DNA was labeled with radioactive P. In a phage-infected bacterial cell, phage particles were formed that contained only radioactive P.

Structure and functions of DNA and RNA molecules.

Nucleic acids are biopolymers of an irregular structure, the monomers of which are nucleotides. Nucleotide consists of the residues of three substances: phosphoric acid, carbohydrate - pentose, nitrogenous base. DNA nucleotides contain deoxyribose, while RNA contains ribose. Residues of purine and pyrimidine nitrogenous bases that make up DNA are adenine, guanine, cytosine, thymine. RNA molecules contain adenine, guanine, cytosine, and uracil.

Nucleotides are connected to each other through the phosphoric acid residue of one nucleotide and the carbohydrate of the other by a strong covalent ether bond, called the "oxygen bridge". The bond goes through the 5th carbon atom of the carbohydrate of one nucleotide to the 3rd carbon atom of the carbohydrate of another nucleotide. The nucleotide sequence represents the primary structure of nucleic acids. RNA is a single polynucleotide chain. DNA in structure is a double polynucleotide chain, coiled into a spiral.

The secondary structure of DNA is formed when a second DNA strand is formed, built according to the principle of complementarity with respect to the first. The second circuit is opposite to the first (anti-parallel). Nitrogenous bases lie in a plane perpendicular to the plane of the molecule - this resembles a spiral staircase. The railings of this ladder are the remains of phosphoric acid and carbohydrates, and the steps are nitrogenous bases.

The nitrogenous bases that make up each nucleotide in opposite chains are able to form complementary hydrogen bonds with each other (due to the existing functional groups in the structure of each nitrogenous base). Adenyl nucleotide is complementary to thymine, guanyl to cytosine, and vice versa. By themselves, these bonds are fragile, but a DNA molecule “stitched” many times along the entire length with such bonds is a very strong connection.

complementarity- this is the spatial-structural and chemical correspondence of nitrogenous bases to each other, they fit together "like a key to a lock."

One DNA molecule can contain 10 8 or more nucleotides.

The structure of the DNA molecule as a double antiparallel helix was proposed in 1953 by the American biologist James Watson and the English physicist Francis Crick.

The DNA molecule of any living organism on the planet consists of only four types of nucleotides, which differ from each other in the nitrogenous bases included in them: adenyl, guanyl, thymine and cytosine. In that versatility DNA. Their sequence is different, and the number is infinite.

For each type of living organisms and for each organism separately, this sequence is individual and strictly specific .

Peculiarity structure of DNA in that the chemically active parts of the molecule - nitrogenous bases, are immersed in the center of the helix and form complementary bonds with each other, and deoxyribose and phosphoric acid residues are on the periphery and cover access to nitrogenous bases - they are chemically inactive. Such a structure can maintain chemical stability for a long time. What else is needed to store hereditary information? It is these structural features of DNA that determine its ability to encode and reproduce genetic information.

The strong structure of DNA is difficult to destroy. Nevertheless, this happens regularly in the cell - during the synthesis of RNA and the doubling of the DNA molecule itself before cell division.

duplication, DNA replication begins with the fact that a special enzyme - DNA polymerase - unwinds the double helix and separates it into separate threads - a reduplication fork is formed. The enzyme acts like a lock in a zipper. On each single-stranded chain - the sticky ends of the reduplication fork - a new chain is synthesized from the free nucleotides in the karyoplasm according to the principle of complementarity. In the new two DNA molecules, one strand remains the original parent strand, and the second strand remains the new daughter strand. As a result, instead of one DNA molecule, two molecules of exactly the same nucleotide composition as the original one appear.

In living systems, we meet with a new type of reactions, unknown in inanimate nature. They're called matrix synthesis reactions . Matrix synthesis is like casting on a matrix: new molecules are synthesized exactly according to the blueprint laid down in the structure of already existing molecules. In these reactions, the exact sequence of monomeric units in the synthesized polymers is ensured. The monomers go to a certain place on the molecules that serve as a matrix where the reaction takes place. If such reactions occurred as a result of a random collision of molecules, they would proceed infinitely slowly. The synthesis of complex molecules based on the matrix principle is carried out quickly and accurately with the help of enzymes. Matrix synthesis underlies the most important reactions in the synthesis of nucleic acids and proteins. The role of the matrix in the cell is played by nucleic acid molecules DNA or RNA. Monomeric molecules from which the polymer is synthesized - nucleotides or amino acids - are located and fixed on the matrix in a strictly defined order in accordance with the principle of complementarity. Then the monomer units are connected into a polymer chain, and the finished polymer leaves the matrix. After that, the matrix is ​​ready to assemble a new exactly the same polymer molecule.

Matrix-type reactions are a specific feature of a living cell. They are the basis of the fundamental property of all living things - the ability to reproduce their own kind.

Functions of Nucleic Acids- storage and transmission of hereditary information. DNA molecules encode information about the primary structure of a protein. The synthesis of mRNA molecules takes place on the DNA matrix. This process is called "transcription". I-RNA in the process of "translation" implements information in the form of a sequence of amino acids in a protein molecule.

The DNA of each cell carries information not only about the structural proteins that determine the shape of the cell, but also about all enzyme proteins, hormone proteins and other proteins, as well as the structure of all types of RNA.

It is possible that nucleic acids provide various types of biological memory - immunological, neurological, etc., and also play an essential role in the regulation of biosynthetic processes.


Similar information.


There are primary, secondary and tertiary structures of RNA and DNA.

The primary structure of RNA and DNA is the same - it is a linear polynucleotide chain in which the nucleotides are interconnected by phosphodiester bonds that form phosphoric acid residues between the carbon atom of one nucleotide and the carbon atom of the next nucleotide.

The secondary structure of DNA is characterized by the rules of E. Chargaff (regularity of the quantitative content of nitrogenous bases).

DNA molecules consist of two antiparallel strands with a complementary nucleotide sequence. The chains are twisted relative to each other in a right-handed helix so that there are approximately 10 base pairs per turn.

Based on X-ray diffraction data and Chargaff's rules, in 1953 J. Watson and F. Crick proposed a model of the secondary structure of DNA in the form of a double helix.

According to this model, the DNA molecule consists of two strands twisted into a right-handed helix around the same axis. Nitrogenous bases are inside, and phosphorus and carbohydrate components are outside. Helix diameter 1.8 nm. The bases form a right angle with the axis of the spiral. The helix pitch is 3.4 nm and contains 10 base pairs. Polynucleotide chains are oriented in the opposite direction (anti-parallel).

Nitrogenous bases in the DNA molecule are located strictly specific, according to the principle of complementarity: A interacts only with T, G with C, i.e. Thymine is always opposite adenine, and cytosine is always opposite guanine. A-T and G-C are called complementary base pairs.

The secondary structure of DNA is stabilized by hydrogen bonds, stacking and hydrophobic interactions.

Hydrogen bonds between pairs of complementary nucleotides (two for the A-T pair and three for the G-C pair) are relatively weak. They act across the spiral. Therefore, the complementary strands of a DNA molecule can separate and rejoin when certain conditions change (for example, changes in temperature or salt concentration).

Stacking interaction of bases - interplanar non-covalent interaction of bases located one above the other in nucleic acids, it ensures the maintenance of the secondary structure of a double-stranded DNA molecule. They operate along a spiral.

Hydrophobic interactions occur between adjacent bases of the same chain, which contributes to the peculiar stacking of the chain.

The tertiary structure of DNA is a helix and supercoil in complex with proteins. DNA can exist in a linear form (in eukaryotic chromosomes) and in a circular form (in prokaryotes and mitochondria). Spiralization is characteristic of both forms.

The material of chromosomes - chromatin - contains, in addition to DNA itself, also histones, non-histone proteins, and a small amount of RNA. Chromatin is a complex of proteins with the nuclear DNA of cells.

The complex of proteins with the nuclear DNA of cells is called chromatin.

The properties of DNA are determined by its structure:

1. Versatility- the principles of DNA construction are the same for all organisms.

2. Specificity- is determined by the ratio of nitrogenous bases: A + T,

which is specific to each species. So in humans it is 1.35, in bacteria - 0.39

Specificity depends on:

the number of nucleotides

the type of nucleotide

the arrangement of nucleotides in the DNA chain

2. Replication or DNA self-duplication: DNA↔DNA. The genetic program of cellular organisms is written in the nucleotide sequence of DNA. To preserve the unique properties of the organism, it is necessary to accurately reproduce this sequence in each subsequent generation. During cell division, the DNA content must double so that each daughter cell can receive the full spectrum of DNA, i.e. in any dividing human somatic cell, 6.4 * 10 9 nucleotide pairs must be copied. The process of DNA duplication is called replication. Replication refers to the reactions of matrix synthesis. During replication, each of the two strands of DNA serves as a template for the formation of a complementary (daughter) strand. It proceeds in the S-period of the interphase of the cell cycle. The high reliability of the replication process guarantees an almost error-free transmission of genetic information over a number of generations. The starting signal for the start of DNA synthesis in the S-period is the so-called S-factor (specific proteins). Knowing the replication rate and the length of the eukaryotic chromosome, it is possible to calculate the replication time, which theoretically is several days, and in practice, replication takes 6–12 hours. It follows from this that replication in eukaryotes simultaneously begins in several places on a single DNA molecule.

The unit of replication is the replicon. A replicon is a section of DNA where replication occurs. The number of replicons per interphase chromosome in eukaryotes can reach 100 or more. In a mammalian cell, there can be 20-30 thousand replicons, in humans - about 50 thousand. At a fixed chain growth rate (for eukaryotes - 100 nucleotides per second), multiple initiation provides a high speed of the process and a decrease in the time required for duplication of extended sections of chromosomes, those. in eukaryotes polyreplicon replication. (Fig. 21)

The replicon contains all the necessary genes and regulatory sequences that enable replication. Each replicon in the process of cell division is activated once. Replication is controlled at the initiation stage. Once the doubling process has begun, it will continue until the entire replicon has been doubled.

In prokaryotes, all DNA is one replicon.

Fig.21. Replication of eukaryotic chromosomal DNA. Replication proceeds in two directions from different origins of replication (Ori) with the formation of vesicles. A "bubble" or "eye" is a region of replicated DNA within unreplicated DNA. (A. S. Konichev, G. A. Sevastyanova, 2005, p. 213)

Enzymes involved in the replication process are combined into a multi-enzymatic complex. 15 enzymes are involved in DNA replication in prokaryotes, and more than 30 in eukaryotes, i.e. replication is an extremely complex and super-precise multi-stage enzymatic process. Enzyme complexes include the following enzymes:

1) DNA polymerases (I, III) catalyze complementary copying, i.e. responsible for the growth of the child chain. (Fig. 22) Prokaryotes replicate at a rate of 1000 nucleotides per second, and eukaryotes at 100 nucleotides per second. The reduced rate of synthesis in eukaryotes is associated with hindered dissociation of histone proteins, which must be removed to move the DNA polymerase in the replication fork along the DNA strand.

2) DNA - primase. DNA polymerases can lengthen a polynucleotide chain by joining existing nucleotides. Therefore, in order for DNA polymerase to be able to start DNA synthesis, it needs a seed or primer (from English primer - seed). DNA-primase synthesizes such a primer, which is then replaced by DNA segments. (Fig. 22).

3) DNA - ligase, connects Okazaki fragments to each other due to the formation of a phosphodiester bond.

4) DNA - helicase, unwinds the DNA helix, breaks the hydrogen bonds between them. As a result, two single multidirectional DNA branches are formed (Fig. 22).

5) SSB - proteins bind to single-stranded DNA and stabilize it, i.e. they create conditions for complementary pairing.

DNA replication does not begin at any random point in the molecule, but in specific places called the region (points) of the origin of replication (Ori). They have certain nucleotide sequences, which facilitates the separation of chains (Fig. 21). As a result of the initiation of replication at the Ori point, one or two replication forks are formed - the sites of separation of the maternal DNA strands. The copying process continues until the DNA is completely duplicated or until the replication forks of two adjacent origins of replication merge. The replication origins in eukaryotes are scattered along the chromosome at a distance equal to 20,000 base pairs (Fig. 21).

Fig.22. DNA replication (explanation in text). (B. Alberts et al., 1994, vol. 2, p. 82)

Enzyme - helicase– breaks hydrogen bonds, i.e. unwinds the double strand, forming two oppositely directed branches of DNA (Fig. 22). Single-stranded regions are linked by special SSB proteins, which line up on the outside of each parent chain and pull them apart. This makes the nitrogenous bases available for binding to complementary nucleotides. At the point where these branches in the direction of DNA replication is the enzyme DNA polymerase, which catalyzes the process and controls the accuracy of complementary synthesis. A feature of the work of this enzyme is its unidirectionality, i.e. construction daughter strand of DNA goes in the direction from 5" end to 3" . On one parent strand, the synthesis of daughter DNA proceeds continuously(leading chain). She grows from 5" to 3" end in the direction of movement of the replication fork and therefore needs only one act of initiation. On the other parent chain, the synthesis of the daughter chain occurs in the form of short fragments with the usual 5" - 3" polarity and with the help of enzymes - ligase they are crosslinked into one continuous lagging chain. Therefore, the synthesis of a lagging strand requires several acts (points) of initiation.

This method of synthesis is called discontinuous replication. Fragment regions synthesized on the lagging strand are named fragments in honor of the discoverer. Okazaki. They are found in all replicating DNA, both in prokaryotes and eukaryotes. Their length corresponds to 1000-2000 nucleotides in prokaryotes and 100-200 in eukaryotes. Thus, as a result of replication, 2 identical DNA molecules are formed, in which one strand is maternal, the other is newly synthesized. This type of replication is called semi-conservative. The assumption of such a method of replication was made by J. Watson and F. Crick, and proved in 1958. M. Meselson and F. Stalem. After replication, chromatin is a system of 2 decompacted DNA molecules united by a centromere.

In the process of replication, errors can occur that prokaryotes and eukaryotes have the same frequency - one in 10 8 -10 10 nucleotides, i.e. an average of 3 errors per genome. This is proof of the high accuracy and coordination of replication processes.

Replication errors are corrected by DNA polymerase III (the "corrector mechanism") or the repair system.

2. Reparation- this is the property of DNA to restore its integrity, i.e. repair damage. The transmission of hereditary information in an undistorted form is the most important condition for the survival of both an individual organism and the species as a whole. Most of the changes are harmful to the cell, they either lead to mutations, or block DNA replication, or cause cell death. DNA is constantly exposed to spontaneous (replication errors, disruption of the nucleotide structure, etc.) and induced (UV - irradiation, ionizing radiation, chemical and biological mutagens) environmental factors. In the course of evolution, a system has been developed that allows you to correct violations in DNA - DNA repair system. As a result of its activity, for every 1000 DNA damages, only one leads to mutations. Damage is any change in DNA that causes a deviation from the normal double-stranded structure:

1) the appearance of single-strand breaks;

2) removal of one of the bases, as a result of which its homologue remains unpaired;

3) replacement of one base in a complementary pair with another incorrectly paired with a partner base;

4) the appearance of covalent bonds between the bases of one DNA chain or between bases on opposite chains.

Repair can take place before DNA doubling (pre-replicative repair) and after DNA doubling (post-replicative). Depending on the nature of the mutagens and the degree of DNA damage in the cell, there is light (photoreactivation), dark, SOS-repair, etc.

Think that photoreactivation occurs in the cell if DNA damage is caused by natural conditions (physiological characteristics of the organism, common environmental factors, including ultraviolet rays). In this case, DNA integrity is restored with the participation of visible light: the reparative enzyme is activated by visible light quanta, connects to damaged DNA, disconnects the pyrimidine dimers of the damaged site, and restores the integrity of the DNA strand.

Dark repair (excision) observed after the action of ionizing radiation, chemicals, etc. It includes the removal of the damaged area, the restoration of the normal structure of the DNA molecule (Fig. 23). This type of repair requires a second complementary strand of DNA. Dark repair is multi-stage, it involves a complex of enzymes, namely:

1) an enzyme that recognizes a damaged section of a DNA chain

2) DNA - endonuclease, makes a break in the damaged DNA chain

3) exonuclease removes the altered part of the DNA strand

4) DNA - polymerase I synthesizes a new DNA segment to replace the removed one

5) DNA ligase joins the end of the old DNA strand with the newly synthesized one, i.e. closes two ends of DNA (Fig. 23). 25 enzyme proteins are involved in dark repair in humans.

With large DNA damage that threatens the life of cells, it turns on SOS reparation. SOS repair was discovered in 1974. This type of repair is noted after the action of large doses of ionizing radiation. A characteristic feature of SOS repair is the inaccuracy in the restoration of the primary structure of DNA, in connection with which it received the name error-prone reparations. The main goal of SOS repair is to maintain cell viability.

Violation in the repair system can lead to premature aging, the development of cancer, diseases of the autoimmune system, cell or organism death.

Rice. 23. Repair of damaged DNA by replacing modified nucleotide residues (dark or excisional repair). (M. Singer, P. Berg, 1998, v. 1, p. 100)

DNA Logic is a DNA computing technology that is in its infancy today, but has high hopes for the future. Biological nanocomputers implanted in living organisms are still seen by us as something fantastic, unreal. But what is unrealistic today, tomorrow may turn out to be something ordinary and so natural that it will be difficult to imagine how one could do without it in the past.

So, DNA computing is a branch of the field of molecular computing at the border of molecular biology and computer science. The main idea of ​​DNA computing is the construction of a new paradigm, the creation of new calculation algorithms based on knowledge of the structure and functions of the DNA molecule and operations that are performed in living cells on DNA molecules using various enzymes. The prospects of DNA computing include the creation of a biological nanocomputer that will be able to store terabytes of information with a volume of several micrometers. Such a computer can be implanted into a cell of a living organism, and its performance will amount to billions of operations per second with an energy consumption of no more than one billionth of a watt.

Benefits of DNA in Computer Technology

Silicon is used as a building material for modern processors and microcircuits. But the possibilities of silicon are not unlimited, and eventually we will come to the point where further growth in the processing power of processors will be exhausted. Therefore, humanity is already facing the acute problem of finding new technologies and materials that could replace silicon in the future.

DNA molecules may turn out to be the very material that will subsequently replace silicon transistors with their binary logic. Suffice it to say that just one pound (453 g) of DNA molecules has a data storage capacity that exceeds the total capacity of all modern electronic data storage systems, and the processing power of a drop-sized DNA processor will be higher than the most powerful modern supercomputer.

More than 10 trillion DNA molecules occupy a volume of only 1 cm3. However, this number of molecules is enough to store 10 TB of information, while they can perform 10 trillion operations per second.

Another advantage of DNA processors compared to conventional silicon processors is that they can perform all calculations in parallel, not sequentially, which ensures that the most complex mathematical calculations can be performed literally in a matter of minutes. Traditional computers would take months and years to perform such calculations.

The structure of DNA molecules

As you know, modern computers work with binary logic, which implies the presence of only two states: logical zero and one. Using a binary code, that is, a sequence of zeros and ones, you can encode any information. There are four basic bases in DNA molecules: adenine (A), guanine (G), cytosine (C) and thymine (T), linked to each other in a chain. That is, a DNA molecule (single strand) can have, for example, the following form: ATTTACGGCC - not binary, but quaternary logic is used here. And just as in binary logic any information can be encoded as a sequence of zeros and ones, in DNA molecules any information can be encoded by combining basic bases.

The base bases in DNA molecules are located at a distance of 0.34 nanometers from each other, which determines their enormous informative capacity - the linear density is 18 Mbps. If we talk about surface informative density, assuming that there is an area of ​​1 square nanometer per base base, then it is more than a million gigabits per square inch. For comparison, we note that the surface recording density of modern hard drives is about 7 Gb / inch 2.

Another important property of DNA molecules is that they can be shaped like a regular double helix that is only 2 nm in diameter. Such a helix consists of two chains (sequences of basic bases), and the content of the first chain strictly corresponds to the content of the second.

This correspondence is achieved due to the presence of hydrogen bonds between the bases of two strands directed towards each other - in pairs G and C or A and T. Describing this property of the double helix, molecular biologists say that DNA strands are complementary due to the formation of G-C and A-T pairs.

For example, if the sequence S is written as ATTACGTCG, then the sequence S' that complements it will be of the form TAATGCAGC.

The process of connecting two single strands of DNA by linking complementary bases into a regular double helix is ​​called renaturation, and the reverse process, that is, separating the double strand and obtaining two single strands, is called denaturation (Fig. 1).

Rice. 1. Processes of renaturation and denaturation

Complementary features of the structure of DNA molecules can be used in DNA calculations. For example, based on sequences that complement each other, you can implement a powerful error correction mechanism that is somewhat reminiscent of RAID Level 1 data mirroring technology.

Basic operations on DNA molecules

For various manipulations of DNA molecules, various enzymes (enzymes) are used. And just like modern microprocessors have a set of basic operations such as addition, shift, logical operations AND, OR and NOT NOR, DNA molecules under the influence of enzymes can perform such basic operations as cutting, copying, pasting, etc. Moreover, all operations over DNA molecules can be performed in parallel and independently of other operations, for example, the addition of a DNA chain is carried out when the initial molecule is exposed to enzymes - polymerases. For the polymerase to work, it is necessary to have a single-stranded molecule (template) that determines the added chain according to the principle of complementarity, a primer (a small double-stranded region) and free nucleotides in solution. The process of completing the DNA chain is shown in Fig. 2.

Rice. 2. The process of completing the DNA chain
when exposed to the original polymerase molecule

There are polymerases that do not require templates for DNA chain elongation. For example, a terminal transferase adds single strands of DNA to both ends of a double stranded molecule. In this way, an arbitrary DNA strand can be constructed (Fig. 3).

Rice. 3. The process of DNA chain elongation

Enzymes called nucleases are responsible for shortening and cutting DNA molecules. There are endonucleases and exonucleases. The latter can shorten both single-stranded and double-stranded molecules from one or both ends (Fig. 4), while endonucleases can shorten only from the ends.

Rice. 4. Molecule shortening process
DNA under the influence of exonuclease

Cutting DNA molecules is possible under the influence of site-specific endonucleases - restriction enzymes, which cut them in a specific place encoded by the nucleotide sequence (recognition site). The incision can be straight or asymmetrical and pass along the recognition site or outside it. Endonucleases destroy internal bonds in the DNA molecule (Fig. 5).

Rice. 5. Cutting the DNA molecule
under the influence of restrictases

Cross-linking - the operation opposite to cutting - occurs under the influence of enzymes - ligases. The "sticky ends" join together to form hydrogen bonds. Ligases serve to close the notches, that is, to promote the formation of phosphodiester bonds in the right places, connecting the bases to each other within the same chain (Fig. 6).

Rice. 6. Cross-linking of DNA molecules under the influence of ligases

Another interesting operation on DNA molecules, which can be classified as basic, is modification. It is used to prevent restriction enzymes from finding a particular site and destroying the molecule. There are several types of modifying enzymes - methylases, phosphatases, etc.

The methylase has the same recognition site as the corresponding restriction enzyme. When the desired molecule is found, methylase modifies the site with the site so that the restriction enzyme can no longer identify this molecule.

Copying, or reproduction, of DNA molecules is carried out during the polymerase chain reaction (Polymerase Chain Reaction, PCR) - fig. 7. The copying process can be divided into several stages: denaturation, priming and elongation. It happens like an avalanche. At the first step, two molecules are formed from one molecule, at the second, four molecules are formed from two, and after n-steps, 2n molecules are obtained.

Rice. 7. The process of copying a DNA molecule

Another operation that can be performed on DNA molecules is sequencing, that is, determining the sequence of nucleotides in DNA. Different methods are used for sequencing chains of different lengths. Using the primer-mediated walking method, it is possible to sequence a sequence of 250-350 nucleotides in one step. After the discovery of restriction enzymes, it became possible to sequence long sequences piecemeal.

Well, the last procedure that we will mention is gel electrophoresis, used to separate DNA molecules by length. If the molecules are placed in a gel and a constant electric field is applied, they will move towards the anode, with shorter molecules moving faster. Using this phenomenon, it is possible to implement the sorting of DNA molecules by length.

DNA computing

DNA molecules with their unique form of structure and the ability to implement parallel computing allow us to take a different look at the problem of computer computing. Traditional processors execute programs sequentially. Despite the existence of multiprocessor systems, multi-core processors, and various technologies aimed at increasing the level of parallelism, at their core, all computers built on the basis of the von Neumann architecture are devices with a sequential instruction execution mode. All modern processors implement the following command and data processing algorithm: fetching commands and data from memory and executing instructions on the selected data. This cycle is repeated many times and at great speed.

DNA computing is based on a completely different, parallel architecture, and in some cases it is precisely because of this that they are able to easily calculate those tasks that would take years to solve for computers based on the von Neumann architecture.

Edlman experiment

The history of DNA computing begins in 1994. It was then that Leonard M. Adleman tried to solve a very trivial mathematical problem in a completely non-trivial way - using DNA calculations. In fact, this was the first demonstration of a prototype biological computer based on DNA computing.

The problem that Edlman chose to perform using DNA computing is known as finding a Hamiltonian path in a graph or choosing a travel route (traveling salesman problem). Its meaning is as follows: there are several cities that you need to visit, and you can visit each city only once.

Knowing the point of departure and the final point, it is necessary to determine the route of travel (if it exists). At the same time, the route is compiled taking into account possible flights and connections of various flights.

So, let's assume that there are only four cities (Edleman's experiment used seven cities): Atlanta (Atlanta), Boston (Boston), Detroit (Detroit) and Chicago (Chicago). The traveler is tasked with choosing a route to get from Atlanta to Detroit, while visiting each city only once. Schemes of possible communications between cities are shown in fig. eight.

Rice. 8. Schemes of possible messages
between cities

It is easy to see (it takes only a few seconds to do this) that the only possible route (Hamiltonian path) is the following: Atlanta - Boston - Chicago - Detroit.

Indeed, with a small number of cities, compiling such a route is quite simple. But with an increase in their number, the complexity of solving the problem grows exponentially and becomes difficult not only for a person, but also for a computer.

So, in fig. 9 shows a graph of seven vertices with indication of possible transitions between them. It takes an ordinary person no more than one minute to find the Hamiltonian path. It is this graph that was used in Edlman's experiment. On fig. Figure 10 shows a graph of 12 vertices - in this case, finding a Hamiltonian path is no longer such an easy task. In general, the complexity of solving the problem of finding a Hamiltonian path increases exponentially with the number of vertices in the graph. For example, for a graph of 10 vertices, there are 106 possible paths; for a graph of 20 vertices - 1012, and for a graph of 100 vertices - 10100 options. It is clear that in the latter case, generating all possible paths and checking them will take a huge amount of time even for a modern supercomputer.

Rice. 9. Finding the best travel route

Rice. 10. A graph consisting of 12 vertices

So, let's return to our example of finding a Hamiltonian path in the case of four cities (see Fig. 8).

To solve this problem using DNA computing, Edlman encoded the name of each city as a single strand of DNA, each containing 20 base bases. For simplicity, we will encode each city with an eight-base DNA strand. DNA codes of cities are shown in Table. 1. Note that a string of eight base bases is redundant to encode only four cities.

Table 1. DNA codes of cities

Note that for each city DNA code that defines a single DNA strand, there is also a complementary strand, that is, a complementary city DNA code, and both the city DNA code and the complementary code are absolutely equal.

Further, using single DNA chains, it is necessary to encode all possible flights (Atlanta - Boston, Boston - Detroit, Chicago - Detroit, etc.). For this, the following approach was used. The last four base bases were taken from the name of the city of departure, and the first four bases were taken from the name of the city of arrival.

For example, the flight Atlanta - Boston will correspond to the following sequence: GCAG TCGG (Fig. 11).

Rice. 11. Coding flights between cities

DNA encoding of all possible flights is shown in Table. 2.

Table 2. DNA codes for all possible flights

So, after the codes of cities and possible flights between them are ready, you can directly proceed to the calculation of the Hamiltonian path. The calculation process consists of four steps:

  1. Generate all possible routes.
  2. Select routes that start in Atlanta and end in Detroit.
  3. Select routes whose length corresponds to the number of cities (in our case, the length of the route is four cities).
  4. Select routes in which each city is present only once.

So, in the first step, we must generate all possible routes. Recall that the correct route corresponds to flights Atlanta - Boston - Chicago - Detroit. This route corresponds to the DNA molecule GCAG TCGG ACTG GGCT ATGT CCGA.

In order to generate all possible routes, it is enough to put all the necessary and pre-prepared ingredients into a test tube, that is, DNA molecules corresponding to all possible flights, and DNA molecules corresponding to all cities. But instead of using single DNA strands corresponding to the names of cities, it is necessary to use DNA strands complementary to them, that is, instead of the DNA strand ACTT GCAG corresponding to Atlanta, we will use the complementary DNA strand TGAA CGTC, etc., since The DNA code of the city and the complementary code are absolutely equal.

Then we put all these molecules (literally a pinch, which will contain about 1014 different molecules) into water, add ligases, cast a spell and ... literally in a few seconds we get all possible routes.

The process of formation of DNA chains corresponding to different routes occurs as follows. Consider, for example, the GCAG TCGG chain responsible for the flight Atlanta - Boston. Due to the high concentration of different molecules, this strand will definitely meet with the complementary DNA strand AGCC TGAC corresponding to Boston. Since the TCGG and AGCC groups are complementary to each other, these chains will link with each other due to the formation of hydrogen bonds (Fig. 12).

Rice. 12. Linking chains corresponding
flight Atlanta - Boston and Boston

Now the resulting chain will inevitably meet with the ACTG GGCT DNA chain corresponding to the Boston-Chicago flight, and since the ACTG group (the first four bases in this chain) is complementary to the TGAC group (the last four bases in the Boston complementary code), the ACTG GGCT DNA chain will join to an already formed chain. Further, the DNA chain corresponding to the city of Chicago (complementary code) will join this chain in the same way, and then the Chicago-Detroit air flight chain. The route formation process is shown in fig. thirteen.

Rice. 13. The process of formation of a DNA chain corresponding to the route
Atlanta - Boston - Chicago - Detroit

We considered an example of the formation of only one route (and this is precisely the Hamiltonian route). All other possible routes are obtained in a similar way (for example, Atlanta - Boston - Atlanta - Detroit). It is important that all routes are formed simultaneously, that is, in parallel. Moreover, the time required to create all possible routes in a given problem and all routes in a problem with 10 or 20 cities is exactly the same (if only there were enough initial DNA molecules). Actually, it is in the parallel algorithm of DNA computing that the main advantage lies in comparison with the von Neumann architecture.

So, DNA molecules corresponding to all possible routes are formed in a test tube. However, this is not yet a solution to the problem - we need to isolate the only DNA molecule that is responsible for the Hamiltonian route. Therefore, the next step is to select molecules corresponding to routes starting in Atlanta and ending in Detroit.

For this, a polymerase chain reaction (PCR) is used, as a result of which many copies of only those DNA strands that begin with the Atlanta code and end with the Detroit code are created.

Two primes are used to implement the polymerase chain reaction: GCAG and GGCT. The process of copying DNA patterns starting with the DNA code of Atlanta and ending with the DNA code of Detroit is shown in Fig. fourteen.

Rice. 14. The process of copying DNA molecules during the PCR reaction

Note that in the presence of primes GCAG and GGCT, DNA molecules that begin with the DNA codes of Atlanta, but do not end with the DNA code of Detroit (under the action of prime GCAG), as well as DNA molecules that end with the DNA code Detroit, but do not start with Atlanta's DNA code (under the effect of prime GGCT). It is clear that the copying speed of such molecules will be much lower than the copying speed of DNA molecules starting with the DNA code of Atlanta and ending with the DNA code of Detroit. Therefore, after the PCR reaction, we will get the predominant number of DNA molecules in the form of a regular double helix, corresponding to routes starting in Atlanta and ending in Detroit.

At the next stage, it is necessary to isolate molecules of the required length, that is, those that contain the DNA codes of exactly four cities. For this, gel electrophoresis is used, which allows the molecules to be sorted by length. As a result, we get molecules of the required length (exactly four cities), starting with the code for Atlanta and ending with the code for Detroit.

Now we need to make sure that in the selected molecules the code of each city is present only once. This operation is carried out using a process known as affinity purification.

For this operation, a microscopic magnetic ball with a diameter of about one micron is used. Complementary DNA codes of this or that city are attracted to it, which perform the function of a sample. For example, if you want to check whether the code of the city of Boston is present in the DNA chain under study, then you must first place the magnetic ball in a test tube with DNA molecules corresponding to the DNA codes of Boston. As a result, we will get a magnetic ball covered with the samples we need. Then this ball is placed in a test tube with the studied DNA chains - as a result, DNA chains containing the complementary Boston code will be attracted to it (due to the formation of hydrogen bonds between complementary groups). Next, the ball with the sorted molecules is taken out and placed in a new solution, from which it is then removed (when the temperature rises, the DNA molecules fall off the ball). This procedure (sorting) is repeated sequentially for each city, and as a result we get only those molecules that contain the DNA codes of all cities, and hence the routes corresponding to the Hamiltonian path. In fact, the problem is solved - it remains only to calculate the answer.

Conclusion

Edlman demonstrated the solution to the problem of finding a Hamiltonian path using only seven cities as an example and spent seven days on it. This was the first experiment to demonstrate the capabilities of DNA computing. In fact, Edlman proved that, using DNA calculations, it is possible to effectively solve enumeration problems, and he outlined a technique that later served as the basis for creating a parallel filtering model.

However, many researchers are not optimistic about the future of biological computers. Here is just a small example. If a similar method were required to find a Hamiltonian path in a graph consisting of 200 vertices, the number of DNA molecules would be required, comparable in weight to our entire planet! This fundamental limitation is, of course, a kind of impasse. Therefore, many research laboratories (for example, IBM) have chosen to focus on other ideas for alternative computers, such as carbon nanotubes and quantum computers.

Since Edlman's experiment, there have been many other studies of the possibilities of DNA computing. For example, one can recall the experience of E. Shapiro: a finite automaton was implemented in it, which can be in two states: S0 and S1 - and answers the question: an even or odd number of characters is contained in the input sequence of characters.

Today, DNA computing is nothing more than promising technologies at the level of laboratory research, and they will remain in this state for more than one year. In fact, at the present stage of development, it is necessary to answer the following global question: what class of problems can be solved using DNA and is it possible to build a general model of DNA computing suitable both for implementation and for use?

Have questions?

Report a typo

Text to be sent to our editors: