# Implementations of computational state transitions with biomolecules

Post-publication activity

Curator: Masami Hagiya

Various attempts have been made to implement state machines with biomolecules. In particular, those using DNA molecules have been successful.

## State Machine and State Transitions

The phrase state machine denotes either a mathematical object with some internal states among which it takes its current state, or a physical implementation of such a mathematical object. For the former, the phrases state transition system and automaton are also used. If the number of states is finite, a state machine is called a finite state machine or a finite automaton. The current state of a state machine is changed to another internal state either spontaneously or by an input to the state machine. There may also be an output from a state machine. A change of the current state is called a state transition. Mathematically, a relation called a state transition relation is defined as a set of tuples consisting of the current state, the next state, an optional input, and an optional output.

Such a state machine is a part of a molecular system that observes its external environment, makes judgment, and produces some effect on the environment, where the state machine plays the role of a computer for making judgment. For example, so-called smart drug is a molecular system that is put into body, detects a cause of a disease by making judgment on signals from its sensors, and finally emits drug. This kind of molecular system is called autonomous, and is currently one of the main targets of the research in the field of molecular computing, including DNA computing.

This article focuses on the problem of physically implementing a state machine by DNA molecules. Given a mathematical definition of a state machine, as mentioned above, is it possible to systematically design its physical implementation by DNA molecules?

## Implementation of State Transitions

In general it is possible to implement a state transition by a molecular reaction of the following form. $M + I + C \rightarrow M' + C'$ The molecule $$M$$ implements a state machine in the current state, and $$M'$$ implements the same machine in the next state. It is assumed that there is only a slight difference between $$M$$ and $$M'\ ,$$ and the main part of $$M$$ is not changed. That is, the part of $$M$$ that encodes the current state is changed to the corresponding part of $$M'$$ that encodes the next state. In the reaction, $$I$$ denotes an input. It may be a single molecule or a group of molecules. It may also be a physical signal such as light or heat. $$I$$ is not present in the case of a spontaneous state transition. $$C$$ and $$C'$$ denote the context before and after the reaction. They are usually omitted in the reaction, because the context consists of the entire solution in which the reaction takes place, and its change is negligible. It may contain enzymes that are necessary for but not consumed by the reaction, or substrates that are abundant in the solution. So, the reaction is usually written as follows. $M + I \rightarrow M'$ If there exists an output in a state transition, it is implemented by a reaction of the following form. $M + I \rightarrow M' + O$ Therefore, the problem of designing an implementation of a state machine reduces to the problem of designing the structure of the molecules $$M$$ and $$M'\ ,$$ and the reactions that implement state transitions.

In the following, implementations of state machines by DNA are classified according to how internal states are encoded and how state transitions take place. As for inputs, the cases where inputs are represented by DNA molecules are first explained. It is then briefly described how inputs other than DNA can be implemented. How to implement outputs are finally discussed.

### Terminal Encoding and Growing Strands

In this classification, a terminal sequence of a DNA strand encodes the current state, and a state transition is implemented by addition of a new terminal sequence to the DNA strand. In this approach, therefore, a DNA strand grows as state transitions are made.

#### DNA

A single-stranded DNA molecule is a polymer consisting of four bases A, T, G and C that are connected by the backbone of phosphate groups and sugars called deoxyribose. Different sequences of bases define different DNA molecule species. Complementarity is defined between bases (and sequences): A and T are complementary, and so are G and C.

A double strand of DNA is formed by two single-stranded DNA molecules that are complementary to each other. For example, A-T-C-T-A-G-C-C-T-T and A-A-G-G-C-T-A-G-A-T are complementary to each other and form a double strand by hybridization as follows.

 A-T-C-T-A-G-C-C-T-T
| | | | | | | | | |
T-A-G-A-T-C-G-G-A-A


Note that the two ends of a single strand of DNA are called the $$5'$$-end and the $$3'$$-end. A single strand is usually written from the $$5'$$-end to the $$3'$$-end. For example, in A-T-C-T-A-G-C-C-T-T, the leftmost A is at the $$5'$$-end. When a double strand is formed, strands with opposition directions hybridize together. In the above double strand, therefore, T-A-G-A-T-C-G-G-A-A is written from the $$3'$$-end to the $$5'$$-end.

#### Ligation

Let $$M$$ denote the following double strand of DNA.

 A-T-C-T-A-G-C-C-T-T
| | | | | |
T-A-G-A-T-C


This is not a complete double strand, but has a single-stranded part, C-C-T-T, called a sticky end. Let $$I$$ be the following DNA complex (or hybrid).

         A-G-A-T-G-A-A-C
| | | |
G-G-A-A-T-C-T-A


This also has sticky ends. Mixing $$M$$ and $$I$$ together yields the following complex by hybridization.

 A-T-C-T-A-G-C-C-T-T A-G-A-T-G-A-A-C
| | | | | | | | | | | | | |
T-A-G-A-T-C G-G-A-A-T-C-T-A


By the reaction called ligation, the covalent bond between T and A and that between C and G are made with the help of the enzyme called ligase as follows.

 A-T-C-T-A-G-C-C-T-T-A-G-A-T-G-A-A-C
| | | | | | | | | | | | | |
T-A-G-A-T-C-G-G-A-A-T-C-T-A


This is the double strand $$M'$$ after the state transition. In this example, the sticky end C-C-T-T of $$M$$ is thought to encode the current state of the machine and the new sticky end G-A-A-C of $$M'$$ the next state, because hybridization with an input depends only on these sticky ends. In the above transition, the current state is changed from C-C-T-T to G-A-A-C.

In this approach, computation proceeds as the state machine $$M$$ hybridizes with an input $$I\ .$$ The resulting DNA complex further hybridizes with another input. Therefore, if all inputs are present from the beginning, a chain of hybridization occurs. Such a process is called self-assembly. In DNA computing, self-assembly of structures made of DNA has been one of the most active research topics.

#### Polymerization

The reaction called polymerization (or extension) can also be used to extend a DNA molecule. In this approach, $$M$$ is the single-stranded DNA molecule A-T-C-T-A-G-C-C-T-T, and $$I$$ is the single-stranded DNA molecule G-T-T-C-A-A-G-G. After hybridization,

 A-T-C-T-A-G-C-C-T-T
| | | |
G-G-A-A-C-T-T-G


is obtained, and polymerization yields the following.

 A-T-C-T-A-G-C-C-T-T-G-A-A-C
| | | | | | | | | | | | | |
T-A-G-A-T-C-G-G-A-A-C-T-T-G


This reaction is executed by the enzyme called polymerase, which adds bases to the $$3'$$-end of a single-stranded DNA molecule to form a double strand. In contrast to ligation, polymerization can only add new bases at the $$3'$$-end. In order to promote the next transition, the above double strand should be broken into single strands, e.g., by raising the temperature.

#### Whiplash Machine

The Whiplash Machine proposed by Hagiya et al. 1999 is an example of a state machine taking this approach of polymerization (Fig.#F1). The Whiplash Machine is a single-stranded DNA molecule that forms a hairpin. The $$3'$$-end of the single strand is extended by polymerase until the new state is added to the $$3'$$-end when polymerization stops due to the machinery called polymerization stop. In this way, the Whiplash Machine makes a spontaneous state transition, i.e., a state transition without any input. The rules for spontaneous transitions are encoded inside the machine itself, and can be thought of as a program of the machine. By using this machine, program-parallel computation was experimentally implemented by Komiya et al. 2006.

### Shrinking Strands

In this classification, a DNA strand implementing a state machine gets shorter and shorter as more and more state transitions are made.

#### Restriction Enzyme

A restriction enzyme can be used to cut existing covalent bonds in a double-stranded DNA. For example, let $$M$$ be the following DNA complex.

 A-T-C-T-A-G-C-C-T-T
| | | | | | | | | |
T-A-G-A-T-C-G-G-A-A-C-T-T-G


The sequence G-T-T-C is single-stranded. Let $$I$$ be the following complex.

 G-A-A-C------------
| | | | | |
-----------


The unspecified part is assumed to contain the recognition site of a restriction enzyme. After hybridization and ligation, the following double strand is obtained.

 A-T-C-T-A-G+C-C-T-T-G-A-A-C------------
| | | | | | | | | | | | | | | | | | | |
T-A-G-A-T-C-G-G-A-A+C-T-T-G------------


It is assumed that the restriction enzyme recognizes the site mentioned above and cuts the covalent bonds specified by the plus symbols. This kind of restriction enzyme is called Type IIS and cuts DNA strands at positions away from the recognition site. After the double strand is broken, the following complex $$M'$$ is obtained.

 A-T-C-T-A-G
| | | | | |
T-A-G-A-T-C-G-G-A-A


The positions at which covalent bonds are cut can be controlled by the input $$I\ .$$ For example, the following complex $$M''$$ might have been obtained.

 A-T-C-T-A
| | | | |
T-A-G-A-T-C-G-G-A


#### DNA Automaton

This approach is taken by the DNA automaton proposed by Benenson et al. 2001, although in the original proposal, the double strand $$M$$ is regarded as an input string while the double strand $$I$$ represents a state transition and is called a transition molecule. In their second proposal in Benenson et al. 2004, the double strand $$M$$ is regarded as a state machine executing a simple flowchart program. More concretely, each transition molecule becomes active when a specific mRNA exists or does not exist, and $$M$$ encodes a flowchart computing a conjunction of such conditions. They proposed that this machine could be used for smart drug.

### Conformational Encoding

Another approach to encoding internal states by DNA molecules would be called conformational encoding. It is possible to construct a molecular complex consisting of many single-stranded DNA molecules hybridizing with one another. It is usually the case that if a new single strand is added to the solution, it hybridizes with an existing complex and the complex gets bigger. However, due to the reaction called strand exchange (traditionally called branch migration or strand displacement), a new single strand can also be used to make the complex smaller by removing a single strand from the complex.

#### DNA Tweezers

The DNA tweezers developed by Yurke et al. 2000 are the first machine that takes this approach, and has two internal states (Fig.#F2). After the single strand $$F$$ is added to the solution, it hybridizes with the tweezers and closes them. After the single strand $$\overline{F}\ ,$$ which is complementary to $$F\ ,$$ is added to the solution, it hybridizes with $$F$$ and removes $$F$$ from the tweezers, which are opened again. In this way, $$F$$ and $$\overline{F}$$ form a double strand and become inactive in the solution, because the double strand does not have any single-stranded part and cannot react with other strands.

After the DNA tweezers were publicized, many DNA machines that make use of strand exchange have been proposed. For example, Yurke et al. soon developed a three-state machine while the DNA tweezers are a two-state machine.

## Inputs

So far, inputs to a state machine implemented by DNA molecules are also DNA molecules. In some implementations inputs are single-stranded DNA molecules, while in other implementations inputs are DNA complexes consisting of several single strands. If DNA molecules or DNA complexes are used as inputs in a reaction, they should not interfere with other reactions, because not all molecules or complexes are consumed by the reaction. Outputs of the reaction should also be independent from other reactions. If inputs or outputs of a reaction interfere with other reactions, they should be removed from the solution after the completion of the reaction.

If physical signals such as light and heat are used as inputs to a state machine, they need not be removed after a state transition. In the case of light, for example, simply turning off irradiation shuts off the signal. In order to allow such physical signals, modified DNA molecules have been used to implement a state machine. In particular, azobenzene-tethered DNA molecules, developed by Asanuma, et al., have been used to implement a state machine that receives light as an input.

Azobenzene molecules have two conformations: trans and cis. The trans conformation promotes hybridization of DNA while the cis conformation prohibits it. Irradiation of ultraviolet light changes the conformation of azobenzene molecules from trans to cis while visible light changes the conformation from cis to trans. Therefore, by introducing azobenzenes into DNA molecules, it is possible to control their hybridization by ultraviolet light or visible light. For example, Liang et al. 2008 showed that DNA tweezers can be opened and closed by ultraviolet light or visible light. This is a direct application of azobenzene-tethered DNA molecules. In any of the above approaches to implementing state machines by DNA molecules, by introducing azobenzenes into some strand(s) in the input $$I\ ,$$ it is possible to control the reaction between the state machine M and I because it is based on hybridization between them.

Any kinds of signals that influence hybridization between $$M$$ and $$I$$ can be used as inputs to state machines. In this sense, such signals should be called (real) inputs, and $$I$$ should be called a transition molecule. Input signals make the transition molecule active. For example, Benenson et al. 2004 used RNA molecules to control the activity of transition molecules. In their application, mRNAs from specific genes are used as input signals to their state machine. Other possibilities of inputs include heat and pH change.

## Outputs

In the terminal encoding, since the DNA molecule implementing the state machine is only extended, another operation is needed to produce outputs. For example, in the case of Whiplash Machine, outputs can be produced by cutting the machine at a specific position by a restriction enzyme. On the other hand, in the approach of shrinking strands, the part removed from the double strand of DNA can be naturally regarded as the output of the state transition.

In the conformational encoding, strands may be removed from the DNA complex of the state machine in a state transition. Those strands can be regarded as outputs. While complete double strands are inactive, single strands can also be produced and they can be fed as inputs to another state machine.

In general, DNA molecules produced as outputs of a state machine can hybridize with other machines and initiate their state transitions. In addition to becoming inputs to state machines, DNA molecules may trigger various phenomena. For example, DNA molecules having special sequences have chemical activities. They can bind to molecules other than DNA, or show enzymatic activities such as those of restriction enzymes. In other words, they produce physical signals as outputs.