Biologically relevant molecular finite automata

Post-publication activity

Dr. Sivan Shoshani, Technion - Israel Institute of Technology, Haifa, Israel
Tamar Ratner, Technion -- Israel Institute of Technology, Haifa, Israel
Dr. Ron Piran, Sanford-Burnham Medical Research Institute, La Jolla, California, USA
Prof. Ehud Keinan, Technion-Israel Institute of Technology, Technion City, Haifa 32000, Israel

1 Introduction
2 Finite State Automata
3 The First Realization of an Autonomous, DNA-Based Finite Automaton
4 3-Symbol-3-State DNA-Based Automata
5 Molecular Computing Device for Medical Diagnosis and Treatment In Vitro
6 DNA-Based Automaton with Bacterial Phenotype Output
7 Molecular Computing with Plant Cell Phenotype
8 Applications of Molecular Finite Automata in Developmental Biology
9 Outlook
10 References

Introduction

The growing interest in the design of new computing systems is attributed to the notion that although the silicon based computing provides outstanding speed, it cannot meet some of the challenges posed by the developing world of biotechnology and synthetic biology. New abilities, such as direct interface between computation processes and biological environment, are necessary. In addition, the challenges of parallelism (Livstone et al. 2003) and miniaturization are still driving forces for developing innovative computing technologies. Moreover, the growth in speed for silicon-based computers, as described by Moor’s law, may be nearing its limit (Ruben and Landweber, 2000). The design of new computing architectures involves two main challenges: reduction of computation time and solving intractable problems. Most of the celebrated computationally intractable problems can be solved with electronic computers by an exhaustive search through all possible solutions. However, an insurmountable difficulty lies in the fact that such a search is too vast to be carried out using the currently available technology.

Although the general idea of using molecular-scale components for building computing devices dates back to 1959 (Feynman, 1961), it was not until 1994 that an active use of DNA molecules was presented in the form of computation (Adleman, 1994). Bio-Molecular Computing (BMC) has evolved since then as an independent field at the interface of computer science, mathematics, chemistry, and biology ((Seeman, 2003),(Reif, 2002), (Chen and Wood, 2000)). Living organisms carry out complex physical processes dictated by molecular information. For example, biochemical reactions — and ultimately the entire organism’s operations — are ruled by instructions stored in its genome, encoded in sequences of nucleic acids. It is tempting to draw an analogy between the intracellular processing of DNA and RNA and the processing of information stored in the tape of the Turing machine. Both systems process information stored in a string of symbols built upon a fixed alphabet, and both operate by moving step-by-step along those strings, modifying or adding symbols according to a given set of rules. These parallels have inspired the idea that biological molecules could become the raw material of new computer species.

DNA molecules enjoy many advantages over other biological molecules as building blocks for the construction of bio-computers. These highly stable molecules can be translated to proteins and thereby create biological structure and function without being consumed. Furthermore, DNA strands can store a very high density of information; one gram of DNA in a volume of 1 cm³ (i.e. cubic centimeters) can store as much information as a trillion compact discs, approximately 750 terabytes ((Feynman, 2003), (Ulam, 1972), (Holland, 1992)). The combination of the remarkable information storage capacity with the base-pairing rules, which can serve as a programming language, offers attractive opportunities in BMC. In addition, the DNA molecules can be easily copied and amplified to countless copies and can be easily manipulated with high fidelity using readily available enzymes. These advantages can be conveniently used not only for the design of biomolecular computing devices but also for dictating the computation rules.

Biological computers would not necessarily offer greater performance and speed in traditional computing tasks. Since natural molecular machines depend on the catalytic rates of enzymes, they are obviously slower than electronic computers. Moreover, the dependence on operations such as gel electrophoresis and polymerase chain reactions (PCR) renders the biological computers slower than conventional computers by orders of magnitude (Tagore et al. 2010). However, DNA-based computing devices have numerous advantages, including the direct interaction with biological systems, the miniaturization of the computing devices to a molecular scale, and the potential for massive parallelism that allows for large number of operations per second.

Over the years, numerous architectures for autonomous molecular computing devices have been developed on the basis of opportunities offered by molecular biology techniques ((Lipton, 1995), (Liu et al. 2000), (Sakamoto et al. 2000), (Faulhammer, 2000), (Braich et al. 2002), (Roweis et al. 1998), (Winfree et al. 1998), (LaBean et al. 1999), (Winfree, 1999)). Several of these have been explored experimentally and proven feasible ((Lipton, 1995), (Benenson, 2001), (Mao et al. 2000), (Rose et al. 2002), (Komiya et al. 2001). This account focuses on the realization of programmable DNA-based finite-state automata that can compute autonomously upon mixing all their components in solution.

Finite State Automata

A finite-state automaton is a notional computing machine that operates on finite sequences of symbols. It is a simplified version of a Turing machine, which can read its input but cannot write new information. Also, unlike the Turing machine, it can move to only one direction. The automaton can comprise a finite number of internal states in which one is defined as the initial state, and one or more are defined as accepting states. Its software consists of transition function rules, each specifying the next state according to the current state and the current symbol. It begins on the leftmost symbol on a tape and in each transition the machine moves one symbol to the right, changing its internal state, or leaving it unchanged, according to one of the applicable transition rules. The machine can also halt without completing the computation if no transition rule applies. The computation terminates on processing the last input symbol. An automaton is said to accept an input if a computation on this input terminates in an accepting final state.

Mathematically, a finite-state automaton M is a quintuple M = (∑, δ, Q, q₀, F), where: ∑ is a finite input symbols; δ is transition function δ:Q×M→Q; Q is a finite set of states; q₀ is the initial state (taken from the set Q) and F is a set of accepting states (subset of Q). A common way to present finite automaton is a transition diagram composed of vertices and arrows connecting them. The vertices represent the states while each arrow describes a transition between vertices as a result of reading a specific symbol. The initial state is distinguished by an inward pointing arrow and accepting states usually by double circles.

The First Realization of an Autonomous, DNA-Based Finite Automaton

Several theoretical models preceded the first realization of a lab-tested molecular finite automaton. One of these was a proposed Turing machine that was based on DNA and restriction enzymes (Rothemund, 1996). The basic idea was to use circular, double-stranded DNA (dsDNA) to represent the tape of symbols, while all enzyme-catalyzed manipulations, including restriction, insertion, and ligation, represented the computational steps, the position of the head, and its internal state. Small dsDNA fragments where designed to represent the transition rules. This structural architecture inspired much of the experimental work on DNA-based finite-state automata.

Programmable finite automata that solve computational problems autonomously were prepared using dsDNA and DNA-manipulating enzymes. The automaton hardware consisted of a restriction enzyme and ligase, the software and input were encoded by dsDNA, and the program was defined by the appropriate choice of software molecules. Upon mixing solutions containing these components, the automaton processed the input molecule via a cascade of restriction, hybridization, and ligation cycles, producing a detectable output molecule that encoded for the automaton final state, and thus the computational result.

For example, a soluble mixture of molecules was designed to represent a 2-symbol-2-state finite automaton ( Figure 1)(Benenson et al. 2001). The hardware comprised of a type-II endonuclease (FokI), T4 DNA ligase, and ATP. The software included a set of transition rules, represented by an appropriate set of transition molecules, all in the form of short dsDNA oligomers. A dsDNA molecule encoded for the input, with each input symbol being coded by a six base-pair (bp) dsDNA sequence (Figure 2). The input molecule included the initial state of the automaton as well. The computing mixture contained the required “peripherals”, namely, two output-detection molecules of different lengths. Each of these could hybridize and ligate selectively to a different output molecule, thus forming an output-reporting molecule. The latter indicated a final state and could be readily detected by gel electrophoresis.

Figure 1: A 2-symbol-2-state automaton. A) All 8 possible transition rules of a finite automaton that has 2 internal states (S₀ and S₁) and 2 input symbols (a and b). B) A selected subset of 4 transition rules that represent a specific finite automaton. C) A graphic description of this automaton that includes 2 states: S₀ and S₁ (circles), 2 symbols: a and b, an initial state, S₀ (indicated by a straight arrow), and 4 transition rules (indicated by curved arrows). This automaton answers the question whether the number of b’s in a given input is even. A positive answer to this question will result in the accepting state S₀ (indicated by a double circle). On the other hand, an odd number of b’s in the input will result in a final state S₁

Figure 2: Molecular design of a 2-symbol-2-state finite automaton. The 10 components of the automaton included an input molecule, 2 enzymes, ATP, 4 transition molecules and 2 detection molecules. The transition molecules shown here construct the automaton presented in Figure 6).

The two different internal states, either S₀ or S₁, were represented by two distinguishable restriction modes of any 6 bp symbol, either at the beginning of the symbol domain or 2 bp deeper into that domain, respectively (Figure 3). The different cleavage site was achieved by using a 4-cutter endonuclease, FokI, which cut 9 and 13 bases away from its recognition site. In this system, there were 6 unique 4-nucleotide 5-prime sticky-end sequences that could be obtained by two restriction modes of 2 symbols and a terminator.

Figure 3: Two different internal states. A) Two restriction modes of a 6 bp domain by a 4-cutter enzyme. B) The symbols a and its two restriction products produced by a 4-cutter endonuclease.

The automaton processed the input, first, by cleaving it with FokI, thereby exposing a 4-nucleotide sticky-end. This result reflected the initial state and the first input symbol. The computation continued via a cascade of transition cycles (Figure 4). In each cycle the sticky-end of an appropriate transition molecule ligated to the sticky-end of the input molecule. This operation indicated the response of the system to the current state and symbol. At each restriction step the most upstream symbol in the input molecule was cleaved by FokI, exposing a new four nucleotide sticky-end. The design of the transition molecules (Figure 2) ensured that the 6 bp-long encodings of the input symbols a and b were cleaved by FokI at the appropriate mode, either at the leftmost part, encoding the state S₁ or at the rightmost part, encoding S₀ (Figure 3). The exact next restriction site, and thus the next internal state, was determined by the current state and the size of the spacers in an applicable transition molecule. The computation proceeded until no transition molecule matched the exposed sticky-end of the input or until the special terminator symbol was cleaved, forming an output molecule that had a sticky-end encoding the final state. This sticky-end ligated to one of two output detecting molecules (Figure 2) and the resultant output reporter was identified after purification by gel electrophoresis. The operation of several automata was tested on various inputs followed by detection of the outputs by gel electrophoresis (Figure 5).

Figure 4: Description of a computing process with input bab. The input is cleaved by FokI, forming the initial state. The complementary transition molecule (<S_0-1>) is hybridized and in the presence of T4 DNA ligase it is ligated to the restricted input. Similarly, the input is processed via repetitive cycles of restriction, hybridization, and ligation with complementary transition molecules. In the final step the output ligates to a S₀ detection molecule (<S₀-D>).

Figure 5: Output detection by gel electrophoresis. The processed mixtures of 3 inputs, aa, aba, and aabb, after ligation with the detection molecules, S₀-D and S₁-D, were analyzed by gel electrophoresis. The ands marked by I&IC represent the inputs and various other dsDNA molecules in the mixture. The relevant bands are the ligation products, marked S₀-R or S₁-R, indicating the final state, either S₀ or S₁, respectively. The left lane shows a molecular weight ladder and the second lane from left shows an intact input, aba, for reference. As expected, the outputs were S₀ for input aa, S₁ for input aba, and S₀ for input aabb.

Based on this design, a ligase-free system was also demonstrated. In this manner, the transition molecules were not consumed but the input consumption drove the computation process to completion without the need to invest external energy in the form of ATP (Benenson et al. 2003). Furthermore, these principles allowed for the design of biomolecular automata that can perform stochastic computing, reminiscent of natural biological processes (Adar et al. 2004).

3-Symbol-3-State DNA-Based Automata

Significant expansion of the complexity of the above-described automata was achieved by the demonstration that a type-II, 4-cutter endonuclease, could restrict a 6 bp symbol in three distinguishable modes (Soreni et al. 2005). Three internal states, S₀, S₁ or S₂, could thus be represented by restriction at the beginning of the symbol domain, 1 bp deep, or 2 bp deep into that domain, respectively. The increased number of states and symbols resulted in significant enhancement of the computing power. The applicability of the 3-symbol-3-state automaton was further enhanced by employing surface anchored input molecules, using surface plasmon resonance (SPR) technology to monitor the computation steps in real time. The realization of this computational design was achieved in a stepwise manner, with automatic, real-time detection of each step, all carried out on a Biacore© chip.

Computation was performed by alternating the feed solutions between the endonuclease solution and a mixture containing the ligase, ATP, and the appropriate transition molecules. The binding and computing events were monitored while they were taking place at the sensor surface. The flow cell was first fed with a solution of BbvI endonuclease, and then fed with a mixture of transition molecules and ligase, and so forth. The computation was terminated after executing a sufficient number of such cycles. Detection of the final state was carried out by sequential feed of three mixtures, each containing one of the detection molecules, d-S₀, d-S₁ or d-S₂, together with T4-DNA ligase and ATP. Since the Biacore chip contained four independent sectors, it was possible to perform parallel computing with 4 different input molecules. This advantage was demonstrated by stepwise monitoring of the computation using one automaton and four different inputs: bc, a, ac, acb (Figure 6 and Figure 7). This work presented significant enhancement of the computational power in comparison with the initially reported 2-symbol-2-state automata. The major improvements included a) increase in the number of internal states, b) increase in the number of symbols, c) real-time detection of the output signal as well as real-time monitoring of stepwise computing by the use of the SPR technology, and d) parallel computing on different inputs, which resulted from the ability to immobilize different input molecules on different sectors on the chip.

Figure 6: The finite automaton shown both graphically (left) and in the form of a table of nine transition rules (right).

Figure 7: Monitoring the computation process by SPR. Stepwise computing with one automaton shown in Figure 6 and four input molecules: bc, a, ac, and acb. The transition molecules were supplied only to satisfy the computation needs of the latter three inputs, while no transition molecules were available for computation with the bc input. The differential RU values represent the changes in the SPR response between two consecutive steps, R represents restriction, L represents ligation. The computation was followed by detection with the soluble detection molecules, d-S₀ and d-S₁.

Molecular Computing Device for Medical Diagnosis and Treatment In Vitro

An in vitro molecular computing device was constructed on the basis of a finite automaton model, capable of detecting and analyzing the levels of disease RNA indicators (Benenson et al. 2004). Upon proper detection of these indicators the device could induce the release of an active drug. The automaton was programmed to identify and analyze mRNA of disease related genes associated with small-cell lung cancer and prostate cancer. It was also designed to produce single-strand DNA (ssDNA) molecules modeled after an anticancer drug, which can affect levels of gene expression via anti-sense activity.

The operation of this computing device was divided into three parts: a computing module that performed stochastic calculations; an input module in which specific mRNA levels or point mutations regulated the concentrations of active transition molecules; and an output module that controlled the release of a suitable drug. The system included input molecules encoding for diagnostic rules, hardware that was comprised of FokI, and software consisting of transition molecules that were regulated by molecular indicators (Figure 8). The dsDNA input molecule had two main regions, a diagnostic part and a drug administration part. The first part was comprised of several symbols, each sensitive to a certain indicator. The automaton processed the diagnostic part one symbol at a time, determining, in each step, whether or not the corresponding indicator was present. After all symbols were processed, the computing proceeded to the drug administration region.

Two input molecules were used, each containing a different drug-administration region, capable of releasing either a drug or a drug-suppressor. Both consisted of a dsDNA stem which protected the ssDNA loop containing the drug or the drug suppressor. The stem prevented undesired interactions between the drug, drug suppressor, and the target mRNA. After positive diagnosis, the stem of the drug-release region was processed with special Yes-verification transition molecules and released the drug. At negative diagnosis this stem remained intact, protecting the drug. After negative diagnosis, the stem of the drug-suppressor release region was processed with special No-verification transition molecules and then released the drug suppressor. At positive diagnosis this stem remained intact, protecting the drug suppressor. The ratio between the drug and drug-suppressor molecules released determined the concentration of the active drug.

This work presented a molecular computing device capable of logical analysis of mRNA disease indicators in vitro and of controlled administration of biologically active ssDNA molecules. Despite of the fact that this work was carried out in vitro, it represented a convincing proof of concept concerning the medical applications of BMC.

Figure 8: Molecular computing for diagnosis and treatment of prostate cancer.(Benenson et al. 2004, Shoshani et al. 2011) A) Part of the computing process of the input molecule ending in drug release. The initial input molecule consists of a diagnosis region (highlighted with blue background) and a drug-administration region (orange background), which includes an inactive drug loop (red). At each computation step, the most probable transition is shown, except for the processing of the symbol PIM1, for which the stochastic choice with two competing transition molecules is shown. B) Regulation of the two transitions for PIM1↑ by sub-sequences (tags) of over-expressed PIM1 mRNA, resulting in a relatively high level of the yes-to-yes transition molecule and low level of the yes-to-no transition molecule. Each transition molecule contains regulation (dark green, orange) and computation (light green, brown) fragments. The inactivation tag of PIM1 mRNA (orange) displaces the strand of the transition molecule yes-to-no and destroys its functionality. The activation tag of PIM1 mRNA (green) activates the yes-to-yes molecule. Initially, a protecting oligonucleotide (green) partially hybridizes to the larger strand of the transition molecule and thereby blocks its function. The activation tag displaces the protecting oligonucleotide, allowing such annealing of the two stands and rendering an active yes-to-yes transition. c) Combining of the computation results of both types of input molecules, both with high Yes and low No final states, results in high release of drug and low release of drug suppressor, and consequently in the administration of the drug. Arrows pointing up and down represent high and low concentration, respectively.

DNA-Based Automaton with Bacterial Phenotype Output

The next step towards actual involvement of molecular computing devices with biological systems was the demonstration that the computation output produced by a molecular finite automaton could be a visible bacterial phenotype (Kossy et al. 2007). As was the case with the above-mentioned finite automata (Figure 1 and Figure 6), the new system also processed linear dsDNA inputs, transforming them into linear output molecules. The difference, however, was that the resultant molecules were transformed into plasmids, thus becoming meaningful genes that could be expressed in E. coli, leading to a visible output. Construction of extended input molecules was carried out by cloning of an insert into a multiple cloning site (MCS) of the vector pUC18. This insert comprised all the previously described components, including several 6 bp symbols, a 6 bp terminator and a recognition site of the restriction enzyme FaqI. The described input plasmids were planned to be cleaved with suitable restriction enzyme to produce specific linear plasmids which will then be used for computation. The terminator sequence was CGCGCG so that the finally obtained sticky-end (both final states possible) would close the linear plasmid (Figure 9), independent of the final state (5’-CGCG). The chosen automaton accepted inputs with an even number of b symbols, as described earlier in Figure 6, which meant that if the dsDNA input contained an even number of b symbols, the computation would lead to a 9 bp insert in the plasmid and the formation of blue colonies on X-gal medium. In contrast, if the input string contained an odd number of b symbols, the computation would result in an 11 bp insert, open reading frame (ORF) shift, and hence formation of white colonies on X-gal medium (Figure 10).

Figure 9: Computations that resulted in bacterial phenotype outputs. Computation with an input that contained an even number of b symbols, leading to blue bacterial colonies, meaning S₀ output (left), and with an input that contains an odd number of b symbols, leading to white bacterial colonies, meaning S₁ output (right).

Figure 10: A) Computation with aaba input in the presence of the transition molecules, which resulted in white bacterial colonies (S₁). B) Control experiment with input aaba without transition molecules. C) Computation with abba input in the presence of transition molecules, which resulted in blue bacterial colonies (S₀). D) Control experiment with abba input without transition molecules.

Molecular Computing with Plant Cell Phenotype

One step further towards in vivo computing in eukaryotic cells was recently demonstrated by the expression of fluorescent proteins in living plant cells (Shoshani et al. 2011). Each of the two possible molecular results of a 2-symbol 2-state finite automaton led to the creation of a circular plasmid that contained the gene of either green fluorescent protein (GFP) or cyan fluorescent protein (CFP). Insertion of these plasmids into living onion cells resulted in either green fluorescent or cyan fluorescent cells as phenotypical output signals.

The mathematical model, as well as its implementation using DNA molecules, was based on the above-mentioned finite automata ((Benenson et al. 2001), (Soreni et al. 2005), (Kossoy et al. 2007)). Each input molecule was represented by a dsDNA that included the following segments: a recognition site for FokI, a string of 6 bp symbols, a 6 bp terminator, and a tail that included a recognition site for the endonuclease PstI. The two detection molecules, DM₀ and DM₁, were also dsDNA, each containing a 4-base sticky end complementary to the appropriately restricted terminator, a gene encoding for a fluorescent protein, either eGFP or eCFP, and a recognition site for the endonuclease NcoI.

Mixing together all of the computation components, including input, hardware, and software, in a single solution, resulted in an autonomous cascade of chemical reactions, leading to a specific dsDNA. This dsDNA was digested with PstI and NcoI in order to generate unique sticky ends for specific ligation with an appropriate linear vector. This vector was mixed with the DNA content from the computation process in the presence of T4 DNA ligase, giving rise to a circular plasmid (Figure 11). In order to obtain the amounts of plasmid required for insertion into onion cells by particle bombardment, the plasmids were first amplified in E. coli on ampicillin-containing medium. Since only the computation output could form a plasmid by ligation with the linear vector, this procedure provided an opportunity to amplify only the plasmids containing output information, eliminating all other non-relevant DNA molecules in the mixture.

Computation with input-1 with automaton-1 (I1-A1), followed by transformation to E. coli, yielded bacterial colonies with correct sequence of S₀. Similarly, computation with Input-1 with automaton 2 (I1-A2) yielded colonies exhibiting the correct plasmid sequence of S₁. The plasmids extracted from the E. coli colonies were coated on tungsten microparticles and used for biolistic delivery into epidermis cells of onion bulbs. Bombardment with the plasmid containing processed I1–A1 resulted in green fluorescent onion cells (Figure 12A). Conversely bombardment with the plasmid containing the processed I1–A2 resulted in cyan fluorescent cells (Figure 12B).

This study demonstrated the ability of autonomous biomolecular computing devices to interact directly with living organisms without any interface. It also demonstrated that the expression of fluorescent proteins in living plant cells could be utilized as a highly accurate visual output of DNA-based computing devices. The output processing procedure involved the creation of a circular dsDNA, which was an important step of the computation procedure, serving as a quality control gate that transformed a rather noisy output into a clean signal. Out of the mixture of many DNA components produced during the computing process, only those defined as true output molecules could evolve into expressible plasmids.

Figure 11: Computation process that ends up with green fluorescent onion cells (S₀). The processed input molecule undergoes ligation to the proper detection molecule (in this case, S₀), following restriction with PstI and NcoI to produce two unique sticky ends. The product is then ligated to a linear vector that leads to the expression of a relevant gene in plant cells.

Figure 12: The final output in the form of fluorescent onion cells. Bombardment with the plasmid containing processed I1-A1 resulted in green fluorescent onion cells 28 h after delivery, representing S₀ output (A), whereas the same experiment with the plasmid containing the processed I1-A2 resulted in cyan fluorescent cells, representing S₁ output (B).

Applications of Molecular Finite Automata in Developmental Biology

Since every biological system can be described as a computing device, it is conceivable that mathematical models could represent complex biological systems and processes. This is particularly true for decision-making processes in living organisms, which obey distinct rules. The intricate process of myogenesis during embryonic development represents an interesting case of a decision-making course of events (Piran et al. 2009). Describing this process in terms of SAT (satisfiability) formalism, particularly in the disjunctive normal form (DNF), is advantageous because it provides an overall view of this phenomenon (Gopalakrishnan, 2006). This formalism has already been employed in non-biological disciplines for solving decision-making problems because it can describe the relationship between causes and effects. The DNF methodology and ternary Łukasiewicz logic were applied to describe the combined signaling effect of four diffusible proteins that leads to myogenesis, which is one of the most fundamental problems in developmental biology.

Creation of an automaton that describes the myogenesis SAT problem has led to a comprehensive overview of this non-trivial phenomenon, and also to a hypothesis that was subsequently verified experimentally. The myogenesis example demonstrated the power of applying Łukasiewicz logic in describing and predicting any decision-making problem in general, and developmental processes in particular.

Myogenesis was represented in SAT formalism with the given concentration set of the morphogens (the signaling molecules) in the environment of the forming myogenic tissue. The concentrations of the four morphogens, Wnts, Shh, Noggin, and BMP4 (denoted by the variables w, s, g, and b, respectively, ), represented the input, whereas myogenesis was defined as the output. Based on the classical Łukasiewicz logic the three input values referred to concentration that was either lower than the normal physiological value (state 0), the normal physiological concentration (state 1) or higher the normal physiological value (state 2). This formalism described appropriately the experimental biological systems, in which genes and proteins are either downregulated (0), unchanged (1) or overexpressed (2).

Figure 13: Myogenesis as a SAT problem presented in DNF using TNS. A. A summary of all possible scenarios that lead to myogenesis in SAT formalism. w, Wnts level; s, Shh level; g, Noggin level; b, BMP4 level; 0, less than the normal physiological concentration; 1, the normal physiological concentration; 2, above the normal physiological concentration. B. Myogenesis SAT Problem. Note that in this formulation, values are not assigned to the variables. Any combination of values could be processed by this function using the truth tables in C. C. The truth tables of myogenic developmental pathway.. D. A 3-symbol-4-state finite automaton that describes the SAT problem F. State S<syb>0: multipotent paraxial cell; States S₁/S₁': unspecified paraxial cell; State S₂: determined paraxial cell (myogenic cell). Uncolored arrow represents the initial state, arrows represent transition rules, and their colors represent the symbols. The green circle represents the accepting state. The order of signals has no importance. Yet, a cell could be exposed to only Wnt signaling, thus accepting state S₁'. Alternatively, a cell could be exposed to only Shh or Noggin minus BMP4, thus accepting state S₁. Note that although we have used 4 soluble proteins, only 3 symbols are used due to the close dependence of Noggin and BMP4 signals. Therefore, noggin minus BMP4 represents the stoichiometric ratio between the two components. If this ratio is negative, then the direction of progression is opposite to the arrow direction. E. A summary of all possible solutions leading to myogenesis in SAT formalism generated from the function F in 2B. The colored clauses indicate an input that collapses into another. The green clauses represent scenarios that have no biological meaning.

In the more widely known Boolean logic, the digit 1 represents the true value and 0 represents the false value. In the less used Łukasiewicz logic, applied to the myogenesis model, the digit 1 defined the normal physiological concentration of a morphogen, sufficient for signaling and therefore representing the true value. The digit 0 represented insufficient concentration for signaling, obviously the false value. The digit 2 represented higher concentration than the normal physiological value, obviously defined as true. A system of 4 variables (proteins w, s, g, and b) and 3 values (0, 1 and 2), yielded a matrix of 3⁴ = 81 scenarios, which may or may not result in myogenesis. Twenty out of these 81 scenarios were found to result in myogenesis (true value). When expressed in the form of DNF clauses, they satisfied the myogenesis function F (Figure 13A). They were abridged to a general formulation described in Figure 13B. Thus, two clauses were sufficient to describe myogenesis, defining the interplay between all four morphogens in this developmental process.

The universal truth tables described in Figure 13C served as a toolbox for defining the function F and assigning the different values to this function (Figure 13B). The appropriate combinations of values that satisfied F were expected to result in myogenesis. A 3-symbol-4-state finite automaton was created to describe this SAT problem (Figure 13D). In this device the morphogens represented the symbols, and the cell types along developmental lineages represented the internal states of the automaton.

This automaton was based on three symbols: 1) Wnts, referring to the local concentration of this morphogen above a certain threshold, 2) Shh, referring to its local concentration in a similar way, and 3) Noggin-minus-BMP4, referring to the stoichiometric ratio between these two morphogens. The automaton was described graphically on the basis of the true clauses, each illustrated by symbols, represented by arrows (Figure 13D). The states, represented by circles, described intermediates along the way from a multipotent paraxial cell (S₀, Figure 13D) through an unspecified paraxial cell (S₁ or S₁’) and finally, to a determined myogenic cell (S₂).

Although the automaton was created independent of the SAT function F, both logic methods resulted in the same conclusions. Both methods predicted that more clauses then those described in Figure 13A could lead to myogenesis (E). In one such predicted group of clauses (shown in red), g = 0 while both w and s are either 1 or 2 and b is either 0 or 1. In the second group (green), w is either 1 or 2, s = 0, g = 1, and b is either 0 or 1. Both the automaton and the SAT formalism indicated that the Shh and Noggin signals were redundant, suggesting an intriguing hypothesis that these two factors were located on the same biochemical pathway.

Three experiments were carried out in order to examine the hypothesis that Noggin expression required Shh signaling: a) insertion of a barrier between the midline tissues and the presomitic mesoderm (PSM), b) ablation of the notochord from the PSM anterior area, and c) implanting Wnt1-secreting cells lateral to a barrier that was inserted between the midline tissues and the PSM. The results strongly supported the hypothesis that Noggin expression was downstream of Shh signaling. These finding also revealed the unique status of the green scenarios in Figure 13E. While mathematically these scenarios must lead to myogenesis, the biological system could not allow Noggin level 1 while Shh level is 0, thus forcing Noggin levels to “collapse” into the level 0. Therefore these unique scenarios did not lead to myogenesis.

The entire process could be described by the simple but powerful formula: Wnts + (–BMP4) = myogenesis, where Shh levels could control the biological outcome of this formula by regulating Wnts signaling and controlling the expression of Noggin.

Outlook

The ever-increasing interest in BMC devices has not arisen from the hope that such machines could compete with their electronic counterparts by offering greater computation speed, higher fidelity and power or better performance in traditional computing tasks. The main advantage of autonomous BMC devices over the electronic computers arises from their ability to interact directly with biological systems and even with living organisms without any interface. The importance of the above-described biologically relevant automata was that they produced a computational output in the form of meaningful and expressible in vivo dsDNA. These results demonstrate that an appropriately designed computing machine can produce an output signal in the form of a specific biological function via direct interaction with living organisms. The next steps along this line would be the insertion of a complete computing device into a living cell or a tissue, with the long-term goal of utilizing BMC devices for in vivo diagnostics and disease control, or a design of new types of biological regulation.

References

R. Adar, Y. Benenson, G. Linshiz, A. Rosner, N. Tishby, E. Shapiro, Proc. Nat. Acad. Sci. USA, 2004, 101, 9960-9965.
L. M. Adleman, Science, 1994, 266, 1021-1024.
Y. Benenson, R. Adar, T. Paz-Elizur, Z. Livneh, E. Shapiro, Proc. Natl. Acad. Sci. USA, 2003, 100, 2191-2196.
Y. Benenson, B. Gil, U. Ben-Dor, R. Adar, E. Shapiro, Nature, 2004, 429, 423-429.
Y. Benenson, T. Paz-Elizur, R. Adar, E. Keinan, Z. Livneh, E. Shapiro, Nature, 2001, 414, 430-4.
R. S. Braich, N. Chelyapov, C. Johnson, P. W. Rothemund, L. Adleman, Science, 2002. 296, 499-502.
J. Chen, D. H. Wood, Proc. Natl Acad. Sci.USA, 2000, 97, 1328-1330.
D. Faulhammer, A. R. Cukras, R. J. Lipton, L. F. Landweber, Proc. Natl Acad. Sci. USA, 2000, 97, 1385-1389.
R. Feynman, in Miniaturization (Ed. D. Gilbert), Reinhold, New York, 1961, 282-296.
G. L. Gopalakrishnan, Computation Engineering Applied Automata Theory and Logic. Springer, New York, 2006.
J. H. Holland, Adaptation in Natural and Artificial Systems, MIT Press, Cambridge, 1992.
K. Komiya, K. Sakamoto, H. Gouzu, S. Yokoyama, M. Arita, A. Nishikawa, M. Hagiya, in DNA Computing 6th international workshop on DNA-based computers (Eds. A. Condon, G. Rozenberg), Springer-Verlag, Berlin, 2001, 19-26.
E. Kossoy, N. Lavid, M. Soreni-Harari, Y. Shoham, E. Keinan, ChemBioChem, 2007, 8, 1255-1260.
T. H. LaBean, E. Winfree, J. H. Reif, in DIMACS Series in Discrete Mathematics and Theoretical Computer Science (Eds. E. Winfree, D. K. Gifford), American Mathematical Society, Providence, 1999, 54, 123-140.
R. J. Lipton, Science, 1995, 268, 542-545.
Q. Liu, L. Wang, A. G. Frutos, A. E. Condon, R. M. Corn, L. M. Smith, Nature, 2000, 403, 175-179.
S. Livstone, D. van Noort, L. F. Landweber, Trends Biotechnol., 2003, 21, 98-101.
C. Mao, T. H. LaBean, J. H. Reif, N. C. Seeman, Nature, 2000, 407, 493-6.
R. Piran, E. Halperin, N. Guttmann-Raviv, E. Keinan, R. Reshef, Development, 2009, 136, 3831-3840.
J. H. Reif, Science, 2002, 296, 478-479.
J. H. Rose, R. J. Deaton, M. Hagiya, A. Suyama, Phys. Rev. E. Stat. Nonlin. Soft Matter. Phys., 2002, 65, 021910.
P. W. K. Rothemund, in DNA Based Computers: Proceedings of a DIMACS Workshop (Eds. R. Lipton, B. B. Eric), American Mathematical Soc. 1996, 27, 297-302.
S. Roweis, E. Winfree, R. Burgoyne, N. V. Chelyapov, M. F. Goodman, P. W. Rothemund, L. M. Adleman, J. Comput. Biol., 1998, 5, 615-629.
J. A. Ruben, L. F. Landweber, Nat. Rev. Mol. Cell Biol., 2000, 1, 69-72.
K. Sakamoto, H. Gouzu, K. Komiya, D. Kiga, S. Yokoyama, T. Yokomori, M. Hagiya, Science, 2000, 288, 1223-1226.
N. C. Seeman, Chem Biol., 2003, 10, 1151-1159.
S. Shoshani, T. Ratner, R. Piran, E. Keinan, Isr. J. Chem., 2011, 51, 67-86.
S. Shoshani, S. Wolf, E. Keinan, Mol. Biosyst., 2011, 7, 1113-1120.
M. Soreni, S. Yogev, E. Kossoy, Y. Shoham, E. Keinan, J. Am. Chem. Soc., 2005, 127, 3935-3943.
S. Tagore, S. Bhattacharya, M. A. Islam, M. L. Islam, J. Proteomics Bioinform., 2010, 3, 234-243
S. Ulam, in Essays on cellular automata (Ed. A. E. Burks) U. of Illinois Press, Chicago, 1972, 219-231.
E. Winfree, F. Liu, L. A. Wenzler, N. C. Seeman, Nature, 1998, 394, 539-44.
E. Winfree, J. Biomol. Struct. Dyn., 1999, 11, 263-270.

Biologically relevant molecular finite automata

Contents

Introduction

Finite State Automata

The First Realization of an Autonomous, DNA-Based Finite Automaton

3-Symbol-3-State DNA-Based Automata

Molecular Computing Device for Medical Diagnosis and Treatment In Vitro

DNA-Based Automaton with Bacterial Phenotype Output

Molecular Computing with Plant Cell Phenotype

Applications of Molecular Finite Automata in Developmental Biology

Outlook

References

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Focal areas

Activity

Tools