Computing by observing

Post-publication activity

Matteo Cavaliere,

The Framework

"Computing by observing" is a framework where the computation is obtained by observing and interpreting the trajectories of an observed system. In this framework, the entire observed progress of a system is considered the result of a computation. This is usually done in the following way: an external observer associates to each state (configuration) of an observed system an unique symbol. As the observed system progresses, the symbols are concatenated obtainining a sequence of symbols (string). The string produced is considered the result of a computation. The paradigm, in the most general case, is presented in Fig.<ref>framework</ref>.

Figure 1: The Computing by Observing Framework

The motivations for the paradigm come from experimental science: there, the usual procedure is to conduct an experiment, observe the entire dynamics of a system and then take the result of this observation as the final output. A momentary picture of the observed system is irrelevant, and rather the entire dynamics of the system is evaluated. Computing by observing introduces an explicit separation of the observer from the observed system, stressing in this way the role of the observer in computation. This allows to define the computation by an opportune computational balance between the observed system and of the external observer. In several applications of the framework, it has been shown that "simple" systems observed by "simple" observers can form very powerful computing devices (i.e., Turing machines) (see, e.g., Alhazov and Cavaliere(2004) and Cavaliere and Leupold(2004)). Moreover, as shown by Cavaliere et al(2006) if the observed system is sufficiently complex, one can obtain every possible computational device without changing the observed system but only by finding the "correct" (simple) observer. These results seem to be relevant especially in the area of natural and molecular computing where one could take advantage of the fact that simple experiments observed in a clever manner by simple tools could compute the same functions that much more complex experiments can compute.

Applications and Results

Computing by observing has been applied in several contexts, by considering different kinds of observers and of observed systems. The framework was originally introduced by Cavaliere and Leupold(2004b) in the framework of membrane computing (P systems), an abstract model of cellular processes. Following the same idea, new bio-inspired models of computation have been obtained in Alhazov and Cavaliere(2004) (observing sticker systems) and in Cavaliere et al(2009) (observing DNA splicing). The generalization of the framework to formal languages theory has been proposed in Cavaliere and Leupold(2004), to string-rewriting in Cavaliere and Leupold(2006). In general, in all these mentioned works, already rather simple systems observed by simple observers were proved to be computationally universal (i.e., equivalent to Turing machines). However, most of these results were obtained by designing an opportune observed system and an opportune observer to produce a specified result. This contrasts with the fact that, often, natural systems cannot be easily designed or modified. However, their dynamics can be usually monitored by using different observers. In this respect, as shown in Cavaliere et al(2006), if the observed system is sufficiently complex, every computational device can be obtained by observing in the ``right" manner a fixed observed system, that is by only changing the observer (with both - observers and observed system of very limited computational power).

Observing Formal Grammars

One can often abstract (idealize) a concrete discrete dynamical system by representing the configurations of the system as sequences of objects/symbols (strings) and the rules that determine the transitions between different configurations by means of opportune rewriting rules (e.g., productions of a formal grammar). For this reason, many results on the computing by observing paradigm can be understood in the abstract framework of formal languages theory. In "Grammar/Observer (G/O) systems", introduced in Cavaliere and Leupold(2004), a formal grammar plays the role of the observed system and a finite state automaton plays the role of the observer. The grammar's configurations are the sentential forms of its derivations. The trajectories of the grammar are its derivations. The observer is a then device mapping the sentential forms into one singular symbol. Precisely, the observer is an appropriate variant of finite state automata where the set of states is labeled with the symbols of a specified output alphabet: Any computation of the observer produces as output the label of the state it halts in (one can assume as observers only deterministic and complete automata). As we do in this article, an observer can always be described by opportune regular expressions (we assume the standard notations used for formal grammars and regular expressions).

A Grammar/Observer (G/O) system is a pair $ \Omega = (G,A) $ constituted by a generative grammar $ G =(N, T, S, P) $ and an observer $ A $ with output alphabet $\Sigma \cup \{\bot\}$ which is also the output alphabet of the entire system $\Omega$. The observer's input alphabet is the union of $N$ and $T$. Several modes of generation can be defined by considering different restrictions on the observer. The most general observer can write an empty and non-empty symbol in an arbitrary manner (free G/O systems).

A free G/O system generates a language in the following manner\[L_f(\Omega) = \{A(w_0,w_1,\dots ,w_n) \mid S=w_0 \Rightarrow w_1 \Rightarrow \dots \Rightarrow w_n,\ w_n\in T^* \}.\]

Here $ A(w_0,w_1,\dots ,w_n) $ is used as a more compact way of writing the catenation $ A(w_0)A(w_1)\cdots A(w_n) $. In other words, the language contains all those strings corresponding to all the possible terminating derivations (trajectories) of the observed grammar. Derivations which do not terminate do not produce a valid output. One can consider the variant where the language produced by $ \Omega $ is defined as $ L_{\bot,f}(\Omega) = {L}_f(\Omega) \cap \Sigma^* . $ In this way the strings in $ L_f(\Omega) $ containing $ \bot $ are filtered out and they are not present in $ L_{\bot,f}(\Omega) $. Thus the observer has in some sense the ability to reject a computation, when configurations of a certain type appear. The functioning of a simple free G/O system presented in in Fig. <ref>example</ref> is the following one.

Figure 2: Functioning of a Grammar/Observer System

To each sentential form produced by the grammar $ G $ the observer $ A $ associates a symbol that can be $ a,b,c,\bot $ or the empty symbol $ \lambda $ (the vertical arrow is the observer mapping). Thus every computation of the observer produces a unique symbol and the chronological catenation of these symbols is then the output string. In the figure the output string is $\lambda a \cdots a b\cdots b c \cdots c$. The mapping defined by the observer is specified by the corresponding regular expressions. The language $L_f(\Omega)$ is obtained by considering all possible halting derivations of $G$ and collecting all the output strings. In this case $L_{\bot,f} (\Omega)$ is $\{a^n b^n c^n \mid n >0\}$. One can assume that $ \mathcal REG $ and $ \mathcal CF$ denote the classes of regular and context-free grammars, respectively and that $REG$, $CF$ and $RE$ denote the classes of languages generated by regular, context-free, and type-0 grammars, respectively. Commutative context-free grammar, as defined in Cavaliere and Leupold(2004) is a class of grammars weaker than $\mathcal CF$ and are denoted by $LCCF$. In Cavaliere and Leupold(2004) it has been shown that a G/O system composed of very simple components, namely a locally commutative context-free grammar ($LCCF$) as observed system and a finite state automaton as observer is computationally complete.

Theorem : For each $L \in RE $ there exists a G/O system $\Omega = (G,A)$, with $G$ an $LCCF$, such that $ L_{\bot,f}(\Omega)=L. $

An interesting fact is that the observer's ability to produce $ \bot $, i.e., to eliminate certain observed derivations, seems intuitively to be a powerful and essential feature to get high computational power. However, one can obtain all recursively enumerable languages over $ \Sigma $ simply by intersection of a language over $\Sigma \cup \{\bot\}$ with the regular language $\Sigma^*$. Notice that recursive languages are closed under intersection with regular sets. Therefore, there must exist some G/O system $\Omega$ generating a non-recursive $L_{f}(\Omega)$ (i.e., without using the filtering with $\bot$).

An interesting restriction of the general framework is to allow only changes in the observer, keeping unchanged the observed system (computing by only observing Cavaliere et al(2006)). The following simple example shows that the changes of the observer can be important. Let us consider the context-free grammar $ G = (\{S,A,B,C\},\{t,p\},S,\{S \rightarrow pS, S \rightarrow p, S\rightarrow A, A\rightarrow AB, A\rightarrow C, B\rightarrow C, C\rightarrow t\})$.

If $G$ is coupled with the observer $A'$ such that $A'(w) = a $ if $ w\in \{S,A,B,C,t,p\}^+ $, then $\Omega=(G,A')$ defines the language $L_{\bot,f}(\Omega)= \{a^i \mid i \geq 2\}$, a regular language. In fact, the derivation $S\Rightarrow pS \stackrel{n-2}\Rightarrow p^{n-1}S \Rightarrow p^n$ produces (when observed) the string $a^{n+1}$. Keeping the same grammar $G$ we change the observer into $A''$ such that: \[A''(w) = \left\{ \begin{array}{ll} \lambda & \textrm{ if } w=S,\\ a & \textrm{ if } w\in AB^*,\\ b & \textrm{ if } w\in C^+B^*,\\ c & \textrm{ if } w\in t^+C^*,\\ \bot & \textrm{ else} \end{array}\right.\] One can verify that $\Omega=(G,A'')$ generates the language $L_{\bot,f}(\Omega)= \{a^nb^nc^n \mid n > 0 \}$, a context-sensitive language. This example suffices to underline that part of the computation can be done by selecting the right observer, keeping unchanged the observed system. This part can be crucial: As shown Cavaliere et al(2006) one can construct an universal context-free grammar that can generate all recursively enumerable languages when observed in the correct manner.

Computing by Observing Bio-Systems

The computing by observing framework has been applied in the area of molecular computing to define "bio-inspired accepting machines". This has been generally achieved by taking (a model of) a biological system, introducing an input to such a system and observing its trajectory. If the trajectory is of an "expected" type (for example, it follows a predetermined pattern), the input is considered accepted by the system, otherwise it can be considered rejected. In this case, the external observer is used for extracting the observed trajectory of the system. A decider is then an automata-like machine that checks whether the observed trajectory is of the expected type. In this articile, we present the application (Cavaliere et al(2009)) of the paradigm to to splicing systems, a formal model of the recombination of double stranded DNA sequences (for simplicity, one refers to them as DNA strands) under the action of a ligase and restriction enzymes (endonucleases). An abstract accepting device, “splicing recognizer” (in short, SR), constructed by joining a decider, an observer, and a splicing system is depicted in Fig.<ref>RSS</ref>.

Error creating thumbnail: Invalid thumbnail parameters

Figure 3: Computing by Observing a Splicing System

The SR works in the following way. An input "marked" DNA strand (represented by a string $w$) is inserted in a test tube. Due to the presence of restriction enzymes, the input strand changes, as it starts to recombine with other DNA strands present in the test tube. A sequence of intermediate marked DNA strands is then produced: Each new marked strand is obtained by splicing the previous one with other strands present in the test tube. This constitutes the trajectory of the input marked DNA strand. Schematically this is presented with the sequence of $w, w', w'', w'''$ in Fig.<ref>RSS</ref>. The external observer associates to each intermediate marked strand a certain symbol taken from a finite set of possible symbol. It writes these symbols onto an output tape in their chronological order. In Fig.<ref>RSS</ref> this corresponds to the string $a_1a_2a_3a_4$. This string then represents a code of the obtained trajectory. When the marked strand becomes of a certain predetermined ``type" the observation stops. At this point the decider checks if the entire observed trajectory of the input marked DNA strand described by the string $a_1a_2a_3a_4$ has followed a certain expected pattern, i.e., if it is in a certain language. If this is true, the input string $w$ is accepted by the SR; otherwise it is considered to be rejected. Using this strategy it is possible to obtain very powerful accepting systems even when very simple components are used. For instance, observing a finite splicing system (i.e., finite set of rules) with observer and decider as finite state automata is enough to simulate an accepting Turing machine. This is a remarkable jump in acceptance power since it is well known (e.g., Paun et al(1998)) that a finite splicing system by itself can generate only a subclass of the class of regular languages.

These ideas can be formalized in the following way (we assume the notation of splicing systems, more precisely their formal H scheme, as presented in Paun et al(1998)).

Consider an alphabet $V$ (splicing alphabet) and two special symbols $\# $ and $\$$ not in $V$. For a splicing rule $r=u_1 \# u_2 \$ u_3 \# u_4$ and strings $x,y,z_1,z_2 \in V^*$ one writes $(x,y) \Longrightarrow_{r} (z_1,z_2)$ iff $x=x_1u_1u_2x_2, y=y_1u_3u_4y_2,z_1=x_1u_1u_4y_2, z_2=y_1u_3u_2x_2$. One refers to $z_1$ ($z_2$) as the first (second) string obtained by applying the splicing rule $r$. An H scheme is a pair $\nu=(V,R)$ where $V$ is an alphabet, and $R\subseteq V^* \#V^*\$V^*\#V^*$ is a set of splicing rules.

For a given H scheme $\nu=(V,R)$ and a language $L \subseteq V^*$ one defines $\nu(L)=\{z_1,z_2 \in V^* \mid (x,y) \Longrightarrow_{r} (z_1,z_2), $ for some $ x, y \in L,r \in R \}$. When restriction enzymes (and a ligase) are present in a test tube, they do not stop acting after one cut and paste operation, but they act iteratively. Then, given an “initial language” $L\subseteq V^*$, and an H scheme $\nu=(V,R)$ one defines the iterated splicing as\[\nu^0(L)=L, \nu^{i+1}(L)= \nu^{i}(L) \cup \nu(\nu^{i}(L)), i \geq 0.\]

In the proposed paradigm, we are interested in observing the trajectory of a specific marked string introduced, at the beginning, in the initial language $L$ and called “input marked string”.

Given an initial language $L$, an “input marked string” $w \in L$, a “target marked language” $L_t$ and an H scheme $\nu$, the scheme defines a sequence of marked strings that represents the trajectory of the input marked string $w$, according to the splicing rules defined in $\nu$.

The sequence of marked strings, $\langle w_0=w,w_1, \cdots, w_k \rangle$, for $k \geq 1$ and $w_k \in L_t$, is constructed in the following iterative way ($w_i$ is the marked string associated to the set $\nu^{i}(L), 0 \leq i \leq k$). Each new marked string is obtained by splicing the old marked string, until a marked string $w_k$ from the target marked language $L_t$ is reached or the marked string cannot be spliced. The first string of the sequence is the input marked string, $w_0=w$. If $w_i \in L_t$, $i \geq 1$, then the sequence stops (the marked string is among the ones of the target marked language). If there is no $x \in \nu^{i}(L)$, $i \geq 0$, such that $(w_i,x) \Longrightarrow_{r} (z_1,z_2)$ or $(x,w_i) \Longrightarrow_{r} (z_1,z_2)$ for some $r\in R$, then the sequence stops (the marked string cannot be spliced). If $x,y \in \nu^{i}(L)$, $i\geq 0$, with $w_i=x$ (or $w_i=y$) and there exists a rule $ r \in R$ such that $(x,y) \Longrightarrow_{r} (z_1,z_2)$, then $w_{i+1}=z_1$. In this case, if the marked string can be subject to more than one splicing rule, producing different strings, the choice of the next marked string is done in a non-deterministic way (notice that it is assumed that the first string produced as the new marked one). Because the update of a marked string is made in a non-deterministic way, given an input marked string $w$, an initial language $L$, a target marked language $L_t$, and an H scheme $\nu$, it is possible to get different sequences of intermediate marked strings. The collection of all these sequences is denoted by $\nu(w,L,L_t)$.

Moreover we use an "observer" that is a finite state automaton with output as defined for G/O systems.

As “deciders” we require devices accepting a certain language over the output alphabet of the corresponding observer and for this we can rely on conventional finite automata. The output of a decider $D$, for a word $w$ in input, is denoted by $D(w)$. It consists of a simple “yes” or “no”.

These components can be then composed to constitute a “splicing recognizer” that formalizes the machine presented in Fig.<ref>RSS</ref>.

A splicing recognizer (SR) is a quintuple $\Omega = (\nu, A, D, L, L_t)$; $\nu =(V,R)$ is an H scheme, $A$ is an observer with input alphabet $V$ and output alphabet $\Sigma$, $D$ is a decider with input alphabet $\Sigma$, $L$ and $L_t$ are finite languages, respectively, the initial and the target marked language for $\nu$.

The “language accepted” by the SR $\Omega $ is the set of all words $w\in V^*$ provided there exists a sequence of marked strings accepted by the decider. Formally, \[ L(\Omega) = \{w \in V^*\mid \] there is $ s\in \nu(w,L,L_t) $ such that $ D(A(s)) = $” yes” }.

It is known in the splicing literature (e.g.,Paun et al(1998) ) that the family of languages generated by splicing systems using only a finite set of splicing rules and a finite initial language is strictly included in the family of regular languages. However we can show that an SR composed by an H scheme with a finite set of rules, finite initial language, finite target marked language and finite state automata as observer and decider, can recognize non-regular languages. This example is just a hint towards the fact that the combination splicing system-observer-decider can be powerful even when the single components are simple. In particular, we show how to construct an SR accepting the non-regular language $ \{o_l a^nb^n o_r \mid n \geq 0\}$.

The SR $\Omega=(\nu, A, D, L, L_t)$ is defined as follows: the H scheme is $\nu=(V,R)$, with $V=\{o_l,o_r,a,b,a',b',X_1,Y_1,X_2,Y_2\}$ and $R=\{r_1:\#bo_r \$ X_2\#b'o_r, r_2: o_la' \# Y_2\$ o_l a \#, r_3: \#b'o_r \$ X_1\#o_r, r_4:o_l \# Y_1 \$ o_la'\# \}$.

The initial language is $L=\{X_2b'o_r,o_la'Y_2,X_1o_r,Y_1o_l\}$. The target marked language is $L_t=\{o_lo_r\}$.

The observer $A $ has input alphabet $V$ and output alphabet $\Sigma= \{l_0,l_1,l_2,l_3,\bot\}$.

The mapping it implements is\[A(w) = \left\{ \begin{array}{ll} l_0 & w\in o_l(a^*b^*)o_r,\\ l_1 & w\in o_l(a^*b^*b')o_r,\\ l_2 & w\in o_l(a'a^*b^*b')o_r,\\ l_3 & w\in o_l(a'a^*b^*)o_r,\\ \bot & else. \end{array}\right.\]

The decider $D$ is a finite state automaton, with input alphabet $\Sigma$, that gives a positive answer exactly if a word belongs to the regular language $l_0(l_1l_2l_3l_0)^*$.

The functioning of $\Omega$ can be described as follows. The observer can "check" that the splicing rules are applied in the order $r_1,r_2,r_3,r_4$, and this corresponds to remove, in an alternating way, a $b$ from the right and an $a$ from the left of the input marked string. In this way, at least one of the evolutions of the input marked string will be of the kind accepted by the decider if, and only if, the input marked string is in the language $\{o_l a^nb^n o_r \mid n \geq 0\}$. Notice that, at each step, the marked string present is spliced only with one of the strings present in the initial language. To clarify the working of the SR $\Omega$ we show the acceptance of the input marked string $w_0=o_l aabbo_r$. For simplicity, we only show the trajectory of the input marked string and the output of the observer, step by step.

Step $0$: input marked string $w_0= o_l aa bb o_r$; $A(w_0)= l_0$;

Step $1$: apply rule $r_1$; new marked string $w_1=o_l aabb'o_r$; $A(w_1)=l_1$;

Step $2$: apply rule $r_2$; new marked string $w_2=o_la'abb'o_r$; $A(w_2)=l_2$;

Step $3$: apply rule $r_3$; new marked string $w_3=o_la'abo_r$; $A(w_3)=l_3$;

Step $4$: apply rule $r_4$; new marked string $w_4=o_labo_r$; $A(w_4)=l_0$;

Step $5$: apply rule $r_1$; new marked string $w_5=o_l ab'o_r$;$A(w_5)=l_1$;

Step $6$: apply rule $r_2$; new marked string $w_6=o_la'b'o_r$; $A(w_6)=l_2$;

Step $7$: apply rule $r_3$; new marked string $w_7=o_la'o_r$; $A(w_7)=l_3$;

Step $8$: apply rule $r_4$; new marked string (in the target marked language) $w_8=o_lo_r$; $A(w_8) =l_0$.

Obviously, the entire observed trajectory $l_0l_1l_2l_3l_0l_1l_2l_3l_0$ is of the kind accepted by the decider $D$, so the string $w_0$ is accepted by the SR $\Omega$. Use similar reasoning one can see tha the defined SR $\Omega$ accepts exactly the language $ \{o_l a^nb^n o_r \mid n \geq 0\}$.

Using similar approaches, it is possible to prove that SRs are universal. In informal words, this means that it is possible to simulate an accepting Turing machine by observing, with a simple observer, the trajectories of a simple splicing system (this was shown in Cavaliere et al(2009)).

Theorem : For each $RE$ language $L$ over the alphabet $T$ there exists an SR $\Omega$ using a splicing scheme $\nu $ with finite splicing rules and finite initial language, such that $\Omega$ accepts the language $\{o_l' w o'_r \mid w \in L\}$, with $o'_l, o'_r \notin T$.

References

Alhazov(2004). Computing by Observing: The Case of Sticker Systems. DNA 2004 LNCS 3384: 1-13.

Cavaliere(2004). Evolution and Observation: A Non-Standard Way to Generate Formal Languages. Theoretical Computer Science 321: 233-248.

Cavaliere(2004b). Evolution and Observation- A New Way to Look at Membrane Systems. WMC2003 LNCS 2933: 70-87.

Cavaliere, M; Hoogeboom, HJ and Frisco, P (2006). Computing by Only Observing. Developments in Language Theory 2006 LNCS 4036: 1-10.

Cavaliere(2006). Observation of String-Rewriting Systems. Fundamenta Informaticae 74: 447-462.

Cavaliere, M; Jonoska, N and Leupold, P (2009). DNA Splicing: Computing by Observing. Natural Computing 8: 157-170.

Paun, Gh; Rozenberg, G and Salomaa, A (1998). DNA Computing. New Computing Paradigms. Springer-Verlag, Heidelberg.

Surveys

Cavaliere, M (2008). Computing by Observing: A Brief Survey. Computability in Europe 2008 LNCS 5028: 110-119.

Cavaliere(2009). Complexity through the Observation of Simple Systems. Electronic Proceedings Theoretical Computer Science 1: 22-10. arXiv:0906.3332

Computing by observing

Contents

The Framework

Applications and Results

Observing Formal Grammars

Computing by Observing Bio-Systems

References

Surveys

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Focal areas

Activity

Tools