Report

Theory of Computing Lecture 23 MAS 714 Hartmut Klauck The game • • • • Two players, Alice and Bob Alice receives an input x2§* Bob receives y2§* Alice sends a message to Bob, who decides whether x±y2 L – ±: concatenation, we will also just write xy • Cost of the game: number of messages used DFA and the game • Theorem [DFA vs. game]: – If L is regular, then the optimal number of messages in the game is equal to the smallest number of states in a DFA. – If L is not regular, then the smallest number of messages is infinite Application • How to show that we need many messages? • Consider the infinite matrix M, rows labeled with x2§* and columns labeled with y2§* and M[x,y]=1 iff xy2 L • Call this matrix the communication matrix of L • Lemma: The minimum number of messages in the game is equal to the number of distinct rows in the matrix Proof • Two rows labeled x, x’ are distinct if there is a column y where M[x,y]=1 and M[x’,y]=0 – or vice versa • If Alice sends the same message on x and x’ Bob cannot distinguish x and x’, so there is an error on xy or on x’y • On the other hand it is enough for Alice to send the same message for rows that are not distinct Examples • Parity: M has two distinct rows – one for x with odd number of 1’s and for even • {0n 1n}: – Consider the rows labeled 0k for all k=1,…,1 – The rows are all different • Conclusion: {0n 1n: n natural} is not a regular language • Note: We can consider subsets of the rows and columns of M and show that this submatrix has infinitely many distinct rows. Then M also has infinitely many distinct rows Example • L={xx: x2{0,1}* } is not regular • Proof: we need to show the number of distinct rows is infinite • Consider the rows and columns labeled by 0n1 for all natural n • Take two rows labeled by x=0p1 and x’=0r1 • Clearly there is a 1 in M[x,y] for y=0p1 but there is no 1 in M[x’,1] • Again M contains an infinite identity matrix Proof [DFA to game] • Assume L is regular. There is a DFA M for L. • Alice simulates M on x, and sends the state of M reached after reading x to Bob, who continues the simulation • Bob accepts iff M accepts xy • Clearly this is a correct protocol, and the number of messages is the size of the set of states Proof [game to DFA] • Now assume that there is a protocol with m messages • We can assume that Alice uses one message for each distinct row of the matrix, otherwise the protocol is not optimal – i.e., for two rows that are not distinct she uses the same message • We construct a DFA with m states corresponding to the m messages • q0 [the starting state] is the message that Alice sends on the empty string • We use m-1 other states, corresponding to messages • Set of accepting states: messages/states, such that Bob accepts on the message when y is empty string • Remains to define the transition function Proof • Assume that on some input x Alice sends message q, and on input xa she sends message q’ • Then we include the transition q,a q’ • Clearly this defines transitions, but are they consistent? • I.e., If Alice sends q on x and q’ on xa, and Alice also sends q on x’, then she should send q’ on x’a • But x and x’ have the same row, hence xa and x’a also have the same row [if not, there is some y such that xay2 L and x’ay not in L, hence rows of x and x’ not the same] • Hence the transition function is well-defined. • We get a DFA with m states for L Note • The minimum number of messages in our communication game is usually called the Myhill-Nerode index 2-way DFA • Theorem: 2-way DFA can recognize only regular languages • Note: – Sometimes it is easier to show that L is regular by using a 2-way DFA – Sometimes 2-way DFA need fewer states – Proof: We show that the Myhill-Nerode index is finite. Examples • L={w: third last symbol of w is 1} • 2-way DFA: move to the right end of the input, move 3 steps left, accept if there is a 1 • Lrev for a regular language L is the set of w, such that w read in reverse is in L – Move to the right end and then simulate the DFA for L while reading the input backwards Proof (theorem) • Proof: We show that the Myhill-Nerode index is finite. • Language L is decided by a 2-way DFA M, where M has n states • Consider words w=xy • Suppose the tape-head of M comes back from y into x while M is in state q, and the same happens later in state q’ – qq’ otherwise there is an infinite loop • Hence the states on which M crosses from y to x are distinct, and there are at most n such crossings • Simulation in the communication game: Proof • We describe the communication protocol • Let S=[q1,...,qm] be a sequence of any m· n distinct states of M • Alice simulates M on x, assuming that when M first leaves x it later returns to x in state q1 – Etc: If M leaves x for the k-th time it returns to x in state qk • Alice’s message: for every possible S Alice sends the m states in which M leaves x (for even m), and additionally if she would accept, for odd m • Bob finds the sequence S that is consistent with M’s behavior on y and Alice’s message (S is the correct sequence occurring on M and xy) • Bob accepts if M accepts • It is easy to see that this protocol is correct, and uses no more than O(n2n) messages, which is finite 2-way NFA • Definition: a two-way NFA is a nondeterministic Turing machine that cannot write and cannot leave the tape area that contains the input. • Theorem: Two-way NFA can decide only regular languages • Proof: Same as for two-way DFA, Alice and Bob check consistency with respect to the nondeterministic A – Is there a computation that leads to the communicated sequence of states and accepts? Open Problem • Is there a language, for which 2-way NFA need much fewer states than 2-way DFA? Size of Automata? • We know L is regular, if there is a DFA with a finite number of states for L • Minimum number of states is the Myhill-Nerode Index • Questions: 1. How can we find a minimal DFA for L? 2. Is the minimal DFA unique (up to renaming states)? 3. Are NFA sometimes smaller than minimal DFA? 4. Are two-way DFA sometimes smaller than DFA? Question 4 • Consider the language L={xi : x2{0,1}n and i2{0,1} log n and xn-i=1} – n is fix, this is a finite language, hence regular • It is easy to see that there are at least 2n rows in the communication matrix – Consider only columns for the different i=n-1,…,0 – Rows labeled x contains the string x at entries i=n-1…0 • Hence any DFA for L has at least 2n states • Can give a 2-way DFA with O(n2) states: – go to the right end of the input, read i into the state, move i steps left, accept if there is a 1 Question 3 • Consider L={xy: x,y 2 {0,1}n, x y} – This is a FINITE language, n is fixed • Exercise: there is an NFA with O(n2) states for L • DFA-size: the matrix has more than 2n distinct rows – DFA size is exponential • Hence NFA can be exponentially smaller than DFA for some languages • Example where they are not: complement of L 2) Uniqueness • Uniqueness of the minimal DFA (up to vertex names) follows from the Myhill-Nerode characterization • The optimal DFA has exactly one state for each set of equal rows of the communication matrix • Edges/the transition function are also determined uniquely, i.e., if x is in set q and xa is in set q’ then there must be an edge labeled a from q to q’