CMPSCI 501: Theory of Computation

Solutions to First Midterm Exam, Spring 2016

David Mix Barrington

Exam given February 2016

Solutions posted 26 February 2016

Directions:

Answer the problems on the exam pages.
There are twelve problems (some with multiple parts) for 130 total points. Actual scale was A = 115, C = 75.
If you need extra space use the back of a page.
No books, notes, calculators, or collaboration.
The first five questions are statements -- in each case say whether the statement is true or false and give a convincing justification of your answer -- a proof, counterexample, quotation from the book or from lecture, etc. You get five points for the correct boolean answer (so there is no reason not to guess if you don't know) and up to five for the justification.

Question text is in black, solutions in blue.

  Q1: 10 points
  Q2: 10 points
  Q3: 10 points
  Q4: 10 points
  Q5: 10 points
  Q6: 10 points
  Q7: 15 points
  Q8: 10 points
  Q9: 10 points
  Q10: 10 points
  Q11: 10 points
  Q12: 15 points
 Total: 130 points

The language X over the alphabet {a, b, c} is denoted by the regular expression (ab)^*c(ab)^*.

The language Y over the alphabet {a, b, c} is the set {wcw: w ∈ {a, b}^*}.

The grammar G has input alphabet {a, b, c}, nonterminals S and T, start symbol S, and rules S → aTb, T → bSa, and S → c.

Question 1 (10): True or false with justification: If L is any language that is not context-free, and R is any regular language, then L ∩ R cannot be context-free.
FALSE. If, for example, R = ∅, then L ∩ R = ∅ as well, and we know that ∅ is a regular language. Another example is that Y above is not a CFL, and X above is a regular language, but Y ∩ X is exactly the context-free language L(G) where G is given above.
Question 2 (10): True or false with justification: The language L(G), where G is the grammar defined above, is regular.
FALSE. Let S = {(ab)ⁱc: i ≥ 0}. If i ≠ j, then (ab)ⁱc(ab)ⁱ is in L(G) but (ab)^jc(ab)ⁱ is not. So the strings in S are pairwise L(G)-distinguishable, and thus S is not regular by the Myhill-Nerode Theorem.
You could also prove L(G) to not be regular by the Regular Language Pumping Lemma, letting w = (ab)^pc(ab)^p where p is the alleged pumping length. Pumping down causes the c to no longer be in the exact middle of the string, since the entire non-empty string y must be before the c, and this means that the pumped-down string is not in L(G).
Question 3 (10): True or false with justification: For any non-negative integer n, let f(n) be the number of strings of length n in the language X defined above. Then f(n) = O(1), that is, there is some constant c such that for all sufficiently large n, f(n) ≤ c.
FALSE. For any non-negative integers i, j, and n with i + j = n, (ab)ⁱc(ab)^j is a string of length 2n + 1 in X. So there are n + 1 strings of length 2n + 1, and thus f(n) eventually exceeds any constant function infintely often.
Question 4 (10): True or false with justification: For any non-negative integer n, let g(n) be the number of strings of length n in the language Y defined above. Then g(n) is not polynomially bounded, that is, it is not the case there is a polynomial p such that, for sufficiently large n, we have g(n) ≤ p(n). (Note: The paraphrase of "polynomially bounded" was incorrect on the actual exam sheet.)
TRUE. If w is any string in {a, b}ⁿ, wcw s a string in Y. So there are 2ⁿ strings of length 2n + 1 in Y, or 2^(n-1)/2 of length n whenever n is odd. This exponential function eventually exceeds any polynomial function.
Question 5 (10): True or false with justification: Let M and N by any two DFA's. Let Z be the language {u₁v₁u₂v₂...u_kv_k: k ≤ 0, u_i ∈ L(M), v_i L(N)}. Then Z must be a regular language.
TRUE. The easiest proof is to note that Z is exactly the language (L(M)L(N))^*, and that the regular languages are closed under concatenation and star. (It is incorrect to claim that Z is regular just because its strings are formed by concatenating strings in L(M) and L(N) -- "closure under concatenation" is a property of languages, not of strings.) Alternatively, we could build an NFA for Z out of M and N, by taking a union of M's states and N's states, adding a new start state i which is the only final state in the new DFA, and adding ε-moves from i to the start state of M, from every final state of M to the start state of N, and from every final state of N to i.
Question 6 (10): Prove that there does not exist any PDA M such that Y = L(M), wehre Y is defined above.
We show that Y does not obey the conclusion of the CFL Pumping Lemma, which means that it cannot be the language of any PDA.
Suppose p were the pumping length for Y in the CFLPL. We let w be the string a^pb^pa^pb^p and suppose that w = uvxyz with |vy| > 0 and |vxy| ≤ p. Neither v nor y can contain the c, or pumping down would bring us out of Y. We cannot have v and y on the same side of the c, or pumping down would cause the c to no longer be in the exact middle of the string. The only remaining case is where x contains the c, and in this case v consists entirely of b's and y consists entirely of a's. Pumping down changes either the number of b's before the c or the number of a's after the c, or both. The changed number cannot equal the size of the corresponding block on the other side of the c, since that block has not changed. In each case the pumped-down string is not in Y, so the conclusion of the CFLPL is false for Y and Y is not a CFL.
Question 7 (15): Suppose that M' is a PDA in which the transition (p, ε, ε; q, ε) occurs. (Recall that ε is the empty string.)
- (a, 5) If we transform M' into a PDA in the correct normal form for the PDA-to-CFG construction, what transitions and other entities will we add to replace the transition (p, ε, ε; q, ε)?
  That transition violates the normal form because it neither pushes nor pops a letter. We add a new state r, a new stack alphabet character c, and two transitions (p, ε, ε; r, c) and (r, ε, c; q, ε).
- (b, 10) What rules of the resulting CFG will come from the transitions added in part (a)?
  From the transitions themselves, there is only one new grammar rule, from combining the push-c transition with the pop-c transition to get A_pq → εA_rrε. There are also new rules of the form A_st → A_srA_rt for every pair of states s and t, but none of these will occur in any derivation of a nonterminal in the grammar, since no string can take M' from any state other than r, with an empty stack, to r with an empty stack, or vice versa.
Question 8 (10): Give an NFA N such that L(N) = X, using the regular expression for X given above. Remember that NFA's may include ε-moves.
The simplest NFA, using an ad hoc construction from the regular expression, would have state set {i, p, q, f} and transitions (i, a, p), (p, b, i), (i, f, c), (f, a, q), and (q, b, f).
My construction gives eight states {1, 2, 3, 4, 5, 6, 7, 8}, with 1 the start state, 8 the only final state, and transitions (1, ε, 2), (2, a, 3), (2, ε, 4), (3, b, 4), (4, ε, 2), (4, c, 5), (5, a, 6), (5, ε, 7), (6, b, 7), (7, ε, 5), and (7, ε. 8). I believe that following the construction in Sipser gives twelve states.
Question 9 (10): Build (by any valid method) a DFA D such that L(D) = X, where X is given above. Explain why you believe your DFA to be correct.
My construction takes the first NFA given in the solution to Question 8, adds a new death state d with all arrows to itself, and adds an arrow to d for every state-letter pair missing in the NFA. It is easy to see that this DFA is correct for X. First, any string in X goes from i to i after the initial ab's, to f after the c, and then remains in f after the final ab's. Conversely, any string accepted by the DFA must have sometime gone from i to f on a c, and before that could only have taken some number of ab's from i and some number of ab's from f.
Question 10 (10): Describe the equivalence classes for the relation of X-equivalence, in the sense of the Myhill-Nerode Theorem. (Again, X is the language given above.) You may use any valid method.
There are five classes, corresponding to the five states of the DFA given in the solution to Question 10, which happens to be minimal. The class of strings going to i is (ab)^*, to p is (ab)^*a, to f is X itself, to q is the language Xa, and to d is the messier set consisting of all strings that cannot be extended to a string in X (or equivalently, the set of all strings not in the other four languages.
We need to establish that this DFA is minimal. We could run the minimization algorithm on it, or just note that the four nonfinal states i, p, q, and d are X-distinguishable because wc is in X if and only if w goes to i, wbc is in X if and only w goes to p, wb is in X if and only if w goes to q, and all three of these strings fail to be in X if and only if w goes to d.
Question 11 (10): Describe a PDA M_G such that L(M_G) = L(G), where G is the grammar given above. You may use any valid method.
The PDA given by the top-down parser construction in the text would have states {i, p, f} (plus extra states to implement the multiple pushes in some of the transitions), stack alphabet {$, S, T, a, b, c}, start state i, only final state f, and transitions (i, ε, ε; p, S$), (p, a, a; p, ε), (p, b, b; p, ε), (p, ε, S; p, aTb), (p, ε, T; p, bSa), (p, ε, S; p, c), and (p, ε, $; f, ε).
There are also simpler PDA's that push a symbol for each ab pair before the c, change states on the c, and pop a symbol for each ab pair after the c. They would also push a $ at the start and pop it at the end, to make sure that acceptance occurs only with an empty stack.
Question 12 (15): Let L₁, a subset of {a, b, c}^*, be the set of strings that contain exactly one c.
- (a, 5) Build a DFA D₁ such that L(D₁) = L₁.
  There is a three-state minimal DFA with state set {i, f, d}, start state i, only final state f, and transitions (i, a, i), (i, b, i), (i, c, f), (f, a, f), (f, b, f), (f, c, d), (d, a, d), (d, b, d), and (d, c, d).
- (b, 10) Use the construction given in lecture and in the text to create a regular expression for the language L₁ from your DFA.
  We first add a new start state i' and a new final state f', with ε-moves from i' to i and from f to f'. We now have five states. We eliminate d and need no moves to replace it as it has no edges out. We can then eliminate i, creating one new transition (i', (a ∪ b)^* c, f). We then eliminate f, leaving the single transition (i', (a ∪ b)^*c(a ∪ b)^*, f'). The label on this transition is our final regular expression.

Last modified 26 February 2016