CMPSCI 501: Theory of Computation

Solutions to First Midterm Exam, Spring 2014

David Mix Barrington

19 February 2014

Directions:

Answer the problems on the exam pages.
There are nine problems (some with multiple parts) for 125 total points plus 10 extra credit. Actual scale was A = 115, C = 70.
If you need extra space use the back of a page.
No books, notes, calculators, or collaboration.
The first six questions are statements -- in each case say whether the statement is true or false and give a convincing justification of your answer -- a proof, counterexample, quotation from the book or from lecture, etc. You get five points for the correct boolean answer (so there is no reason not to guess if you don't know) and up to five for the justification.

  Q1: 10 points
  Q2: 10 points
  Q3: 10 points
  Q4: 10 points
  Q5: 10 points
  Q6: 10 points
  Q7: 40 points
  Q8: 25 points
  Q9: +10 points
 Total: 125+10 points

Question text is in black, solutions in blue.

The language X is the set of all strings in {a, b}^* in which every b has an a immediately before it and an a immediately after it.

If w is any string, w^R is w written backwards. For example, (abaab)^R = baaba.

The function f from strings in {a, b}^* to strings in {a, b}^* replaces each a in the input with ab and each b in the input with ba. Thus, f(aba) = abbaab.

Language Y is the set of all strings of the form w(f(w))^R where w is any string in {a, b}^R. For example, taking w = aba, we see that ababaabba is in Y because it is the concatenation of w = aba and f(w)^R = baabba.

Language Z is the set of all strings of the form wf(w) where w is any string in {a, b}^*. For example, taking w = aba, we find that abaabbaab is in Z.

The NFA N has input alphabet {a}, state set {i, p, q, r, s, t}, final state set {i, p, r, s}, and transition function δ defined as follows: δ(i, a) = {p, r}, δ(p, a) = {q}, &delta(q, a) = {p}, δ(r, a) = {s}, δ(s, a) = {t}, δ(t, a) = {r}, and for all states u, δ(u, ε) = ∅. This last definition means that there are no ε-moves in N.

Question 1 (10): True or false with justification: The language Y, defined above, is regular.
FALSE. Y is not regular. If i and j are two different non-negative integers, then u = aⁱ and v = a^j are Y-distinguishable. Letting z be (ba)ⁱ, we see that uz is in Y. But vz is not in Y because a string in a^*(ba)* can be in Y only if the number of a's exactly equals the number of ba's. (If there are too many a's, the first one past the 1/3 mark of the string does not match the a before it. If there are two few a's, there is a b in the first third without its corresponding ab later.)
Hence the set {ε, a, aa, aaa,...} is an infinite set of pairwise Y-distinguishable strings, and Y is not regular by Myhill-Nerode.
A pumping lemma proof could choose w = a^p(ba)^p, so that pumping would change the number of a's which would take the string out of Y for the reasons outlined above.
Question 2 (10): True or false with justification: The language Z, defined above, is context-free.
FALSE. Z is not context-free, which we can show by the CFLPL using w = a^pb^p(ab)^p(ba)^p, where p is the alleged pumping length. (Note that taking the simpler w = a^p(ab)^p does not work, because it is possible to take u = a^p-1, v = a, x = ε, y = ab, and z = (ab)^p-1, whereupon uvⁱxyⁱz is in Z for all nonnegative integers i.)
We will actually show that the language Z', defined to be the intersection of Z with a^*b^*(ab)^*(ba)^*, does not satisfy the CFLPL with this choice of w. This will prove that Z is not a CFL, since if it were Z' would be a CFL as well as the intersection of a CFL and a regular language.
First note that w is in Z', as the concatenation of the string a^pb^p with its image under f. We must show that any way to write w as uvxyz fails to meet the terms of the CFLPL. If either v or y straddles two of the four blocks of w, pumping up will yield a string that is not in Z' because it is not in the regular language.If they are each in a single block, they must be in adjacent blocks because |vxy| ≤ p. This leaves three cases, and in each case pumping down makes either the number of a's differ from the number of ab's, or the number of b's differ from the number of ba's, or both.
Question 3 (10): True or false with justification: The language {w: f(w) ∈ X} is regular. (The function f and the language X are defined above.)
TRUE. X is regular by the result of 7(a), so let M be its DFA. Let M' be an NFA with the same state set, start state, and final state set as M. Define, for each state s in M', δ'(s, a) = δ^*(s, ab) and δ'(s, b) = δ^*(s, ba). (Here δ* is the extension of δ to operate on words.) With δ' as the transition function of M', on any input w M' will go to the same state that M would reach on input f(w). So w is in L(M') if and only if f(w) is in L(M), and thus L(M') is the given language, which is thus proved to be regular.
Another approach, simpler in this case, is to observe that no nonempty string can be in the given language, since any nonempty string w either begins with b (so that f(w) begins with b), ends in a (so that f(w) ends with b), or has an ab substring (so that f(w) has an abba substring). The empty string is in this language since f(ε) = ε is in X, so the given language is the regular language ∅^* = {ε}.
Question 4 (10): True or false with justification: The language {f(w): w ∈ X} is not regular. (Again, f and X are defined above.)
FALSE. To get a regular expression for this language, we need only take a regular expression for X (from 7(c)) and "apply f to it", changing all its a's to ab's and all its b's to ba's. Of course, the new ab's and ba's should be each enclosed in parentheses.
Question 5 (10): True or false with justification: Let M be a pushdown automaton that follows the rules for our construction of an equivalent context-free grammar. (It has exactly one final state that is not its start state, it accepts only with empty stack, and every transition either pushes or pops a letter but not both.) Assume in addition that M reads exactly one input letter on each transition. Then every string in L(M) has even length.
TRUE. In order to accept a string, the computation of the PDA must end with an empty stack, so that it has an equal number of pushes and pops. The number of input letters it reads is equal to the total number of transitions by the additional assumption, and this number is the number of pushes plus the number of pops since each transition is a push or a pop but not both. The length of the accepted input is thus twice the number of pushes, an even number.
Question 6 (10): True or false with justification: There is an example of a string aⁱ that is in the language L(N) while the string aⁱ⁺⁶ is not. (The NFA N is defined above.)
TRUE. The string a⁰ = ε is in L(N) since the start state is final. But for a⁶ there are exactly two possible paths from the start state in N, one leading to the nonfinal state q and the other to the nonfinal state t, so that a⁶ is not in L(N). For every positive integer i, the only length-6 paths following a path on aⁱ must lead back to that same state (as it is not state i), so aⁱ and aⁱ⁺⁶ are either both in or both out of L(N).
Question 7 (40): These questions deal with finite automaton and regular language constructions. The language X and the NFA N are defined above.
- (a, 10) Build a DFA whose language is X. You may use any method, but be sure that your DFA is correct.
  The minimal DFA for X has start state 1 (final) with a-arrow to 2 and b-arrow to 4, state 2 (final) with a-arrow to 2 and b-arrow to 3, state 3 (non-final) with a-arrow to 2 and b-arrow to 4, and death state 4 (non-final) with both arrows to itself.
  There are other correct DFA's with more states. Many people gave DFA's that either failed to accept the empty string (which has no b's and thus no bad b's) or failed to accept ababa.
- (b, 10) Find a set of four strings that are pairwise X-distinguishable, and show that they have this property. Explain (without proof) what the existence of this set implies about the set of correct DFA's for the language X. (Hint: You must choose two strings in X and two strings not in X.)
  Any such set must include two strings in X, one of them the empty string, and two strings not in X, such as b and ab. No string in X can be equivalent to one not in X, since we can distinguish them with z = ε. So to show that {ε, a, b, ab} is pairwise X-distinguishable, we need only distinguish ε from a (with the string z = ba, for example) and distinguish b from ab (with the string z = a, for example).
  The existence of this set implies that every correct DFA for X has at least four states, by the Myhill-Nerode Theorem.
- (c, 10) Find a regular expression for X, using your DFA from part (a).
  Applying the construction to the DFA in 7(a), we add a new final state 5 with new ε-moves from 1 to 5 and from 2 to 5. Since there are no moves into 1, it may remain the start state. We first kill 4, which leads to no new moves because 4 had no move out of it. We then kill 3, leading to an ab-move from 2 to 2. This merges with the existing a-loop on 2 to make an (a ∪ ab)-loop. Now killing state 2 makes a new move from 1 to 5 with label a(a ∪ ab)^*. This merges with the existing ε-move from 1 to 5 to give us the final regular expression, ε ∪ a(a ∪ ab)*.
  I took points off if there was no derivation of the expression from the DFA, even if it was correct for the language. I was fairly generous with incorrect or nonstandard derivations that led to correct expressions.
- (d, 10) Using the Subset Construction, find a DFA equivalent to the NFA N.
  Start state {i} has a-move to {p, r} which has a-move to {q, s} which has a-move to {p, t} which has a-move to {q, r} which has a-move to {p, s} which has a-move to {q, t} which has a-move to {p, r}. The process stops with seven of the possible 64 states. The only non-final state is {q, t}, so we can see that this DFA accepts a^k for every k except the positive multiples of 6.
Question 8 (25): Here are three questions about the languages Y and Z, which are defined above.
- (a, 5) Give three examples of six-letter strings in Y and three examples of six-letter strings in Z. Explain (without proof) exactly which non-negative integers can be the lengths of strings in Y, and which can be the lengths of strings in Z.
  In Y: Choose from aababa, ababba, babaab, and bbabab.
  In Z: Choose from aaabab, ababba, babaab, and bbbaba.
  Since the length of f(w) and f(w)^R are each twice the length of w, the length of any string in either language is three times the length of the string w that put it in the language. Thus the length of a string in Y or Z must be a multiple of 3, and any such multiple is a possible length of a string in either language.
- (b, 10) Give a context-free grammar whose language is Y. You should justify your answer, but you need not prove that your grammar is correct.
  The simplest grammer has three rules: S → aSba, S → bSab, and S → ε. To generate the string in Y arising from any string w, we apply the first two rules once to generate each letter of w to the left of the nonterminal, starting with the first letter of w. These rules generate f(w)^R to the right of the nonterminal, so that applying the third rule completes a derivation of the string w(f(w))^R. And any complete derivation in this grammar must be of this form for some w, since the third rule may be used only once, at the end of the derivation. So a string is derivable from this grammar if and only if it is in Y.
- (c, 10) Describe a pushdown automaton whose language is Y, either by construction from your answer to part (b) or otherwise.
  The construction gives us a PDA with three states, plus additional states to implement rules with multiple pushes. The start state has a single move to the middle state, reading nothing, popping nothing, and pushing S$. There are five moves from the middle state to itself, one reading and popping a, one reading and popping b, and three that read nothing and pop S, pushing aSba, bSab, and ε respectively. Finally there is a move from the middle state to the final state, reading nothing, popping $, and pushing nothing. As shown in the text, this PDA accepts exactly the strings in L(G) where G is the grammar.
  A simpler PDA pushes a $ at the beginning, jumps to a state where it reads letters from the input and pushes them onto the stack, jumps to another state where it reads ab's and ba's from the input, popping a b from the stack for each ab and an a from the stack for each ba, and finally jumping to the final state while popping a $. This PDA can accept only by reading and pushing the first third of the string, which we may call w, then jumping to the next state and clearing the stack while reading f(w)^R. Note that the $ business is necessary so that we can't accept with a nonempty stack after having read only a prefix of the proper string.
Question 9 (10 extra credit): The function f defined above is an example of a homomorphism, a function from strings to strings that satisfies the rule f(uv) = f(u)f(v) for all strings u and v. (A homomorphism is completely determined by its outputs for single-letter inputs.)
The homomorphic image of a language L under a homomorphism f is the set {f(w): w ∈ L}. The inverse homomorphic image of L under f is the set {w: f(w) ∈ L}.
If L is a context-free language and f is any homomorphism, is its homomorphic image guaranteed to be a context-free language? Justify your answer.
If L is a CFL and f is a homomorphism, is its inverse homomorphic image guaranteed to be a context-free language? Justify your answer.
Both languages are regular. For the homomorphic image, it is easy to convert a grammar for L into a grammar for {f(w): w ∈ L} by "applying f to the grammar" as in Question 4. We just replace every terminal c on the right-hand side of a rule with the string f(c). Then every derivation of a string w in the old grammar becomes a derivation of f(w) in the new grammar.
For the other language, we convert a PDA M for L into a PDA M' for {w: f(w) ∈ L}. M' will have the same states, start state, final state, and stack alphabet as M, except that it will have some extra states as follows. For every state p of M and every letter f(c), determine every possible action of M on reading the string f(c). For each of these actions, make a chain of transitions in M' that has the same effect on the stack but reads only the single letter c. Thus any complete accepting computation in M' follows a sequence of such chains, reading a work w from the input, such that there is a corresponding computation in M that reads the string f(w). Similarly, any computation in M that reads f(w) can be broken into a chain of actions for each substring f(c), such that a corresponding chain of actions can be taken in M' while reading w. In summary, L(M') is exactly {w: f(w) ∈ L(M).

Last modified 23 February 2014