# Solutions to First Midterm Exam, Spring 2011

### Directions:

• Answer the problems on the exam pages.
• There are eight problems for 120 total points. Actual scale was A=112, C=76.
• If you need extra space use the back of a page.
• No books, notes, calculators, or collaboration.
• The first six questions are true/false, with five points for the correct boolean answer and up to five for a correct justification of your answer -- a proof, counterexample, quotation from the book or from lecture, etc. -- note that there is no reason not to guess if you don't know.

```  Q1: 10 points
Q2: 10 points
Q3: 10 points
Q4: 10 points
Q5: 10 points
Q6: 10 points
Q7: 20 points
Q8: 40 points

Total: 120 points
```

Question text is in black, solutions in blue.

Some questions refer to the context-free grammar G which has rules S → aSb, S → bSa, and S → c.

Several questions also refer to the context-free grammar H which has rules S → aS, S → bS, S → Sa, S → Sb, and S → c.

The regular language X is given by the regular expression a*b*.

The regular language Y is given by the regular expression Σ*baΣ*.

• Question 1 (10): True or false with justification: Any context-free language is the language of some PDA that has exactly one final state and that accepts only when its stack is empty.

TRUE. By definition, any CFL is the language of some context-free grammar G. If we construct a top-down parser for G (as in Sipser's proof that a PDA with language L(G) exists), then this is a PDA that has exactly one final state and accepts only when its stack is empty. The bottom-up parser constructed in lecture has these two properties as well.

• Question 2 (10): True or false with justification: The language L(G), where the grammar G is defined above, is the language of some DFA.

FALSE. For any naturals i and j with i ≠ j, the string aicbi is in L(G), because we can apply the rule S → aSb i times and then the rule S → c once. The string ajcbi is not in L(G), because every string in L(G) has c in its exact middle position.

So the set X = {ai: i ≥ 0} is an infinite set of pairwise L(G)--distinguishable strings. Given two strings x = ai and y = aj in X with i ≠ j, let z = cbi. Then xz = aicbi is in L(G), and yz = ajcbi is not. Since this set exists, L(G) cannot be the language of any DFA by the Myhill-Nerode Theorem.

• Question 3 (10): True or false with justification: The language L(H), where the grammar H is defined above, is the language of some DFA.

TRUE. I claim that L(H) is the langauge of the regular expression R = (a ∪ b)*c(a ∪ b)* and thus has a DFA by Kleene's Theorem.

To see that L(H) ⊆ L(R), note that every rule except S → c preserves the invariant that the string derived is in (a ∪ b)*S(a ∪ b)*, and this other rule can only be used once, as the last step of a derivation, and thus produces a string in L(R).

To see that L(R) ⊆ L(H), let ucv, with u and v in (a ∪ b)*, be an arbitrary string in L(R). We can derive ucv in G by |u| uses of the rules S → aS or S → bS, followed by |v| uses of the rules S → Sa or S → Sb, followed by one use of the rule S → c.

• Question 4 (10): True or false with justification: Neither of the languages L(G) and L(H) is contained in the other.

FALSE. L(G) is contained in L(H), because every rule of G can be simulated by rules of H. We simulate S → aSb in H by taking S to aS and then to ASb, and similarly we simulate S → bSa in H by taking S to bS and then to bSa. The other rule of G, S → c, is also a rule of H. So every string derivable in G is also derivable in H.

• Question 5 (10): True or false with justification: The languages X and Y defined above each have at most three classes in their Myhill-Nerode equivalence relations.

TRUE. We can exhibit three-state DFA's for each language, proving that each language has at most three states. (In fact these two DFA's are minimal, so there are exactly three states for each language, but you did not need to show this.) The DFA for X has states 1, 2, and 3, start state 1, final states 1 and 2, a-arrows from 1 to 2, 2 to 3, and 3 to 3, and b-arrows from 1 to 2, 2 to 2, and 3 to 3. It is easy to see that this DFA accept any string of a's followed by any string of b's, but rejects any string that is not of this form. The DFA for Y is actually identical except that the only final state is 3. Here we see that the first b in the string (if any) takes the DFA to state 2, and the first a after that b takes it to 3, so we get to 3 if and only if we have an a after a b, which is true if and only if we are in Y.

• Question 6 (10): True or false with justification: If M1, M2, and M3 are any three PDA's with languages L1, L2, and L3 respectively, then the language [L1(L2 ∪ L3)]* is the language of some PDA. (Here we are using the operations of concatenation and Kleene star on languages, even if those languages are not given by regular expressions.)

TRUE. The easiest proof is to note that each of these PDA's has an equivalent grammar, and we proved on the homework (by three simple constructions) that the languages of grammars are closed under union, concatenation, and Kleene star. So there is a grammar for for the desired language, meaning that there must be a PDA for it.

Another proof would proceed directly from the PDA's, combining them by constructions similar to those we used for NFA's. But this construction must use the fact (proved in Question 1 of this exam) that any PDA has an equivalent PDA that accepts only on empty stack. Otherwise, we could not necessarily connect two PDA's in series, because the first might leave something on the stack that would interfere with the second.

• Question 7 (20):

• (a,5) Prove (by any method) that the string abacbab is in L(G).

We can exhibit the derivation S → aSb → abSab → abaSbab → abacbab, or equivalently draw a parse tree for this derivation.

• (b,5) Describe the language L(G) in English.

L(G) consists of all strings ucv where u and v are each in (a ∪ b)*, u and v are of the same length, and v is obtained from u by first taking the reversal of u and then switching its a's and b's.

• (c,10) Describe (by any method) a PDA whose language is L(G). If you do not use a standard construction, argue that your PDA has exactly this language.

Of course we could create a top-down or bottom-up parser, but here is a perhaps simpler PDA constructed directly. The start state is 1, and has a single transition (ε ε → \$) to state 2. State 2 has two loops (a, ε → a) and (b, ε → b), and a transition (c, ε → ε) to state 3. State 3 has two loops (a, b → ε) and (b, a → ε), and a transition (ε \$ → ε) to state 4. State 4 is the only final state and has no transitions.

The only way this PDA can accept is to put a \$ on the stack, read a string u of a's and b's and put it on the stack, read a c, read a string v of a's and b's which must be the reverse of u with a's and b's swapped, then pop the \$. This is clearly possible if and only if the input is a string ucv as described in part (b) of this question, which is true if and only if the input is in L(G).

• Question 8 (40): This question uses the regular languages X and Y defined before Question 1. X is the language of the regular expression a*b* and Y is the language of the regular expression Σ*baΣ*, and Σ = {a,b}.

• (a,10) Create two NFA's whose languages are X and Y respectively, using any method. It is possible in each case to have a three-state NFA with no ε-moves.

X has a two-state NFA, with an a-loop on the start state 1, a b-arrow from 1 to 2, a b-loop on state 2, and both states final. This NFA can accept in state 1 if and only if it reads a string in a*, and can accept in state 2 if and only if it reads a string in a*bb*. A string is in X if and only if it is in one of these two languages.

Y has a three-state NFA, with an a-loop and b-loop on start state 1, a b-arrow from 1 to 2, an a-arrow from 2 to 3, and an a-loop and b-loop on state 3, which is the only final state. Clearly a string can be accepted by this NFA if and only if it is some string followed by ba followed by some string.

• (b,10) Using your answer to part (a), construct an NFA whose language is X ∪ Y. You may use the method from the text, the method from lecture, or some other construction, but the latter should be justified.

Using Sipser's construction, we create a six-state NFA with all the states and transitions from the two NFA's of part (a) and one more state, the non-final start state, with ε-arrows to the two start states of the two other machines (of course these states stop being start states in the new machine). Clearly this NFA can accept a string if and only if at least one of the two NFA's from part (a) could accept it.

• (c,10) Construct, by any legitimate method, a DFA whose language is X ∪ Y.

Using Sipser's version of the Subset Construction on the NFA of part (b), we call the start state 0, the two states of the X-machine 1 and 2, and the three states of the Y-machine 3, 4, and 5. We create a DFA whose states represent sets of states of this NFA. The start state is E({0}) = {0, 1, 3}. This has an a-arrow to {1, 3} and a b-arrow to {2, 3, 4}. State {1, 3} has an a-arrow to itself and a b-arrow to {2, 3, 4}. State {2, 3, 4} has a b-arrow to itself and an a-arrow to {3, 5}. State {3, 5} has an a-arrow to itself and a b-arrow to {3, 4, 5}. State {3, 4, 5} has an a-arrow to {3, 4} and a b-arrow to itself. We thus complete our DFA with five states, all of which are final because each set contains one of the final states of the NFA, which are 1, 2 and 5.

• (d,10) Construct, by any legitimate method, a minimal DFA whose language is X ∪ Y. (A minimal DFA is one such that any other DFA for the same language has at least as many states.) If you believe that your answer to part (c) is minimal, you must justify that claim here.

It is sort of obvious that since all the states are final, the DFA of part (c) will accept every string and is thus equivalent to the one-state DFA with its state final. But the State Minimization Algorithm will also lead us to this conclusion. We begin with a partition of one set F containing all five states. We have no set N because there are no non-final states. Now each of the five states has behavior (F, F) because all the a-arrows and b-arrows lead to states in F. So after one round we conclude that we have found the final partition of states, and form our minimal DFA by merging all five states of F into a single state. This creates a one-state DFA with its state final. (A one-state DFA is always minimal for its language, since there is no such thing as a 0-state DFA.)