# Solutions for First Practice Midterm Exam

#### 25 February 2008

Question text is in black, solutions are in blue.

• Question 1 (10): (True/false with justification) If X is any CFL and Y is any regular language, X ∩ Y must be regular.

FALSE. For example, let X be any non-regular CFL, such as the set of palindromes over {a,b}, and let Y be Σ*, which is regular. Then X ∩ Y is X itself, which is not regular.

• Question 2 (10): (True/false with justification) If X and Y are each CFL's and neither X nor Y is regular, then X ∪ Y must be a CFL and X ∪ Y must not be regular.

FALSE. It is true that X ∪ Y must be a CFL because CFL's are closed under union. But it is possible to have two non-regular CFL's union to be a regular language. For example, let X be {anbn: n ≥ 0} and let Y be {aibj: i ≠ j}. Neither X nor Y is regular, and both are CFL's, but X ∪ Y is the regular language a*b*.

• Question 3 (10): (True/false with justification) IF M is a DFA such that all states of M are final states, then L(M) must be equal to Σ*, the set of all strings.

TRUE. Let w be an arbitrary string. Reading w takes M from the start state to some state, and by the assumption that state must be final, so w is in L(M). Since w was arbitrary, we know that L(M) contains all possible strings.

• Question 4 (10): (True/false with justification) If M is an NFA such that all states of M are final states, then L(M) must be equal to Σ*, the set of all strings.

FALSE. Given a string w, there is no reason why any path labeled by the letters of w should exist in M. For example, suppose that M has a single state, both start and final, and no transitions at all. Then L(M) is {ε}, not Σ*, but M meets the given condition.

• Question 5 (30): Let the language E be the set of all strings w over the alphabet {a,b} such that the number of a's in w equals the number of b's in w.
• (a,15) Describe a context-free grammar G such that E = L(G).

Let the grammar have start symbol S and rules S → aSb, S → bSa, S → SS, and S → ε. Clearly every string derived in this grammar has an equal number of a's and b's -- it remains to prove that every string with an equal number of a's and b's can be derived. We prove this by strong induction on the length of strings. If the string has no letters we can derive it by the fourth rule. If the string can be divided into two nonempty strings in E, we can derive each from S by the inductive hypothesis and use the third rule to make the two S's we need.

The remaining case is where the string w is nonempty and has no proper substring in E. Suppose the first letter of w is an a. We claim the last letter must be b, because otherwise the net count of a's would be 1 after the first letter and -1 before the last letter and must have been 0 in between. So we can write w as avb, where v is a string in E and thus can be derived by induction, so we can make w starting with the first rule. Similarly, if w starts with b it must end in a and we can similarly make it using the second rule. Note that this is reminiscent of the argument that the grammar we constructed from a PDA derives all strings accepted by that PDA.

• (b,15) Describe a pushdown automaton P such that E = L(P). You may or may not want to use a general construction on your answer to (a).

I think it's easier to construct the PDA directly. The idea is to let the characters on the stack, a's or b's, represent the excess of one type of letter over the other. We need rules to shift either letter from the input onto the stack, and rules to cancel an a from the input with a b on the stack or vice versa. We also need to ensure that we only accept with an empty stack.

The result looks rather like the top-down or bottom-up parsers. We have three states q0, q, and f, where q0 is the start state and f is the only final state. We have a transition from q0 to q pushing a \$ symbol, four transitions from q to q with read-pop-push combinations (a,ε;a), (b,εb), (a,b;ε), and (b,a;ε), and finally a transition from q to f that pops the \$.

The PDA can only accept with an empty stack, and the stack can only be empty if all the a's and b's in w have been read onto the stack and/or canceled with the other letter. If the input string is in E, then it is possible for the machine to accept it, keeping only a's on the stack if it has seen more a's than b's and keeping only b's on the stack if it has seen more b's.

• Question 6 (20): Let X be the language {anbn: n ≥ 0} and let Y be the language {bndn: n ≥ 0}. Let Z be the language {anb2ncn: n ≥ 0}.
• (a,10) Is the concatenation product XY a CFL? Justify your answer.

Yes, XY is a CFL because both X and Y are CFL's and CFL's are closed under concatenation. For example, we could have rules S → TU, T → aTb, T → ε, U → bUc, and U → ε.

• (b,10) Is Z a CFL? Justify your answer, by reference to (a) or otherwise.

No, Z is not a CFL. First, note that XY is irrelevant because it is the language {aibi+jcj: i, j ≥ 0} rather than Z. We prove that Z is not a CFL by using the CFL pumping lemma. Assume that Z is a CFL and let p be its pumping constant. Let w be the string apb2pcp. By the CFL pumping lemma, w is equal to uvxyz where vy is nonempty, |vxy| ≤ p, and uvixyiz is in Z for all nonnegative integers i. The string vxy can include parts of at most two of the three parts of w, the initial 0's and the 1's or the 1's and the final 0's. If either v or y contain any 0's, then the pumped-down string uxz will have different numbers of initial 0's and final 0's and so not be in Z. If v and y are all 1's, then the string uxz will have fewer than 2p 1's but still have p initial 0's and p final 0's, so it is not in Z. All cases lead to the pumping lemma being violated, so Z cannot be a CFL. (Note that this proof is almost identical to the proof that {anbncn: n ≥ 0} is not a CFL.

• Question 7 (30): Let X be the language of the regular expression b ∪ Σ*a.
• (a,10) Draw an NFA whose language is X. Make sure your NFA cannot accept the string ab.

Let M have start state 1, ε-moves from 1 to states 2 and 3, a b-move from 2 to a final state 4, no moves out of 4, an a-move from 3 to a final state 5, both an a-move and a b-move from 3 to itself, and no moves out of 5. The only string that can lead to state 4 is b, and the strings that can lead to 5 are exactly the language Σ*a. In particular, ab cannot be accepted using either final state.

• (b,10) Draw a DFA whose language is X, using a general construction on your answer from (a) or otherwise.

Start state 123 is nonfinal. Its a-move is to 35 and its b-move is to 34, both final states. State 35 has a-move to itself and b-move to 3. State 34 has a-move to 35 and b-move to 3. State 3 has an a-move to 35 and a b-move to 3. The final states are 34 and 35.

• (c,10) Give a four-state DFA whose language is X, if your answer to (b) is not already a four-state DFA. Argue that any correct DFA for X must have at least four states, either by running the minimization algorithm or by finding four strings that are pairwise X-inequivalent.

The construction in (b) gave us only four states. We begin with the minimization algorithm.

Our initial partition has classes N = {123, 3} and F = {34, 35}. Then 123 has a-move and b-move both to F, while 3 has a-move to F and b-move to N. So 123 and 3 must be separated at the next stage. Turning to F, we see that both 34 and 35 have a-moves to F and b-moves to N. So we need to carry out another round with new classes X = {123}, Y = {3}, and F = {34, 35}. Now both 34 and 35 have an a-move to F and a b-move to Y. The question is incorrectly stated, and there are only three states in the minimal DFA.