Question text is in black, solutions in blue.
Q1: 10 points Q2: 10 points Q3: 10 points Q4: 10 points Q5: 10 points Q6: 15 points Q7: 10 points Q8: 10 points Q9: 15 points Q10: 10 points Q11: 10 points Q12: +10 points Total: 120+10 points
The language X over the alphabet {a, b, c} is the set {a1bjck: i + j = k}.
The language Y over the alphabet {a, b} is the set {anbncn: n ≥ 0}.
The language Z over the alphabet {a, b, c, d} is the language of the following NFA N. N has state set {0, 1, 2, 3}, start state and only final state 0, and transitions (0, a, 1), (0, a, 2), (1, b, 2), (1, b, 3), (2, c, 3), (2, c, 0), (3, d, 0), and (3, d, 1).
The PDA M has state set {i, p, f} with start state i and only final state f. Its input alphabet is {a, b} and its stack alphabet is {a, c}. Its transitions are (i, a, ε; p, c), (p, a, ε; p, a), (p, b, a; p, ε), and (p, b, c; f, ε). Recall that the transition (q, x, y; r, z) means that the PDA can do from state q to state r while reading x, popping y, and pushing z.
The grammar G has rules S → aTb, T → TT, T → aTb, and T → ε.
FALSE. Proof 1: The intersection of X with the regular language
a*c* is {ancn: n ≥
0}, which is isomorphic to {anbn: n ≥ 0}, a
language we proved not to be regular.
Proof 2: For any naturals i and j with i ≠ j, the strings
ai and aj are X-distinguishable, using the
string ci. Hence there are infinitely many
X-equivalence classes and no DFA for X can exist. (Along the same
lines you could quote a correct answer for Question 12 here.)
Proof 3: Use the Regular Language Pumping Lemma with k the
alleged
pumping length and w = akck. If X were
regular with pumping length k,
w could be written as xyz with |xy| ≤ k and |y| > 0,
so that for any i the string xyiz would be in X. But xz
would not be in X, since it would have fewer a's than c's. Hence X
is not regular.
TRUE. The complement of X is the union of the complement of
a*b*c* and the language
{aibjck: i + j ≠ k}. The former
langauge is the complement of a regular language, and is thus both
regular and context-free. It suffices to prove that the latter
language is context-free.
Proof 1: The language in question is the union of two languages,
one containing the strings with too few c's and one the strings with
too many. X itself has the grammar S → aSc, S → T, T
→ bTc, T → ε. The language of strings with two
many
c's is the concatenation of X with the regular language
cc* and is thus context-free. The language of strings
with too few c's has the grammar S → aSc, S → aT, S →
bU, T → aT, T → U, U → bUc, U → bU, U →
ε.
Proof 2: We can easily build a PDA for X that first pushes a $ onto
the stack, then reads a's and
pushes
them onto the stack, then reads b's and pushes an a onto the stack for
each one, then reads c's and pops an a from the stack for each one,
then
pops the $ and accepts if it is at the end of the string. Some of you
wanted to make a PDA for the complement by switching final and
non-final
states of the PDA for X, which does not work. But we can make a PDA for
the complement that is similar to the one for X, except that (1) if it sees an a after the
first
b, or an a or b after the first c, or a c when it can pop the $
instead of an a, it goes to a state where it reads
the rest of the input and then accepts, (2) all the former nonfinal
states are now final, and (3) the former final state is now nonfinal.
FALSE. Proof 1: Without using the CFL Pumping Lemma, we can see that
this
language is similar to our given non-CFL
{anbncn: n ≥ 0}. If we had a PDA
for Y, we could alter it so that after it has seen a b, it interprets
any c's it sees as a's and rejects if it sees any real a's. Thus it
interprets the input string as being in Y, and thus accepts it, if and
only if it is in the given non-CFL.
Proof 2: We can use the CFLPL almost identically to the case of
the
given non-CFL. Let k be the alleged pumping length and choose w to be
the string akbkak, which is in Y. If
Y were a CFL with pumping length k, w could be written as uvxyz with
|vxy| ≤ k, |vy| > 0, and uvixyiz in Y for
all
i. But the string uxz (with i = 0) must have its three single-letter
strings of different lengths, since deleting v and y must affect at
least
one of them but cannot affect all three.
The simplest such DFA has a single state which is final, with all transitions going from that state to itself. That DFA accepts all strings and hence accepts all strings in X. Someone gave a DFA for the language a*b*c*, which also accepts all strings in X and is thus correct. Of course by Question 1 it is impossible to have a DFA that accepts exactly the strings in X, but we are not asked for such a DFA.
Since N has no ε-moves, we can easily inspect it and see that
it has no two-step paths where both edges have the same label. Hence
the
language Z' is equal to the language Z, and since Z has an NFA it has
a
regular expression by Kleene's Theorem.
Several people did not notice this fact about Z and gave the following
valid proof, which works for an arbitrary regular language Z. The set
of
all strings with no double letters is the complement of the regular
language Σ*(aa ∪ bb ∪ cc ∪
dd)Σ* and is thus regular. (It's also easy to design
a DFA for this language, which remembers the last letter it has seen
and goes to a death state if it is repeated.) Therefore Z' is the
intersection of two regular languages, which we have shown to be
regular, and by Kleene's Theorem it has a regular expression.
Start state 0 (final) has a-arrow to state 12 (nonfinal) and other arrows
to
state d (nonfinal). State 12 has b-arrow to state 23 (nonfinal),
c-arrow to state 03 (final), and other arrows to d. State 23 has
c-arrow to 03, d-arrow to state 01 (final), and other arrows to d.
State 01 has a arrow to 12, b-arrow to 23, and other arrows to d. We
have completed the construction with six states of the possible 16.
This DFA is minimal, which is easiest to see by proving the three
final
states, and the three nonfinal states, to be Z-distinguishable. The
string bc separates state 0 from state 01, and the string d separates
both states 0 and 01 from state 03. The string d also separates 12
from 23, and the string c separates both 12 and 23 from d.
We could also just run the minimization algorithm on the DFA. The
initial partition has classes N = {12, 23, d} and F = {0, 01, 03}. If
we describe behavior of each state by the classes to which the letters
a, b, c, and d go from that state, we get that 12 has NNFN, 23 has
NNFF,
d has NNNN, 0 has NNNN, 01 has NNNN, and 03 has NNNF. Thus class N is
split into three singleton classes and class F is split into two
classes,
the non-singleton one being {0, 01}. But since b sends 0 to d and
sends
01 to 23, and d and 23 are now separate, we get all singleton classes
at the next (and last) stage of the algorithm.
There are lots of ways to do this, depending on whether we start from
N
or from the DFA in Question 6, and on what order we remove states.
I started by adding a new start and final state to N. Removing state
3 then gives us transitions (i, ε, 0), (0, ε, f),
(0, a, 1), (0, a, 2), (1, bd, 0), (1, bd, 1),
(1, b, 2), (2, c ∪ cd, 0), and (2, cd, 1).
Removing state 2 then gives (i, ε, 0), (0, ε, f),
(0, a(c ∪ cd), 0), (0, a ∪ acd, 1), (1, bd ∪ b(c ∪
cd), 0), (1, bd ∪ bcd, 1).
Removing state 1 then gives (i, ε, 0), (0, ε, f),
(0, a(c ∪ cd) ∪ (a ∪ acd)(bd ∪ bcd)*
(bd ∪ b(c ∪ cd)), 0).
The final regular expression is thus
[ac ∪ acd ∪ (a ∪ acd)(bd ∪ bcd)*
(bd ∪ bc ∪ bcd)]*.
The PDA given by the construction has three states, plus more used solely to implement multiple-letter pushes. For convenience we will describe it with these multiple-letter pushes. The state set is {i, p, f}, the start state is i, the only final state is f, the input alphabet is {a, b}, the stack alphabet is {$, a, b, S, T}, and the transitions are (i, ε, ε; p, S$), (p, a, a; p, ε), (p, b, b; p, ε), (p, ε, S; p, aSb), (p, ε, T; p, TT), (p, ε, T; p, aTb), (p, ε, T; p, ε), and (p, ε, $; f, ε).
The PDA M (1) has exactly one final state which is not the start state, (2) can accept only with an empty stack, and (3) either pushes or pops one letter on each transition, but never does both. Most people forgot condition (2).
There are nine nonterminals Aii, Aip,
Aif (the start symbol), Api, App,
Apf, Afi, Afp, and
Aff. It turns out that only Aif and
App are needed to derive all possible strings in the
language.
There are three rules taking Aii, App, and
Aff each to ε. Of these we will need only
App → ε.
There are 27 rules of the form Axy →
AxzAzy for each choice of states x, y, and z.
Of these we will need only App →
AppApp.
Finally there are two rules arising from matched pairs of
transitions pushing and popping the same letter. These are
Aif → aAppb from pushing and popping a
c, and
App → aAppb from pushing and popping an a.
In Question 9b we showed that L(M) has a grammar which has a copy of G
within it, if we identify Aif with S and App
with T. So L(G) ⊆ L(M), though we cannot immediately rule
out the possibility that L(M) contains strings not in L(G).
To show that L(M) ⊆ L(G), we must examine all possible
accepting computations of M. Any such computation begins by
pushing a c and reading an a, and ends by popping that c and
reading a b. So L(M) = aQb, where Q is the set of strings that
can be read while going from state p on an empty stack to state p
on
an empty stack. This language is the balanced-paren language,
which has a grammar with start symbol T and rules T → TT,
T → aTb, and T → ε. Any derivation in this
language can thus be mimicked in G.
We might also characterize L(M) and L(G) each as the set of
strings of a's and b's that start with an a, end with a b, have an
equal number of a's and b's, and always have more a's than b's in
any nonempty prefix. Then we can argue separately that L(M) and
L(G) are each equal to this language.
Any string u in Z has a path in N from state 0 to itself. Thus if any
string v is also in Z, we know that uv is in Z by concatenating
the two paths. (I took off two points for not explaining that Z
is closed under concatenation. One person misquoted the true fact
"The class of regular languages is closed under concatenation" as
the false assertion "Every regular language is closed under
concatenation.")
The length of u must be congruent to either 0, 1, or 2 modulo 3. If
it
is congruent to 0, we may take v to be the empty string. If it is
congruent to 1, we may take v to be the string ac, and then |uv| is
congruent to 0. If |u| is congruent to 2, we may take v = abcd and
then |uv| is congruent to 0.
For any natural i, let Ai be the set {ai}.
For any positive natural j, let Bj be the set
{aibj-i: 0 ≤ i < j}.
For any natural k, let Ck be the set
{aibjci+j-k: i and j are naturals
with i + j > k}.
Let D be the set of strings that are either not in
a*b*c* or are of the form
aibjck with i + j < k.
Strings in Ai may be followed by any string of the form
ai'bjck with i + i' + j = k.
Strings in Bj may be followed by any string of the form
bj'ck with j + j' = k.
Strings in Ck may be followed only by the string
ck.
Strings in D cannot be followed by any strings.
This describes the infinite set of Myhill-Nerode classes for X.
We can think of these classes as forming the states of an infinite
"minimal automaton" for X, with A0 as the start state and
C0 as the only final state. The class of a string wa, wb,
or wc depends only on the class of w.
Last modified 22 February 2015