Question text in black, solutions in blue.
These problems deal with four formal languages over the alphabet {0,1}. Define the following function f from {0,1}* to the integers: f(λ) = 0 and for any string w, f(w0) = f(w) - 1 and f(w1) = f(w) + 1. (Recall that λ is the empty string.)
A is defined to be the language {w: f(w) = 0}.
B is the language {w: f(w) = 0 and for all v, if v is a prefix of w then f(v) ≥ 0}.
C is the language {w: w is in B and for all v, if v is a prefix of w then f(v) ≤ 3}.
Finally, D is the language {w: w is in B and for all v, if v is a prefix of w then f(v) ≤ 1}. (This said "f(w) ≤ 1" before, which makes D the same language as B.)
(Recall that string u is a prefix of string v if there is a string x such that ux = v.)
Languages A and B are not regular, as will be shown in Question 2.
Languages C and D are regular. C has a five-state DFA, with state set
{0,1,2,3,4}, start state 0, and final state set {0}. Its 0-transitions
are from 0 to d, 1 to 0, 2 to 1, 3 to 2, and d to d. Its 1-transitions are
from 0 to 1, 1 to 2, 2 to 3, 3 to d, and d to d. (The "death state" d is
needed because every state must have a 0-transition and a 1-transition.)
A regular expression for C is (1(1(10)*0)*0)*
as is easy to compute from the DFA by the state reduction method.
D has a three-state DFA with state set {0,1,d}, start and only final state
0, 0-transitions from 0 to d, 1 to 0, and d to d, and 1-transitions from
0 to 1, 1 to d, and d to d. D's regular expression is (10)*.
A has an equivalence class for every integer, positive, negative or zero.
This is because membership in A depends only on the value of f, and f is
easily proved to be a homomorphism (that is, f(uv) = f(u) + f(v) for any
strings u and v). So if f(u) = f(v), then f(uw) = f(vw) for any w, and thus
uw and vw are either both in or both out of A. Conversely, if f(u) ≠
f(v), we can find a string w with f(w) = -f(u), and then uw will be in A
and vw will not be in A. Thus u and v are not A-equivalent.
B has a class for each non-negative integer k, consisting of those
strings w such that f(w) = k and no prefix u of w has f(u) < 0. All
strings u that have a prefix with negative f-value are B-equivalent,
because for each of them uv is not in B for any string B. To prove that the
set of strings with no f-negative prefix is divided into classes by f-value,
note that if u and v are such strings f(u) = f(v), again f(uw) = f(vw) for
any w and thus uv and uw are both in B or both not in B. If, on the other
hand, u and v are such strings with f(u) ≠ f(v), we may again choose a
string w such that f(w) = -f(u) (for example, w could be 0f(u))
and see that f(uw) = 0 and f(vw) ≠ 0, so that uw is in B and vs is not.
C and D have classes corresponding to the states of the DFA's I gave in
the solution to Question 1. The classes 0, 1, 2, and 3 of C correspond to
the four possible values of f(w) for strings w that have no prefix u with
either f(u) < 0 or f(u) > 3. All strings that have such a prefix are
equivalent to each other, since none of them can have uw ∈ C for any
string w. The value of f determines the class by an argument similar to that
given for Languages A and B.
Language D has three Myhill-Nerode classes, similar to those of C and
matching the DFA given above. Justification of this claim is very similar
to that for C.
Both A and B are context-free. A grammar for A is S → SS,
S → 1S0, S → 0S1, S → ε. A grammar for B is
S → SS, S → 1S0, s → ε. The simplest PDA's for A and
B keep track of the current value of f(w) in unary on the stack. For
language B, the PDA after reading w has f(w) 1's on its stack (on top of a
bottom-of-stack marker) unless a prefix has had negative f-value (in which case
the PDA is in a death state). This condition is easy to maintain by pushing
a 1 when you see a 1 and popping a 1 when you see a 0. The PDA accepts if it
can remove the bottom of stack marker at the end of the string.
The PDA for A is similar except that it keeps k 1's on the stack if f(w)
is a positive number k, and keeps k 0's on the stack if f(w) = -k. Both these
conditions are easy to maintain, and we have a bottom-of-stack marker as well.
The PDA accepts if it can remove the marker at the end of the string.
Last modified 4 November 2004