Question text is in black, solutions in blue.
Q1: 10 points Q2: 10 points Q3: 10 points Q4: 10 points Q5: 10 points Q6: 10 points Q7: 10 points Q8: 15 points Q8: 40+10 points Total: 125+10 points
The language ABC over the alphabet Σ = {a, b, c} is defined as the set of all strings that contain at least one a, at least one b, and at least one c.
If X and Y are any two languages over the same alphabet, the symmetric difference X Δ Y is defined to be the set of strings that are in either X or Y, but not in both.
TRUE. There can be at most finitely many strings of length ≤ k -- the exact number is the sum for i from 0 to k of |Σ|i. We can write a regular expression for the singleton set containing each string in L, then union these together to get a regular expression for L.
FALSE. Any L has this property, since any string has a finite length. There exist non-regular languages, so there exist non=regular languages with this property.
FALSE. Let S be the set {ε, a, b, c, ab, ac, bc, abc}. If u and v are any two strings in S, there exists a letter in u but not in v or vice versa. WLOG let there exist a letter x, in u but not in v. Let z be a string containing exactly those letters not in u. Then uz is in L, but vz is not in L because it has no x. Since u and v were arbitrary, we see that S is a set of pairwise L-distinguishable strings. By Myhill-Nerode, any DFA for L must have at least eight states.
TRUE. The grammar S --> S0 | 1S1 | 0 | 1S generates this language. Given
any string 1k0y in the language, I can make it by first using the
rules S --> S0 and S --> 1S1 to make y to the right of the S, making a 1 to the
left for every 1 in y. Then if I need more 1's to the left, I make them with
the rule S --> 1S. Finally I make the 0.
I must also show that every string made by my grammar is in this language.
I must finish by using S --> 0, after using the other three rules some number
of times. So I have a 0, and I made some 1's to the left of it and a string of
0's and 1's to the right of it. If m is the number of times I used the rule
S --> 1S1, the number of 1's to the left of the first 0 in the final string is
at least m, and the number of 1's to the right of it is at most m.
TRUE. Here are two proofs:
(1) X Δ Y equals C(X ∪ C(Y)) ∪ C(C(X) ∪ Y), where C is
the complement operator, and we know that the regular languages are closed
under union and complement.
(2) Let X = L(M) and Y = L(N) where M and N are DFA's. Construct a DFA O
whose states are pairs (m, n) with m a state of M and n a state of N. The
start state of O is the pair of start states, and the transition function
uses the function of M on the first component and the function of N on the
second. The final states are pairs (m, n) where exactly one of m and n are
final in their respective machines. This DFA O, when it reads a string w,
goes to a state (m, n) where m and n are the states of M and N respectively
on reading w. O accepts w, therefore, exactly when one of M and N accepts w
and the other doesn't.
The following queue machine's language is the non-CFL
{anbncn: n ≥ 0}.
An easy regular expression for the complement of ABC is
(a ∪ b)* ∪ (a ∪ c)* ∪
(b &cup c)*, because any string not in ABC must contain at most
two different letters.
The simplest regular expression I can think of for ABC itself is the
union of six terms, one for each possible ordering of the first a, first b,
and first c in the string. The first term is
aΣ*bΣ*cΣ*.
It's possible, but tedious, to construct an expression for either
language from its eight-state DFA by state elimination.
The strings in L(N) are ε, a, ab, ba, aba, abb, and bab.
Following the version of the construction in Sipser, the start start is {1, 3}, with a-arrow to {2, 3} and b-arrow to {4}. The state {2, 3} has a-arrow to the death state and b-arrow to {2, 3, 4}. The state {4} has a-arrow to {2, 3} and b-arrow to the death state. The state {2, 3, 4} has a-arrow to {2, 3} and b-arrow to itself. The death state, of course, has both arrows to itself. We are done -- only five of the 16 potential states are reachable.
The top-down parser has state set {s, p, f}, start state s, only final state f, two transitions (s, ε, ε; p, S$) and (p, ε, $, f, ε), and loops on state p with labels (a, a; ε), (b, b; ε), (c, c; ε), (ε, S; TaT), (ε, T; bS), and (ε, T; c).
We already meet the conditions about the start and final state, so we only have to ensure that each transition either pushes or pops a single character but not both. We break the transition from s to p into two, the first pushing $ and the second pushing S. Two of the loops must be broken up: (ε, S; TaT) into pop-S, push-T, push-a, and push-T, and (ε, T; bS) into pop-T, push-S, and push-b.
The base case is k = 1, for which we can make cac, of length 3(1), by the
derivation S --> TaT --> caT --> cac.
For the inductive case, assume that there is some derivation of a string
w of length 3k from S. Then we can make a string of length 3(k+1) by S
--> TaT --> bSaT --> bSac --> bwac, using the inductive hypothesis for the
last step.
Let f be the function from {S, T, a, b, c}* to the natural numbers that is the homomorphism taking S to 0 and all other letters to 1. The original string S of the derivation has f(S) = 0. Each of the rules preserves f modulo 3: the rule S --> TaT adds 3 and the other two rules perserve f exactly. So the f-value of any string appearing in a derivation from S must be divisible by 3. Any word in L(G) is such a string, and must have length equal to its f-value, which is divisible by 3.
Using the hint, I will show that L(G) ∩ R is non-regular, where R is the
regular language b*(a ∪ c)*.
I claim that L(G) ∩ R is equal to
X = {bkc(ac)k+1: k ≥ 0}.
(L(G) ∩ R ⊆ X because to make a string in R, I may only use the
rule T --> bS at the beginning of the string. Thus every T that is not the
first one in the string must go to c, and in effect I have the rule
S --> Tac, which I may use only for S --> bSac or S --> cac. Thus any valid
derivation uses the first of these rules k times and then the second once,
getting a string in X. And of course X ⊆ L(G) by this derivation, and
obviously X ⊆ R.
To see that X is non-regular, I can use Myhill-Nerode by observing that
the set {bi: i ≥ 0} is pairwise X-distinguishable, with
bi and bj distinguished by c(ac)i+1.
Or I could use the Regular Language Pumping Lemma -- if p is the alleged
pumping length I take w = bpc(ac)p+1. The pumped
string must occur within the initial b's, and puming either up or down takes
us out of X.
Last modified 15 May 2013