### Restivo

```ON THE EXPRESSIVE POWER OF
SHUFFLE PRODUCT
Antonio Restivo
Università di Palermo
A very general problem:
Given a basis B of languages, and a
set O of operations, characterize the
family O(B) of languages expressible
from the basis B by using the
operations in O.
The Family REG of Regular Languages:
The basis:
B = { {a} | a   } U {ε}
The operations:
O = {union, concatenation, (Kleene) star}
REG = O(B)
REG is closed also under all Boolean operations
The Family SF of Star-Free Languages
The basis:
B = { {a} | a   } U {ε}
The operations:
O = {Boolean operations , concatenation}
SF = O(B)
Shuffle Product
The shuffle of two words u and v is the set
u ш v = {u1v1…unvn|n≥0, u1…un=u, v1…vn=v}
ab ш ba = {abba, baab, abab, baba}
The shuffle of two languages L and K is the
language
L шK =
UuєL,vєK u ш v
Expressive Power of the Shuffle
Very little is known about classes of
languages closed under shuffle, and their
study appears to be a difficult problem.
Such a study, apart its theoretical interest,
is also motivated by applications to the
modeling of process algebras and to
program verification
The Family INT of Intermixed Languages
The basis:
B = { {a} | a   } U {ε}
The operations:
O = {Boolean operations, concatenation, shuffle}
INT = O(B)
Theorem (Berstel, Boasson, Carton, Pin, R.)
SF  INT  REG
SF
INT
REG
The Problem
Problem 1:
Give a (decidable) characterization if the
family INT
Proposition
INT is not a variety (in the sense of
Eilenberg)
Remark: REG and SF are varieties
Periodicity
A language L  * is aperiodic , or non-counting,
if there exists an integer n  0 such that, for all
x,y,z  *, one has
xynz  L  xyn+1z  L.
Theorem (M.P. Schutzenberger)
A regular language L is aperiodic if and only if it is
star-free
Periodicity
The strict inclusion SF  INT implies that the
shuffle of two star-free languages in general is
not star-free:
«the shuffle creates periodicities»
Problem 2:
Determine conditions under which the shuffle of
two star-free languages is star-free.
Bounded Shuffle
Let k be a positive integer. The k-shuffle of two
languages L1 and L2 is defined as follows:
L1 шk L2 =
= {u1v1…umvm |m≤k, u1…umL1, v1…vmL2}.
Any k-shuffle is called bounded shuffle
Theorem (Castiglione, R.)
SF is closed under bounded shuffle
Corollary. The shuffle of a star-free language and
a finite language is a star-free language
Partial Commutations
Let  be an alphabet and let      be a symmetric
and reflexive relation, called (partial) commutation.
Consider the congruence  of * generated by the
set of pairs (ab,ba) with (a,b).
If L  * is a language, [L] denotes the closure of L by
. L is closed by  if L = [L].
The closed subsets of * are called trace languages.
Partial Commutations
Let L1 and L2 be two languages over the alphabet.
Let 1 and 2 be two disjoint copies of the alphabet 
(colored copies), and i: i   , for i=1,2, the
corresponding bijections. Let L’1 (L’2 resp.) be the subset of
1 (2 resp.) corresponding to the L1 (L2 resp.) under the
morphism 1 (2 resp.).
Let  = 1  2 and consider the partial commutation  
1  2 and let : * * be the morphism induced by 1
and 2 (delete colours). The -product of L1 and L2 is
L1 ш L2 =  ( [L’1L’2])
Partial Commutations
bacbcaabca L1, babcacbab  L2
bacbcaabca babcacbab
 = {(a,a), (a,b), (b,a), (b,b), (c,a) (c,b), (c,c)}
bbaabcbcaabcacacbab  L1 ш L2
The -product generalizes at the same time
concatenation and shuffle:
If  = , then L1 ш L2 = L1L2
If  = 1  2, then L1 ш L2 = L1 ш L2
Partial Commutations
Given the partial commutation   1  2, we
define the partial commutation ’   
defined as follows:
(a,b)  ’  (a,b)  
Theorem (Guaiana, R., Salemi)
Let L1, L2 be languages over , closed under ’.
If L1, L2  SF, then L1 ш L2  SF.
Corollary. The shuffle of two commutative starfree languages is star-free
Partial Commutations
If the internal commutation ’ (i.e.the
commutation allowed inside each of the
languages L1, L2) is the «same» as the
external commutation  (i.e. the
commutations between the letters in L1
and the letters in L2), then the -product
preserves the star-freeness.
Unambiguous Star-Free Languages
A language L is the marked product of the
languages L0, L1, …, Ln if
L = L0a1L1a2L2 … anLn,
for some letters a1, a2, … , an of  .
A marked product L = L0a1L1a2L2 … anLn is
unambiguous if every word of L admits a unique
decomposition u = u0a1u1…anun, with u0  L0, … , un
 Ln.
The product {a,c}*a{}b{b,c}* is unambiguous
Unambiguous Star-Free Languages
SF is the smallest Boolean algebra of languages
which is closed under marked product
The family USF of Unambiguous Star-Free
languages is the smallest Boolean algebra of
languages of * containing the languages of the
form A*,for A , which is closed under
unambiguous marked product.
Unambiguous Star-Free Languages
FO : class of languages corresponding to
formulas of first order logic.
FOk : class of languages corresponding to
formulas of first order logic with k variables.
Theorem (McNaughton) SF = FO
Theorem (Immerman, Kozen) FO3 = FO
Theorem (Therien, Wilke) FO2 = USF
Unambiguous Star-Free Languaes
USF  SF  INT  REG
USF
SF
INT
REG
Theorem (Castiglione, R.)
If L1, L2  USF then L1 ш L2  SF
Cyclic Submonoids
The languages in the class USF correspond to
regular expressions in which the star operation is
restricted to subsets of the alphabet.
The simplest languages not in USF are the
languages of the form L = u*, where u is a word of
length  2.
Such languages are the cyclic submonoids of *.
We here study the shuffle of cyclic submonoids.
Cyclic submonoids
Theorem (Berstel, Boasson, Carton, Pin, R.)
If a word u contains at least two different letters,
then u*  INT.
A word u  * is primitive if the condition u=vn, for
some word v and integer n, implies u=v and n=1.
Theorem (McNaughton, Papert)
The language u* is star-free if and only if u is a
primitive word.
Cyclic submonoids
u = b, v = ab
b* ш (ab)* = (b + ab)*  SF
u = aab, v = bba
(aab)* ш (bba)*  (ab)* = ((ab)3)*
 (aab)* ш (bba)*  SF

Problem 3:
Characterize the pairs of primitive words u,v
such that u*ш v* is a star-free language.
Combinatorics on Words
Theorem (Lyndon, Schutzenberger)
If u and v are distinct primitive words, then the
word unvm is primitive for all n,m  2.
Theorem (Shyr, Yu)
If u and v are distinct primitive words, then
there is at most one non-primitive word in the
language u+v+.
Combinatorics on Words
Problem 3 is related to the search for the
powers (non-primitive words) that appear in the
language u+ ш v+.
Denote by Q the set of primitive words.
For u,v,w  Q, let p(u,v,w) be the integer k such
that
(u*ш v*)  w* = (wk)*
If (u*ш v*)  w* = {}, then p(u,v,w) = 0.
Combinatorics on Words
For u,v  Q, define the set of integers
P(u,v) = { p(u,v,w) | w  Q}.
For instance, if u = a10b and v = b then
P(u,v) = {0,1,2,5,10}.
Problem 4:
Given two primitive words u, v, characterize the set
P(u,v) in terms of the combinatorial properties of u
and v.
```