The thesis deals with some topics in the theory of formal languages and automata. Specifically, the thesis deals with the theory of context-free languages and the study of their descriptional complexity. The descriptional complexity of a formal structure (e.g., grammar, model of automata, etc) is the number of symbols needed to write down its description. While this aspect is extensively treated in regular languages, as evidenced by numerous references, in the case of context-free languages few results are known. An important result in this area is the Parikh’s theorem. The theorem states that for each context-free language there exists a regular language with the same Parikh image. Given an alphabet Σ = {a1, . . . , am}, the Parikh image is a function ψ : Σ^∗→ N^m that associates with each word w∈Σ^∗, the vector ψ(w)=(|w|_a1, |w|_a2, . . . , |w|_am), where |w|_ai is the number of occurrences of ai in w. The Parikh image of a language L⊆Σ^∗ is the set of Parikh images of its words. For instance, the language {a^nb^n | n ≥ 0} has the same Parikh image as (ab)^∗. Roughly speaking, the theorem shows that if the order of the letters in a word is disregarded, retaining only the number of their occurrences, then context-free languages are indistinguishable from regular languages. Due to the interesting theoretical property of the Parikh’s theorem, the goal of this thesis is to study some aspects of descriptional complexity according to Parikh equivalence. In particular, we investigate the conversion of one-way nondeterministic finite automata and context-free grammars into Parikh equivalent one-way and two-way deterministic finite automata, from a descriptional complexity point of view. We prove that for each one-way nondeterministic automaton with n states there exist Parikh equivalent one-way and two-way deterministic automata with e^O(sqrt(n lnn)) and p(n) states, respectively, where p(n) is a polynomial. Furthermore, these costs are tight. In contrast, if all the words accepted by the given one-way nondeterministic automaton contain at least two different letters, then a Parikh equivalent one-way deterministic automaton with a polynomial number of states can be found. Concerning context-free grammars, we prove that for each grammar in Chomsky normal form with h variables there exist Parikh equivalent one-way and two-way deterministic automata with 2^O(h^2 ) and 2^O(h) states, respectively. Even these bounds are tight. A further investigation is the study under Parikh equivalence of the state complexity of some language operations which preserve regularity. For union, concatenation, Kleene star, complement, intersection, shuffle, and reversal, we obtain a polynomial state complexity over any fixed alphabet, in contrast to the intrinsic exponential state complexity of some of these operations in the classical version. For projection we prove a superpolynomial state complexity, which is lower than the exponential one of the corresponding classical operation. We also prove that for each two one-way deterministic automata A and B it is possible to obtain a one-way deterministic automaton with a polynomial number of states whose accepted language has as Parikh image the intersection of the Parikh images of the languages accepted by A and B.

DESCRIPTIONAL COMPLEXITY AND PARIKH EQUIVALENCE / G.j. Lavado ; tutor: G. Pighizzini, C. Mereghetti ; coordinatore: E. Damiani. DIPARTIMENTO DI INFORMATICA, 2015 Mar 13. 27. ciclo, Anno Accademico 2014. [10.13130/lavado-giovanna-janet_phd2015-03-13].

DESCRIPTIONAL COMPLEXITY AND PARIKH EQUIVALENCE

G.J. Lavado
2015

Abstract

The thesis deals with some topics in the theory of formal languages and automata. Specifically, the thesis deals with the theory of context-free languages and the study of their descriptional complexity. The descriptional complexity of a formal structure (e.g., grammar, model of automata, etc) is the number of symbols needed to write down its description. While this aspect is extensively treated in regular languages, as evidenced by numerous references, in the case of context-free languages few results are known. An important result in this area is the Parikh’s theorem. The theorem states that for each context-free language there exists a regular language with the same Parikh image. Given an alphabet Σ = {a1, . . . , am}, the Parikh image is a function ψ : Σ^∗→ N^m that associates with each word w∈Σ^∗, the vector ψ(w)=(|w|_a1, |w|_a2, . . . , |w|_am), where |w|_ai is the number of occurrences of ai in w. The Parikh image of a language L⊆Σ^∗ is the set of Parikh images of its words. For instance, the language {a^nb^n | n ≥ 0} has the same Parikh image as (ab)^∗. Roughly speaking, the theorem shows that if the order of the letters in a word is disregarded, retaining only the number of their occurrences, then context-free languages are indistinguishable from regular languages. Due to the interesting theoretical property of the Parikh’s theorem, the goal of this thesis is to study some aspects of descriptional complexity according to Parikh equivalence. In particular, we investigate the conversion of one-way nondeterministic finite automata and context-free grammars into Parikh equivalent one-way and two-way deterministic finite automata, from a descriptional complexity point of view. We prove that for each one-way nondeterministic automaton with n states there exist Parikh equivalent one-way and two-way deterministic automata with e^O(sqrt(n lnn)) and p(n) states, respectively, where p(n) is a polynomial. Furthermore, these costs are tight. In contrast, if all the words accepted by the given one-way nondeterministic automaton contain at least two different letters, then a Parikh equivalent one-way deterministic automaton with a polynomial number of states can be found. Concerning context-free grammars, we prove that for each grammar in Chomsky normal form with h variables there exist Parikh equivalent one-way and two-way deterministic automata with 2^O(h^2 ) and 2^O(h) states, respectively. Even these bounds are tight. A further investigation is the study under Parikh equivalence of the state complexity of some language operations which preserve regularity. For union, concatenation, Kleene star, complement, intersection, shuffle, and reversal, we obtain a polynomial state complexity over any fixed alphabet, in contrast to the intrinsic exponential state complexity of some of these operations in the classical version. For projection we prove a superpolynomial state complexity, which is lower than the exponential one of the corresponding classical operation. We also prove that for each two one-way deterministic automata A and B it is possible to obtain a one-way deterministic automaton with a polynomial number of states whose accepted language has as Parikh image the intersection of the Parikh images of the languages accepted by A and B.
13-mar-2015
Settore INF/01 - Informatica
automata and formal languages; context-free languages; semilinear set; Parikh image; descriptional complexity
PIGHIZZINI, GIOVANNI
DAMIANI, ERNESTO
Doctoral Thesis
DESCRIPTIONAL COMPLEXITY AND PARIKH EQUIVALENCE / G.j. Lavado ; tutor: G. Pighizzini, C. Mereghetti ; coordinatore: E. Damiani. DIPARTIMENTO DI INFORMATICA, 2015 Mar 13. 27. ciclo, Anno Accademico 2014. [10.13130/lavado-giovanna-janet_phd2015-03-13].
File in questo prodotto:
File Dimensione Formato  
phd_unimi_R09519.pdf

accesso aperto

Tipologia: Tesi di dottorato completa
Dimensione 691.37 kB
Formato Adobe PDF
691.37 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/263438
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact