Chomsky hierarchy (original) (raw)
The Chomsky hierarchy or Chomsky-Schützenberger hierarchy is a way of classifying formal grammars into four types, with the lower numbered types being more general.
Recall that a formal grammar G=(Σ,N,P,σ) consists of an alphabet Σ, an alphabet N of non-terminal symbols properly included in Σ, a non-empty finite set P of productions, and a symbol σ∈N called the start symbol. The non-empty alphabet T:=Σ-N is the set of terminal symbols. Then G is called a
Type-0 grammar
if there are no restrictions on the productions. Type-0 grammar is also known as an unrestricted grammar, or a phrase-structure grammar.
if the productions are of the form uAv→uWv, where u,v,W∈Σ* with W≠λ, and A∈N, or σ→λ, provided that σ does not occur on the right hand side of any productions in P. As A is surrounded by words u,v, a type-1 grammar is also known as a context-sensitive grammar.
if the productions are of the form A→W, where A∈N and W∈Σ*. Type-2 grammars are also called context-free grammars, because the left hand side of any productions are “free” of contexts.
if the productions are of the form A→u or A→uB, where A,B∈N and u∈T*. Owing to the fact that languages generated by type-3 grammars can be represented by regular expressions
, type-3 grammars are also known as regular grammars.
It is clear that every type-i grammar is type-0, and every type-3 grammar is type-2. A type-2 grammar is not necessarily type-1, because it may contain both σ→λ and A→W, where λ occurs in W. Nevertheless, the relevance of the hierarchy has more to do with the languages generated by the grammars. Call a formal language a type-i language if it is generated by a type-i grammar, and denote ℒi the family of type-i languages. Then it can be shown that
where each inclusion is strict.