develooper Front page | perl.perl6.language | Postings from May 2001

A proposal for more powerful text processing to be built in to Perl: Flex and Pushdown Expressions.

Thread Next
From:
Daniel S. Wilkerson
Date:
May 10, 2001 17:57
Subject:
A proposal for more powerful text processing to be built in to Perl: Flex and Pushdown Expressions.
Message ID:
3AFB23C0.B4F3437D@digital-integrity.com
One of the great strengths of Perl is that, more than any other
language I know, it helps you cross between the "data" space and the
"program" space: eval(), built in regex notation, etc.  Even with the
considerable expressive power already at our disposal, I would like to
suggest that there might be a trifling additional feature or two that
perhaps would round out the language by their addition.  My apologies
if these have been discussed before.

Flex - Put all of flex right into Perl.  Flex is simply an event
engine (-compiler) for driving calls against code according to regular
expression matching events.  While it is often convenient to have the
the flow of control expressed in an imperative way (do this; do this;
do this) as Perl already has, the event-driven regular expression
matching that lexers offer would allow one to easily write
mini-language interpreters or pre-processors in Perl very easily.

I propose a syntax similar to flex, where you can specify a block of
rules and then call a function like yylex() to jump into the event
engine for parsing a stream or a string.  A rule can always return to
jump out.  There could be a package for the rules to all have access
to shared state, or if everything is going to be object-oriented, a
common object for them to share state in (so it could be thread-safe
and have partially-parsed many different strings at once).

Pushdown expressions - Slightly more powerful "regular" expressions.
The term regular expression is from the theory community, but
implementations long ago have exceeded the power of theoretical
regular expressions, the recognition power of which is equivalent to the
set
of languages that can be recognized by Deterministic Finite Automata.

We might as well extend them to "Pushdown Expressions", which would
equal the power of Pushdown Automata (a higher layer in Chomsky's
hierarchy, but not as high as Turing Machines).  One very useful step
in this direction would be to add a modifier at the end of a regex:
p (for pushdown).

This modifier would require that all parentheses-like literals: () []
{} pair-up in order for the trailing one to match.  That is, suppose
you are trying to write a pre-processor for a language like Java (I
did this) the major difficulty that would prevent you from using
regular expressions is that it is impossible to skip over
arbitrarily-nested regions of parentheses.  I propose that one simply
write
$a =~ /\(.*\)/p to force the closing ) to match the opening one only
after nesting depth is taken into account.

Daniel Wilkerson


Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About