Front page | perl.perl5.porters |
Postings from July 2001
Unedited minutes from the p5p meeting 7/23/01
Thread Next
From:
Adam Turoff
Date:
July 24, 2001 10:20
Subject:
Unedited minutes from the p5p meeting 7/23/01
Message ID:
20010724132052.D12181@panix.com
[ NOTE: The first half of Jarkko's discussion of new features in 5.8 is
missing; the discussion of what do do about XML::RPC and Core Modules
will follow in a separate message. All typos and such are
purely my fault; apologies for any errors, inaccuracies or unclear
passages. Z. ]
-------
Agenda:
1) gsar on maintenance
2) jhi on devel
3) gnat on perl6
4) dan on perl6 internals
5) hugo on RE innards
6) japhy on Friedl's stuff
7) simon on magic/B/Unicode
8) jeff okamoto on ipv6
9) core modules -- let the fight BEGIN!
-------------------------------
jhi on devel:
-------------
[....]
microperl: smaller than miniperl; Perl using features found only in ANSI C
currently not used for anything, but useful for experimentation; possible
to do configuration in Perl
parameter reordering from printf
much better SOCKs (firewall proxy library thingy)
better marker for regex compilation errors
Regex debugger backend cooperation between ActiveState and mjd; general
backend for writing debuggers for the regex engine, so anyone can use it
PerlIO: multiarg open
numerous memory leaks plugged after gsar and Alan Burlison agree on what
a memory leak means
more overridable keywords; close to overridding all of the keywords
(gsar: I don't believe all keywords will be overridable, e.g. print, split;
some of them are fixable, but what are the needs...)
now we have an untie method
we have 'use English "-nomatchvars" ($&, $', $` not included)
some beginnings of source compilation support; the fun part was getting
configure to .... [do something really strange, but useful]. Configure
supports -v mksimlinks; allows a readonly source code tree
can microperl be useful with cross compilation? (not quite yet; 64-bit
issues)
[[ What's Old in 5.8]]
Deprications:
- EQ
- pseudohashes (Larry: fine)
- suidperl
can snoop a directory tree based on error messages coming from
suidperl
- package;
- bless referent, referant (Simon: this is handy; jhi: you keep saying
that; you can always use quotes)
- self binding arrays and hashes (exercises come less known code paths)
could be possible to fix this, but no volunteers
- obscure regex features (not from Ilya, but from POSIX) [:alpha:],
features defined by posix such as [:anything-equivalent-to-A:]
(all diacriticals with A); could do this with Unicode, but unclear
how to do this with composition and decomposition; no one has requested
this (except Simon); this is not implemented yet, come back later
related to collation: handle this group of characters as one single
character; apparently so obscure that no one has requested it yet.
[[ Future: Things I Hopefully Don't Have To Care About (except 5.8.1) --jhi]]
- Artur's continuing work on ithreads; may lead to a stable threading
implementation; iThreads: main intent was to have concurrent
interpreters (and a pseudo-fork is a natural result) Fundemental to
apache2
- continuing work on bignum
- missing pieces of unicode (composition, decomposition, collation, regex
features proposed by Unicode; Unicode standard refers to Perl)
- inline; Brian thinks it shouldn't be part of the core ever, but if there
are hooks needed in Perl, that would make the cooperation more beautiful
than it is now
- hopeful timetable is 3-4 months
Questions:
what's life after 5.8? new pumpkin? 5.8.1 would be jhi, after that, Hugo
will there be a 5.10? what else do we need to do? :-) No major
requirements; could be a phased path towards adoption to 6.0;
Damian: corporate america will be very thankful for more stability and
fewer features; split between programmers who want to use new features,
but 5.004/5.005 is in very common use; many will skip 5.6, and go
straight to 5.8 if it is stable; don't want us to release a new version
they don't have time to support;
RJRay: what redhat goes through before they put something into an official
release is pretty stringent (modulo before Randy started :-); burn-in
periods where they run candidate servers for 3-4 weeks before they
put it on distributions CDs; that's why products is a couple of months
behind based on what's available at release time; a lot of other
packages that are part of RH7.1 count on Perl working reliabily and
consistently (and not worthwhile the 2-3 months of QA)
-------------------------
gsar on 5.6.x
- won't be doing much work on 5.6.x, except for minor patches from 5.8
- probably won't get a 5.6.2 until 5.8 released
- anyone else who would like to take on the maintenance of 5.6.x?
-------------------------
gnat on perl6
[ based on a talk from yapc::montreal ]
started with a burst of fire and enthusiam last year; learned a lot from
the experience
watched very closely how mhonarc project launched
rfc project confirmed the wisdom of letting there be only one language
designer; closed down the RFC process to stop people having strange ideas
right now going in cycles: lots of discussion after each apocolypse (not
final judgements on how the language should be)
parallel to language design, the internals design has been progressing with
Dan at the helm;
parser/runtime are decoupled, you should be able to plug in another parser,
so long as it compiles down to whatever runs in the runtime; won't compile
straight to .net; (AS has found that perl doesn't run on .net very well,
designed only for C#, we don't want a Perl6 that's C# under the hood)
probably won't be done as cleanly as we had hoped
ending up with our own runtime, compiler system (with pluggable parsers,
including a perl5 parser); or even python: take python code and turn it
into perl6 bytecode; make this a deliberate feature of the design to allow
the language to be malleable
-------------------------
dan on perl6 internals
design as it stands at the moment:
- interpreter looks like a software CPU; complete with register sets,
executable code
- everyone's VM does this, but no one admits it
- assuming the language allows us to do it, we should be able to draw
a lot on literature on optimizing on real CPUs; don't know very much
about optimizing code for a stack machine; it all looks like forth
more or less, but we could do a little bit better
- stackless python is misnamed; perl5 was "stackless" from day 1
- parser, compiler and runtime wil all be conceptually separate entities;
let you yank out any of these modularly
- perl6 for small platforms won't have a parser (no string eval, do won't
work except for bytecomplied programs) could bytecompile a Perl parser
in Perl and run that on a visor; be able to hand-craft op trees
- perl6 will probably ship with a perl assembler before we have an
interpreter (write for the VM); easier than writing a parser for perl
- let people experiment; allow other people to write programs that run
on our platform;
- start by writing B::Deparse first
- PR: Compiled Perl on small devices
- when designing the interpeter part, the bytecode we generate will
translate cleanly into other forms (JVM, .NET, Machine code); perl5->C
are a bunch of function calls into Perl; why do that for integer math?
not the primary focus, but a concern
- semi-serious suggestion: MMIX? Tools available
- 64 {int|float|string|pmc} registers
- easier to work with bounded resources, not unbounded register sets
- 64 chosen out of the air, most of the optimization algorithms get
comfortable with mid-30 register count, so should make it easier to
optimize (IA-64 has 64 registers with an infinite register architecture)
- regex engine won't be standalone; will compile down to a chunk of
bytecode that will get executed; to our benefit: when we translate to
native executable code, maps nicely to most processors instead of going
through a state machine (except for dynamic regexes with variable
substitutions)
- garbage collection: full garbage collection, no reference counting in
perl6 (except possibly for creating explicit references: 2 bits for
reference counts (0, 1, 2, many) [ possibly memory savings with Perl5
if such a scheme were adopted for Perl5 reference counting ]
refcounting is pricy, and a lot of time is spent in the memory system;
guaranteed object destruction; garbage collection; not tied together
- vtbls: we won't be doing things to variables, we'll be asking variables
to do things to themselves; opcodes worried about control flow; this
allows us to cut down the size of the codepath a lot; won't have to have
a variety of code checks; variable will know what state it's in:
get_string will return string and will upgrade only when neccessary
if a variable wants to do anything fancy, it can, and expense isn't
shared by everyone;
overloading is simpler; safe threaded variables; no overhead to check to
see if variables is shared (use a shared variable vtbl; only shared
variables pay the cost of shared access)
update vtbl on a per-object basis and on the fly: core scalar class
should have 9-10 vtbls associated on it (determined by state: simple int
vs. simple string) , objects can override their own vtbls or multiple
vtbls (based on state); each variable will have it's own private vtbl
Q: If we're putting it all in the optree into variables, how does this
impact the bytecode? Interpreter will have to come with a set of vtbls;
interpreter may have a set of vtbls built into it; bytecode may load
extra vtbls on the fly; sort of like C# or Java classes
-------------------------
hugo on regex internals:
- identified the bits involved in parsing regexes, and the bits outside
of regcomp and regexec in executing regexes
- function summaries would be cool, but better written down
- possible to rewrite the regex engine to use op trees, peek-ahead,
and avoid opcodes; single perl op, but what your regex compiles into is
effectively an optree (a single program which is done to shave some
cycles; complexity it brings in makes it impossible to maintain/debug,
and may be costing us time; something interesting enough to try)
- even if perl6 is just around the corner, there's going to still be a
lifecycle of perl5 in general, including 5.6/5.8
- far too big a change to retrofit into 5.6
- how long do we expect 5.8 out there?
- Dan: best to be pushed off into a module, especially if Ilya's plugable
regex engine works; you're not changing the core, if it doesn't work,
you haven't changed the regex engine works (The pluggable regex engine
*does* work; used by ActiveState); hooks are in there now
- could be running in 5.6.2
- hugo & mjd have been looking at how the call to regmatch calls itself
recursively on almost completely simple regexes: causes stack blowouts;
give it it's own stack to avoid blowing out the C stack (x* is
processed recursively, 'x+foo' fails as many times as x+ succeeds if not
followed by a 'foo')
- large parts of regmatch doing the same thing different ways [...]
- mjd's brief intro to regex engine is in the proceedings
-------------------------
japhy onj friedl's stuff
- 3 petitions about the regex engine:
- unicode charclass subtraction
- naming captured subheaders
- sexeger stuff (/r reverses a regex in VB)
- the way unicode handles charclass subtraction is disgusting; - is
overloaded when between 2 other characters: proposed syntax for a more
pleasant alternative, posix implementation of set operators doesn't
exist in Perl yet:
Proposal 1:
[X[&Y]] : intersection
[X[^Y]] : subtraction
Proposal 2: (nested charclasses)
[X] == [[[X]]]
[[X] && [Y]] union (maybe w/ or w/o spaces around &&)
[[X] && [^Y]] subtraction
Current kludge (from epp):
[^\D5] double negative
(?!5)\d takes logic outside of the charclass
(?!aeiou)[a-z] vs. [[a-z]&&[^aeiou]]
Named Captures:
/age::(?<age>\d+)/ issues: autovivication? my? $ missing? scalars?
lvalue? (?<$s->age>)
Match last of something in a string: (perl isn't as optimized whitespace)
removing trailing whitespace is easier by reversing, removing leading
and reversing again
/(\d+)\D*$/ (match last string of digits; very slow with backtracking)
"123A456B" backtracking to the 3 is pointless, because it matched \d,
so it can't match the \D
:: LARRY: /x is the default in regexes in perl6 (but doens't apply to
charclases) ::
- mjd: doesn't need a core patch: overload qr//; Jeff's done all of this
already (but incomplete)
- problem with regex: can't return a recursive datastructure
- in ruby or python: the discussion is easier because regex return objects
but we don't have methods (save $1, $2, $3)
- can /r be implemented by overloading qr//? Yes
-------------------------
simon on magin/b/unicode
- have people used the unicode stuff available today? specifically IO
Filter stuff
- problem: perl using unicode when not asked to do so in 5.6.x
- unicode::decompose peforms normalization; simon wants it in the
core at some point;
- how do we localize? is there any way to get there from here?
[[ deep magic discussion about localized variables ]]
- B: See Simon's tutorial online, or here at the conference
- B::Generate: write your optimizer in pure perl: the optree returned
is readwrite, not readonly. (Should be named D::Generate);
self-modifying code; Lisp-ish code==data;
- turning off constant folding: patch on p5p (for use with B::Deparse)
- Unicode filenames from readdir: will it ever turn on utf8 flags from
what you get from readdir? No support at the moment...could be turned
on through environment variables (Mystical UTF8_LOCALE variables set
properly...); almost can do that now with Linux; hack with taint
(tainting taint)
- don't assume utf8 by default is a BAD idea; but the UTF8... environment
variable declares the user's expectaction on how I/O should be handled
with ALL apps on Linux (or wherever)
-------------------------
jeff okamoto on ipv6
- one of those issues that may drive us to 5.10
- apache is ipv6 capable
- how do we make mod_perl talk ipv6? hpux has an ipv6 kit; tru64, freebsd
do; vms doesn't; windows does;
- how is perl going to deal with this? perl scripts will have to change
because they're making assumptions on the format of an IP address?
- if you're given an IPv4 address, call IPv4 routines? Or mask in an IPv6
addr? Determine at configure time? Be dynamic and configure at
runtime (good with NFS mounts; might want to manage with separate
executables...)?
- use ipv6; pragma might be the way to go
- is the scenario likely where a system has ipv6 and ipv4 concurrently?
Very likely; most of the nodes that are ipv6 capable retain ipv4
compatibility
- connect/bind/accept should work with ipv4 in an ipv4 ways on an ipv6
machine with ipv4 compatibility
- most ipv4 vestiges won't be too big a deal; still hardcoded references
to ipv4 structures, mostly in VMS; socket extension needs major work
for ipv6: structures have changed, library calls have changed, etc.
- the socket6 extension might have been written with the assumption that
if you want ipv4 it will automatically translate to ipv6.
- Perl? What about xsubs? struct inaddr, etc. compatibility issues.
- when is that likely to happen? as soon as one platform vendor decides
to create problems and drop ipv4 support
- pretty common outside of the us to ship with ipv6 (China)
- jeff willing to work on ipv6 stuff, but still issues to be resolved with
when/how support is built/configured
- reverse name translation has annoying flags to determine string
representation of a numeric address; lots of issues
- perl6 should address this; designing the structure for perl6, that may
help us retrofit it into perl5
- perl6-language is probably a good place to start
----
Core modules discussion [mailed separately]
Thread Next
-
Unedited minutes from the p5p meeting 7/23/01
by Adam Turoff