Unedited minutes from the p5p meeting 7/23/01

Front page | perl.perl5.porters | Postings from July 2001
Unedited minutes from the p5p meeting 7/23/01

Thread Next
From:
Adam Turoff
Date:
July 24, 2001 10:20
Subject:
Unedited minutes from the p5p meeting 7/23/01
Message ID:
20010724132052.D12181@panix.com
[ NOTE: The first half of Jarkko's discussion of new features in 5.8 is
  missing; the discussion of what do do about XML::RPC and Core Modules
  will follow in a separate message.  All typos and such are 
  purely my fault; apologies for any errors, inaccuracies or unclear
  passages.  Z. ]

-------


Agenda:

	1) gsar on maintenance
	2) jhi on devel
	3) gnat on perl6
	4) dan on perl6 internals
	5) hugo on RE innards
	6) japhy on Friedl's stuff
	7) simon on magic/B/Unicode
	8) jeff okamoto on ipv6
	9) core modules -- let the fight BEGIN!

-------------------------------

jhi on devel:
-------------

[....]

microperl: smaller than miniperl; Perl using features found only in ANSI C
currently not used for anything, but useful for experimentation; possible
to do configuration in Perl

parameter reordering from printf

much better SOCKs (firewall proxy library thingy)

better marker for regex compilation errors

Regex debugger backend cooperation between ActiveState and mjd; general
backend for writing debuggers for the regex engine, so anyone can use it

PerlIO: multiarg open

numerous memory  leaks plugged after gsar and Alan Burlison agree on what
a memory leak means

more overridable keywords; close to overridding all of the keywords
(gsar: I don't believe all keywords will be overridable, e.g. print, split;
some of them are fixable, but what are the needs...)

now we have an untie method

we have 'use English "-nomatchvars" ($&, $', $` not included)

some beginnings of source compilation support; the fun part was getting 
configure to .... [do something really strange, but useful].  Configure
supports -v mksimlinks; allows a readonly source code tree

can microperl be useful with cross compilation?  (not quite yet; 64-bit
issues)

[[ What's Old in 5.8]]

Deprications:

 - EQ

 - pseudohashes (Larry: fine)

 - suidperl
 	can snoop a directory tree based on error messages coming from
	suidperl

 - package;

 - bless referent, referant (Simon: this is handy; jhi: you keep saying
   that;  you can always use quotes)

 - self binding arrays and hashes (exercises come less known code paths)
 	could be possible to fix this, but no volunteers

 - obscure regex features (not from Ilya, but from POSIX)  [:alpha:],
	features defined by posix such as [:anything-equivalent-to-A:]
	(all diacriticals with A); could do this with Unicode, but unclear
	how to do this with composition and decomposition; no one has requested
	this (except Simon);  this is not implemented yet, come back later

	related to collation: handle this group of characters as one single
	character; apparently so obscure that no one has requested it yet.

 [[ Future: Things I Hopefully Don't Have To Care About (except 5.8.1) --jhi]]

 - Artur's continuing work on ithreads; may lead to a stable threading
   implementation; iThreads: main intent was to have concurrent
   interpreters (and a pseudo-fork is a natural result) Fundemental to 
   apache2

 - continuing work on bignum

 - missing pieces of unicode (composition, decomposition, collation, regex
    features proposed by Unicode; Unicode standard refers to Perl)

 - inline; Brian thinks it shouldn't be part of the core ever, but if there
   are hooks needed in Perl, that would make the cooperation more beautiful
   than it is now

 - hopeful timetable is 3-4 months
 
Questions: 

what's life after 5.8?  new pumpkin?  5.8.1 would be jhi, after that, Hugo

will there be a 5.10?  what else do we need to do?  :-)  No major
requirements; could be a phased path towards adoption to 6.0; 

Damian: corporate america will be very thankful for more stability and
	fewer features; split between programmers who want to use new features,
	but 5.004/5.005 is in very common use; many will skip 5.6, and go
	straight to 5.8 if it is stable; don't want us to release a new version
	they don't have time to support; 

RJRay: what redhat goes through before they put something into an official
	release is pretty stringent (modulo before Randy started :-); burn-in
	periods where they run candidate servers for 3-4 weeks before they
	put it on distributions CDs; that's why products is a couple of months
	behind based on what's available at release time; a lot of other
	packages that are part of RH7.1 count on Perl working reliabily and
	consistently (and not worthwhile the 2-3 months of QA)

-------------------------

gsar on 5.6.x

- won't be doing much work on 5.6.x, except for minor patches from 5.8

- probably won't get a 5.6.2 until 5.8 released

- anyone else who would like to take on the maintenance of 5.6.x?

-------------------------

gnat on perl6

[ based on a talk from yapc::montreal ]

started with a burst of fire and enthusiam last year; learned a lot from
the experience

watched very closely how mhonarc project launched

rfc project confirmed the wisdom of letting there be only one language
designer; closed down the RFC process to stop people having strange ideas

right now going in cycles: lots of discussion after each apocolypse (not
final judgements on how the language should be)

parallel to language design, the internals design has been progressing with
Dan at the helm;

parser/runtime are decoupled, you should be able to plug in another parser,
so long as it compiles down to whatever runs in the runtime; won't compile
straight to .net; (AS has found that perl doesn't run on .net very well,
designed only for C#, we don't want a Perl6 that's C# under the hood)
probably won't be done as cleanly as we had hoped

ending up with our own runtime, compiler system (with pluggable parsers,
including a perl5 parser); or even python: take python code and turn it
into perl6 bytecode; make this a deliberate feature of the design to allow
the language to be malleable

-------------------------

dan on perl6 internals

design as it stands at the moment:

 - interpreter looks like a software CPU; complete with register sets,
   executable code

 - everyone's VM does this, but no one admits it

 - assuming the language allows us to do it, we should be able to draw
   a lot on literature on optimizing on real CPUs; don't know very much
   about optimizing code for a stack machine; it all looks like forth
   more or less, but we could do a little bit better

 - stackless python is misnamed; perl5 was "stackless" from day 1

 - parser, compiler and runtime wil all be conceptually separate entities;
   let you yank out any of these modularly

 - perl6 for small platforms won't have a parser (no string eval, do won't
   work except for bytecomplied programs) could bytecompile a Perl parser
   in Perl and run that on a visor; be able to hand-craft op trees

 - perl6 will probably ship with a perl assembler before we have an 
   interpreter (write for the VM); easier than writing a parser for perl

 - let people experiment; allow other people to write programs that run
   on our platform; 

 - start by writing B::Deparse first

 - PR: Compiled Perl on small devices

 - when designing the interpeter part, the bytecode we generate will 
   translate cleanly into other forms (JVM, .NET, Machine code); perl5->C
   are a bunch of function calls into Perl; why do that for integer math?
   not the primary focus, but a concern

 - semi-serious suggestion: MMIX?  Tools available

 - 64 {int|float|string|pmc} registers

 - easier to work with bounded resources, not unbounded register sets

 - 64 chosen out of the air, most of the optimization algorithms get
   comfortable with mid-30 register count, so should make it easier to
   optimize (IA-64 has 64 registers with an infinite register architecture)

 - regex engine won't be standalone; will compile down to a chunk of
   bytecode that will get executed; to our benefit: when we translate to
   native executable code, maps nicely to most processors instead of going
   through a state machine (except for dynamic regexes with variable
   substitutions)

 - garbage collection: full garbage collection, no reference counting in
   perl6 (except possibly for creating explicit references: 2 bits for
   reference counts (0, 1, 2, many) [ possibly memory savings with Perl5
   if such a scheme were adopted for Perl5 reference counting ]
   
   refcounting is pricy, and a lot of time is spent in the memory system;
   
   guaranteed object destruction; garbage collection; not tied together

 - vtbls: we won't be doing things to variables, we'll be asking variables
   to do things to themselves; opcodes worried about control flow; this
   allows us to cut down the size of the codepath a lot; won't have to have
   a variety of code checks; variable will know what state it's in:
   get_string will return string and will upgrade only when neccessary

   if a variable wants to do anything fancy, it can, and expense isn't
   shared by everyone;

   overloading is simpler; safe threaded variables; no overhead to check to
   see if variables is shared (use a shared variable vtbl; only shared
   variables pay the cost of shared access)

   update vtbl on a per-object basis and on the fly: core scalar class
   should have 9-10 vtbls associated on it (determined by state: simple int
   vs. simple string) , objects can override their own vtbls or multiple
   vtbls (based on state); each variable will have it's own private vtbl

   Q: If we're putting it all in the optree into variables, how does this
   impact the bytecode?  Interpreter will have to come with a set of vtbls;
   interpreter may have a set of vtbls built into it; bytecode may load
   extra vtbls on the fly; sort of like C# or Java classes

-------------------------

hugo on regex internals:

 - identified the bits involved in parsing regexes, and the bits outside
   of regcomp and regexec in executing regexes

 - function summaries would be cool, but better written down

 - possible to rewrite the regex engine to use op trees, peek-ahead,
   and avoid opcodes; single perl op, but what your regex compiles into is
   effectively an optree (a single program which is done to shave some
   cycles; complexity it brings in makes it impossible to maintain/debug,
   and may be costing us time; something interesting enough to try)

 - even if perl6 is just around the corner, there's going to still be a
   lifecycle of perl5 in general, including 5.6/5.8

 - far too big a change to retrofit into 5.6

 - how long do we expect 5.8 out there?

 - Dan: best to be pushed off into a module, especially if Ilya's plugable
   regex engine works; you're not changing the core, if it doesn't work,
   you haven't changed the regex engine works (The pluggable regex engine
   *does* work; used by ActiveState); hooks are in there now

 - could be running in 5.6.2

 - hugo & mjd have been looking at how the call to regmatch calls itself
   recursively on almost completely simple regexes: causes stack blowouts;
   give it it's own stack to avoid blowing out the C stack  (x* is
   processed recursively, 'x+foo' fails as many times as x+ succeeds if not
   followed by a 'foo')
 - large parts of regmatch doing the same thing different ways [...]

 - mjd's brief intro to regex engine is in the proceedings

-------------------------

japhy onj friedl's stuff

 - 3 petitions about the regex engine: 
   - unicode charclass subtraction
   - naming captured subheaders
   - sexeger stuff (/r reverses a regex in VB)

 - the way unicode handles charclass subtraction is disgusting; - is
   overloaded when between 2 other characters: proposed syntax for a more
   pleasant alternative, posix implementation of set operators doesn't
   exist in Perl yet:

   Proposal 1:

   [X[&Y]] : intersection
   [X[^Y]] : subtraction

   Proposal 2: (nested charclasses)

   [X] == [[[X]]] 
   [[X] && [Y]]  union  (maybe w/ or w/o spaces around &&)
   [[X] && [^Y]] subtraction

   Current kludge (from epp):
   [^\D5]   double negative
   (?!5)\d  takes logic outside of the charclass

   (?!aeiou)[a-z]  vs. [[a-z]&&[^aeiou]]

   Named Captures:
   /age::(?<age>\d+)/  issues: autovivication?  my?  $ missing?  scalars?
                       lvalue?  (?<$s->age>)

   Match last of something in a string: (perl isn't as optimized whitespace)
   removing trailing whitespace is easier by reversing, removing leading
   and reversing again

   /(\d+)\D*$/  (match last string of digits; very slow with backtracking)

   "123A456B"  backtracking to the 3 is pointless, because it matched \d,
   so it can't match the \D

:: LARRY: /x is the default in regexes in perl6 (but doens't apply to
charclases) ::
 
 - mjd: doesn't need a core patch: overload qr//; Jeff's done all of this
   already  (but incomplete)

 - problem with regex: can't return a recursive datastructure

 - in ruby or python: the discussion is easier because regex return objects
   but we don't have methods (save $1, $2, $3)

 - can /r be implemented by overloading qr//?  Yes

-------------------------

simon on magin/b/unicode

 - have people used the unicode stuff available today? specifically IO
   Filter stuff

 - problem: perl using unicode when not asked to do so in 5.6.x

 - unicode::decompose peforms normalization; simon wants it in the
   core at some point; 

 - how do we localize? is there any way to get there from here?

 [[ deep magic discussion about localized variables ]]

 - B: See Simon's tutorial online, or here at the conference

 - B::Generate: write your optimizer in pure perl: the optree returned
   is readwrite, not readonly.  (Should be named D::Generate);
   self-modifying code; Lisp-ish code==data; 

 - turning off constant folding: patch on p5p (for use with B::Deparse)

 - Unicode filenames from readdir: will it ever turn on utf8 flags from
   what you get from readdir?  No support at the moment...could be turned
   on through environment variables (Mystical UTF8_LOCALE variables set
   properly...); almost can do that now with Linux; hack with taint
   (tainting taint)

 - don't assume utf8 by default is a BAD idea; but the UTF8... environment
   variable declares the user's expectaction on how I/O should be handled
   with ALL apps on Linux (or wherever)

-------------------------

jeff okamoto on ipv6

 - one of those issues that may drive us to 5.10

 - apache is ipv6 capable

 - how do we make mod_perl talk ipv6?  hpux has an ipv6 kit; tru64, freebsd
   do; vms doesn't; windows does; 

 - how is perl going to deal with this?  perl scripts will have to change
   because they're making assumptions on the format of an IP address?

 - if you're given an IPv4 address, call IPv4 routines? Or mask in an IPv6
   addr?  Determine at configure time?  Be dynamic and configure at
   runtime (good with NFS mounts; might want to manage with separate
   executables...)?

 - use ipv6; pragma might be the way to go

 - is the scenario likely where a system has ipv6 and ipv4 concurrently?
   Very likely; most of the nodes that are ipv6 capable retain ipv4
   compatibility

 - connect/bind/accept should work with ipv4 in an ipv4 ways on an ipv6
   machine with ipv4 compatibility

 - most ipv4 vestiges won't be too big a deal; still hardcoded references
   to ipv4 structures, mostly in VMS;  socket extension needs major work
   for ipv6: structures have changed, library calls have changed, etc.

 - the socket6 extension might have been written with the assumption that
   if you want ipv4 it will automatically translate to ipv6.

 - Perl?  What about xsubs?  struct inaddr, etc. compatibility issues.

 - when is that likely to happen?  as soon as one platform vendor decides
   to create problems and drop ipv4 support

 - pretty common outside of the us to ship with ipv6 (China)

 - jeff willing to work on ipv6 stuff, but still issues to be resolved with
   when/how support is built/configured

 - reverse name translation has annoying flags to determine string
   representation of a numeric address; lots of issues

 - perl6 should address this; designing the structure for perl6, that may
   help us retrofit it into perl5

 - perl6-language is probably a good place to start

----

Core modules discussion [mailed separately]
Thread Next
Unedited minutes from the p5p meeting 7/23/01 by Adam Turoff
Re: Inline::Files for core inclusion (Was: Unedited minutes from the p5p meeting 7/23/01) by Matthew Wickline
Re: Inline::Files for core inclusion (Was: Unedited minutesfrom the p5p meeting 7/23/01) by Dan Sugalski
Re: Unedited minutes from the p5p meeting 7/23/01 by Nathan Torkington
Re: Unedited minutes from the p5p meeting 7/23/01 by H . Merijn Brand
Re: Unedited minutes from the p5p meeting 7/23/01 by Nathan Torkington
Re: Unedited minutes from the p5p meeting 7/23/01 by Jarkko Hietaniemi
Re: Unedited minutes from the p5p meeting 7/23/01 by H . Merijn Brand
Re: Unedited minutes from the p5p meeting 7/23/01 by Michael G Schwern
Re: Unedited minutes from the p5p meeting 7/23/01 by Robin Houston
nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About