Front page | perl.cpan.workers |
Postings from August 2000
CPAN meeting minutes
Thread Next
From:
Adam Turoff
Date:
August 16, 2000 13:35
Subject:
CPAN meeting minutes
Message ID:
20000816163220.A27993@panix.com
Here's my first draft of the minutes of last month's CPAN meeting.
I have a (docbook-based) HTML version available that can be posted
somewhere appropriate.
Z.
* CPAN Categorization
+ Vaults of Parnassus
The general opinion in the community at large about CPAN is that
it is better than the Python community's "Vaults of Parnassus".
Python doesn't have the same distinction between "script/program"
and "module". This is due to standard a python idiom that allows a
module to be run as a program. Therefore, it is easier to find both
"modules" and "scripts" in the same archive, with one interface.
+ CPAN Scripts
There are 22 top level categories on CPAN for modules. Using this
categorization for scripts would be appreciated. Switching the
breakdown from Module/Script:Category to Category:Module/Script may
help.
+ ppt
Some scripts (and modules) like ppt defy simple categorization. ppt is
a complete distribution, but the individual modules belong in different
categories.
ppt demonstrates that the categorization problem is hard (which is why
there's a 'Master of Library Science' degree), and that the hierarchy
will never be perfect. Ideally, there should be 'sideways' links across
categories.
+ Metadata
Much more metadata is needed with CPAN modules. This can be
accomplished with OSD, PPM or some derived/enhanced format.
+ Use of OSD/PPM
In order to spur adoption of a more metadata-rich format, CPAN should
start refusing uploads if the OSD files are not included. Such a
solution is acceptable if and only if it is announced well in advance
and phased in over a reasonable amount of time.
+ Navigation
CPAN primary categorization is the by-author directory, which is not
particularly navigable. Some persistant URL interface is required, and
the by-author structure serves this purpose, but perhaps a better
directory structure is available.
Maintaining a sane on-disk structure is important, as Tom and others use
'ls' or ftp to navigate around CPAN. This is quite similar to the *BSD
Ports tree.
+ Searching
Graham's search.cpan.org is an excellent new interface for browsing
CPAN. The possibility of replicating or mirroring this search interface
was discussed.
The categorization problem has an impact on searching CPAN. Adding
symlinks to individual modules from mulitiple categories may help the
searching problem, but adding more keywording to the search database
would help more.
+ Keywords
CPAN should be extended to categorized modules by keyword. Exactly
what those keywords should be and how that list of keywords should grow
is a separate discussion.
* Site Maintenance
+ CPAN Scalability
CPAN is currently hovering around 700MB of storage, and is mirrored
worldwide through rsync and FTP mirroring. This model is adequate and a
radical switch to something like napster or Coda/Intermezzo isn't
absolutely necessary. And Andreas finds Coda/Intermezzo to be broken
presently.
Currently, about 90% of all public sites mirror from funet.fi, and while
funet may not remain the canonical master, one master site seems
adequate.
CPAN is averaging roughly 20 new uploads or fewer per day, 150 new
uploads over the last 30 days according to Andreas.
+ Private mirroring
Private mirrors of CPAN are not tallied, and Jarkko doesn't want to know
about it.
+ Adding mirrors
Jarkko wants to write simple step-by-step instructions for setting up a
CPAN mirror, or possibly write a web form for setting up new mirrors.
rsync service is available for mirroring, and that needs to be
publicized better.
+ CPAN multiplexer
The CPAN multiplexer on perl.com hasn't been operational for some time
now.
Since the multiplexer was written as a simple CGI script, other
multiplexing devices and services have become available. These include
Akamai's service and a Cisco MUX.
Tom and Jon said they'd talk to the appropriate people at O'Reillynet to
fix the problem.
+ PAUSE Backup?
Perhaps it would be worthwhile to have a redundant PAUSE server in case
Andreas' machine goes down.
Then again, perhaps not. No one seems to mind when the server goes down
for four hours these days.
The main PAUSE server dumps out its MySQL database hourly and produces
1.6 MB files. This is certainly rsync'able.
* Distribution
+ BSD Ports interface
The (Free|Open|Net)BSD project maintain "ports" collections of software
ported to each operating system. This interface is both powerful and
simple, and encourages using 'ls' and 'cd' to navigate through the
categories and packages.
+ New package formats
MakeMaker supports a 'make ppm' target along with 'make dist'. This
could easily be extended to support 'make dpkg' and 'make rpm' to build
Debian and RPM packages.
Using native package formats (such as ports, dpkg and rpm) will allow
all Perl modules to be better integrated into the host operating
system's software registry system, instead of relying on 'perllocal.pod'
to be the canonical list of modules installed on a system.
+ CPAN::Site
One possible solution to the various problems behind CPAN as we know it
would be to allow and encourage private CPAN mirrors with their own
additional features.
For example, this could be a private/personal CPAN that is
layered on top of CPAN for distribution of local (proprietary?)
modules inside a corporate network. This may also take the form
of carrying only those modules that have been blessed by an
ad-hoc editor (e.g. mjd's private CPAN, ny.pm's QC'd CPAN, etc.)
This allows the advantage of reusing the existing CPAN tools such as
the CPAN.pm shell.
+ PAUSE
The PAUSE scripts can be replicated to help build local CPAN
repositories (e.g. the "Morgan Stanley CPAN"). Andreas says this won't
be done until some minor security issues are resolved (e.g. removing
"eval" from his code.)
Perhaps some of these local PAUSE sites can feed back into the master
PAUSE database.
+ PPM
ActiveState offers binary ready-to-install binary packages of Perl
modules. Thus far, CPAN has shunned binary files. Should CPAN be
extended to offer PPM or PPM-style binaries? Note that OSD can be used
to point to multiple binary versions of a single module.
Note that there may be licensing issues involved with replicating
ActiveState's PPM repository. One solution may be for ActiveState
to offer multiple PPM servers worldwide that act as customization
layers on top of a generic CPAN repository.
* Developer Issues
+ Integrate OSD/PPM
Perl can be gently upgraded over the next few relases of Perl5, so that
in about a year's time all new uploads will contain OSD/PPM files.
Perl programmers will probably resist a push to include XML markup with
their modules. Make::Maker currently mines the data found in
Makefile.PL and generates a ppm file; this interface needs to remain and
be extended.
+ Bundles
Bundles exist to solve a packaging problem, but they are not widely
used. The community needs to spend more time explaining and supporting
the use of bundle files.
+ PAUSE, New modules
There's an important piece of information that's not reaching the
community: how to create and distribute new modules on CPAN. The
problem isn't uploading to PAUSE, but rather writing the first
Makefile.PL.
Simon Cozens recently wrote 'perlnewmod' to explain this. This needs to
be advertised better.
+ Auto-notification
It would be quite handy to allow users sign up for an -announce style
mailing list for new upload announcements of modules they care about,
or possibly a new-uploads-daily digest message.
use.perl.org already presents a new uploads listing on a frequent
basis, but many users care about only two or three modules on CPAN.
Similarly, users could register interest in a specific module, category
or author and receive mail whenever "something interesting" happens.
+ Module/Distribution mapping problem
A better mapping of module -> distributioin and distribution->module
is required. The problem is mostly solved, but the remaining issues are
really nasty and begin with Make::Maker.
+ Perl Census?
Should PAUSE be extended to act as a registry of all (or most) Perl
developers? Even those without modules on CPAN?
+ Abandoned Modules
Currently, a module author can pro-actively hand over development and
maintenance of an abandoned module. What is the process for taking
over an abandoned module once the original author disappears?
This needs to take into account the fact that some authors may
be on prolonged travel/vacations and cannot or will not check
their mail to see their modules have bugs which need to be fixed.
+ Automatic Probing
Modern Win32 operating systems can now self-probe and identify which
components need to be upgraded. To some degree, CPAN.pm can do this
and identify which modules have been updated on CPAN since they were
installed locally.
Unfortunately, randomly upgrading modules may break existing code, and
there is no consistent mechanism for identifying when a newer module is
experimental and should not overwrite a stable installation.
CPAN.pm (or something similar) should be able to maintain a registry of
some sort to identify which modules and scripts can/should be upgraded.
Note that this also deals with the outstanding issue of maintaining
multiple versions of a single module concurrently.
* Developer Documentation
+ General Documentation problems
CPAN should open up the job of documenting Perl and CPAN modules to
every user. Currently, the CPAN module list does a fairly good job
advertising and promoting modules, while there is no similar interface
to gather, advertise and promote documentation-only submissions.
+ OSD
If there is some OSD-like mechanism for modules, then documentation can
be tagged and tracked similarly. Similar issues are involved in
handling scripts.
+ Smaller docs
There is a need for a larger amount of smaller pieces of documentation.
These could take the form of annotations of existing documents, hints
files, mini-HOWTOs or general notes.
A tidbit on installing a specific module on Solaris is a good example of
the scope here.
use.perl.org and Perl FAQ Prime (perlfaq.com) may serve as good areas to
handle or develop these pieces of documentation.
+ Document Annotation
The PHP documentation has an area to comment on the documentation and
comment on the commentary. This is much like reinventing the Talmud.
This system appears to work very well, similar to slashdot or book
errata.
This could be useful for bug reports, revisions, amendments.
* CPAN Organization
+ Module Homepages
Each Perl module could have it's own "homepage" (Graham has long since
implemented this with search.cpan.org). It could contain user feedback,
rating systems (e.g. Amazon).
Such a feature may be limited to those modules that produce OSD, as a
way of encouraging adoption of OSD.
+ Problematic Submissions
How should documentation-only uploads be categorized? Where do modules
go when they don't fit into any of the existing categories?
Is this a problem that would be solved with better or multiple search
engine interfaces?
+ Module lists
Many of the problems with CPAN are caused by the fact that there's
only one, incomplete module list. Perhaps multiple module lists will
help solve these problems.
"Namespace Pumpkings" may help here; the Apache, TK and XML namespaces
have their own independantly maintained module lists. Accepting that
this is a good idea and extending it may be a way to incorporate
multiple module lists.
+ CPAN API
A better API into the "CPAN service" would be nice. This should include
version control, stability information (experimental, release, bugfix,
etc.) and so forth.
This API could be as simple as a Perl-ready version of the 03- file;
Tom uses a mechanism like this currently.
+ CPAN Quality
Is there too much crap on CPAN? Does the module list steer people away
from the "bad modules"? Should we produce a user interface that steers
people away from the crap and/or towards the better modules?
Would user ranking of modules be useful?
+ "Blessed" Modules?
Perhaps one solution to the Quality issue is to split CPAN into
multiple areas. The Hitchhiker's Guide to the Galaxy (hhgttg.org)
is split into an anything-goes scratch area and an official edited
area. This isolates the quality controlled, tested modules from random
uploads that may not be ready for general usage.
+ Reviews
More reviews of more modules are needed. Perhaps they could be found
on search.cpan.org or some other CPAN site.
* Namespaces, Versioning
+ Namespace issues
modules@perl.org serves as an area to discuss proposed module names.
There have been some complaints that the maintainers of the Perl module
namespace don't always come up with reasonable and intuitive names for
modules, and that anyone trying to refute the decision of this group
faces a losing five-against-one battle.
It was also mentioned that modules@perl.org doesn't always respond to
requests in a timely fashion. Using autoreply would help this solve
this issue.
Concerns like this may be encouraging people to leave Perl and switch to
Python.
+ Module versions
Versioning of modules is important. Being able to identify a specific
module by version number solves part of that problem but creates
others, since only one version of a module may be installed at once
(or at least installing and using multiple versions of a single module
is *very* tricky).
+ Module naming
Flexibility is needed in implementing and versioning modules. Once a
module is released, changing its name is as politically correct as
recinding a domain name by committee, especially since changing a
module's name will break exisiting code.
Module names should not mention how the code is implemented (e.g.
Text::CSV_XS).
General module naming guidelines are needed.
+ Namespacing
Currently, CPAN namespaces use the same first-come-first-serve model as
domain names. Occasionally, this allows a developer to own a namespace
even though they have written a poorer implementation of an interface.
This problem also ignores the fact that some modules (Text::CSV,
Scalar::Utils) are available in all-Perl and XS implementations. In
the case of Text::CSV, chosing the implementation is done by the module
user, while in the case of Scalar::Utils, the decision is made by the
installer.
Allowing multiple modules to use the same namespace may solve these
issues.
+ Author-based Namespaces
Perhaps appending the author's CPAN ID to a module name will better
identify the specific version of a module to be used:
"use File::Parse::TIMB;"
Perhaps using the existing version numbering mechanism will help if it
can be extended to using a string, such as "#NI_S".
+ Impact on Perl Syntax
The best solution, which we may not have seen yet, might require
significant changes to the Perl language. Thus, it may need to wait
until Perl6.
+ Using Interfaces
Kevin Lenzo pointed out that Modula 2 solves this problem nicely by
using the 'interface' keyword. That is, implementations aren't named,
but interfaces are. So, any package implementing a known interface is
interchangeable with any other package implementing the same interface.
Interfaces also open up the issue of public vs. private interfaces.
Currently, all Perl modules expose public interfaces, since there is no
(or cumbersome) data hiding available with Perl moudles. A named
interface could identify only the public portion of an interface that
should be used or reimplmented.
Some mechanism for versioning and standardizing interfaces would also
be necessary.
+ Unique Identifiers
A few of the issues that stem from uniquely identifying a module have
already been solved. Mozilla's XPCOM uses "IAD" to give the
implementation of a component a unique identifier. CORBA and COM (?)
use a GSID for the same purpose.
These identifiers are intended to be globally unique.
+ Corporate namespaces
Tim Bunce mentioned that Solaris' "kstat" command is now implemented in
Perl. Sun also wants to use and control the "Solaris::" namespace
since it is an extension of their existing trademark on Solaris.
One possible solution to the corporate namespace problem would be to
insure that any module sitting in an "owned" namespace not maintained
by the namespace owner (e.g. "Solaris::*" modules not written by Sun)
explicitly acknowledge the trademark owner.
Jon Orwant has promised to figure out the legal wording necessary.
+ Removing Modules
The issue of corporate namespaces brings up the issue of removing
modules from CPAN. That is, if there is a corporate namespace such as
OReilly (or O'Reilly), how is a module removed when it doesn't belong
in that namespace?
Misuse of a corporate or otherwise "owned" namespace is one reason for
removing a module off of CPAN. Other reasons may exist, such as
removing unmaintained modules that can no longer work with modern Perls.
+ Cute Names
Modules with names like "IMA::DBI", "Math::ematica" and "D'Oh" need to
be addressed. Should they be renamed to be more conformant with the
moudle naming guidelines? Should they be left back if/when CPAN splits
into scratch/edited areas?
* Licensing
+ Additional Disclaimers
Morgan Stanley adds a paragraph to the standard Artistic License that
effectively states "if you use our module, you can never sue us."
+ License Identification
Better identification if the license used with a module needs to be
tracked. This could be done upon upload to PAUSE. That is, the module
author can specify the license flavor used for the module.
Once this is done, identifying which modules can be distributed on
CDROM will help publishers respect a module author's individual
licensing concerns.
* Schwern's Quality Assurance presentation
+ Malicious modules
In order to prove a point, one module author wrote a module that
printed out the error message "I am deleting all of your files"
to prove a point: there are no security mechanisms for checking modules
or installing modules.
This needs to be fixed.
+ cpan-testers
The cpan-testers effort is great, but it has problems. First,
it is not automated. Second, it is incomplete.
+ CPANTS: The CPAN Testing Service
Schwern proposes instituting an automated series of quality tests.
These tests are intended to identify a possible lack of quality, not
the presence of quality in a module/script.
Perfect identification of all lack-of-quality indicators is not the
goal. Achieving 80% accuracy during automated testing is OK;
cpan-testers could be recast to handle the more difficult 20% of the
problem.
+ Levels of quality
CPANTS is designed to identify quality control problems in simple tiers
of boolean tests. After passing one tier of tests, a module can
proceed to the next tier of tests.
Each tier of testing is designed to identify a specific set of common
problems.
Two tiers of simple boolean tests are currently proposed: "Veto tests"
and "Boolean Kwalitee Tests".
+ Veto tests
The veto tests are intended to find compile errors, incomplete
distributions and incompatibility with common Perls.
This list is incomplete, but is representative of the kind of veto
tests envisioned for a first-pass quality check.
= Are README, INSTALL, Manifest and Makefile.PL files present?
= Do the modules have tests?
= Do the modules pass their own tests?
= Does every .pm file compile (perl -c)?
= Does it blow up because it's supposed to, and if so, does it blow up
in the Makefile?
= Does the distribution pass its tests on all stable and popular Perls?
= Does the distribution pass its tests on all stable and popular
configurations? (64bit, malloc, sfio, etc.)
= Does the distribution pass its tests on all popular and sane
architectures? (Linux, Solaris, Win32, etc.)
= Does the distribution play well with others? (e.g. are there
security violations in Makefile.PL)
+ Boolean Kwalitee tests
The kwalitee tests are designed to examine the code in an
automated fashion to sound alarms upon the presence of questionable
results.
Failing a kwalitee test is non-fatal, it just provides a signal
(a "red-flag") that there may be problems with a module. Upon
failing a test, the author will be notified. The author can
then fix the problem (e.g. unintentional overuse of '$&') or
provide an explanation (e.g. "The code needs to be this way.")
This explanation may be an annotation of the module's OSD file.
Upon failing a test, once an author's explanation is provided, that
test will cease to be run on future versions of a module, and the
author's explanation will serve to document why a specific kwalitee
test isn't run.
= Use Devel::Coverage to determine if at least N% of the code
is tested with the module's test suite.
= Use B::Fathom (or something better) to see if the code has a
complexity rating of less than N.
= Is documentation present?
= Does the code look "yucky"? That is, does it exhibit any signatures
of Ineffective Perl Programming? (Overuse of $_, %_ and @_).
= Does the code use problematic features such as fork, sig, alarm?
= Does the code use experimental features?
= Does the code load in less than N seconds?
= Is a ChangeLog present?
+ Human tests
Some tests can only be done by people looking at the code and
distribution. Again, these are red-flag warnings, not veto tests.
= How complete is the documentation for this module?
= How readable is the documentation for this module?
= How up-to-date is the documentation for this module?
= Is the interface "overdone"?
= Is there a book available for this module?
= Is this module backwards compatible with previous releases?
+ CPAN integration
These tests should not be triggered on a commit into CPAN. Modules
should be useable prior to being committed to CPAN. That is, some of
these tests should be run before a module is accepted into CPAN.
+ Test Results
There are issues to be resolved with keeping a permanent record of all
test results for a module. Keeping a record of every submission that
failed will probably be counterproductive and discourage module
auhtors from submitting modules to CPAN.
+ Automated Testing
Kurt Starsinic's Perl Labs would probably be the best vehicle for
getting CPANTS off the ground.
Note that this process is a distributed computing effort like
SETI@Home, but much more dangerous since random, unchecked code is
being run against many machines. Being able to run each set of tests
in a secure sandbox or running it on a throw-away, restorable
configuration will be important.
Strange customer configurations should be allowed into the testing
framework.
+ Testing Requirement
Testing like this will be an acceptance test for Perl6. Getting this
ready for CPAN before Perl6 requires it will help improve CPAN earlier.
+ Automate, Automate, Automate
Everything possible that can be automated in this process should be
automated and distributable. CPANTS should spin off as many pieces as
possible for distribution and automation.
+ Perl Metrics
CPANTS can be extended to pick up as many metrics as possible about
the Perl code on CPAN. These metrics need to be publicized and
extended.
= Are deprecated features still being used?
= How much code uses experimental features?
= How "complex" is the average piece of Perl code?
= Of the features slated for removal in Perl6 (e.g. formats, $#), how
commonly are they used?
= How common are object-oriented modules? Non-object-oriented
modules?
+ Trusted groups
Many groups (e.g. ny.pm, Yahoo, etc.) can join in and target a
handful of modules for manual testing. These "trusted groups"
can then publish their blessings for their favorite modules.
+ Karma
CPAN/CPANTS could adopt a Slashdot/Advogato style karma rating to
reward module authors and reviewers.
+ Automated mailing lists
CPANTS can maintain two -announce style lists: one for nagging module
authors/users to fix code, and one "kudos" list for recognizing module
authors and reviewers who make contributions to CPAN/CPANTS.
+ Requirements
CPANTS needs an automated testing framework, possibly something like
bonsai/tinderbox from Mozilla. It also might need an area on
sourceforge for development of the testing framework.
CPANTS also needs a wide variety of configurations, such as
those offered by Perl Labs. It also needs contacts throughout
the Perl community for more specific and obscure yet important
system configurations.
* Merijn Broeren's Report on the CPAN BOF
What follows is a report on the issues raised at the CPAN BOF at TPC5.
Many of these issues have been raised elsewhere in this document.
Some of the points mentioned here come from the CPAN BOF, others come
from our discussion of points raised by the BOF.
+ One page per module on use.perl.org (or possibly use.cpan.org)
This already exists on search.cpan.org:
http://search.cpan.org/dist?ModuleName
+ Module comparisons
Comparisons, reviews and surveys of similar modules would be useful for
many users. This could be done on the web, and might be made available
through some enhancement to CPAN.pm.
+ User education
There are some outstanding questions about CPAN that have been
answered in many places, but the information isn't reaching CPAN users.
+ Namespace Pumpkings / Propogandist
Namespaces like Apache and TK have independantly maintained module
lists and namespace management. This should be extended into other
commonly used and large namespaces.
+ More Metrics
In order to support laziness and impatience of Perl users, CPAN should
identify some common metrics about modules to show some standard of
quality.
+ Upload / Download counts
Since CPAN is distributed, getting accurate download statistics is
difficult, but something is better than nothing.
Since search.cpan.org is one centralized repository, examining the
statistics from CPAN searches may be worthwhile.
The mirror setup can be simplified to provide standard logging that is
more easily harmonized with other CPAN mirrors.
+ More docs
Module authors need to include more usage examples of their modules.
More documentation in general would also be well received.
+ Supersets / Personal SDKs
Some users want to mirror CPAN and add their own modules on top of it.
Some users want to create their own personal groups of modules for
distribution as a complete SDK.
+ Module Lint
A mechanism for identifying common problems with modules would be
appreciated.
+ Revisit CTAN, Debian archives
There are some interesting features in other software archives that
have been developed since CPAN was first released.
+ User Census for which modules are installed
A simple way for a user to submit the details of which (non-core) CPAN
modules they have installed will help identify how popular specific
modules are. This should be an opt-in process.
NB: OpenBSD asks users to send their dmesg output to dmesg@OpenBSD.org
to track what kind of hardware OpenBSD has been installed on.
+ Monger Involvement?
There's an open quesiton on how involved the Perl Mongers should be in
improving CPAN. Individual Perl Monger groups are an invaluable source
of volunteers for some of the more labor intensive aspects of improving
CPAN.
+ Module patch repository?
There is no consistent way of patching a module on CPAN and making that
patch available for general use. The development model on CPAN today
forces users to wait for the next release of a module (which may or may
not fix the bug in question). This is especially problematic for
unsupported modules.
+ Module/Perl version synchronization
Currently, there's no easy way to indentify what minimum version
of Perl is required for a specific module. This information may be
mentioned in a module's documentation, in the sources, or absent
entirely. Making this information available on CPAN (especially prior
to download) would be extremely helpful.
+ Upgrade OSD separately?
Upgrading the metadata associated with a module is an open issue. It
is undetermined at this time whether the metadata information for a
module should be included with that module, or if it should be external
to that module.
After a module is released, the metadata information for that module
may change (e.g. versions of Perl known to work with this module,
binary distributions of this module, etc.). For these reasons,
updating the metadata separately (and externally) may need to be
addressed.
+ Save changes for the next CPAN?
Many of the changes proposed here require a significant amount of
effort and could conceiveably be held off until CPAN was redesigned and
relaunched.
The consensus at the BOF (and the CPAN meeting) was that these changes
should be implemented on today's CPAN, and not wait for a full CPAN
redesign.
+ Decommissioned modules?
How are old modules identified and/or removed from CPAN? Some modules
are old and unmaintained, but still useful. Should they be removed,
annotated as "old and unmaintained" or moved to a separate area of
CPAN?
+ Forward/Reverse dependancies
Knowing which modules a specific module requires is important.
Similarly, knowing which modules require this module is equally
important.
This involves both knowing modules by name and specific versions
of those modules. Having this information will help administrators
identify when a new module requires an upgrade/downgrade to an
installed module, and when that update will break existing code.
+ Better descriptions
Both short and long module descriptions are necessary. A better
hierarchy of module classifications would be welcome, as would allowing
a single module fit into multiple categories.
This issue revolves around creating better metadata for modules, and
specifically identifying how the metadata can be improved.
+ CPAN is a library
The Library of Congress is a library, and has solved many of
these issues already. Much of the discussion of CPAN metadata
revolves around problems that librarians have solved already.
+ Better scripts repository
Increasing the number of scripts and classification of scripts would be
nice.
+ Better README
Currenlty, the README is an unstructured text file. Adding structure
to it, possibly in POD, would help.
+ Versioning
A clearer mechanism of identifying what module versions are found in a
larger package/distribution would help. Right now, the distinction is
available, but it is often unclear.
+ Inspection
Currently, the only real way to examine a module is to download it.
While search.cpan.org makes the docs available, it would be much better
if all of the relevant information about a module were easily found and
machine readable, so that a complete summary were available on the web
or through CPAN.pm.
* Fixing Today's CPAN
What follows is a list of outstanding issues identified above that need
to be addressed in CPAN today, or can be implemented today to improve
CPAN.
+ Module Versioning
The issue of allowing multiple versions of a module to be installed at
once is a big issue that involves changes to the Perl language. All
discussion of versioning (including namespaces, interfaces, multiple
implementations of a single module) are for Larry to think about.
+ Digital Signatures
Tracking MD5 signatures (or something better) should be implemented
soon. Other software libraries do this today to help insure that the
software downloaded hasn't been tampered with.
+ Globalization / Internationalization / Localization
There are CPAN and PAUSE issues involved in offering improved
i18n support for CPAN.
+ Saving CPAN.pm options
CPAN.pm can already save some options, but the number of things that
can be made optional should expand.
+ SourceForge
SourceForge has solved some of the problems CPAN is facing today.
Adopting some of their solutions can help improve CPAN.
+ Structure/Politics of PAUSE
PAUSE should expand to address some of the social issues that exist
today, such as supporting multiple implementations of one module, etc.
+ Quality Control
Differentiating CPAN into different quality levels may help some of the
issues involved in finding quality modules.
In this manner, CPAN as we know it would be a first-level staging area
(APAN). Once some rudimentary automated quality checking is done, a
distribution is made available on some intermediate staging area
(BPAN). After a module has been reviewed by an editor, it can be made
available on a quality-controlled area (the new CPAN).
+ Easy inclusion of modules
CPAN, the "anything-goes distributed-repository" should remain.
If CPAN morphs into a multi-tier quality controlled framework,
then the scratch area (APAN) serves the purpose of an anything-goes
mirrored repository. Rejecting a crufty or incomplete module
does not prevent it from being mirrored. Such modules are still
propogated (in APAN) as they are today.
+ Derived views of CPAN
Rather than re-centralizing CPAN, the next CPAN should allow and
encourage users and organizations to create their own layers on top of
CPAN. This could be used for adding local, private modules into a
private CPAN mirror, or offering a privately edited list of modules
that one person or group has "blessed".
For example, this might take the form of "my.cpan.org/mjd" or
"cpan.plover.com" for Mark-Jason Dominus' view of the Best of CPAN.
+ APAN/BPAN/CPAN
The names APAN, BPAN and CPAN are functional distinctions to the
"multiple levels of CPAN" and are not intended to be their final
names.
APAN serves as the initial "anything goes" staging area.
BPAN serves as the area for blessed, well named modules that have gone
through some rudimentary quality checking.
CPAN serves as the area for fully QC'd modules, and acts as a good
baseline for derived (personal/private) views of CPAN.
From here forward, CPAN shall mean either the unified repository as we
know it today, or a hypothetical multi-tiered APAN/BPAN/CPAN.
+ Save CPAN's perception
CPAN is very well respected both within the free software
community and within the Perl community.
Any changes to CPAN must maintain that high level of respect outside
the Perl community and level of contribution within the Perl community.
+ Searchability
CPAN should remain searchable. If at all possible the searchability
should improve (possibly through keyword searching).
Should CPAN split into multiple tiers, each tier should be
individually searchable, as well as searchable together with
data from the other tiers.
+ Redistribute CPAN code
One goal for improving CPAN may be to package all of the PAUSE and CPAN
programs used for maintaining CPAN for use with other software
libraries like CTAN or the Vaults of Parnassus.
+ Improve the Module List
Currently, the module list contains about 1/4 of all modules on CPAN.
Adding a new module onto the module list can sometimes be a political
issue.
The reason why the module list is limited in size is largely a
historical accident. There is no technical reason why the module list
cannot be expanded and/or split across multiple files.
It would also help if the by-modules directory were comprehensive.
Currently, it is not.
+ Alternative module lists
The master module list should merge in the XML, Apache and TK module
lists.
+ Strict Formatting
The strict formatting of the module list does not necessarily need to
be maintained, if a better format (or Perl-readable format) is
available.
+ Perl5/Perl6
As Perl6 is developed, some mechanism for identifying both Perl5 and
Perl6 modules in CPAN will be necessary.
Modules specific to Perl5 or Perl6 should both be on the same CPAN.
+ FTP Interface
Should the FTP interface into CPAN be deprecated?
Currently, there is a less problematic replacement:
http://~~~~/get?module=Foo::Bar
The get?module= request may not be as complete as the FTP interface,
but it can be extended and doesn't have the problems some FTP
servers/firewalls have.
+ Persistent URLs
One consistent format needs to be chosen to identify a Perl module in
any CPAN mirror. Currently, any module can be found through the
by-author directory structure. This may or may not need to be
revisited, but there should be one canonical URL for any module.
+ Winnepeg Auto-Installer
There is a little-known utility offered by the University of Winnipeg
that will auto-install Perl modules. This should be publicized and
extended, and perhaps integrated into CPAN.
+ Binary Distributions
If binary distributions are available through CPAN, a CPAN overlay or
some similar mechanism, then there needs to be a way to identify the
underlying source distribution for a given binary distribution.
This might be done already through some OSD-like metadata.
+ modules@perl.org
The modules list needs to expand to add more people and more moderators
into the discussion.
Perhaps a broader, more distributed list is required.
+ backpan
Ask maintains a backup of CPAN that contains all versions of a module
posted since backpan was created. This should be publicized and
possibly expanded.
+ CPAN API
The mechanism for finding and downloading CPAN modules is reasonably
ad-hoc at the moment. Clarifying this API, possibly using SOAP or some
other XML format would make it easier for more CPAN interfaces to be
created.
CPAN.pm already offers some of these features. This is mostly a
request to improve and/or better advertise CPAN.pm, or otherwise
offer a richer server-side API into searching and downloading CPAN for
modules and other distributions.
+ Metadata Distribution
The entire CPAN repository is hovering around 1GB today. Once richer
metadata is available for CPAN modules, this metadata can be replicated
widely, perhaps to a user's local machine. This would offer a
space-efficient mechanism for browsing CPAN locally without constantly
resynchronizing the entire repository.
This mechanism is very similar to the *BSD Ports collection, which
encodes a few GB of data into a few MB of metadata. The metadata is
used to identify dependencies and install all required prerequisites
when installing a single package.
+ More HTML interfaces
Simplifying or promoting the underlying data used to build a CPAN
interface would allow more people to try and create better interfaces
into CPAN. This would be a good thing.
+ Trademarks
CPAN (and CPAN interfaces) need to do a better job of acknowledging
trademarks. This is especially necessary if CPAN moves to acknowledge
corporate namespaces.
+ Licensing
In order to simplify redistribution, more rigid tiers of CPAN
may that require a module use a standard open source license
(GPL, Artistic, etc.) instead of a custom license that requires
users to examine the license before use.
+ Liability
If CPAN mutates into a multi-tiered, quality controlled repository, the
liability issues with that quality control will probably need to be
expressly disclaimed.
+ Mirror Setup
The setup and policy of creating a new CPAN mirror can be simplified.
This may also include generating standardized logs that can be forwarded
to a central repository for producing more complete CPAN download
statistics.
+ Module Install
Module installation can be improved to allow a user to register their
use of a module, and possbily even sign up for a -announce (or even
-discuss) style mailing list about that module.
+ Module bug report repository
Today, every module needs to maintain its own bug tracking system.
Extending or duplicating Perl's bug tracking system for individual
modules would help track down and hopefully fix bugs in CPAN modules.
+ Integration with SourceForge?
Many modules are already on SourceForge or would benefit from migrating
to SourceForge. A better integration between SourceForge and CPAN
would help module users.
+ Perl Census
There are many reasons why a Perl Census would be helpful. This would
necessarily be an opt-in effort.
Techniques for conducting this census include
= census.pl: a Perl program for examining a Perl installation and
sending results to a central repository. This could be completely
anonymous, identify the organization or completely identify the
machine/individual.
= mail perllocal.pod: This is a low impact, simple way of seeing what
modules are installed on any particular machine.
= suppress non-CPAN modules: As a security measure, tally only those
modules that are available on CPAN; ignore any private modules.
= (uname -a ; perl -V) | mail: collect statistics on which platforms
are popular, and which build options are popular.
* Action Items
+ use.cpan.org - Elaine Ashton
+ RFC on namespaces - Kevin Lenzo
Kevin will focus on the issues he's seen with the Festival namespace
+ Naming Guidelines - Jon Orwant
+ Trademark Issues - Jon Orwant
This will focus on issues revolving around Sun owning the Solaris::*
and O'Reilly owning the OReilly::* / O'Reilly::* namespaces.
A policy on removing a module from an owned namespace may come out of
this.
+ Versioning - Larry Wall
Many of the issues around implementing namespaces and versions
+ Discussion of COM's GSIDs - Gurusamy Sarathy
+ OSD / Cryptographic Signatures - Andreas Koenig and Graham Barr
+ Module Security Issues - Merijn Broeren
+ Structuring master/subsidiary indexes, Global Distribution - Merijn Broeren
+ Module reviews and comparisions - Adam Turoff
+ cpan-workers mailing list - Ask Bjorn Hansen
+ Fixing the CPAN Multiplexer - Tom Christiansen
+ Cleanup the CPAN Backbone - Jarkko Hietaniemi
This includes redistributing the programs which maintain CPAN as well
as simplifying the mirroring policy.
+ CPANTS - Michael Schwern
CPANTS is the CPAN testing service
+ xx.cpan.org -> Ask Bjorn Hansen
This involves setting up country code aliases for local cpan.org
mirrors (us, ca, uk, etc.).
This also may involve some measure of load balancing and round-robin
DNS.
Thread Next
-
CPAN meeting minutes
by Adam Turoff