Front page | perl.modules |
Postings from October 2002
Module submission LSI
Thread Next
From:
Perl Authors Upload Server
Date:
October 13, 2002 14:28
Subject:
Module submission LSI
Message ID:
200210132128.g9DLSjb21729@pause.perl.org
The following module was proposed for inclusion in the Module List:
modid: LSI
DSLIP: adpOg
description: Latent semantic indexing toolkit
userid: MACIEJ (Maciej Ceglowski)
chapterid: 11 (String_Lang_Text_Proc)
communities:
Will introduce at O'Reilly bioinformatics conference 2003. General
writeup at http://javelina.cet.middlebury.edu/lsa/out/lsa_intro.htm
similar:
none
rationale:
Latent semantic indexing is a vector-based technique for indexing
large document collections. LSI uses a dimensionality reduction
technique called 'singular value decomposition' to vastly improve
recall in searching such collections. LSI search engines can return
relevant results even when a document does not contain an exact
keyword match to the query.
"Document" and "keyword" here can be defined very loosely -
although traditionally LSI has been applied to natural language
text, the technique is purely algebraic, and there are potential
applications to DNA sequences, image files, and pretty much anything
you can shoehorn into a vector model.
While LSI has been a theoretical curiosity for many years, this is
the first open-source implementation ( to my knowledge ), and the
first practical toolkit usable by people outside the computational
linguistics community. We hope to make the task of building
vector-based search engines, visualization tools, and archive
management tools much easier for the casual programmer.
Because the toolkit includes visualization, clustering and other
components that go beyond searching data, I felt the Search::
namespace was overly specific. I do understand that the CPAN team
may feel very strongly about creating a root-level LSI namespace.
enteredby: MACIEJ (Maciej Ceglowski)
enteredon: Sun Oct 13 21:28:44 2002 GMT
The resulting entry would be:
LSI adpOg Latent semantic indexing toolkit MACIEJ
Thanks for registering,
The Pause Team
PS: The following links are only valid for module list maintainers:
Registration form with editing capabilities:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=31300000_210131c3f58a35db&SUBMIT_pause99_add_mod_preview=1
Immediate (one click) registration:
https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=31300000_210131c3f58a35db&SUBMIT_pause99_add_mod_insertit=1
Thread Next
-
Module submission LSI
by Perl Authors Upload Server