develooper Front page | perl.modules | Postings from October 2002

Module submission LSI

Thread Next
From:
Perl Authors Upload Server
Date:
October 13, 2002 14:28
Subject:
Module submission LSI
Message ID:
200210132128.g9DLSjb21729@pause.perl.org

The following module was proposed for inclusion in the Module List:

  modid:       LSI
  DSLIP:       adpOg
  description: Latent semantic indexing toolkit
  userid:      MACIEJ (Maciej Ceglowski)
  chapterid:   11 (String_Lang_Text_Proc)
  communities:
    Will introduce at O'Reilly bioinformatics conference 2003. General
    writeup at http://javelina.cet.middlebury.edu/lsa/out/lsa_intro.htm

  similar:
    none

  rationale:

    Latent semantic indexing is a vector-based technique for indexing
    large document collections. LSI uses a dimensionality reduction
    technique called 'singular value decomposition' to vastly improve
    recall in searching such collections. LSI search engines can return
    relevant results even when a document does not contain an exact
    keyword match to the query.

    "Document" and "keyword" here can be defined very loosely -
    although traditionally LSI has been applied to natural language
    text, the technique is purely algebraic, and there are potential
    applications to DNA sequences, image files, and pretty much anything
    you can shoehorn into a vector model.

    While LSI has been a theoretical curiosity for many years, this is
    the first open-source implementation ( to my knowledge ), and the
    first practical toolkit usable by people outside the computational
    linguistics community. We hope to make the task of building
    vector-based search engines, visualization tools, and archive
    management tools much easier for the casual programmer.

    Because the toolkit includes visualization, clustering and other
    components that go beyond searching data, I felt the Search::
    namespace was overly specific. I do understand that the CPAN team
    may feel very strongly about creating a root-level LSI namespace.

  enteredby:   MACIEJ (Maciej Ceglowski)
  enteredon:   Sun Oct 13 21:28:44 2002 GMT

The resulting entry would be:

LSI               adpOg Latent semantic indexing toolkit             MACIEJ


Thanks for registering,
The Pause Team

PS: The following links are only valid for module list maintainers:

Registration form with editing capabilities:
  https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=31300000_210131c3f58a35db&SUBMIT_pause99_add_mod_preview=1
Immediate (one click) registration:
  https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=31300000_210131c3f58a35db&SUBMIT_pause99_add_mod_insertit=1

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About