ListWebber II

Table Of Contents

  1. Introduction
  2. User's Guide
    1. "What is your name?"
    2. "What is your email address?"
    3. "What list do you want to query?"
    4. "Do you want to 'Search the list' or 'Retrieve messages from the list?'"
    5. "Enter your query."
  3. Caveats
  4. Administrator's Guide
  5. Possible Improvements
  6. Release Notes
  7. The Code
    1. ListWebber II 1.0b
    2. cgi-lib.pl
    3. sock.pl

Introduction

ListWebber provides the means for searching LISTSERV and ListProcessor lists while reducing the need to know their searching syntax.

Do you subscribe to any lists? Have you ever said to yourself, "Self, I remember seeing some information on that topic from the list, but now I can't find it." Do you want to know whether or not a particular subject is being discussed on a list? Are you frustrated with the "noise to signal" ratio of particular lists? Have you ever wanted to extract a particular idea from a list, but you didn't want to subscribe to the whole thing? I have asked myself these questions, and that is why ListWebber was created.

User's Guide

Using your forms-capable World Wide Web (WWW) browser, you can use ListWebber to search the archives of LISTSERV or ListProcessor lists and extract only the information you want.

To use ListWebber simply complete the form and submit the message. ListWebber then creates an email message (query) on your behalf and sends it to the server. The server will then process your query and send the results back to the email address you specified in the previously completed form. Be patient, the servers may repond very quickly (a few minutes) or it may not respond for more than a day. (The addresses specified for the servers may be incorrect. In which case, you will get no response at all.) Lastly, you repeat this process until you are satisfied with the results.

Completing the form (constructing a query) is a five-step process broken into into five questions and one statement. Each is explained below:

  1. "What is your name?" - The Internet mail protocol likes to know who is sending messages. Additionally, if your message gets corrupted or lost, then the postmaster at the other end of the line may have away of contacting you and forwarding your message appropriately.
  2. "What is your email address?" - Enter your real email address; do not enter an alias. For example, presently, my real email address is "eric_morgan@library.lib.ncsu.edu", but my alias is "eric_morgan@ncsu.edu". When you send email and if you have an aliased email address, then your aliased email address gets translated into your real email addess during the email process. This is true every time you send email. Sometimes servers will check whether or you are a member of the list being searched. To do this they check your email address against the email addresses they have on record. Since ListWebber does not require you to enter your real email address and your alias, the server may not find your address on record. Thus, play it safe; use your real email address.
  3. "What list do you want to query?" - Select one or more lists from the menu. Take note of the operating system listed in parentheses beside the list. You will need to know this later. Make sure you select lists from similar operating systems. Otherwise your query may not be understood by the server. You may want to consult The Directory of Scholarly Electronic Conferences in order to make your selection(s).
  4. "Do you want to 'Search the list' or 'Retrieve messages from the list?'" - Select either the Search or Retrieve options. The first time you use ListWebber you will always select the Search option.

    Once you have used the search option, you will (probably) recieve a list of "hits" matching your query. Here is sample output from a non-Unix-based LISTSERV query:

    
    > S Eric Lease Morgan in PACS-L
    --> Database PACS-L, 11 hits.
    
    > I
    Item #   Date   Time  Recs   Subject
    ------   ----   ----  ----   -------
    007617 92/12/11 13:32  217   A Memo From Mr. Serials
    009941 93/10/22 13:18   87   LISTGopher
    010180 93/11/18 18:18   46   Re: First OPAC/PDA/Cellular Link
    010215 93/11/23 10:57  336   Current Cites 4, no. 11 (1993)
    010321 93/12/06 17:33   60   OPAC/Cel1ular/PDA Test
    010357 93/12/09 13:26   56   Re: WHY NEWSGROUPS? Was: LISTSERVS VS. NEWSGROUPS
    010417 93/12/15 17:20   41   Emerging Technologies Committee
    010944 94/02/23 15:39  251   Conference announcement - Computer conferencing
    011007 94/03/03 18:15   78   Reference Questions via Internet
    011056 94/03/10 17:42   25   NCSU, WWW, gopher, and WAIS
    011535 94/05/26 16:01   41   2 W3 | ~ 2 W3 
    				

    After you have recieved output similar to the example above, take note of the Item #'s representing the messages you would like to read. You will use these numbers to retrieve the original postings.

    Here is sample output from a the Unix-based ListProcessor query:

    
    Matches for pattern public ...
    
    --- Archive: publib (path: listproc/publib) 
    
    >>> File 931203:
         IMPACT OF PUBLIC LIBRARY SERVICES ON CHILDREN
         Improves coordination between school and public
      Exemplary Public Library Reading-Related Programs
    Rollack, Barbara T. Public Library Services for
      Children's Listening Skills," Public Library
    <<< End of matches in file 931203
    
    >>> File 931204:
    The staff of the Burlington Public Library
    The staff of the Burlington Public Library
    <<< End of matches in file 931204
    
    >>> File 931205:
    	At the Burlington Public Library, we ask that children under 16
    The staff of the Burlington Public Library
    Sender: Glendora Public Library 
    exact age but do have a range.  Jill Patterson, Glendora Public
            Baldwinsville Public Library
    Adults in the Public Library" and I am thrilled to learn that this
    <<< End of matches in file 931205
    
    >>> File 931217:
    The staff of the Burlington Public Library
    The staff of the Burlington Public Library
    Rocky River (OH) Public Library
    The staff of the Burlington Public Library
    >   Bellingham Public Library     FAX   676-7795
    to abide by the system.  Many public libraries do not.  At the Onondaga
    County Public Library in Syracuse, New York we have free access to
    The public library may be the last outpost of a democratic society for 
      Bellingham Public Library     FAX   676-7795
    >public library?
    I want to let everyone know we at the Minneapolis Public Library have
    Minneapolis Public Library
    with The New York Public Library?--Thank you Marilee, Sandra, Sharon Bart &
    planned (and hopefully accurate) publication date of May 1, 1994.
    Minneapolis Public Library
    Cedar Rapids Public Library		%   Phone:  319-398-5123
    months I have been the public library consultant at the Oregon State
    Now I am seeking help for two public libraries in Oregon. Personal
      Robin Elbot                     | Reading Public Library 
    republics included.
    The staff of the Burlington Public Library
    <<< End of matches in file 931217
    
    >>> File 931219:
    * US Technology Public Policy
    <<< End of matches in file 931219
    
    >>> File 931227:
      > In the public libraries I've worked in I've had parents ask. They've heard
    arising from children being left unattended in public places.  The
    public facilities; (4) Anything else you think is relevant!
    <<< End of matches in file 931227 
    				

    Similarly, after you have recieved output similar to the example above, take note of the file numbers representing the messages you would like to read. You will use these numbers to retrieve the original postings.

  5. "Enter your query." - This is the most difficult part of the form to complete.
    • If you are searching a non-Unix-computer LISTSERV, then you can enter a Boolean or soundex query. Below is the help text describing searching straight from the LISTSERV instructions:
      
        Basic search functions
        ----------------------
      
        The  two  most important things you have to indicate when you search a
        database are:
      
        1.  The name of the database you want to search.
      
        2.  What you want to search the individual documents for.
      
        The name of the database to be searched is specified after  the  words
        or  phrases  to  be  sought  and  is prefixed with an IN keyword.  For
        example, we might do this:
      
                              Search Rosemary in MOVIES
      
        This would select all the entries from  database  "MOVIES"  containing
        the string "ROSEMARY".
      
        Now  if  you just wanted to see the list of all the movies you can see
        this week, you could have used  an  asterisk  as  search  argument  to
        select all the entries in the database:
      
                                  Search * in MOVIES
      
        Note  that  the  database name doesn't have to be uppercased.  This is
        merely done to make the examples look better.
      
        If you want to "narrow" your previous search, i.e. perform  additional
        tests  on  the  documents that have been previously selected, you must
        omit the IN keyword.  In that case, the search will be applied to  the
        previous "hits" and will create a new "hit list".
      
        But  in  most  cases, we will want to search for something longer than
        one word, for example part of a "key" sentence.
      
                   Search Hardware problem with a 4381 in IBMFORUM
      
        Another problem is that we  might  not  remember  the  exact  original
        sentence.
        This is not very important, since LISTSERV will search each word indi-
        vidually:    in  the above example, any entry that contained the words
        "hardware", "problem", "with", "a" and "4381" would have  matched  the
        search, even if the words appeared in a different order.
      
        But  what  if  the  original  document had "4381-13" in it, instead of
        "4381"?
        This is again no problem, as LISTSERV does not require the word to  be
        surrounded  by  blanks  to  find  a match.   Case is also ignored when
        performing the search operation.  That is, "problem" would have  found
        a  match  on  "problems"...    and  "with" would have found a match on
        "without" or "withstand"!  This may sound like inconsistent behaviour,
        but you should keep in mind that it  is  always  possible  to  "narrow
        down"  a search operation.  However, once a document has been excluded
        from the list of "hits", it is very difficult to bring it back.
      
        Now what if I want to search for an exact string?  For example,  I  am
        interested  in  the  string  "in C".   It is very likely that just any
        document in the database will contain both a "in" and  the  letter  C.
        But  what  I  am  interested  in is things which have been written, or
        programmed, or implemented, "in C".
        In that case, it is possible to force LISTSERV to group words together
        by quoting them, as in:
      
                               Search 'in C' in UTILITY
      
        This method can also be used to insert extra blanks between or  before
        words:    leading  and  trailing  blanks are normally removed automat-
        ically, but they are preserved inside quoted  strings.    Please  note
        that  quotes  must be doubled when specified inside quoted strings, as
        in:
      
                         Search 'Rosemary''s baby' in MOVIES
      
      
        The search for 'in C' resulted in over fifty hits, because a match was
        erroneously found against "in clear", "in core", etc.   However, I  do
        not  want  to  search for 'in C ' because there might be hits with "in
        C." or "in C," in the database and I don't want to miss them.
        If the search respected the capital C, it would  no  longer  find  all
        those  irrelevant  hits.    To  do  this, you must enclose your search
        string in double-quotes instead of single quotes, for example:
      
                               Search "in C" in UTILITY
      
        Note that single quotes should not  be  doubled  inside  double-quoted
        strings, and vice-versa.  Only quotes of the same type than the string
        should be doubled.
      
        It  is important to understand the difference between the two types of
        quoting.  If you request a search for 'TEXT', you will find a match on
        "TEXT", "Text", "text" or even "teXt".  This is the same behaviour  as
        unquoted  text.   However, if you request a search for "TEXT", it will
        only find a match on "TEXT", not on "text" nor "Text".
      
      +
      + Quoting is also the only way to search for  a  reserved  keyword  like
      + "IN":  if you tried "Search in in UTILITY", LISTSERV would report that
      + database  "IN"  does  not exist and would reject the command.  This is
      + because the keyword IN indicates the end of your search arguments.  If
      + you quote it, however, it will not be recognized and will be  searched
      + as  you  wanted  it  done.    Similarly,  if you want to search for an
      + asterisk, you will have to quote it since "Search  *"  indicates  that
      + all entries should be selected.
      +
      
      
        Now the problem is that there may be sentences starting with a capital
        I,  e.g.  "In  C, it would be coded this way:".  How can I catch these
        sentences?
        Actually, you have been using "complex search  expressions"  from  the
        beginning without even being aware of it.  When you specified a search
        on  "Hardware  problem with a 4381", you had in fact been asking LIST-
        SERV for:  "Hardware AND problem AND with AND a AND 4381".  The  "AND"
        is implicit, but it may be overriden.  You may even use parenthesis if
        needed:
      
                   Search ("in C" or "In C") and program in UTILITY
      
        The "AND" can still be implied, as in:
      
        +--------------------------------------------------------------------+
        | Search wooden chair (blue or green) in CHAIRS                      |
        | Search (wooden chair) or (plastic chair) in CHAIRS                 |
        | Search plastic chair (blue or green but not streaked) in CHAIRS    |
        |                                                                    |
        | The following commands are strictly equivalent:                    |
        |  Search (wooden chair) or (plastic chair not blue) in CHAIRS       |
        |  Search chair (wooden or (plastic not blue)) in CHAIRS             |
        |  Search chair (wooden or (plastic but not blue)) in CHAIRS         |
        |  Search chair AND (wooden OR (plastic AND NOT blue)) in CHAIRS     |
        |                                                                    |
        | Figure 4.  Sample  SEARCH  commands  using complex document search |
        |            arguments                                               |
        +--------------------------------------------------------------------+
      
      
      
        Date specifications
        -------------------
      
      
        Since each document has been  assigned  a  "date/time"  field,  it  is
        possible to select documents based on this date field.  This is accom-
        plished  by appending "date search rules" to the search expression, as
        in:
      
        +--------------------------------------------------------------------+
        | Search problem (serious or severe) in BBOARD since july            |
        | Search problem in BBOARD since oct 85                              |
        | Search symptom in BBOARD since 12/28                               |
        | Search error report from 12 january to august in BBOARD            |
        | Search user complaint until 18 sept in BBOARD                      |
        | Search data check since today 11:53 in EREP                        |
        |                                                                    |
        | Figure 5.  Sample SEARCH commands using date search arguments      |
        +--------------------------------------------------------------------+
      
        The default values for omitted arguments are always chosen  so  as  to
        exclude as little entries as possible.  For example, "July" would mean
        "1  July 00:00:00" in a SINCE specification, and "31 July 23:59:59" in
        an UNTIL clause.  The only exception is the year field,  which  always
        defaults to the current year.
      
      
      
        Keyword search specifications
        -----------------------------
      
      
        The  last  thing  you  may wish to search is the "keywords" list.  For
        example, you might want to select those plastic chairs which cost less
        than 50 dollars.  It is assumed that the price will vary often  (maybe
        almost daily), and that it is therefore kept externally from the docu-
        ment  describing  the  chair.   Thus, you would have a "Price" keyword
        which you could search in the following way:
      
                   Search plastic chair in CHAIRS where price < 50
      
        You may of course use complex expressions (with  parenthesis)  in  the
        WHERE  clause.   There are new comparison operators available for this
        clause, like IS, CONTAINS, all the usual arithmetical comparison oper-
        ators, and some more.    However,  the  AND  operation  is  no  longer
        implied, but it can still be specified explicitly of course:
      
             Search plastic chair in CHAIR where price < 50 and avail > 4
      
        The  problem  now is that, as the search commands become more and more
        complex, they will no longer fit in a single  line.    To  solve  this
        problem,  it  has  been  decided that any database command ending in a
        dash indicates that more is to follow on the next line.  This  process
        can be repeated several times if desired.  This applies to both inter-
        active and batch commands.
      
        +--------------------------------------------------------------------+
        | Search chair (wooden or (blue or green but not streaked)) -        |
        |        in CHAIRS -                                                 |
        |        where price < 50 & avail > 4                                |
        |                                                                    |
        | Search chair (wooden or (blue or green but not streaked)) -        |
        | in CHAIRS where price < 50 & avail > 4                             |
        |                                                                    |
        | Search chair (wooden or ( -                                        |
        | blue or green but not streaked) -                                  |
        | ) -                                                                |
        | in CHAIRS where price < 50 & avail > 4                             |
        |                                                                    |
        | Figure 6.  Sample  SEARCH  commands  with continuation lines:  All |
        |            these commands are  strictly  identical,  although  the |
        |            first one is obviously more legible.                    |
        +--------------------------------------------------------------------+
      
        The  only  "trick"  about  this continuation line business is that you
        should always keep quoted strings on a single line.   The  process  of
        identifying  continuation  lines and concatenating them afterwards may
        cause unwanted blanks to be inserted in the command line, which is  no
        problem  outside  a  quoted string since blanks are ignored, but might
        cause erroneous results in a quoted string.
      
        If you want to search for several possible values in a given  keyword,
        you do not have to repeat the keyword name and operator:
      
        +--------------------------------------------------------------------+
        | > Search * in BBOARD where -                                       |
        | > subject contains (PC or (Personal and computer))                 |
        |                                                                    |
        | is strictly equivalent to:                                         |
        |                                                                    |
        | > Search * in BBOARD where -                                       |
        | > subject contains PC or -                                         |
        | > (subject contains Personal and subject contains computer)        |
        |                                                                    |
        | Figure 7.  Sample use of "factorization"                           |
        +--------------------------------------------------------------------+
      
        However,  it  should  be  noted that this "factorization" is performed
        according to the rules of logic, which may not necessarily match those
        of english grammar.  This removes any possible  ambiguity  as  to  the
        meaning of these clauses.  Let's consider the following example:
      
                        machine does not contain (IBM and DEC)
      
        This clause will get translated into:
      
            machine does not contain IBM and machine does not contain DEC
      
        In  english  you  would probably say "machine contains neither IBM nor
        DEC" .  This is how LISTSERV will understand it.  However, if you read
        the clause aloud, you will probably not pronounce the parenthesis  and
        will  end  up  saying "machine does not contain IBM and DEC", in other
        words, "machine does not contain both  IBM  and  DEC"  ,  which  is  a
        totally  different  thing  (and  would  most  probably be true all the
        time).  The "english meaning" could be  obtained  with  the  following
        clause:
      
                         not (machine contains (IBM and DEC))
      
        In  the  former  case,  the  negative  "does  not contain" operator is
        inserted inside the parenthesis.  In the latter,  only  "contains"  is
        moved, and the negation remains outside.
      
        +--------------------------------------------------------------------+
        | > Search gateway problem -                                         |
        | > in BBOARD -                                                      |
        | > since sept 86 -                                                  |
        | > where sender contains (john or paul but not mick) -              |
        | > and subject does not contain lost                                |
        | --> Database BBOARD, 5 hits.                                       |
        |                                                                    |
        | > Index                                                            |
        | Item #   Date   Time  Recs   Subject                               |
        | ------   ----   ----  ----   -------                               |
        | 000012 87/10/18 13:09   12   The gateway has stopped working       |
        | 000017 87/08/24 09:18    9   Glory glory alleluja! Again!!!        |
        | 000018 87/10/18 13:09    8   You know what? It WORKS!!!            |
        | 000024 87/10/18 13:09    7   Guess what happened today?            |
        | 000205 87/10/04 16:59    9   Who's going to babysit it today?      |
        |                                                                    |
        |                                                                    |
        | You  might now wish to narrow your search down to exclude postings |
        | whose subject contains "work".  You can do this  by  specifying  a |
        | new WHERE clause with no associated IN.                            |
        |                                                                    |
        | > Search * where subject does not contain work                     |
        | --> Database BBOARD, 3 hits.                                       |
        |                                                                    |
        | > Index                                                            |
        | Item #   Date   Time  Recs   Subject                               |
        | ------   ----   ----  ----   -------                               |
        | 000017 87/08/24 09:18    9   Glory glory alleluja! Again!!!        |
        | 000024 87/10/18 13:09    7   Guess what happened today?            |
        | 000205 87/10/04 16:59    9   Who's going to babysit it today?      |
        |                                                                    |
        | Figure 8.  Sample SEARCH commands with keyword search clauses      |
        +--------------------------------------------------------------------+
      
        Finally,  the  reason why the database name appears in each reply from
        LISTSERV is that you can specify several database in the IN clause:
      
        +--------------------------------------------------------------------+
        | > Search user complaint in BBOARD1 BBOARD2 -                       |
        | > since august -                                                   |
        | > where sender is charles                                          |
        | --> Database BBOARD1, 2 hits.                                      |
        | --> Database BBOARD2, 8 hits.                                      |
        |                                                                    |
        |                                                                    |
        | Figure 9.  Sample SEARCH commands involving several databases      |
        +--------------------------------------------------------------------+
      
      
      
        Phonetic search
        ---------------
      
      
        There may be cases where you are looking for  a  certain  value  of  a
        keyword,  the  exact  spelling of which you cannot remember.  In these
        cases, it may be useful to try a phonetic search.   A phonetic  search
        will yield a match for anything that "sounds like" your search string,
        as  dictated by a predefined algorithm which is of course not perfect.
        It may give a hit for something which does  not  actually  sound  like
        your  search  string,  or, more rarely, omit a keyword which did sound
        like what you entered.  The main reasons for this are that  the  algo-
        rithm  must  be  fast  to execute on the machine and therefore not too
        sophisticated, and that the way a given word is pronounced depends  on
        the  idiom in which the word was written.  For example, the phonetical
        transcription of the  name  "Landau"  will  be  different  in  French,
        English, German and Russian.  Thus, it is impossible to decide whether
        a  word  sounds  like  another  if the language in which the words are
        pronounced is not known (and of  course  LISTSERV  does  not  have,  a
        priori, any way to know it).
      
        Phonetic searches are performed through the use of the SOUNDS LIKE and
        DOES  NOT  SOUND  LIKE  operators,  which are syntactically similar to
        CONTAINS and DOES NOT CONTAIN.  That is, you could do something like:
      
                  Select * in PHONEBOOK where NAME sounds like WOLF
      
        There is a little trick with the SOUNDS LIKE operator that you  should
        be  aware of.   If your search string (WOLF in our above example) is a
        single word, it will be compared individually to all the words in  the
        reference  string  (i.e.  the  data  from  the  database), and will be
        considered a hit if it "sounds like" any of the words in the reference
        string.   Thus, the search word  "Ekohl"  sounds  like  the  reference
        string  "Ecole  Normale Superieure" because it matches the first word.
        If the search string contains more  than  one  word,  the  search  and
        reference strings will be compared phonetically as a whole (and "Ekohl
        Dzentrahll"  will  therefore  not  match  "Ecole Normale Superieure").
        Note that any search string containing more than a single word must be
        quoted, as explained in the previous sections of this chapter.
      
        +--------------------------------------------------------------------+
        | > Select * in BITEARN where site sounds like (COHRNEAL and LAPORRAD|RY)
        | --> Database BITEARN, 3 hits.                                      |
        |                                                                    |
        | > Index                                                            |
        | Ref# Conn  Nodeid   Site name                                      |
        | ---- ----  ------   ---------                                      |
        | 0292 87/03 CRNLASSP Cornell University Cornell Laboratory of Atomic|
        | 0301 87/03 CRNLION  Cornell University Cornell Laboratory of Plasma|
        | 0307 87/06 CRNLNUC  Cornell University Laboratory of Nuclear Studes|
        |                                                                    |
        | > Select * in BITEARN where SITE sounds like HOPTIKK               |
        | --> Database BITEARN, 2 hits.                                      |
        |                                                                    |
        | > Index                                                            |
        | Ref# Conn  Nodeid   Site name                                      |
        | ---- ----  ------   ---------                                      |
        | 0751 87/09 FRIHAP31 Assistance Publique - Hopitaux de Paris        |
        | 2120 87/04 UOROPT   University of Rochester The Institute of Optics|
        |                                                                    |
        | > Select * in BITEARN where SITE sounds like SCHIKAGO              |
        | --> Database BITEARN, 1 hit.                                       |
        |                                                                    |
        | > Index                                                            |
        | Ref# Conn  Nodeid   Site name                                      |
        | ---- ----  ------   ---------                                      |
        | 0140 86/03 BMLSCK11 Studiecentrum voor Kernenergie (SCK/CEN), Mol, |
        |                                                                    |
        | Figure 10.  Sample SEARCH commands involving phonetic  match:  The |
        |             first  command  shows  an example of accurate phonetic |
        |             match, where the  result  is  exactly  what  the  user |
        |             expected.   In the second example, the user found what |
        |             he was  looking  for  ("Optics"),  but  an  additional |
        |             unwanted  entry was selected.  This is by far the most |
        |             common case.  The last command is a typical example of |
        |             phonetic clash, where the algorithm did not  translate |
        |             the  search string into phonetics as the user expected |
        |             it, with the result that the desired name  ("Chicago") |
        |             was  not  found and that completely irrelevant entries |
        |             were presented instead.                                |
        +--------------------------------------------------------------------+
      
      +
      + The phonetic matching algorithm used by LISTSERV is a  slightly  modi-
      + fied  version  of  SOUNDEX  --  a  well-known  algorithm that provides
      + reasonably accurate matches at a very low CPU cost.  Although it gives
      + best results with the English language, for which  it  was  originally
      + designed, it is not too strongly tied to it and can still be used with
      + other  languages.    It is of course absolutely impossible to write an
      + program that would work for all the languages in the  world,  or  even
      + for  the most widley used ones, since their interpretation of the most
      + common combinations of letters are completely incompatible.
      + 
      						
    • If you are searching a Unix-based ListProcessor, then enter a "regular expression" specifying your query. (In my opinion, regular expressions are really ugly looking things.) Below is the help text taken directly from the LISTSERV instructions on how to formulate search queries. Important! The only parameter you must supply is the "<pattern>". ListWebber supplies the "search" as well as the "<archive | path-to-archive>]" arguments. If necessary, you can supply the " [/password] [-all]" agruments.
      
      Syntax: search <archive | path-to-archive>] [/password] [-all] <pattern>
              Search all files of the specified archive (and all of its subarchives
              if -all is specified) for lines that match the pattern. The pattern
              can be an egrep(1)-style regular expression with support for the
              following additional operators: '~', if leading the regular expression
              it reverses its meaning; '|' and '&' separate multiple regular
              expressions (logical OR and AND); '<' '>' group regular expressions
              (we preserve the meaning of the parentheses from ed(1), and remove the
              meaning of < and > from ed(1) since in the ListProcessor context they are
              either the default, or inappropriate). These can be used literally by
              escaping them with '\'. In addition, the following characters should
              be defined in matched pairs: (), <>, []. The pattern may be enclosed
              in single or double quotes.
      
              Pattern matching is case insensitive.
      
              Certain archives may be private, and in this case you have to
              specify a password for accessing them. The slash is required.
              Different archives may have different passwords.
      
              The archiving system is hierarchical. Therefore subarchives
              may have the same names; they can be distinguished by the path
              (the branch in the hierarchy) to them. For example, the archives
              unix and pub/unix are distinct. 'path-to-archive' is a UNIX style
              path (such as pub/unix) -- i.e. a '/' is used to move through the
              branches of the hierarchy. An 'index' request always reports paths
              to archives for your convinience.
      
              Examples:
      
              search listproc -all "<oranges &~apples>|.*andarin[sz]?"
              search ilp '\|.*\|.$'
      
              Notes:
              - . matches ANY character including newline, so to find all lines
                that contain a newline only, one should use '^.$' instead of '^$'
              - If you do not quote your pattern, it will include any blanks you
                may have appended after it by mistake, and it will not include any
                blanks that may start the regular expression. 
      						
    • If you want to "'Retrieve messages from the list", then enter the numbers corresponding to the messages you want to retrieve. (Examples of such numbers are illustrated in Step #4.) Each number must be separated by at least one space. No more than about a dozen messages can be specified at one time. It makes no sense to select more than one list when retrieving postings. (Unless, of course, you want to retrieve random messages.) Often LISTSERVs impose a limit on the number of bytes they will send to you in one day. If you recieve such an error, then you will have to wait until the next day and retransmit your requests.

Finally, after completing the form, select the "Submit query" option and ListWebber will do the rest. Remeber, LISTSERVs respond to these queries at various speeds. Be prepared to wait 24 hours for a reply. On the other hand, the LISTSERV may respond is just a few minutes.

Caveats

There are a number of limitations to searching LISTSERV list archives:

Administrator's Guide

ListWebber is a cgi script written in perl. Very little of the code should have to be modified for your site; the only part of the perl scripts that ought to be modified is the footer of the document specifing the name of the ListWebber's administrator (you). The goal of ListWebber is to create an SMTP message on the behalf of the user and send this message to the user-specified LISTSERV for processing. If the user-specified LISTSERV is Unix-based, then ListWebber sends a simple "search" statment. On the other hand, if the LISTSERV is not Unix-based, then ListWebber constructs an SMTP message in the form of a Job Control Langauge (JCL) script. To enable the user to select a LISTSERV, you must create a text file listing the user's choices. This is a feature. It allows you to create subject-specific lists of lists. For example, you could have a ListWebber for computer science-related lists as well as library-related, engineering-related, or english-related lists. Combinations of subjects may be in order as well! For example, all the lists in which you, as an individual, are interested. The text files specifying the user's choices take the form list:address:type where list is the name of a list, address is the name of the computer hosting the list, and type is a "u" denoting a Unix-based ListProcessor or a "v" denoting a non-Unix-based LISTSERV. Here is an example ListWebber specification file:

ATLAS-L:TCUBVM.bitnet:v 
CWIS-L:WUVMD.bitnet:v 
ADAPT-L:AUVM.bitnet:v 
ADVANC-L:IDBSU.bitnet:v 
AFAS-L:KENTVM.bitnet:v 
ALF-L:YORKVM1.bitnet:v 
ARCHIVES:ARIZVM1.bitnet:v 
ARIE-L:IDBSU.bitnet:v 
ARLIS-L:UKCC.bitnet:v 
ASIS-L:UVMVM.bitnet:v

The initial call to ListWebber is done in a non-standard way; URLs specifying ListWebber services must include the full-path name of the ListWebber specification file as its first argument. If a file name is not included in the URL, then ListWebber will report an error and no other processing will take place. Below is a snipet of HTML demonstrating possible calls to ListWebber services:

Search various LISTSERV lists:

Notice how each call to the ListWebber program (listwebber2) is followed by a question mark (?) and then the full-path name to a ListWebber specification file (/usr/usrers/temp/www/httpd/cgi-bin/library.lists, /usr/usrers/temp/www/httpd/cgi-bin/engineering.lists, or /usr/usrers/temp/www/httpd/cgi-bin/computer.lists).

After modifying the footer of the ListWebber script, and after creating at least one specification file, put the ListWebber code in an executable directory. Lastly, annotate your HTML files to include calls to ListWebber documents.

Possible Improvements

I could have written ListWebber so it automatically looks for *.lists files if no specification file where included, but then the specification files would have to reside in a specific location in your server's directory structure and the specification files would have to have a consistant extenstion like .lists. I opted not to impose these restrictions on you.

Another possible improvement would be to allow the user to specify any list/LISTSERV combination. This, in effect, would turn ListWebber into an anonymous mailer, and this is a situation we want to avoid.

Release Notes


Creator: Eric Lease Morgan <eric_morgan@infomotions.com>
Source: This text was never formally published.
Date created: 1996-05-31
Date updated: 2005-04-23
Subject(s): computer programs and scripts;
URL: http://infomotions.com/musings/list-webber/