Xref: feenix.metronet.com alt.surrealism:99
Path: feenix.metronet.com!news.utdallas.edu!hermes.chpc.utexas.edu!cs.utexas.edu!usc!howland.reston.ans.net!agate!msuinfo!netnews.upenn.edu!mjd
From: mjd@saul.cis.upenn.edu (Mark-Jason Dominus)
Newsgroups: alt.surrealism
Subject: Automatic Keywords
Keywords: blunder harvest Jan shipboard
Message-ID: <MJD.93Oct14001800@saul.cis.upenn.edu>
Date: 14 Oct 93 04:17:59 GMT
Sender: news@netnews.upenn.edu
Organization: University of Pennsylvania
Lines: 58
Nntp-Posting-Host: saul.cis.upenn.edu


The enclosed program generates `Keywords:' lines for your mail messages
and news posts.  It accepts as arguments a number of keywords to
generate (default is 4) and a list of files to select keywords from
(default is /usr/dict/words).  It is written in perl.

Some sample outputs:

Keywords: embedded hidden masterpiece pyramid
Keywords: Halpern keystone losable Poe
Keywords: goatherd metabolic quiver Societe
Keywords: deviant snatch Vietnam viewpoint
Keywords: immovable incite multiply urgency

The program has the delightful property of producing all possible sets
of keywords equiprobably.  

#!/usr/local/bin/perl
# select 4 lines from input, equiprobably.

srand;

$lnct=0;
$num=4;			# select this many more lines

$#lines = 30000;	#pre-extend for efficiency.

while ($ARGV[0] =~ /^\d*$/ && $#ARGV != -1) {
    $num=$ARGV[0];
    shift;
}

if ($#ARGV == -1) {
    @ARGV=('/usr/dict/words');
}

while (<>) {
  chop;
  $lines[$lnct++] = $_; 
}

$denom=$lnct;		# from a pool of this many more
# so the probability that the next line should be selected is $num/$denom.

print "Keywords:";
for ($i=0; $i<$lnct; $i++)  {
  if (rand() < ($num / $denom) ) {
    print " ", $lines[$i];
    $num--;
  }
  $denom--;
}

print "\n";
--

 If you never did, / you should.  /  These things are fun / and fun is good.
Mark-Jason Dominus 	  			    mjd@central.cis.upenn.edu 
