1#!/usr/local/bin/perl -w
2
3=head1 NAME
4
5kocos.pl - Find the Kth order co-occurrences of a word
6
7=head1 SYNOPSIS
8
9This program finds the Kth order co-occurrences of a given word.
10
11=head1 DESCRIPTION
12
13=head2 1. What are Kth order co-occurrences?
14
15Co-occurrences are the words which occur together in the same context. All
16words which co-occur with a given target word are called its co-occurrences.
17The concept of 2nd order co-occurrences is explained in the paper Automatic
18word Sense Discrimination [Schutze98]. According to this paper, the words
19which co-occur with the co-occurring words of a target word are called as the
202nd order co-occurrences of that word.
21
22So with each increasing order of co-occurrences, we introduce an extra level
23of indirection and find words co-occurring with the previous order
24co-occurrences.
25
26We generalize the concept of 2nd order co-occurrences from [Schutze98] to find
27the Kth order co-occurrences of a word. These are the words that co-occur
28with the (K-1)th order co-occurrences of a given target word.
29
30We have also found [Niwa&Nitta94] to be related to kocos. While we do not
31exactly reimplement the co-occurrence vectors they propose, we feel that
32kocos is at least similar in spirit.
33
34=head2 2. Usage
35
36Usage: kocos.pl [OPTIONS] BIGRAM
37
38=head2 3. Input
39
40=head3 3.1 BIGRAM
41
42Specify the BIGRAM file name on the command line after the program name and
43options (if any) as shown in the usage note.
44
45BIGRAM should be a bigram output(normal or extended) created by NSP programs -
46count.pl, statistic.pl or combig.pl. When count.pl and statistic.pl are run for
47creating bigrams (--ngram set to 2 or not specified), the programs list the
48bigrams of all words which co-occur together(in certain window). So we can
49say that if a bigram 'word1<>word2<>' is listed in the output of count.pl
50or statistic.pl program, it means that the words word1 and word2 are the
51co-occurrences of each other.
52
53In general you may want to consider the use of stop lists (--stop option
54in count.pl) to remove very common words such as "the" and "for", and
55also eliminate low frequency bigrams (--remove option in count.pl). The
56stop list is particularly  important as high frequency words such as "the"
57or "for" will co-occur with many different words, and greatly expand the
58search needed to find kth order co-occurrences.
59
60If you want to run kocos.pl on a source file not created by either count
61or statistic program of this package, just make sure that each line of BIGRAM
62file will list two words WORD1 and WORD2 as
63WORD1<>WORD2<>
64The program minimally requires that there are exactly two words and they are
65separated by delimiter '<>' with an extra delimiter '<>' after the second
66word. So you may convert any non NSP input to this format where two words
67occurring in the same context are '<>' separated and provide it to kocos.
68
69Controlling scope of the context
70
71You may like to call two words as co-occurrences of each other if they occur
72within a specific distance from each other. We encourage in this case that you
73use --window w option of NSP program count.pl while creating a BIGRAM file. This
74will create bigrams of all words which co-occur within a distance w from each
75other. Thus --window w sets the maximum distance allowed between two words to
76call them co-occurrences of each other.
77
78Note that if the --window option is not used while creating BIGRAM input
79for kocos, only those words which come immediately next to each other will
80be considered as the co-occurrences (default window size being 2 for bigrams).
81
82=head2 4. Options
83
84=head3 4.1 --literal WORD
85
86With this option, the target WORD whose kth order co-occurrences are to be
87found can be directly specified on the command line.
88
89e.g.
90        kocos.pl --literal line test.input
91will find the 1st order co-occurrences (by default) of the word 'line' using
92Bigrams listed in file test.input.
93
94	kocos.pl --literal , --order 3 test.input
95will find 3rd order co-occurrences of ',' from file test.input.
96
97=head3 4.2 --regex REGEXFILE
98
99With this option, target word can be specified using Perl regular expression/s.
100The regex/s should be written in a file and multiple regex/s should either
101appear on separate lines or should be Perl 'OR' (|) separated.
102
103We provide this option to allow user to specify various morphological
104variants of the target word e.g. line, lines, Line,Lines etc.
105
106e.g.
107(1) let test.regex contains a regular expression for target word which is -
108 /^[Ll]ines?$/
109
110To use this for finding kocos, run kocos.pl with command
111
112        kocos.pl --regex test.regex --order K test.input
113
114(2) To find say 2nd order co-occurrences of any general target word which occurs in
115Data in <head> tags like Senseval Format,
116we use a regular expression
117 /^<head.*>\w+</head>$/
118in our regex file say test.regex
119and run kocos.pl using command
120
121        kocos.pl --regex test.regex --order 2 eng-lex-sample.training.xml
122
123(3) To find 3rd order co-occurrences of any word that contains period '.'
124run kocos.pl using
125
126	kocos.pl --literal . --order 3 test.input
127
128Or write a regex /\./ in file say test.regex and run kocos using
129
130	kocos.pl --regex test.regex --order 3 test.input
131
132(4) To find 2nd order co-occurrences of all words that are numbers,
133write a regex like /^\d+$/ to a regexfile say test.regex and run kocos
134using,
135
136        kocos.pl --regex test.regex --order 2 test.input
137
138Note: writing a regex /\d+/ will also match words like line20.1.cord, or
139art%10.fine456 that include numbers.
140
141Regex/s that should exactly match as target words should be delimited by
142^ and $ as in /^[Ll]ines?$/. Specifying something like /[Ll]ines?/ will
143match with 'incline'.
144
145Note - The program kocos.pl requires that the target word is specified using
146either of the options --literal or --regex
147
148=head3 4.3 --order K
149
150If the value of K is specified using the command line option --order K,
151kocos.pl will find the Kth order co-occurrences of the target word. K can
152take any integer value greater than 0. If the value of K is not specified,
153the program will set K to 1 and will simply find the co-occurrences of the
154target (the word co-occurrence generally means first order co-occurrences).
155
156=head3 4.4 --trace TRACEFILE
157
158To see a detailed report of how each Kth order co-occurrence is reached as a
159sequence of K words, specify the name of a TRACEFILE on the command line
160using --trace TRACEFILE option.
161
162TRACEFILE will show the chains of K+1 words where the first word is the TARGET
163word and every ith word in the chain is a (i-1)th order co-occurrence of target
164which co-occurs with (i-1)th word in the chain. So a chain of K+1 words,
165
166 TARGET->COC1->COC2->COC3....->COCK-1->COCK
167
168shows that COC1 is a first order co-occurrence of the TARGET.
169
170 COC2 is a second order co-occurrence such that COC2 co-occurs with
171 COC1 which in turn co-occurs with the TARGET.
172 COC3 is a third order co-occurrence such that COC3 co-occurs with
173 COC2 which in turn co-occurs with COC1 which co-occurs with TARGET.
174
175and so on......
176
177=head3 4.6 --help
178
179This option will display the help message.
180
181=head3 4.7 --version
182
183This option will display version information of the program.
184
185=head2 5. Output
186
187The program will display a list of Kth order co-occurrences to standard
188output  such that each co-occurrence occurs on a separate line and is
189followed by '<>' (just to be compatible with other programs in NSP).
190
191Note that the output of kocos.pl could be directly used by the program
192nsp2regex of the SenseTools Package (by Satanjeev Banerjee and Ted
193Pedersen) to convert Senseval data instances into feature vectors in ARFF
194format where our Kth order co-occurrences are used as features.
195
196For more information on SenseTools you can refer to its README:
197http://www.d.umn.edu/~tpederse/sensetools.html
198
199                                IMPORTANT NOTE
200
201If there are some kth order co-occurrences which are also the ith order
202co-occurrences (0<i<k) of the target word, program kocos.pl will not
203display them as the Kth order co-occurrences. kocos.pl displays only those
204words as Kth order co-occurrences whose minimum distance from target word
205is K in the co-occurrence graph.
206[Co-occurrence graph shows a network of words where a word is connected to
207all words it co-occurs with.]
208
209
210=head2 6. Usage examples
211
212(a)	Using default value of order
213To find the (1st order) co-occurrences of a word 'line' from the BIGRAM file
214test.input, run kocos.pl using the following command.
215 	kocos.pl --literal line test.input
216
217(b)	Using option order
218To find the 2nd order co-occurrences of a word 'line' from the BIGRAM file
219test.input, run kocos.pl using the following command.
220	kocos.pl --literal line --order 2 test.input
221
222(c)	Using the trace option
223To see how the 4th order co-occurrences of a word 'line' is reached as a
224sequence of words which form a co-occurrence chain, run kocos.pl using the
225following command.
226	kocos.pl --literal line --order 4 --trace test.trace test.input
227
228(d) 	Using a Regex to specify the target word
229To find Kth order co-occurrences of a target word 'line' which is specified as
230a Perl regular expression say /^[Ll]ines?$/ in a file test.regex,
231run kocos.pl using
232	kocos.pl --regex test.regex --order K test.input
233
234(e) 	Using a generic Regex for Data like Senseval-2,
235To find 2nd order co-occurrences of a target word that occurs in <head> tags
236in the data file eng-lex-sample.training.xml, use a regular expression like
237/<head>\w+</head>/ from a file say test.regex, and run kocos.pl using
238	kocos.pl --regex test.regex --order 2 test.input
239
240=head2 7. General Recommendations
241
242(a) Create a BIGRAM file using programs count.pl, statistic.pl or combig.pl
243    of NSP Package.
244(b) Use --window W option of program count.pl to specify the scope of the
245    context. Any word that occurs within a distance W from a target word will be
246    treated as its co-occurrence.
247(c) Use either --literal or --regex option to specify the target word. We
248    recommend use of regex support to detect forms of target word other than
249    its base form.
250
251=head2 8. Examples of Kth order co-occurrences
252
253In all the following examples, we assume that the input comes from the file
254test.input and word 'line' is a target word.
255
256 test.input =>
257 ----------------
258 print<>in<>	|
259 print<>line<>	|
260 text<>the<>	|
261 text<>line<>	|
262 file<>the<>	|
263 file<>in<>	|
264 line<>file	|
265 ----------------
266
267(Note that test.input doesn't look like a valid count/statistic output because
268kocos.pl will minimally require two words WORD1 and WORD2 separated by '<>'
269with an extra '<>' after WORD2 as described in Section 3.1 of this README)
270
271(a)	The 1st order co-occurrences of word 'line' can be found by
272	running kocos.pl with either of the following commands -
273
274	kocos.pl --literal line test.input
275		OR
276	kocos.pl --order 1 --literal line test.input
277
278This will display the co-occurrences of 'line' to standard output as shown
279below in the box.
280
281 --------
282 text<>	|
283 file<>	|
284 print<>|
285 --------
286
287This is because the program finds the bigrams
288
289 print<>line<>
290 text<>line<>
291 line<>file<>
292
293where word 'line' co-occurs with the words print, text and file which become
294the 1st order co-occurrences.
295
296(b)     The 2nd order co-occurrences of word 'line' can be found by
297	running kocos.pl with the following command -
298        kocos.pl --literal line --order 2 test.input
299
300This will display the 2nd order co-occurrences of 'line' to standard output
301as shown below in the box.
302
303 --------
304 the<> 	|
305 in<> 	|
306 --------
307
308This is because the program finds the words print, text and file as the
309first order co-occurrences (as explained in case a) and finds bigrams
310
311 print<>in<>
312 text<>the<>
313 file<>the<>
314 file<>in
315
316where 'the' and 'in' co-occur with the words print, text, file.
317
318(c)     To see how the 2nd order co-occurrences of word 'line' are reached
319	run the program using the following command -
320        kocos.pl --order 2 --trace test.trace test.input line
321
322This will display the 2nd order co-occurrences of 'line' to standard output
323as shown below in the box.
324
325 --------
326 the<>   |
327 in<>    |
328 --------
329
330and a detailed report of co-occurrence chains in test.trace file as shown
331in the box below.
332
333 test.trace =>
334
335 ----------------
336 line->text->the|
337 line->file->the|
338 line->file->in	|
339 line->print->in|
340 ----------------
341
342where
343the first line shows that the word 'line' co-occurred with 'text' which
344co-occurred with 'the'. Hence 'the' became a 2nd order co-occurrence.
345Similarly, 'line' co-occurred with 'file' which in turn co-occurred with
346'the' and 'in' which are therefore the 2nd order co-occurrences of 'line'.
347
348=head2 11. References
349
350[Niwa&Nitta94] Y. Niwa and Y. Nitta. Co-occurrence vectors from corpora
351vs. distance vectors from dictionaries. COLING-1994.
352
353[Schutze98] H. Schutze. Automatic word sense discrimination. Computational
354Linguistics,24(1):97-123,1998.
355
356=head1 AUTHORS
357
358 Amruta Purandare, pura0010@umn.edu
359 Ted Pedersen, tpederse@umn.edu
360
361 Last updated on 12/05/2003 by TDP
362
363This work has been partially supported by a National Science Foundation
364Faculty Early CAREER Development award (#0092784).
365
366=head1 BUGS
367
368=head1 SEE ALSO
369
370http://www.d.umn.edu/~tpederse/nsp.html
371
372=head1 COPYRIGHT
373
374Copyright (C) 2002-2003, Amruta Purandare and Ted Pedersen
375
376This program is free software; you can redistribute it and/or modify it under
377the terms of the GNU General Public License as published by the Free Software
378Foundation; either version 2 of the License, or (at your option) any later
379version.
380
381This program is distributed in the hope that it will be useful, but WITHOUT
382ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
383FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
384
385You should have received a copy of the GNU General Public License along with
386this program; if not, write to
387
388The Free Software Foundation, Inc.,
38959 Temple Place - Suite 330,
390Boston, MA  02111-1307, USA.
391
392=cut
393
394###############################################################################
395
396#				Changelogs
397
398# Date		Version		By		Changes			Code
399
400# 03/30/2003	0.03		Amruta		Regex support for     ADP.03.1
401#						specifying target
402#						word
403#
404# 07/02/2003	0.05		Amruta		Redesigned algorithm  ADP.05
405#						to improve performance
406#
407###############################################################################
408
409#                               THE CODE STARTS HERE
410
411###############################################################################
412
413#                           ================================
414#                            COMMAND LINE OPTIONS AND USAGE
415#                           ================================
416
417# show minimal usage message if no arguments
418if($#ARGV<0)
419{
420        &showminimal();
421        exit;
422}
423
424# command line options
425use Getopt::Long;
426GetOptions ("help","version","order=i","trace=s","literal=s","regex=s");
427# show help option
428if(defined $opt_help)
429{
430        $opt_help=1;
431        &showhelp();
432        exit;
433}
434
435# show version information
436if(defined $opt_version)
437{
438        $opt_version=1;
439        &showversion();
440        exit;
441}
442# if the order is specified
443if(defined $opt_order)
444{
445	$order=$opt_order;
446}
447# otherwise set default to 1
448else
449{
450	$order=1;
451}
452
453# trace report will show how
454# a Kth order co-occurrence is
455# reached via a chain of
456# lower order co-occurrences
457if(defined $opt_trace)
458{
459	$trace=$opt_trace;
460}
461
462# ----------------
463# ADP.03.1 start
464# ----------------
465# this part has been added during NSP version 0.55 release
466
467# target word is specified via --literal
468if(defined $opt_literal)
469{
470        $target=$opt_literal;
471}
472# target specified as Perl regex/s in a file
473if(defined $opt_regex)
474{
475        $regex_file=$opt_regex;
476        if(!(-e $regex_file))
477        {
478                print STDERR "ERROR($0):
479        Regex file <$regex_file> doesn't exist.\n";
480                exit;
481        }
482        open(REG,$regex_file) || die "ERROR($0):
483        Error(error code=$!) in opening Regex File <$regex_file>.\n";
484        undef $target;
485	while(<REG>)
486        {
487                chomp;
488                s/^\s+//g;
489                s/\s$//g;
490                if(/^\s*$/)
491                {
492                        next;
493                }
494                if(/^\//)
495                {
496                        s/^\///;
497                }
498                else
499                {
500                        print STDERR "ERROR($0):
501        Regular Expression <$_> should start with '/'\n";
502                        exit;
503                }
504                if(/\/$/)
505                {
506                        s/\/$//;
507                }
508                else
509                {
510                        print STDERR "ERROR($0):
511        Regular Expression <$_> should end with '/'\n";
512                        exit;
513                }
514                $target.="(".$_.")|";
515        }
516	if(defined $target)
517	{
518		chop $target;
519	}
520	else
521	{
522		print "ERROR($0):
523	No valid Perl regex found in Regex file <$regex_file>.\n";
524		exit;
525	}
526}
527# --------------
528# ADP.03.1 end
529# --------------
530
531#############################################################################
532
533#                       ================================
534#                          INITIALIZATION AND INPUT
535#                       ================================
536
537#$0 contains the program name along with
538#a complete path. Extract just the program
539#name and use in error messages
540$0=~s/.*\/(.+)/$1/;
541
542if(!defined $ARGV[0])
543{
544        print STDERR "ERROR($0):
545        Please specify the SOURCE file name...\n";
546        exit;
547}
548#accept the input file name
549$infile=$ARGV[0];
550#check if exists
551if(!-e $infile)
552{
553        print STDERR "ERROR($0):
554        Source file <$infile> doesn't exist...\n";
555        exit;
556}
557#open if exists
558open(IN,$infile) || die "Error($0):
559        Error(code=$!) in opening <$infile> file.\n";
560
561#check if the target word exists
562if(!defined $target)
563{
564        print STDERR "ERROR($0):
565        Please specify the target word using one of the --literal or --regex
566	options.\n";
567        exit;
568}
569
570#check if order is valid
571if($order<1)
572{
573	print STDERR "ERROR($0):
574	Order should be greater than or equal to 1.\n";
575	exit;
576}
577#check if --trace is used
578if(defined $trace)
579{
580	$ans="n";
581	#check if the TRACE_FILE already exists
582	if(-e $trace)
583	{
584		print STDERR "WARNING($0):
585	Trace file <$trace> already exists, overwrite(y/n)? ";
586		$ans=<STDIN>;
587	}
588	if(!-e $trace || $ans=~/Y|y/)
589	{
590		#open the TRACE_FILE
591		open(TRACE,">$trace") || die "Error($0):
592        Error(code=$!) in opening Trace file <$trace>.\n";
593	}
594	else
595	{
596		undef $trace;
597	}
598}
599##############################################################################
600
601#			=============================
602#			 Reading and Storing Bigrams
603#			=============================
604
605$line_num=0;
606#creating a coc_store data structure for storing all bigram strings from SOURCE
607#so that co-occurrences can be found looking at this data structure
608while($line=<IN>)
609{
610	$line_num++;
611        chomp $line;
612	# handling blank lines
613        if($line=~/^\s*$/)
614        {
615                next;
616        }
617	#store the bigram strings
618	if($line=~/<>/)
619	{
620		#checking if the SOURCE is a valid NSP output for Bigrams
621		$check_bigram=$line;
622		$cnt=0;
623		#count how many times <> occurs
624		while($check_bigram=~/<>/)
625		{
626			$cnt++;
627			$check_bigram=$';
628		}
629		#should be 2 for bigrams
630		if($cnt!=2)
631		{
632			print STDERR "ERROR($0):
633	SOURCE file <$infile> is not a valid Bigram output of NSP at line
634	<$line_num>.\n";
635			exit;
636		}
637		# store bigram in coc_store
638		push @coc_store,$line;
639	}
640}
641
642############################################################################
643
644#		===========================================
645#		 Ranking words according to their distance
646#			     From the target word
647#		===========================================
648
649# start with target word which
650# is at 0th level (rank)
651$rank{$target}=0;
652$word=$target;
653while($rank{$word}<$order)
654{
655	# rank my co-occurrences
656	&rank_cocs($word,$rank{$word});
657	# no more words in queue
658	if($#words < 0)
659	{
660		if($rank{$word}<$order)
661		{
662			print "No co-occurrences at $order th level.\n";
663		}
664		last;
665	}
666	else
667	{
668		# get the first word from queue
669	        $word=shift @words;
670	}
671}
672
673#############################################################################
674
675#			==============================
676#			     Printing Trace Report
677#			==============================
678
679# print trace report
680if(defined $trace)
681{
682	# trace each kth order co-occurrence back
683	foreach $word (@kocs)
684	{
685		# get all parents until the
686		# target word is reached
687		@chain=();
688		push @chain,$word;
689		if(defined $regex_file)
690		{
691			while($word !~ /$target/)
692			{
693				push @chain,$parent{$word};
694				$word=$parent{$word};
695			}
696		}
697		else
698		{
699			while($word ne $target)
700			{
701				push @chain,$parent{$word};
702                                $word=$parent{$word};
703			}
704		}
705		# print reverse chain
706		@reversed=reverse @chain;
707		print TRACE join("->",@reversed);
708		print TRACE "\n";
709	}
710}
711###########################################################################
712
713#                      ==========================
714#                          SUBROUTINE SECTION
715#                      ==========================
716
717#-----------------------------------------------------------------------------
718
719# ranks and queues the co-occurrences of a given word
720sub rank_cocs
721{
722	my $word=$_[0];
723	# co-occurrences of given word
724	# will be at rank(word)+1
725	my $level=$_[1]+1;
726	#string from the coc_store
727	my $coc_string="";
728	# check bigrams and rank words
729	# co-occurring with the given
730	# word
731	foreach $coc_string (@coc_store)
732	{
733		@parts=split(/<>/,$coc_string);
734		$word1=$parts[0];
735		$word2=$parts[1];
736		# current word is the target word
737		# specified via regex option
738		undef $got_coc;
739		if($level==1 && defined $regex_file)
740		{
741			# if exactly one of the words matches
742			# the target, extract the other
743			if($word1=~/$word/ && !defined $rank{$word2} && $word2!~/$target/)
744			{
745				$got_coc=$word2;
746				$parent=$word1;
747			}
748			elsif($word2=~/$word/ && !defined $rank{$word1} && $word1!~/$target/)
749			{
750				$got_coc=$word1;
751				$parent=$word2;
752			}
753		}
754		elsif(defined $regex_file)
755		{
756			# if one of the words matches the
757			# given word and other doesn't match
758			# the target word
759			if($word1 eq $word && !defined $rank{$word2} && $word2!~/$target/)
760                        {
761				$got_coc=$word2;
762				$parent=$word1;
763			}
764                        elsif($word2 eq $word && !defined $rank{$word1} && $word1!~/$target/)
765                        {
766				$got_coc=$word1;
767				$parent=$word2;
768			}
769		}
770		else
771		{
772			# one of the words matches the
773			# given word
774			if($word1 eq $word && !defined $rank{$word2})
775			{
776				$got_coc=$word2;
777				$parent=$word1;
778			}
779			elsif($word2 eq $word && !defined $rank{$word1})
780			{
781				$got_coc=$word1;
782				$parent=$word2;
783			}
784		}
785		if(defined $got_coc)
786                {
787			# rank the obtained coc
788			$rank{$got_coc}=$level;
789			# queue the coc
790			push @words,$got_coc;
791			# print if level is K
792			if($level==$order)
793			{
794				print "$got_coc<>\n";
795			}
796			# store link to parent for tracing
797			if(defined $trace)
798			{
799				$parent{$got_coc}=$parent;
800			        if($level==$order)
801				{
802					push @kocs,$got_coc;
803				}
804			}
805		}
806	}
807}
808
809#-----------------------------------------------------------------------------
810#show minimal usage message
811sub showminimal()
812{
813        print "Usage: kocos.pl [OPTIONS] BIGRAM";
814        print "\nTYPE kocos.pl --help for help\n";
815}
816
817#-----------------------------------------------------------------------------
818#show help
819sub showhelp()
820{
821	print "Usage:  kocos.pl [OPTIONS] BIGRAM
822Displays the kth order Co-occurrences of a given target word.
823Target word should be specified via --literal or --regex option.
824
825BIGRAM
826	A list of bigrams formatted like the output (extended or normal)
827	of NSP programs count.pl or statistic.pl.
828
829OPTIONS:
830--literal LITERAL
831	Specify the target word directly on command line as a literal.
832
833--regex REGEXFILE
834	Specify a file containing Perl regular expression/s that define
835	the target word.
836
837--order K
838	Specify the value of K (K>0) to find the kth order co-occurrences.
839	A Kth order co-occurrence is a word that co-occurs with a (K-1)th
840	order co-occurrence of the target word.
841
842	By default, the value of K is set to 1 which simply lists the
843	words that co-occur with a given target word. When K is 2, all words
844	that co-occur with the words that co-occur with the target word are
845	shown, and so on for higher orders.
846
847--trace TRACEFILE
848	Specify the name of a TRACEFILE to see a detailed trace report
849	showing  the chains of co-occurrences. A chain shows how a Kth
850	order co-occurrence is reached as a sequence of K lower order
851	co-occurrences.
852		e.g. WORD->First->Second->Third..->Kth
853	shows that 'First' is a first order co-occurrence of WORD,
854	'Second' is a second order co-occurrence of WORD which co-occurs
855	with 'First'. 'Third' is a third order co-occurrence of WORD which
856	co-occurs with 'Second' and so on until K is reached.
857--help
858        To display this message.
859
860--version
861        To display the version information.\n";
862}
863
864#------------------------------------------------------------------------------
865#version information
866sub showversion()
867{
868        print "kocos.pl       -        version 0.05\n";
869        print "Copyright (C) 2002-2003, Amruta Purandare & Ted Pedersen\n";
870        print "Date of Last Update: 07/01/2003\n";
871}
872
873#############################################################################
874
875
876=head1 AUTHORS
877
878 Amruta Purandare, University of Minnesota, Duluth,  pura0010@d.umn.edu
879 Ted Pedersen, University of Minnesota, Duluth,  tpederse@umn.edu
880
881=head1 BUGS
882
883=head1 SEE ALSO
884
885http://www.d.umn.edu/~tpederse/nsp.html
886
887=head1 COPYRIGHT
888
889Copyright (C) 2002-2003, Amruta Purandare & Ted Pedersen
890
891This program is free software; you can redistribute it and/or modify it
892under the terms of the GNU General Public License as published by the Free
893Software Foundation; either version 2 of the License, or (at your option)
894any later version.
895
896This program is distributed in the hope that it will be useful, but
897WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
898 or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
899for more details.
900
901You should have received a copy of the GNU General Public License along
902with this program; if not, write to
903
904The Free Software Foundation, Inc.,
90559 Temple Place - Suite 330,
906Boston, MA  02111-1307, USA.
907
908=cut
909
910