• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

PyStemmer.egg-info/H03-May-2022-207200

build/H03-May-2022-

docs/H15-Jul-2020-9256

libstemmer_c/H15-Jul-2020-44,01141,015

sampledata/H15-Jul-2020-37,54035,571

src/H03-May-2022-6,7165,489

tests/H15-Jul-2020-29,52229,489

.gitignoreH A D21-Mar-2020103 1211

.travis.ymlH A D21-Mar-2020326 2821

AUTHORSH A D21-Mar-2020163 95

ChangeLogH A D21-Mar-20202.1 KiB5439

HACKINGH A D21-Mar-2020552 2112

LICENSEH A D21-Mar-20204.7 KiB9778

MANIFEST.inH A D21-Mar-2020218 76

PKG-INFOH A D15-Jul-20202.9 KiB6561

README.rstH A D21-Mar-20202.8 KiB7654

benchmark.pyH A D21-Mar-2020859 2821

makedist.shH A D21-Mar-2020270 1410

runtests.pyH A D21-Mar-2020368 1811

setup.cfgH A D15-Jul-202038 53

setup.pyH A D03-May-20226.5 KiB200160

setup.py.origH A D15-Jul-20206.6 KiB200160

tarballfetcher.pyH A D21-Mar-20201.5 KiB5441

tox.iniH A D21-Mar-2020160 119

README.rst

1PyStemmer
2=========
3
4What is PyStemmer?
5------------------
6
7PyStemmer is a Python interface to the stemming algorithms from the Snowball
8project (https://snowballstem.org/). A stemming algorithm (or stemmer) is a
9process for removing the commoner morphological and inflexional endings from
10words in English. Its main use is as part of a term normalisation process that
11is usually done when setting up Information Retrieval systems.  A stemmer aims
12to conflate words with the same linguistic base form, in order that the
13resulting "stem" may be considered to represent all words with that base form.
14
15Stemmers can be used to make searches more comprehensive. For example, stemming
16can ensure that a search for 'cars' will also find all documents that contain
17only 'car'.
18
19Snowball is a small string processing language designed for creating stemming
20algorithms for use in Information Retrieval.  It is also the name of a project
21to develop a good base set of stemming algorithms.
22
23PyStemmer uses the "libstemmer_c" C interface to the snowball algorithms,
24provided by the snowball project itself.  This library is unmodified, but
25contained within the PyStemmer distribution.  If you wish to upgrade PyStemmer
26to a more recent version of libstemmer_c (or the snowball algorithms), it
27should suffice to download a new copy of libstemmer_c from the snowball
28project, and replace the contents of the libstemmer_c subdirectory with the
29contents of the download.
30
31Requirements
32------------
33
34Python header files should be installed.
35
36This version of PyStemmer has been tested using Python series 2.6, 2.7, 3.3,
373.4, 3.5, 3.6, 3.7 and pypy.  Builds are checked with `travis`_:
38
39.. _travis: https://travis-ci.org/snowballstem/pystemmer
40
41.. image:: https://travis-ci.org/snowballstem/pystemmer.png?branch=master
42   :target: https://travis-ci.org/snowballstem/pystemmer
43
44Installation
45------------
46
47PyStemmer uses distutils, so all that is necessary to build and install
48PyStemmer is the usual distutils invocation::
49
50    python setup.py install
51
52You can also install using ``pip``:
53
54    * from PyPI: ``pip install pystemmer``
55    * from a local copy of the code: ``pip install .``
56    * from git: ``pip install git+git://github.com/snowballstem/pystemmer``
57
58API
59---
60
61PyStemmer's API is documented by documentation comments.
62
63A brief overview can be found in docs/quickstart.txt
64
65License
66-------
67
68PyStemmer is copyright (c) 2006, Richard Boulton, and is licensed under the MIT
69license: see the file "LICENSE" for the full text of this.  It is was inspired
70by an earlier implementation (which was copyright (c) 2001, Andreas Jung, and
71also licensed under the MIT license, but no portions of which remain in this
72package, and had a different API).
73
74The snowball algorithms, and the snowball library, are copyright (c) 2001-2006,
75Dr Martin Porter and Richard Boulton, and are licensed under the BSD license.
76