• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

normality/H21-Jun-2020-466347

tests/H21-Jun-2020-11082

.bumpversion.cfgH A D21-Jun-2020189 118

.gitignoreH A D21-Jun-202069 77

.travis.ymlH A D21-Jun-2020439 2619

LICENSEH A D21-Jun-20201.1 KiB2218

MANIFEST.inH A D21-Jun-202060 33

MakefileH A D21-Jun-2020361 2115

README.mdH A D21-Jun-20201.2 KiB3625

setup.cfgH A D21-Jun-202061 64

setup.pyH A D03-May-20221.3 KiB4946

README.md

1# normality
2
3[![Build Status](https://travis-ci.org/pudo/normality.svg?branch=master)](https://travis-ci.org/pudo/normality)
4
5Normality is a Python micro-package that contains a small set of text
6normalization functions for easier re-use. These functions accept a
7snippet of unicode or utf-8 encoded text and remove various classes
8of characters, such as diacritics, punctuation etc. This is useful as
9a preparation to further text analysis.
10
11**WARNING**: This library works much better when used in combination
12with ``pyicu``, a Python binding for the International Components for
13Unicode C library. ICU provides much better text transliteration than
14the default ``text-unidecode``.
15
16## Example
17
18```python
19# coding: utf-8
20from normality import normalize, slugify, collapse_spaces
21
22text = normalize('Nie wieder "Grüne Süppchen" kochen!')
23assert text == 'nie wieder grune suppchen kochen'
24
25slug = slugify('My first blog post!')
26assert slug == 'my-first-blog-post'
27
28text = 'this \n\n\r\nhas\tlots of \nodd spacing.'
29assert collapse_spaces(text) == 'this has lots of odd spacing.'
30```
31
32## License
33
34``normality`` is open source, licensed under a standard MIT license
35(included in this repository as ``LICENSE``).
36