1Metadata-Version: 1.1
2Name: parsel
3Version: 1.5.1
4Summary: Parsel is a library to extract data from HTML and XML using XPath and CSS selectors
5Home-page: https://github.com/scrapy/parsel
6Author: Scrapy project
7Author-email: info@scrapy.org
8License: BSD
9Description: ===============================
10        Parsel
11        ===============================
12
13        .. image:: https://img.shields.io/travis/scrapy/parsel/master.svg
14           :target: https://travis-ci.org/scrapy/parsel
15           :alt: Build Status
16
17        .. image:: https://img.shields.io/pypi/v/parsel.svg
18           :target: https://pypi.python.org/pypi/parsel
19           :alt: PyPI Version
20
21        .. image:: https://img.shields.io/codecov/c/github/scrapy/parsel/master.svg
22           :target: http://codecov.io/github/scrapy/parsel?branch=master
23           :alt: Coverage report
24
25
26        Parsel is a library to extract data from HTML and XML using XPath and CSS selectors
27
28        * Free software: BSD license
29        * Documentation: https://parsel.readthedocs.org.
30
31        Features
32        --------
33
34        * Extract text using CSS or XPath selectors
35        * Regular expression helper methods
36
37        Example::
38
39            >>> from parsel import Selector
40            >>> sel = Selector(text=u"""<html>
41                    <body>
42                        <h1>Hello, Parsel!</h1>
43                        <ul>
44                            <li><a href="http://example.com">Link 1</a></li>
45                            <li><a href="http://scrapy.org">Link 2</a></li>
46                        </ul
47                    </body>
48                    </html>""")
49            >>>
50            >>> sel.css('h1::text').get()
51            'Hello, Parsel!'
52            >>>
53            >>> sel.css('h1::text').re('\w+')
54            ['Hello', 'Parsel']
55            >>>
56            >>> for e in sel.css('ul > li'):
57            ...     print(e.xpath('.//a/@href').get())
58            http://example.com
59            http://scrapy.org
60
61
62
63
64        History
65        -------
66
67        1.5.1 (2018-10-25)
68        ~~~~~~~~~~~~~~~~~~
69
70        * ``has-class`` XPath function handles newlines and other separators
71          in class names properly;
72        * fixed parsing of HTML documents with null bytes;
73        * documentation improvements;
74        * Python 3.7 tests are run on CI; other test improvements.
75
76        1.5.0 (2018-07-04)
77        ~~~~~~~~~~~~~~~~~~
78
79        * New ``Selector.attrib`` and ``SelectorList.attrib`` properties which make
80          it easier to get attributes of HTML elements.
81        * CSS selectors became faster: compilation results are cached
82          (LRU cache is used for ``css2xpath``), so there is
83          less overhead when the same CSS expression is used several times.
84        * ``.get()`` and ``.getall()`` selector methods are documented and recommended
85          over ``.extract_first()`` and ``.extract()``.
86        * Various documentation tweaks and improvements.
87
88        One more change is that ``.extract()`` and  ``.extract_first()`` methods
89        are now implemented using ``.get()`` and ``.getall()``, not the other
90        way around, and instead of calling ``Selector.extract`` all other methods
91        now call ``Selector.get`` internally. It can be **backwards incompatible**
92        in case of custom Selector subclasses which override ``Selector.extract``
93        without doing the same for ``Selector.get``. If you have such Selector
94        subclass, make sure ``get`` method is also overridden. For example, this::
95
96            class MySelector(parsel.Selector):
97                def extract(self):
98                    return super().extract() + " foo"
99
100        should be changed to this::
101
102            class MySelector(parsel.Selector):
103                def get(self):
104                    return super().get() + " foo"
105                extract = get
106
107
108        1.4.0 (2018-02-08)
109        ~~~~~~~~~~~~~~~~~~
110
111        * ``Selector`` and ``SelectorList`` can't be pickled because
112          pickling/unpickling doesn't work for ``lxml.html.HtmlElement``;
113          parsel now raises TypeError explicitly instead of allowing pickle to
114          silently produce wrong output. This is technically backwards-incompatible
115          if you're using Python < 3.6.
116
117
118        1.3.1 (2017-12-28)
119        ~~~~~~~~~~~~~~~~~~
120
121        * Fix artifact uploads to pypi.
122
123
124        1.3.0 (2017-12-28)
125        ~~~~~~~~~~~~~~~~~~
126
127        * ``has-class`` XPath extension function;
128        * ``parsel.xpathfuncs.set_xpathfunc`` is a simplified way to register
129          XPath extensions;
130        * ``Selector.remove_namespaces`` now removes namespace declarations;
131        * Python 3.3 support is dropped;
132        * ``make htmlview`` command for easier Parsel docs development.
133        * CI: PyPy installation is fixed; parsel now runs tests for PyPy3 as well.
134
135
136        1.2.0 (2017-05-17)
137        ~~~~~~~~~~~~~~~~~~
138
139        * Add ``SelectorList.get`` and ``SelectorList.getall``
140          methods as aliases for ``SelectorList.extract_first``
141          and ``SelectorList.extract`` respectively
142        * Add default value parameter to ``SelectorList.re_first`` method
143        * Add ``Selector.re_first`` method
144        * Add ``replace_entities`` argument on ``.re()`` and ``.re_first()``
145          to turn off replacing of character entity references
146        * Bug fix: detect ``None`` result from lxml parsing and fallback with an empty document
147        * Rearrange XML/HTML examples in the selectors usage docs
148        * Travis CI:
149
150          * Test against Python 3.6
151          * Test against PyPy using "Portable PyPy for Linux" distribution
152
153
154        1.1.0 (2016-11-22)
155        ~~~~~~~~~~~~~~~~~~
156
157        * Change default HTML parser to `lxml.html.HTMLParser <http://lxml.de/api/lxml.html.HTMLParser-class.html>`_,
158          which makes easier to use some HTML specific features
159        * Add css2xpath function to translate CSS to XPath
160        * Add support for ad-hoc namespaces declarations
161        * Add support for XPath variables
162        * Documentation improvements and updates
163
164
165        1.0.3 (2016-07-29)
166        ~~~~~~~~~~~~~~~~~~
167
168        * Add BSD-3-Clause license file
169        * Re-enable PyPy tests
170        * Integrate py.test runs with setuptools (needed for Debian packaging)
171        * Changelog is now called ``NEWS``
172
173
174        1.0.2 (2016-04-26)
175        ~~~~~~~~~~~~~~~~~~
176
177        * Fix bug in exception handling causing original traceback to be lost
178        * Added docstrings and other doc fixes
179
180
181        1.0.1 (2015-08-24)
182        ~~~~~~~~~~~~~~~~~~
183
184        * Updated PyPI classifiers
185        * Added docstrings for csstranslator module and other doc fixes
186
187
188        1.0.0 (2015-08-22)
189        ~~~~~~~~~~~~~~~~~~
190
191        * Documentation fixes
192
193
194        0.9.6 (2015-08-14)
195        ~~~~~~~~~~~~~~~~~~
196
197        * Updated documentation
198        * Extended test coverage
199
200
201        0.9.5 (2015-08-11)
202        ~~~~~~~~~~~~~~~~~~
203
204        * Support for extending SelectorList
205
206
207        0.9.4 (2015-08-10)
208        ~~~~~~~~~~~~~~~~~~
209
210        * Try workaround for travis-ci/dpl#253
211
212
213        0.9.3 (2015-08-07)
214        ~~~~~~~~~~~~~~~~~~
215
216        * Add base_url argument
217
218
219        0.9.2 (2015-08-07)
220        ~~~~~~~~~~~~~~~~~~
221
222        * Rename module unified -> selector and promoted root attribute
223        * Add create_root_node function
224
225
226        0.9.1 (2015-08-04)
227        ~~~~~~~~~~~~~~~~~~
228
229        * Setup Sphinx build and docs structure
230        * Build universal wheels
231        * Rename some leftovers from package extraction
232
233
234        0.9.0 (2015-07-30)
235        ~~~~~~~~~~~~~~~~~~
236
237        * First release on PyPI.
238
239Keywords: parsel
240Platform: UNKNOWN
241Classifier: Development Status :: 5 - Production/Stable
242Classifier: Intended Audience :: Developers
243Classifier: License :: OSI Approved :: BSD License
244Classifier: Natural Language :: English
245Classifier: Topic :: Text Processing :: Markup
246Classifier: Topic :: Text Processing :: Markup :: HTML
247Classifier: Topic :: Text Processing :: Markup :: XML
248Classifier: Programming Language :: Python :: 2
249Classifier: Programming Language :: Python :: 2.7
250Classifier: Programming Language :: Python :: 3
251Classifier: Programming Language :: Python :: 3.4
252Classifier: Programming Language :: Python :: 3.5
253Classifier: Programming Language :: Python :: 3.6
254Classifier: Programming Language :: Python :: 3.7
255Classifier: Programming Language :: Python :: Implementation :: CPython
256Classifier: Programming Language :: Python :: Implementation :: PyPy
257