• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..03-May-2022-

segregation/H09-Aug-2021-16,63912,296

segregation.egg-info/H03-May-2022-317244

MANIFEST.inH A D09-Aug-2021124 21

PKG-INFOH A D09-Aug-202119.8 KiB317244

README.mdH A D09-Aug-202116.7 KiB293221

environment.ymlH A D09-Aug-2021322 2525

setup.cfgH A D09-Aug-202138 53

setup.pyH A D09-Aug-20212.1 KiB7253

README.md

1# Segregation Analysis, Inference, and Decomposition with PySAL
2
3[![codecov](https://codecov.io/gh/pysal/segregation/branch/master/graph/badge.svg?token=1ujvZCI9Ce)](https://codecov.io/gh/pysal/segregation)
4![PyPI - Python Version](https://img.shields.io/pypi/pyversions/segregation)
5![PyPI](https://img.shields.io/pypi/v/segregation)
6![Conda (channel only)](https://img.shields.io/conda/vn/conda-forge/segregation)
7![GitHub commits since latest release (branch)](https://img.shields.io/github/commits-since/pysal/segregation/latest)
8[![DOI](https://zenodo.org/badge/162503796.svg)](https://zenodo.org/badge/latestdoi/162503796)
9[![Documentation](https://img.shields.io/static/v1.svg?label=docs&message=current&color=9cf)](http://pysal.org/segregation/)
10
11
12![](doc/_static/images/heatmaps.png)
13
14The PySAL **segregation** package is a tool for analyzing patterns of urban segregation.
15With only a few lines of code, **segregation** users can
16
17Calculate over 40 segregation measures from simple to state-of-the art, including:
18
19- [aspatial segregation indices](https://github.com/pysal/segregation/blob/master/notebooks/aspatial_examples.ipynb)
20- spatial segregation indices
21  - [using spatial weights matrices, euclidian distances, or topological relationships](https://github.com/pysal/segregation/blob/master/notebooks/spatial_examples.ipynb)
22  - [using street network distances](https://github.com/pysal/segregation/blob/master/notebooks/network_measures.ipynb)
23  - [using multiscalar definitions](https://github.com/pysal/segregation/blob/master/notebooks/multiscalar_segregation_profiles.ipynb)
24- [local segregation indices](https://github.com/pysal/segregation/blob/master/notebooks/local_measures_example.ipynb)
25
26Test whether segregation estimates are statistically significant:
27
28- [single value inference](https://github.com/pysal/segregation/blob/master/notebooks/inference_wrappers_example.ipynb)
29- [comparative inference](https://github.com/pysal/segregation/blob/master/notebooks/inference_wrappers_example.ipynb)
30
31[Decompose](https://github.com/pysal/segregation/blob/master/notebooks/decomposition_wrapper_example.ipynb)
32segregation comparisons into
33
34- differences arising from spatial structure
35- differences arising from demographic structure
36
37## Installation
38
39Released versions of segregation are available on pip and anaconda
40
41pip:
42
43```bash
44pip install segregation
45```
46
47[anaconda](https://www.anaconda.com/download/):
48
49```bash
50conda install -c conda-forge segregation
51```
52
53You can also install the current development version from this repository
54
55 download [anaconda](https://www.anaconda.com/download/):
56
57`cd` into the directory and run the following commands
58
59```bash
60conda env create -f environment.yml
61conda activate segregation
62python setup.py develop
63```
64
65## Getting started
66
67For a complete guide to the `segregation` API, see the online
68[documentation](https://pysal.org/segregation/).
69
70For code walkthroughs and sample analyses, see the
71[example notebooks](https://github.com/pysal/segregation/tree/master/notebooks)
72
73## Calculating Segregation Measures
74
75Each index in the **segregation** module is implemented as a class, which is built from a `pandas.DataFrame`
76or a `geopandas.GeoDataFrame`. To estimate a segregation statistic, a user needs to call the segregation class
77she wishes to estimate, and pass three arguments:
78
79- the DataFrame containing population data
80- the name of the column with population counts for the group of interest
81- the name of the column with the total population for each enumeration unit
82
83Every class in **segregation** has a `statistic` and a `core_data` attributes.
84The first is a direct access to the point estimation of the specific segregation measure
85and the second attribute gives access to the main data that the module uses internally to
86perform the estimates.
87
88### Single group measures
89
90If, for example, a user was studying income segregation and wanted to know whether
91high-income residents tend to be more segregated from others.
92This user may want would want to fit a dissimilarity index (D) to a DataFrame called `df` to
93a specific group with columns like `"hi_income"`, `"med_income"` and `"low_income"` that store counts of people in each income
94bracket, and a total column called `"total_population"`. A typical call would be something like this:
95
96```python
97from segregation.aspatial import Dissim
98d_index = Dissim(df, "hi_income", "total_population")
99```
100
101To see the estimated D in the first generic example above, the user would have just to run
102`d_index.statistic` to see the fitted value.
103
104If a user would want to fit a *spatial* dissimilarity index (SD), the call would be nearly
105identical, save for the fact that the `DataFrame` now needs to be a `GeoDataFrame` with an appropriate `geometry` column
106
107```python
108from segregation.spatial import SpatialDissim
109spatial_index = SpatialDissim(gdf, "hi_income", "total_population")
110```
111
112Some spatial indices can also accept either a [PySAL](http://pysal.org) `W` object, or a [pandana](https://github.com/UDST/pandana) `Network` object,
113which allows the user full control over how to parameterize spatial effects.
114The network functions can be particularly useful for teasing out differences in
115segregation measures caused by two cities that have two very different spatial structures,
116like for example Detroit MI (left) and Monroe LA (right):
117
118![](doc/_static/images/networks.png)
119
120For point estimation, all single-group indices available are summarized in the following
121table:
122
123| **Measure**                                       | **Class/Function**              | **Spatial?** |    **Specific Arguments**      |
124|:--------------------------------------------------|:--------------------------------|:------------:|:-----------------------------: |
125| Dissimilarity (D)                                 | Dissim                          |      No      |           -                    |
126| Gini (G)                                          | GiniSeg                         |      No      |           -                    |
127| Entropy (H)                                       | Entropy                         |      No      |           -                    |
128| Isolation (xPx)                                   | Isolation                       |      No      |           -                    |
129| Exposure (xPy)                                    | Exposure                        |      No      |           -                    |
130| Atkinson (A)                                      | Atkinson                        |      No      |           b                    |
131| Correlation Ratio (V)                             | CorrelationR                    |      No      |           -                    |
132| Concentration Profile (R)                         | ConProf                         |      No      |           m                    |
133| Modified Dissimilarity (Dct)                      | ModifiedDissim                  |      No      |       iterations               |
134| Modified Gini (Gct)                               | ModifiedGiniSeg                 |      No      |       iterations               |
135| Bias-Corrected Dissimilarity (Dbc)                | BiasCorrectedDissim             |      No      |           B                    |
136| Density-Corrected Dissimilarity (Ddc)             | DensityCorrectedDissim          |      No      |          xtol                  |
137| Minimun-Maximum Index (MM)                        | MinMax                          |      No      |                                |
138| Spatial Proximity Profile (SPP)                   | SpatialProxProf                 |     Yes      |           m                    |
139| Spatial Dissimilarity (SD)                        | SpatialDissim                   |     Yes      |     w, standardize             |
140| Boundary Spatial Dissimilarity (BSD)              | BoundarySpatialDissim           |     Yes      |      standardize               |
141| Perimeter Area Ratio Spatial Dissimilarity (PARD) | PerimeterAreaRatioSpatialDissim |     Yes      |      standardize               |
142| Distance Decay Isolation (DDxPx)                  | DistanceDecayIsolation          |     Yes      |      alpha, beta, metric       |
143| Distance Decay Exposure (DDxPy)                   | DistanceDecayExposure           |     Yes      |      alpha, beta, metric       |
144| Spatial Proximity (SP)                            | SpatialProximity                |     Yes      |      alpha, beta, metric       |
145| Absolute Clustering (ACL)                         | AbsoluteClustering              |     Yes      |      alpha, beta, metric       |
146| Relative Clustering (RCL)                         | RelativeClustering              |     Yes      |      alpha, beta, metric       |
147| Delta (DEL)                                       | Delta                           |     Yes      |           -                    |
148| Absolute Concentration (ACO)                      | AbsoluteConcentration           |     Yes      |           -                    |
149| Relative Concentration (RCO)                      | RelativeConcentration           |     Yes      |           -                    |
150| Absolute Centralization (ACE)                     | AbsoluteCentralization          |     Yes      |           -                    |
151| Relative Centralization (RCE)                     | RelativeCentralization          |     Yes      |           -                    |
152| Relative Centralization (RCE)                     | RelativeCentralization          |     Yes      |           -                    |
153| Spatial Minimun-Maximum (SMM)                     | SpatialMinMax                   |     Yes      | network, w, decay, distance, precompute |
154
155### Multigroup measures
156
157**segregation** also facilitates the estimation of multigroup segregation measures.
158
159In this case, the call is nearly identical to the single-group, only now we pass a list of
160column names rather than a single string;
161reprising the income segregation example above, an example call might look like this
162
163```python
164from segregation.aspatial import MultiDissim
165index = MultiDissim(df, ['hi_income', 'med_income', 'low_income'])
166```
167
168```python
169index.statistic
170```
171
172Available multi-group indices are summarized in the table below:
173
174| **Measure**                                 | **Class/Function**               | **Spatial?** | **Specific Arguments** |
175|:--------------------------------------------|:---------------------------------|:------------:|:----------------------:|
176| Multigroup Dissimilarity                    | MultiDissim                      |      No      |           -            |
177| Multigroup Gini                             | MultiGiniSeg                     |      No      |           -            |
178| Multigroup Normalized Exposure              | MultiNormalizedExposure          |      No      |           -            |
179| Multigroup Information Theory               | MultiInformationTheory           |      No      |           -            |
180| Multigroup Relative Diversity               | MultiRelativeDiversity           |      No      |           -            |
181| Multigroup Squared Coefficient of Variation | MultiSquaredCoefficientVariation |      No      |           -            |
182| Multigroup Diversity                        | MultiDiversity                   |      No      |       normalized       |
183| Simpson’s Concentration                     | SimpsonsConcentration            |      No      |           -            |
184| Simpson’s Interaction                       | SimpsonsInteraction              |      No      |           -            |
185| Multigroup Divergence                       | MultiDivergence                  |      No      |           -            |
186
187### Local measures
188
189Also, it is possible to calculate local measures of segregation.
190A `statistics` attribute will contain the values of these indexes. **Note:
191in this case the attribute is in the plural since, many statistics are fitted, one for
192each enumeration unit** Local segregation indices have the same signature as their global
193cousins and are summarized in the table below:
194
195| **Measure**                   | **Class/Function**             | **Spatial?** | **Specific Arguments** |
196|:------------------------------|:-------------------------------|:------------:|:----------------------:|
197| Location Quotient             | MultiLocationQuotient          |      No      |           -            |
198| Local Diversity               | MultiLocalDiversity            |      No      |           -            |
199| Local Entropy                 | MultiLocalEntropy              |      No      |           -            |
200| Local Simpson’s Concentration | MultiLocalSimpsonConcentration |      No      |           -            |
201| Local Simpson’s Interaction   | MultiLocalSimpsonInteraction   |      No      |           -            |
202| Local Centralization          | LocalRelativeCentralization    |     Yes      |           -            |
203
204## Testing for Statistical Significance
205
206Once the segregation indexes are fitted, the user can perform inference to shed light for
207statistical significance in regional analysis.
208The summary of the inference framework is presented in the table below:
209
210| **Inference Type** | **Class/Function** |                    **Function main Inputs**                    |       **Function Outputs**       |
211|:-------------------|:-------------------|:--------------------------------------------------------------:|:--------------------------------:|
212| Single Value       | SingleValueTest    |  seg_class, iterations_under_null, null_approach, two_tailed   |   p_value, est_sim, statistic    |
213| Two Values         | TwoValueTest       | seg_class_1, seg_class_2, iterations_under_null, null_approach | p_value, est_sim, est_point_diff |
214
215### [Single Value Inference](https://github.com/pysal/segregation/blob/master/notebooks/inference_wrappers_example.ipynb)
216
217![](doc/_static/images/singleval_inference.png)
218
219### [Two-Value Inference](https://github.com/pysal/segregation/blob/master/notebooks/inference_wrappers_example.ipynb)
220
221![](doc/_static/images/twoval_inference.png)
222
223### [Decomposition](https://github.com/pysal/segregation/blob/master/notebooks/decomposition_wrapper_example.ipynb)
224
225Another useful analysis that can be performed with the **segregation** module is a
226decompositional approach where two different indexes can be broken down into their spatial
227component (`c_s`) and attribute component (`c_a`). This framework is summarized in the table
228below:
229
230| **Framework** | **Class/Function**   |        **Function main Inputs**         | **Function Outputs** |
231|:--------------|:---------------------|:---------------------------------------:|:--------------------:|
232| Decomposition | DecomposeSegregation | index1, index2, counterfactual_approach |       c_a, c_s       |
233
234![](doc/_static/images/decomp_example.png)
235
236In this case, the difference in measured D statistics between Detroit and Monroe is
237attributable primarily to their demographic makeup, rather than the spatial structure of
238the two cities.
239(Note, this is to be expected since *D* is not a spatial index)
240
241## Contributing
242
243PySAL-segregation is under active development and contributors are welcome.
244
245If you have any suggestion, feature request, or bug report, please open a new
246[issue](https://github.com/pysal/segregation/issues) on GitHub.
247To submit patches, please follow the PySAL development
248[guidelines](https://github.com/pysal/pysal/wiki) and open a
249[pull request](https://github.com/pysal/segregation). Once your changes get merged, you’ll
250automatically be added to the
251[Contributors List](https://github.com/pysal/segregation/graphs/contributors).
252
253## Support
254
255If you are having issues, please talk to us in the
256[gitter room](https://gitter.im/pysal/pysal).
257
258## License
259
260The project is licensed under the
261[BSD license](https://github.com/pysal/pysal/blob/master/LICENSE.txt).
262
263## Funding
264
265<img src="figs/nsf_logo.jpg" width="50"> Award #1831615
266[RIDIR: Scalable Geospatial Analytics for Social Science Research](https://www.nsf.gov/awardsearch/showAward?AWD_ID=1831615)
267
268<img src="figs/capes_logo.jpg" width="50"> Renan Xavier Cortes is grateful for the support of Coordenação de Aperfeiçoamento de
269Pessoal de Nível Superior - Brazil (CAPES) - Process number 88881.170553/2018-01
270
271## Citation
272To cite `segregation`, we recommend the following
273
274```latex
275@software{renan_xavier_cortes_2020,
276  author       = {Renan Xavier Cortes and
277                  eli knaap and
278                  Sergio Rey and
279                  Wei Kang and
280                  Philip Stephens and
281                  James Gaboardi and
282                  Levi John Wolf and
283                  Antti Härkönen and
284                  Dani Arribas-Bel},
285  title        = {PySAL/segregation: Segregation Analysis, Inference, & Decomposition},
286  month        = feb,
287  year         = 2020,
288  publisher    = {Zenodo},
289  doi          = {10.5281/zenodo.3265359},
290  url          = {https://doi.org/10.5281/zenodo.3265359}
291}
292```
293