1Metadata-Version: 2.1 2Name: pyshp 3Version: 2.1.3 4Summary: Pure Python read/write support for ESRI Shapefile format 5Home-page: https://github.com/GeospatialPython/pyshp 6Author: Joel Lawhead 7Author-email: jlawhead@geospatialpython.com 8License: MIT 9Download-URL: https://github.com/GeospatialPython/pyshp/archive/2.1.1.tar.gz 10Description: # PyShp 11 12 The Python Shapefile Library (PyShp) reads and writes ESRI Shapefiles in pure Python. 13 14 ![pyshp logo](http://4.bp.blogspot.com/_SBi37QEsCvg/TPQuOhlHQxI/AAAAAAAAAE0/QjFlWfMx0tQ/S350/GSP_Logo.png "PyShp") 15 16 [![Build Status](https://travis-ci.org/GeospatialPython/pyshp.svg?branch=master)](https://travis-ci.org/GeospatialPython/pyshp) 17 18 ## Contents 19 20 [Overview](#overview) 21 22 [Version Changes](#version-changes) 23 24 [Examples](#examples) 25 - [Reading Shapefiles](#reading-shapefiles) 26 - [The Reader Class](#the-reader-class) 27 - [Reading Geometry](#reading-geometry) 28 - [Reading Records](#reading-records) 29 - [Reading Geometry and Records Simultaneously](#reading-geometry-and-records-simultaneously) 30 - [Writing Shapefiles](#writing-shapefiles) 31 - [The Writer Class](#the-writer-class) 32 - [Adding Records](#adding-records) 33 - [Adding Geometry](#adding-geometry) 34 - [Geometry and Record Balancing](#geometry-and-record-balancing) 35 36 [How To's](#how-tos) 37 - [3D and Other Geometry Types](#3d-and-other-geometry-types) 38 - [Working with Large Shapefiles](#working-with-large-shapefiles) 39 - [Unicode and Shapefile Encodings](#unicode-and-shapefile-encodings) 40 41 [Testing](#testing) 42 43 44 # Overview 45 46 The Python Shapefile Library (PyShp) provides read and write support for the 47 Esri Shapefile format. The Shapefile format is a popular Geographic 48 Information System vector data format created by Esri. For more information 49 about this format please read the well-written "ESRI Shapefile Technical 50 Description - July 1998" located at [http://www.esri.com/library/whitepapers/p 51 dfs/shapefile.pdf](http://www.esri.com/library/whitepapers/pdfs/shapefile.pdf) 52 . The Esri document describes the shp and shx file formats. However a third 53 file format called dbf is also required. This format is documented on the web 54 as the "XBase File Format Description" and is a simple file-based database 55 format created in the 1960's. For more on this specification see: [http://www.clicketyclick.dk/databases/xbase/format/index.html](http://www.clicketyclick.dk/databases/xbase/format/index.html) 56 57 Both the Esri and XBase file-formats are very simple in design and memory 58 efficient which is part of the reason the shapefile format remains popular 59 despite the numerous ways to store and exchange GIS data available today. 60 61 Pyshp is compatible with Python 2.7-3.x. 62 63 This document provides examples for using PyShp to read and write shapefiles. However 64 many more examples are continually added to the blog [http://GeospatialPython.com](http://GeospatialPython.com), 65 and by searching for PyShp on [https://gis.stackexchange.com](https://gis.stackexchange.com). 66 67 Currently the sample census blockgroup shapefile referenced in the examples is available on the GitHub project site at 68 [https://github.com/GeospatialPython/pyshp](https://github.com/GeospatialPython/pyshp). These 69 examples are straight-forward and you can also easily run them against your 70 own shapefiles with minimal modification. 71 72 Important: If you are new to GIS you should read about map projections. 73 Please visit: [https://github.com/GeospatialPython/pyshp/wiki/Map-Projections](https://github.com/GeospatialPython/pyshp/wiki/Map-Projections) 74 75 I sincerely hope this library eliminates the mundane distraction of simply 76 reading and writing data, and allows you to focus on the challenging and FUN 77 part of your geospatial project. 78 79 80 # Version Changes 81 82 ## 2.1.3 83 84 ### Bug fixes: 85 86 - Fix recent bug in geojson hole-in-polygon checking (see #205) 87 - Misc fixes to allow geo interface dump to json (eg dates as strings) 88 - Handle additional dbf date null values, and return faulty dates as unicode (see #187) 89 - Add writer target typecheck 90 - Fix bugs to allow reading shp/shx/dbf separately 91 - Allow delayed shapefile loading by passing no args 92 - Fix error with writing empty z/m shapefile (@mcuprjak) 93 - Fix signed_area() so ignores z/m coords 94 - Enforce writing the 11th field name character as null-terminator (only first 10 are used) 95 - Minor README fixes 96 - Added more tests 97 98 ## 2.1.2 99 100 ### Bug fixes: 101 102 - Fix issue where warnings.simplefilter('always') changes global warning behavior [see #203] 103 104 ## 2.1.1 105 106 ### Improvements: 107 108 - Handle shapes with no coords and represent as geojson with no coords (GeoJSON null-equivalent) 109 - Expand testing to Python 3.6, 3.7, 3.8 and PyPy; drop 3.3 and 3.4 [@mwtoews] 110 - Added pytest testing [@jmoujaes] 111 112 ### Bug fixes: 113 114 - Fix incorrect geo interface handling of multipolygons with complex exterior-hole relations [see #202] 115 - Enforce shapefile requirement of at least one field, to avoid writing invalid shapefiles [@Jonty] 116 - Fix Reader geo interface including DeletionFlag field in feature properties [@nnseva] 117 - Fix polygons not being auto closed, which was accidentally dropped 118 - Fix error for null geometries in feature geojson 119 - Misc docstring cleanup [@fiveham] 120 121 ## 2.1.0 122 123 ### New Features: 124 125 - Added back read/write support for unicode field names. 126 - Improved Record representation 127 - More support for geojson on Reader, ShapeRecord, ShapeRecords, and shapes() 128 129 ### Bug fixes: 130 131 - Fixed error when reading optional m-values 132 - Fixed Record attribute autocomplete in Python 3 133 - Misc readme cleanup 134 135 ## 2.0.0 136 137 The newest version of PyShp, version 2.0 introduced some major new improvements. 138 A great thanks to all who have contributed code and raised issues, and for everyone's 139 patience and understanding during the transition period. 140 Some of the new changes are incompatible with previous versions. 141 Users of the previous version 1.x should therefore take note of the following changes 142 (Note: Some contributor attributions may be missing): 143 144 ### Major Changes: 145 146 - Full support for unicode text, with custom encoding, and exception handling. 147 - Means that the Reader returns unicode, and the Writer accepts unicode. 148 - PyShp has been simplified to a pure input-output library using the Reader and Writer classes, dropping the Editor class. 149 - Switched to a new streaming approach when writing files, keeping memory-usage at a minimum: 150 - Specify filepath/destination and text encoding when creating the Writer. 151 - The file is written incrementally with each call to shape/record. 152 - Adding shapes is now done using dedicated methods for each shapetype. 153 - Reading shapefiles is now more convenient: 154 - Shapefiles can be opened using the context manager, and files are properly closed. 155 - Shapefiles can be iterated, have a length, and supports the geo interface. 156 - New ways of inspecing shapefile metadata by printing. [@megies] 157 - More convenient accessing of Record values as attributes. [@philippkraft] 158 - More convenient shape type name checking. [@megies] 159 - Add more support and documentation for MultiPatch 3D shapes. 160 - The Reader "elevation" and "measure" attributes now renamed "zbox" and "mbox", to make it clear they refer to the min/max values. 161 - Better documentation of previously unclear aspects, such as field types. 162 163 ### Important Fixes: 164 165 - More reliable/robust: 166 - Fixed shapefile bbox error for empty or point type shapefiles. [@mcuprjak] 167 - Reading and writing Z and M type shapes is now more robust, fixing many errors, and has been added to the documentation. [@ShinNoNoir] 168 - Improved parsing of field value types, fixed errors and made more flexible. 169 - Fixed bug when writing shapefiles with datefield and date values earlier than 1900 [@megies] 170 - Fix some geo interface errors, including checking polygon directions. 171 - Bug fixes for reading from case sensitive file names, individual files separately, and from file-like objects. [@gastoneb, @kb003308, @erickskb] 172 - Enforce maximum field limit. [@mwtoews] 173 174 175 # Examples 176 177 Before doing anything you must import the library. 178 179 180 >>> import shapefile 181 182 The examples below will use a shapefile created from the U.S. Census Bureau 183 Blockgroups data set near San Francisco, CA and available in the git 184 repository of the PyShp GitHub site. 185 186 ## Reading Shapefiles 187 188 ### The Reader Class 189 190 To read a shapefile create a new "Reader" object and pass it the name of an 191 existing shapefile. The shapefile format is actually a collection of three 192 files. You specify the base filename of the shapefile or the complete filename 193 of any of the shapefile component files. 194 195 196 >>> sf = shapefile.Reader("shapefiles/blockgroups") 197 198 OR 199 200 201 >>> sf = shapefile.Reader("shapefiles/blockgroups.shp") 202 203 OR 204 205 206 >>> sf = shapefile.Reader("shapefiles/blockgroups.dbf") 207 208 OR any of the other 5+ formats which are potentially part of a shapefile. The 209 library does not care about file extensions. 210 211 #### Reading Shapefiles Using the Context Manager 212 213 The "Reader" class can be used as a context manager, to ensure open file 214 objects are properly closed when done reading the data: 215 216 >>> with shapefile.Reader("shapefiles/blockgroups.shp") as shp: 217 ... print(shp) 218 shapefile Reader 219 663 shapes (type 'POLYGON') 220 663 records (44 fields) 221 222 #### Reading Shapefiles from File-Like Objects 223 224 You can also load shapefiles from any Python file-like object using keyword 225 arguments to specify any of the three files. This feature is very powerful and 226 allows you to load shapefiles from a url, a zip file, a serialized object, 227 or in some cases a database. 228 229 230 >>> myshp = open("shapefiles/blockgroups.shp", "rb") 231 >>> mydbf = open("shapefiles/blockgroups.dbf", "rb") 232 >>> r = shapefile.Reader(shp=myshp, dbf=mydbf) 233 234 Notice in the examples above the shx file is never used. The shx file is a 235 very simple fixed-record index for the variable-length records in the shp 236 file. This file is optional for reading. If it's available PyShp will use the 237 shx file to access shape records a little faster but will do just fine without 238 it. 239 240 #### Reading Shapefile Meta-Data 241 242 Shapefiles have a number of attributes for inspecting the file contents. 243 A shapefile is a container for a specific type of geometry, and this can be checked using the 244 shapeType attribute. 245 246 247 >>> sf.shapeType 248 5 249 250 Shape types are represented by numbers between 0 and 31 as defined by the 251 shapefile specification and listed below. It is important to note that the numbering system has 252 several reserved numbers that have not been used yet, therefore the numbers of 253 the existing shape types are not sequential: 254 255 - NULL = 0 256 - POINT = 1 257 - POLYLINE = 3 258 - POLYGON = 5 259 - MULTIPOINT = 8 260 - POINTZ = 11 261 - POLYLINEZ = 13 262 - POLYGONZ = 15 263 - MULTIPOINTZ = 18 264 - POINTM = 21 265 - POLYLINEM = 23 266 - POLYGONM = 25 267 - MULTIPOINTM = 28 268 - MULTIPATCH = 31 269 270 Based on this we can see that our blockgroups shapefile contains 271 Polygon type shapes. The shape types are also defined as constants in 272 the shapefile module, so that we can compare types more intuitively: 273 274 275 >>> sf.shapeType == shapefile.POLYGON 276 True 277 278 For convenience, you can also get the name of the shape type as a string: 279 280 281 >>> sf.shapeTypeName == 'POLYGON' 282 True 283 284 Other pieces of meta-data that we can check include the number of features 285 and the bounding box area the shapefile covers: 286 287 288 >>> len(sf) 289 663 290 >>> sf.bbox 291 [-122.515048, 37.652916, -122.327622, 37.863433] 292 293 Finally, if you would prefer to work with the entire shapefile in a different 294 format, you can convert all of it to a GeoJSON dictionary, although you may lose 295 some information in the process, such as z- and m-values: 296 297 298 >>> sf.__geo_interface__['type'] 299 'FeatureCollection' 300 301 ### Reading Geometry 302 303 A shapefile's geometry is the collection of points or shapes made from 304 vertices and implied arcs representing physical locations. All types of 305 shapefiles just store points. The metadata about the points determine how they 306 are handled by software. 307 308 You can get a list of the shapefile's geometry by calling the shapes() 309 method. 310 311 312 >>> shapes = sf.shapes() 313 314 The shapes method returns a list of Shape objects describing the geometry of 315 each shape record. 316 317 318 >>> len(shapes) 319 663 320 321 To read a single shape by calling its index use the shape() method. The index 322 is the shape's count from 0. So to read the 8th shape record you would use its 323 index which is 7. 324 325 326 >>> s = sf.shape(7) 327 328 >>> # Read the bbox of the 8th shape to verify 329 >>> # Round coordinates to 3 decimal places 330 >>> ['%.3f' % coord for coord in s.bbox] 331 ['-122.450', '37.801', '-122.442', '37.808'] 332 333 Each shape record (except Points) contains the following attributes. Records of 334 shapeType Point do not have a bounding box 'bbox'. 335 336 337 >>> for name in dir(shapes[3]): 338 ... if not name.startswith('_'): 339 ... name 340 'bbox' 341 'parts' 342 'points' 343 'shapeType' 344 'shapeTypeName' 345 346 * shapeType: an integer representing the type of shape as defined by the 347 shapefile specification. 348 349 350 >>> shapes[3].shapeType 351 5 352 353 * shapeTypeName: a string representation of the type of shape as defined by shapeType. Read-only. 354 355 356 >>> shapes[3].shapeTypeName 357 'POLYGON' 358 359 * bbox: If the shape type contains multiple points this tuple describes the 360 lower left (x,y) coordinate and upper right corner coordinate creating a 361 complete box around the points. If the shapeType is a 362 Null (shapeType == 0) then an AttributeError is raised. 363 364 365 >>> # Get the bounding box of the 4th shape. 366 >>> # Round coordinates to 3 decimal places 367 >>> bbox = shapes[3].bbox 368 >>> ['%.3f' % coord for coord in bbox] 369 ['-122.486', '37.787', '-122.446', '37.811'] 370 371 * parts: Parts simply group collections of points into shapes. If the shape 372 record has multiple parts this attribute contains the index of the first 373 point of each part. If there is only one part then a list containing 0 is 374 returned. 375 376 377 >>> shapes[3].parts 378 [0] 379 380 * points: The points attribute contains a list of tuples containing an 381 (x,y) coordinate for each point in the shape. 382 383 384 >>> len(shapes[3].points) 385 173 386 >>> # Get the 8th point of the fourth shape 387 >>> # Truncate coordinates to 3 decimal places 388 >>> shape = shapes[3].points[7] 389 >>> ['%.3f' % coord for coord in shape] 390 ['-122.471', '37.787'] 391 392 In most cases, however, if you need to do more than just type or bounds checking, you may want 393 to convert the geometry to the more human-readable [GeoJSON format](http://geojson.org), 394 where lines and polygons are grouped for you: 395 396 397 >>> s = sf.shape(0) 398 >>> geoj = s.__geo_interface__ 399 >>> geoj["type"] 400 'MultiPolygon' 401 402 The results from the shapes() method similiarly supports converting to GeoJSON: 403 404 405 >>> shapes.__geo_interface__['type'] 406 'GeometryCollection' 407 408 409 ### Reading Records 410 411 A record in a shapefile contains the attributes for each shape in the 412 collection of geometries. Records are stored in the dbf file. The link between 413 geometry and attributes is the foundation of all geographic information systems. 414 This critical link is implied by the order of shapes and corresponding records 415 in the shp geometry file and the dbf attribute file. 416 417 The field names of a shapefile are available as soon as you read a shapefile. 418 You can call the "fields" attribute of the shapefile as a Python list. Each 419 field is a Python list with the following information: 420 421 * Field name: the name describing the data at this column index. 422 * Field type: the type of data at this column index. Types can be: 423 * "C": Characters, text. 424 * "N": Numbers, with or without decimals. 425 * "F": Floats (same as "N"). 426 * "L": Logical, for boolean True/False values. 427 * "D": Dates. 428 * "M": Memo, has no meaning within a GIS and is part of the xbase spec instead. 429 * Field length: the length of the data found at this column index. Older GIS 430 software may truncate this length to 8 or 11 characters for "Character" 431 fields. 432 * Decimal length: the number of decimal places found in "Number" fields. 433 434 To see the fields for the Reader object above (sf) call the "fields" 435 attribute: 436 437 438 >>> fields = sf.fields 439 440 >>> assert fields == [("DeletionFlag", "C", 1, 0), ["AREA", "N", 18, 5], 441 ... ["BKG_KEY", "C", 12, 0], ["POP1990", "N", 9, 0], ["POP90_SQMI", "N", 10, 1], 442 ... ["HOUSEHOLDS", "N", 9, 0], 443 ... ["MALES", "N", 9, 0], ["FEMALES", "N", 9, 0], ["WHITE", "N", 9, 0], 444 ... ["BLACK", "N", 8, 0], ["AMERI_ES", "N", 7, 0], ["ASIAN_PI", "N", 8, 0], 445 ... ["OTHER", "N", 8, 0], ["HISPANIC", "N", 8, 0], ["AGE_UNDER5", "N", 8, 0], 446 ... ["AGE_5_17", "N", 8, 0], ["AGE_18_29", "N", 8, 0], ["AGE_30_49", "N", 8, 0], 447 ... ["AGE_50_64", "N", 8, 0], ["AGE_65_UP", "N", 8, 0], 448 ... ["NEVERMARRY", "N", 8, 0], ["MARRIED", "N", 9, 0], ["SEPARATED", "N", 7, 0], 449 ... ["WIDOWED", "N", 8, 0], ["DIVORCED", "N", 8, 0], ["HSEHLD_1_M", "N", 8, 0], 450 ... ["HSEHLD_1_F", "N", 8, 0], ["MARHH_CHD", "N", 8, 0], 451 ... ["MARHH_NO_C", "N", 8, 0], ["MHH_CHILD", "N", 7, 0], 452 ... ["FHH_CHILD", "N", 7, 0], ["HSE_UNITS", "N", 9, 0], ["VACANT", "N", 7, 0], 453 ... ["OWNER_OCC", "N", 8, 0], ["RENTER_OCC", "N", 8, 0], 454 ... ["MEDIAN_VAL", "N", 7, 0], ["MEDIANRENT", "N", 4, 0], 455 ... ["UNITS_1DET", "N", 8, 0], ["UNITS_1ATT", "N", 7, 0], ["UNITS2", "N", 7, 0], 456 ... ["UNITS3_9", "N", 8, 0], ["UNITS10_49", "N", 8, 0], 457 ... ["UNITS50_UP", "N", 8, 0], ["MOBILEHOME", "N", 7, 0]] 458 459 You can get a list of the shapefile's records by calling the records() method: 460 461 462 >>> records = sf.records() 463 464 >>> len(records) 465 663 466 467 To read a single record call the record() method with the record's index: 468 469 470 >>> rec = sf.record(3) 471 472 Each record is a list-like Record object containing the values corresponding to each field in 473 the field list. A record's values can be accessed by positional indexing or slicing. 474 For example in the blockgroups shapefile the 2nd and 3rd fields are the blockgroup id 475 and the 1990 population count of that San Francisco blockgroup: 476 477 478 >>> rec[1:3] 479 ['060750601001', 4715] 480 481 For simpler access, the fields of a record can also accessed via the name of the field, 482 either as a key or as an attribute name. The blockgroup id (BKG_KEY) of the blockgroups shapefile 483 can also be retrieved as: 484 485 486 >>> rec['BKG_KEY'] 487 '060750601001' 488 489 >>> rec.BKG_KEY 490 '060750601001' 491 492 The record values can be easily integrated with other programs by converting it to a field-value dictionary: 493 494 495 >>> dct = rec.as_dict() 496 >>> sorted(dct.items()) 497 [('AGE_18_29', 1467), ('AGE_30_49', 1681), ('AGE_50_64', 92), ('AGE_5_17', 848), ('AGE_65_UP', 30), ('AGE_UNDER5', 597), ('AMERI_ES', 6), ('AREA', 2.34385), ('ASIAN_PI', 452), ('BKG_KEY', '060750601001'), ('BLACK', 1007), ('DIVORCED', 149), ('FEMALES', 2095), ('FHH_CHILD', 16), ('HISPANIC', 416), ('HOUSEHOLDS', 1195), ('HSEHLD_1_F', 40), ('HSEHLD_1_M', 22), ('HSE_UNITS', 1258), ('MALES', 2620), ('MARHH_CHD', 79), ('MARHH_NO_C', 958), ('MARRIED', 2021), ('MEDIANRENT', 739), ('MEDIAN_VAL', 337500), ('MHH_CHILD', 0), ('MOBILEHOME', 0), ('NEVERMARRY', 703), ('OTHER', 288), ('OWNER_OCC', 66), ('POP1990', 4715), ('POP90_SQMI', 2011.6), ('RENTER_OCC', 3733), ('SEPARATED', 49), ('UNITS10_49', 49), ('UNITS2', 160), ('UNITS3_9', 672), ('UNITS50_UP', 0), ('UNITS_1ATT', 302), ('UNITS_1DET', 43), ('VACANT', 93), ('WHITE', 2962), ('WIDOWED', 37)] 498 499 If at a later point you need to check the record's index position in the original 500 shapefile, you can do this through the "oid" attribute: 501 502 503 >>> rec.oid 504 3 505 506 ### Reading Geometry and Records Simultaneously 507 508 You may want to examine both the geometry and the attributes for a record at 509 the same time. The shapeRecord() and shapeRecords() method let you do just 510 that. 511 512 Calling the shapeRecords() method will return the geometry and attributes for 513 all shapes as a list of ShapeRecord objects. Each ShapeRecord instance has a 514 "shape" and "record" attribute. The shape attribute is a Shape object as 515 discussed in the first section "Reading Geometry". The record attribute is a 516 list-like object containing field values as demonstrated in the "Reading Records" section. 517 518 519 >>> shapeRecs = sf.shapeRecords() 520 521 Let's read the blockgroup key and the population for the 4th blockgroup: 522 523 524 >>> shapeRecs[3].record[1:3] 525 ['060750601001', 4715] 526 527 The results from the shapeRecords() method is a list-like object that can be easily converted 528 to GeoJSON through the _\_geo_interface\_\_: 529 530 531 >>> shapeRecs.__geo_interface__['type'] 532 'FeatureCollection' 533 534 The shapeRecord() method reads a single shape/record pair at the specified index. 535 To get the 4th shape record from the blockgroups shapefile use the third index: 536 537 538 >>> shapeRec = sf.shapeRecord(3) 539 540 Each individual shape record also supports the _\_geo_interface\_\_ to convert it to a GeoJSON: 541 542 543 >>> shapeRec.__geo_interface__['type'] 544 'Feature' 545 546 The blockgroup key and population count: 547 548 549 >>> shapeRec.record[1:3] 550 ['060750601001', 4715] 551 552 553 ## Writing Shapefiles 554 555 ### The Writer Class 556 557 PyShp tries to be as flexible as possible when writing shapefiles while 558 maintaining some degree of automatic validation to make sure you don't 559 accidentally write an invalid file. 560 561 PyShp can write just one of the component files such as the shp or dbf file 562 without writing the others. So in addition to being a complete shapefile 563 library, it can also be used as a basic dbf (xbase) library. Dbf files are a 564 common database format which are often useful as a standalone simple database 565 format. And even shp files occasionally have uses as a standalone format. Some 566 web-based GIS systems use an user-uploaded shp file to specify an area of 567 interest. Many precision agriculture chemical field sprayers also use the shp 568 format as a control file for the sprayer system (usually in combination with 569 custom database file formats). 570 571 To create a shapefile you begin by initiating a new Writer instance, passing it 572 the file path and name to save to: 573 574 575 >>> w = shapefile.Writer('shapefiles/test/testfile') 576 >>> w.field('field1', 'C') 577 578 File extensions are optional when reading or writing shapefiles. If you specify 579 them PyShp ignores them anyway. When you save files you can specify a base 580 file name that is used for all three file types. Or you can specify a name for 581 one or more file types: 582 583 584 >>> w = shapefile.Writer(dbf='shapefiles/test/onlydbf.dbf') 585 >>> w.field('field1', 'C') 586 587 In that case, any file types not assigned will not 588 save and only file types with file names will be saved. 589 590 #### Writing Shapefiles Using the Context Manager 591 592 The "Writer" class automatically closes the open files and writes the final headers once it is garbage collected. 593 In case of a crash and to make the code more readable, it is nevertheless recommended 594 you do this manually by calling the "close()" method: 595 596 597 >>> w.close() 598 599 Alternatively, you can also use the "Writer" class as a context manager, to ensure open file 600 objects are properly closed and final headers written once you exit the with-clause: 601 602 603 >>> with shapefile.Writer("shapefiles/test/contextwriter") as w: 604 ... w.field('field1', 'C') 605 ... pass 606 607 #### Writing Shapefiles to File-Like Objects 608 609 Just as you can read shapefiles from python file-like objects you can also 610 write to them: 611 612 613 >>> try: 614 ... from StringIO import StringIO 615 ... except ImportError: 616 ... from io import BytesIO as StringIO 617 >>> shp = StringIO() 618 >>> shx = StringIO() 619 >>> dbf = StringIO() 620 >>> w = shapefile.Writer(shp=shp, shx=shx, dbf=dbf) 621 >>> w.field('field1', 'C') 622 >>> w.record() 623 >>> w.null() 624 >>> w.close() 625 >>> # To read back the files you could call the "StringIO.getvalue()" method later. 626 627 #### Setting the Shape Type 628 629 The shape type defines the type of geometry contained in the shapefile. All of 630 the shapes must match the shape type setting. 631 632 There are three ways to set the shape type: 633 * Set it when creating the class instance. 634 * Set it by assigning a value to an existing class instance. 635 * Set it automatically to the type of the first non-null shape by saving the shapefile. 636 637 To manually set the shape type for a Writer object when creating the Writer: 638 639 640 >>> w = shapefile.Writer('shapefiles/test/shapetype', shapeType=3) 641 >>> w.field('field1', 'C') 642 643 >>> w.shapeType 644 3 645 646 OR you can set it after the Writer is created: 647 648 649 >>> w.shapeType = 1 650 651 >>> w.shapeType 652 1 653 654 655 ### Adding Records 656 657 Before you can add records you must first create the fields that define what types of 658 values will go into each attribute. 659 660 There are several different field types, all of which support storing None values as NULL. 661 662 Text fields are created using the 'C' type, and the third 'size' argument can be customized to the expected 663 length of text values to save space: 664 665 666 >>> w = shapefile.Writer('shapefiles/test/dtype') 667 >>> w.field('TEXT', 'C') 668 >>> w.field('SHORT_TEXT', 'C', size=5) 669 >>> w.field('LONG_TEXT', 'C', size=250) 670 >>> w.null() 671 >>> w.record('Hello', 'World', 'World'*50) 672 >>> w.close() 673 674 >>> r = shapefile.Reader('shapefiles/test/dtype') 675 >>> assert r.record(0) == ['Hello', 'World', 'World'*50] 676 677 Date fields are created using the 'D' type, and can be created using either 678 date objects, lists, or a YYYYMMDD formatted string. 679 Field length or decimal have no impact on this type: 680 681 682 >>> from datetime import date 683 >>> w = shapefile.Writer('shapefiles/test/dtype') 684 >>> w.field('DATE', 'D') 685 >>> w.null() 686 >>> w.null() 687 >>> w.null() 688 >>> w.null() 689 >>> w.record(date(1898,1,30)) 690 >>> w.record([1998,1,30]) 691 >>> w.record('19980130') 692 >>> w.record(None) 693 >>> w.close() 694 695 >>> r = shapefile.Reader('shapefiles/test/dtype') 696 >>> assert r.record(0) == [date(1898,1,30)] 697 >>> assert r.record(1) == [date(1998,1,30)] 698 >>> assert r.record(2) == [date(1998,1,30)] 699 >>> assert r.record(3) == [None] 700 701 Numeric fields are created using the 'N' type (or the 'F' type, which is exactly the same). 702 By default the fourth decimal argument is set to zero, essentially creating an integer field. 703 To store floats you must set the decimal argument to the precision of your choice. 704 To store very large numbers you must increase the field length size to the total number of digits 705 (including comma and minus). 706 707 708 >>> w = shapefile.Writer('shapefiles/test/dtype') 709 >>> w.field('INT', 'N') 710 >>> w.field('LOWPREC', 'N', decimal=2) 711 >>> w.field('MEDPREC', 'N', decimal=10) 712 >>> w.field('HIGHPREC', 'N', decimal=30) 713 >>> w.field('FTYPE', 'F', decimal=10) 714 >>> w.field('LARGENR', 'N', 101) 715 >>> nr = 1.3217328 716 >>> w.null() 717 >>> w.null() 718 >>> w.record(INT=nr, LOWPREC=nr, MEDPREC=nr, HIGHPREC=-3.2302e-25, FTYPE=nr, LARGENR=int(nr)*10**100) 719 >>> w.record(None, None, None, None, None, None) 720 >>> w.close() 721 722 >>> r = shapefile.Reader('shapefiles/test/dtype') 723 >>> assert r.record(0) == [1, 1.32, 1.3217328, -3.2302e-25, 1.3217328, 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000] 724 >>> assert r.record(1) == [None, None, None, None, None, None] 725 726 727 Finally, we can create boolean fields by setting the type to 'L'. 728 This field can take True or False values, or 1 (True) or 0 (False). 729 None is interpreted as missing. 730 731 732 >>> w = shapefile.Writer('shapefiles/test/dtype') 733 >>> w.field('BOOLEAN', 'L') 734 >>> w.null() 735 >>> w.null() 736 >>> w.null() 737 >>> w.null() 738 >>> w.null() 739 >>> w.null() 740 >>> w.record(True) 741 >>> w.record(1) 742 >>> w.record(False) 743 >>> w.record(0) 744 >>> w.record(None) 745 >>> w.record("Nonesense") 746 >>> w.close() 747 748 >>> r = shapefile.Reader('shapefiles/test/dtype') 749 >>> r.record(0) 750 Record #0: [True] 751 >>> r.record(1) 752 Record #1: [True] 753 >>> r.record(2) 754 Record #2: [False] 755 >>> r.record(3) 756 Record #3: [False] 757 >>> r.record(4) 758 Record #4: [None] 759 >>> r.record(5) 760 Record #5: [None] 761 762 You can also add attributes using keyword arguments where the keys are field names. 763 764 765 >>> w = shapefile.Writer('shapefiles/test/dtype') 766 >>> w.field('FIRST_FLD','C','40') 767 >>> w.field('SECOND_FLD','C','40') 768 >>> w.null() 769 >>> w.null() 770 >>> w.record('First', 'Line') 771 >>> w.record(FIRST_FLD='First', SECOND_FLD='Line') 772 >>> w.close() 773 774 ### Adding Geometry 775 776 Geometry is added using one of several convenience methods. The "null" method is used 777 for null shapes, "point" is used for point shapes, "multipoint" is used for multipoint shapes, "line" for lines, 778 "poly" for polygons. 779 780 **Adding a Null shape** 781 782 A shapefile may contain some records for which geometry is not available, and may be set using the "null" method. 783 Because Null shape types (shape type 0) have no geometry the "null" method is called without any arguments. 784 785 786 >>> w = shapefile.Writer('shapefiles/test/null') 787 >>> w.field('name', 'C') 788 789 >>> w.null() 790 >>> w.record('nullgeom') 791 792 >>> w.close() 793 794 **Adding a Point shape** 795 796 Point shapes are added using the "point" method. A point is specified by an x and 797 y value. 798 799 800 >>> w = shapefile.Writer('shapefiles/test/point') 801 >>> w.field('name', 'C') 802 803 >>> w.point(122, 37) 804 >>> w.record('point1') 805 806 >>> w.close() 807 808 **Adding a MultiPoint shape** 809 810 If your point data allows for the possibility of multiple points per feature, use "multipoint" instead. 811 These are specified as a list of xy point coordinates. 812 813 814 >>> w = shapefile.Writer('shapefiles/test/multipoint') 815 >>> w.field('name', 'C') 816 817 >>> w.multipoint([[122,37], [124,32]]) 818 >>> w.record('multipoint1') 819 820 >>> w.close() 821 822 **Adding a LineString shape** 823 824 For LineString shapefiles, each shape is given as a list of one or more linear features. 825 Each of the linear features must have at least two points. 826 827 828 >>> w = shapefile.Writer('shapefiles/test/line') 829 >>> w.field('name', 'C') 830 831 >>> w.line([ 832 ... [[1,5],[5,5],[5,1],[3,3],[1,1]], # line 1 833 ... [[3,2],[2,6]] # line 2 834 ... ]) 835 836 >>> w.record('linestring1') 837 838 >>> w.close() 839 840 **Adding a Polygon shape** 841 842 Similarly to LineString, Polygon shapes consist of multiple polygons, and must be given as a list of polygons. 843 The main difference is that polygons must have at least 4 points and the last point must be the same as the first. 844 It's also okay if you forget to repeat the first point at the end; PyShp automatically checks and closes the polygons 845 if you don't. 846 847 It's important to note that for Polygon shapefiles, your polygon coordinates must be ordered in a clockwise direction. 848 If any of the polygons have holes, then the hole polygon coordinates must be ordered in a counterclockwise direction. 849 The direction of your polygons determines how shapefile readers will distinguish between polygon outlines and holes. 850 851 852 >>> w = shapefile.Writer('shapefiles/test/polygon') 853 >>> w.field('name', 'C') 854 855 >>> w.poly([ 856 ... [[113,24], [112,32], [117,36], [122,37], [118,20]], # poly 1 857 ... [[116,29],[116,26],[119,29],[119,32]], # hole 1 858 ... [[15,2], [17,6], [22,7]] # poly 2 859 ... ]) 860 >>> w.record('polygon1') 861 862 >>> w.close() 863 864 **Adding from an existing Shape object** 865 866 Finally, geometry can be added by passing an existing "Shape" object to the "shape" method. 867 You can also pass it any GeoJSON dictionary or _\_geo_interface\_\_ compatible object. 868 This can be particularly useful for copying from one file to another: 869 870 871 >>> r = shapefile.Reader('shapefiles/test/polygon') 872 873 >>> w = shapefile.Writer('shapefiles/test/copy') 874 >>> w.fields = r.fields[1:] # skip first deletion field 875 876 >>> # adding existing Shape objects 877 >>> for shaperec in r.iterShapeRecords(): 878 ... w.record(*shaperec.record) 879 ... w.shape(shaperec.shape) 880 881 >>> # or GeoJSON dicts 882 >>> for shaperec in r.iterShapeRecords(): 883 ... w.record(*shaperec.record) 884 ... w.shape(shaperec.shape.__geo_interface__) 885 886 >>> w.close() 887 888 889 ### Geometry and Record Balancing 890 891 Because every shape must have a corresponding record it is critical that the 892 number of records equals the number of shapes to create a valid shapefile. You 893 must take care to add records and shapes in the same order so that the record 894 data lines up with the geometry data. For example: 895 896 897 >>> w = shapefile.Writer('shapefiles/test/balancing', shapeType=shapefile.POINT) 898 >>> w.field("field1", "C") 899 >>> w.field("field2", "C") 900 901 >>> w.record("row", "one") 902 >>> w.point(1, 1) 903 904 >>> w.record("row", "two") 905 >>> w.point(2, 2) 906 907 To help prevent accidental misalignment PyShp has an "auto balance" feature to 908 make sure when you add either a shape or a record the two sides of the 909 equation line up. This way if you forget to update an entry the 910 shapefile will still be valid and handled correctly by most shapefile 911 software. Autobalancing is NOT turned on by default. To activate it set 912 the attribute autoBalance to 1 or True: 913 914 915 >>> w.autoBalance = 1 916 >>> w.record("row", "three") 917 >>> w.record("row", "four") 918 >>> w.point(4, 4) 919 920 >>> w.recNum == w.shpNum 921 True 922 923 You also have the option of manually calling the balance() method at any time 924 to ensure the other side is up to date. When balancing is used 925 null shapes are created on the geometry side or records 926 with a value of "NULL" for each field is created on the attribute side. 927 This gives you flexibility in how you build the shapefile. 928 You can create all of the shapes and then create all of the records or vice versa. 929 930 931 >>> w.autoBalance = 0 932 >>> w.record("row", "five") 933 >>> w.record("row", "six") 934 >>> w.record("row", "seven") 935 >>> w.point(5, 5) 936 >>> w.point(6, 6) 937 >>> w.balance() 938 939 >>> w.recNum == w.shpNum 940 True 941 942 If you do not use the autoBalance() or balance() method and forget to manually 943 balance the geometry and attributes the shapefile will be viewed as corrupt by 944 most shapefile software. 945 946 947 948 # How To's 949 950 ## 3D and Other Geometry Types 951 952 Most shapefiles store conventional 2D points, lines, or polygons. But the shapefile format is also capable 953 of storing various other types of geometries as well, including complex 3D surfaces and objects. 954 955 **Shapefiles with measurement (M) values** 956 957 Measured shape types are shapes that include a measurement value at each vertex, for instance 958 speed measurements from a GPS device. Shapes with measurement (M) values are added with the following 959 methods: "pointm", "multipointm", "linem", and "polygonm". The M-values are specified by adding a 960 third M value to each XY coordinate. Missing or unobserved M-values are specified with a None value, 961 or by simply omitting the third M-coordinate. 962 963 964 >>> w = shapefile.Writer('shapefiles/test/linem') 965 >>> w.field('name', 'C') 966 967 >>> w.linem([ 968 ... [[1,5,0],[5,5],[5,1,3],[3,3,None],[1,1,0]], # line with one omitted and one missing M-value 969 ... [[3,2],[2,6]] # line without any M-values 970 ... ]) 971 972 >>> w.record('linem1') 973 974 >>> w.close() 975 976 Shapefiles containing M-values can be examined in several ways: 977 978 >>> r = shapefile.Reader('shapefiles/test/linem') 979 980 >>> r.mbox # the lower and upper bound of M-values in the shapefile 981 [0.0, 3.0] 982 983 >>> r.shape(0).m # flat list of M-values 984 [0.0, None, 3.0, None, 0.0, None, None] 985 986 987 **Shapefiles with elevation (Z) values** 988 989 Elevation shape types are shapes that include an elevation value at each vertex, for instance elevation from a GPS device. 990 Shapes with elevation (Z) values are added with the following methods: "pointz", "multipointz", "linez", and "polyz". 991 The Z-values are specified by adding a third Z value to each XY coordinate. Z-values do not support the concept of missing data, 992 but if you omit the third Z-coordinate it will default to 0. Note that Z-type shapes also support measurement (M) values added 993 as a fourth M-coordinate. This too is optional. 994 995 996 >>> w = shapefile.Writer('shapefiles/test/linez') 997 >>> w.field('name', 'C') 998 999 >>> w.linez([ 1000 ... [[1,5,18],[5,5,20],[5,1,22],[3,3],[1,1]], # line with some omitted Z-values 1001 ... [[3,2],[2,6]], # line without any Z-values 1002 ... [[3,2,15,0],[2,6,13,3],[1,9,14,2]] # line with both Z- and M-values 1003 ... ]) 1004 1005 >>> w.record('linez1') 1006 1007 >>> w.close() 1008 1009 To examine a Z-type shapefile you can do: 1010 1011 >>> r = shapefile.Reader('shapefiles/test/linez') 1012 1013 >>> r.zbox # the lower and upper bound of Z-values in the shapefile 1014 [0.0, 22.0] 1015 1016 >>> r.shape(0).z # flat list of Z-values 1017 [18.0, 20.0, 22.0, 0.0, 0.0, 0.0, 0.0, 15.0, 13.0, 14.0] 1018 1019 **3D MultiPatch Shapefiles** 1020 1021 Multipatch shapes are useful for storing composite 3-Dimensional objects. 1022 A MultiPatch shape represents a 3D object made up of one or more surface parts. 1023 Each surface in "parts" is defined by a list of XYZM values (Z and M values optional), and its corresponding type is 1024 given in the "partTypes" argument. The part type decides how the coordinate sequence is to be interpreted, and can be one 1025 of the following module constants: TRIANGLE_STRIP, TRIANGLE_FAN, OUTER_RING, INNER_RING, FIRST_RING, or RING. 1026 For instance, a TRIANGLE_STRIP may be used to represent the walls of a building, combined with a TRIANGLE_FAN to represent 1027 its roof: 1028 1029 >>> from shapefile import TRIANGLE_STRIP, TRIANGLE_FAN 1030 1031 >>> w = shapefile.Writer('shapefiles/test/multipatch') 1032 >>> w.field('name', 'C') 1033 1034 >>> w.multipatch([ 1035 ... [[0,0,0],[0,0,3],[5,0,0],[5,0,3],[5,5,0],[5,5,3],[0,5,0],[0,5,3],[0,0,0],[0,0,3]], # TRIANGLE_STRIP for house walls 1036 ... [[2.5,2.5,5],[0,0,3],[5,0,3],[5,5,3],[0,5,3],[0,0,3]], # TRIANGLE_FAN for pointed house roof 1037 ... ], 1038 ... partTypes=[TRIANGLE_STRIP, TRIANGLE_FAN]) # one type for each part 1039 1040 >>> w.record('house1') 1041 1042 >>> w.close() 1043 1044 For an introduction to the various multipatch part types and examples of how to create 3D MultiPatch objects see [this 1045 ESRI White Paper](http://downloads.esri.com/support/whitepapers/ao_/J9749_MultiPatch_Geometry_Type.pdf). 1046 1047 ## Working with Large Shapefiles 1048 1049 Despite being a lightweight library, PyShp is designed to be able to read and write 1050 shapefiles of any size, allowing you to work with hundreds of thousands or even millions 1051 of records and complex geometries. 1052 1053 When first creating the Reader class, the library only reads the header information 1054 and leaves the rest of the file contents alone. Once you call the records() and shapes() 1055 methods however, it will attempt to read the entire file into memory at once. 1056 For very large files this can result in MemoryError. So when working with large files 1057 it is recommended to use instead the iterShapes(), iterRecords(), or iterShapeRecords() 1058 methods instead. These iterate through the file contents one at a time, enabling you to loop 1059 through them while keeping memory usage at a minimum. 1060 1061 1062 >>> for shape in sf.iterShapes(): 1063 ... # do something here 1064 ... pass 1065 1066 >>> for rec in sf.iterRecords(): 1067 ... # do something here 1068 ... pass 1069 1070 >>> for shapeRec in sf.iterShapeRecords(): 1071 ... # do something here 1072 ... pass 1073 1074 >>> for shapeRec in sf: # same as iterShapeRecords() 1075 ... # do something here 1076 ... pass 1077 1078 The shapefile Writer class uses a similar streaming approach to keep memory 1079 usage at a minimum. The library takes care of this under-the-hood by immediately 1080 writing each geometry and record to disk the moment they 1081 are added using shape() or record(). Once the writer is closed, exited, or garbage 1082 collected, the final header information is calculated and written to the beginning of 1083 the file. 1084 1085 This means that as long as you are able to iterate through a source file without having 1086 to load everything into memory, such as a large CSV table or a large shapefile, you can 1087 process and write any number of items, and even merge many different source files into a single 1088 large shapefile. If you need to edit or undo any of your writing you would have to read the 1089 file back in, one record at a time, make your changes, and write it back out. 1090 1091 ## Unicode and Shapefile Encodings 1092 1093 PyShp has full support for unicode and shapefile encodings, so you can always expect to be working 1094 with unicode strings in shapefiles that have text fields. 1095 Most shapefiles are written in UTF-8 encoding, PyShp's default encoding, so in most cases you don't 1096 have to specify the encoding. For reading shapefiles in any other encoding, such as Latin-1, just 1097 supply the encoding option when creating the Reader class. 1098 1099 1100 >>> r = shapefile.Reader("shapefiles/test/latin1.shp", encoding="latin1") 1101 >>> r.record(0) == [2, u'Ñandú'] 1102 True 1103 1104 Once you have loaded the shapefile, you may choose to save it using another more supportive encoding such 1105 as UTF-8. Provided the new encoding supports the characters you are trying to write, reading it back in 1106 should give you the same unicode string you started with. 1107 1108 1109 >>> w = shapefile.Writer("shapefiles/test/latin_as_utf8.shp", encoding="utf8") 1110 >>> w.fields = r.fields[1:] 1111 >>> w.record(*r.record(0)) 1112 >>> w.null() 1113 >>> w.close() 1114 1115 >>> r = shapefile.Reader("shapefiles/test/latin_as_utf8.shp", encoding="utf8") 1116 >>> r.record(0) == [2, u'Ñandú'] 1117 True 1118 1119 If you supply the wrong encoding and the string is unable to be decoded, PyShp will by default raise an 1120 exception. If however, on rare occasion, you are unable to find the correct encoding and want to ignore 1121 or replace encoding errors, you can specify the "encodingErrors" to be used by the decode method. This 1122 applies to both reading and writing. 1123 1124 1125 >>> r = shapefile.Reader("shapefiles/test/latin1.shp", encoding="ascii", encodingErrors="replace") 1126 >>> r.record(0) == [2, u'�and�'] 1127 True 1128 1129 1130 # Testing 1131 1132 The testing framework is doctest, which are located in this file README.md. 1133 In the same folder as README.md and shapefile.py, from the command line run 1134 ``` 1135 $ python shapefile.py 1136 ``` 1137 1138 Linux/Mac and similar platforms will need to run `$ dos2unix README.md` in order 1139 correct line endings in README.md. 1140 1141 # Contributors 1142 1143 ``` 1144 Atle Frenvik Sveen 1145 Bas Couwenberg 1146 Casey Meisenzahl 1147 Charles Arnold 1148 David A. Riggs 1149 davidh-ssec 1150 Evan Heidtmann 1151 ezcitron 1152 fiveham 1153 geospatialpython 1154 Hannes 1155 Ignacio Martinez Vazquez 1156 Jason Moujaes 1157 Jonty Wareing 1158 Karim Bahgat 1159 Kyle Kelley 1160 Louis Tiao 1161 Marcin Cuprjak 1162 mcuprjak 1163 Micah Cochran 1164 Michael Davis 1165 Michal Čihař 1166 Mike Toews 1167 Nilo 1168 pakoun 1169 Paulo Ernesto 1170 Raynor Vliegendhart 1171 Razzi Abuissa 1172 RosBer97 1173 Ross Rogers 1174 Ryan Brideau 1175 Tobias Megies 1176 Tommi Penttinen 1177 Uli Köhler 1178 Vsevolod Novikov 1179 Zac Miller 1180 ``` 1181 1182Keywords: gis geospatial geographic shapefile shapefiles 1183Platform: UNKNOWN 1184Classifier: Programming Language :: Python 1185Classifier: Programming Language :: Python :: 2.7 1186Classifier: Programming Language :: Python :: 3 1187Classifier: Programming Language :: Python :: 3.5 1188Classifier: Programming Language :: Python :: 3.6 1189Classifier: Programming Language :: Python :: 3.7 1190Classifier: Programming Language :: Python :: 3.8 1191Classifier: Topic :: Scientific/Engineering :: GIS 1192Classifier: Topic :: Software Development :: Libraries 1193Classifier: Topic :: Software Development :: Libraries :: Python Modules 1194Requires-Python: >= 2.7 1195Description-Content-Type: text/markdown 1196