1# coding=utf-8 2__version__ = '3.5.0' 3__author__ = "Avinash Kak (kak@purdue.edu)" 4__date__ = '2021-May-30' 5__url__ = 'https://engineering.purdue.edu/kak/dist/BitVector-3.5.0.html' 6__copyright__ = "(C) 2021 Avinash Kak. Python Software Foundation." 7 8__doc__ = ''' 9 10BitVector.py 11 12Version: ''' + __version__ + ''' 13 14Author: Avinash Kak (kak@purdue.edu) 15 16Date: ''' + __date__ + ''' 17 18 19@title 20CHANGES IN THIS VERSION: 21 22 Version 3.5.0 makes the module ready for Python 3.9. The methods 23 "array.fromstring" and "array.tostring" will not exist in Python 3.9. 24 One additional change that I have made is that I have gone back to my 25 older implementation for __add__ on account of a bug in the new 26 implementation that I had incorporated in Version 3.4.9. 27 28 Version 3.4.9, in addition to implementing __iadd__, also includes a 29 significantly faster implementation for __add__. For extending a given 30 instance of BitVector, the implementation for __iadd__ adds bits to the 31 existing instance as opposed to creating a new instance. These changes 32 to the module should make it easier to solve larger problems more 33 quickly with BitVector. 34 35 Version 3.4.8 fixes a bug in the slice assignment logic in the 36 implementation of __setitem__. This version also makes the module 37 ready for Version 3.5.3 of Python3 that requires that when you write to 38 a string stream object, you do so with literals of type bytes. 39 40 Version 3.4.7 fixes a Python 3 specific bug in the write_to_file() 41 method of the module. While I was at it, I have also changed the name 42 of the method write_bits_to_fileobject() to 43 write_bits_to_stream_object() so that there is no confusion between 44 write_to_file() and write_bits_to_fileobject(). For backward 45 compatibility, write_bits_to_fileobject() will continue to work if you 46 are writing a bitvector to a string stream object. 47 48 See the "CHANGE HISTORY" section for a complete history of all the 49 changes made in the different versions of this module. 50 51 52@title 53INTRODUCTION: 54 55 The BitVector class is for a memory-efficient packed representation of 56 bit arrays and for logical operations on such arrays. The operations 57 supported on bit vectors are: 58 59 __add__ for concatenation 60 __and__ for bitwise logical AND 61 __contains__ 62 __eq__, __ne__, __lt__, __le__, __gt__, __ge__ 63 __getitem__ for indexed and sliced access 64 __iadd__ for adding to an existing bitvector 65 __int__ for returning integer value 66 __invert__ for inverting the 1's and 0's 67 __iter__ for iterating through 68 __len__ for len() 69 __lshift__ for circular shifts to the left 70 __or__ for bitwise logical OR 71 __rshift__ for circular shifts to the right 72 __setitem__ for indexed and sliced setting 73 __str__ for str() 74 __xor__ for bitwise logical XOR 75 close_file_object 76 count_bits 77 count_bits_sparse faster for sparse bit vectors 78 deep_copy 79 divide_into_two 80 gcd for greatest common divisor 81 gen_random_bits 82 get_bitvector_in_ascii 83 get_bitvector_in_hex 84 gf_divide_by_modulus for modular divisions in GF(2^n) 85 gf_MI for multiplicative inverse in GF(2^n) 86 gf_multiply for multiplications in GF(2) 87 gf_multiply_modular for multiplications in GF(2^n) 88 hamming_distance 89 int_val for returning the integer value 90 is_power_of_2 91 is_power_of_2_sparse faster for sparse bit vectors 92 jaccard_distance 93 jaccard_similarity 94 length 95 min_canonical for min-int-value canonical form 96 multiplicative_inverse 97 next_set_bit 98 pad_from_left 99 pad_from_right 100 permute 101 rank_of_bit_set_at_index 102 read_bits_from_file 103 reset 104 reverse 105 runs 106 set_value 107 shift_left for non-circular left shift 108 shift_right for non-circular right shift 109 test_for_primality 110 unpermute 111 write_bits_to_stream_object 112 write_to_file 113 114 115@title 116CONSTRUCTING BIT VECTORS: 117 118 You can construct a bit vector in the following different ways: 119 120 @tagC0 121 (C0) You construct an EMPTY bit vector using the following syntax: 122 123 bv = BitVector(size = 0) 124 125 @tagC1 126 (C1) You can construct a bit vector directly from either a tuple or a 127 list of bits, as in 128 129 bv = BitVector(bitlist = [1,0,1,0,0,1,0,1,0,0,1,0,1,0,0,1]) 130 131 @tagC2 132 (C2) You can construct a bit vector from an integer by 133 134 bv = BitVector(intVal = 56789) 135 136 The bits stored now will correspond to the binary representation 137 of the integer. The resulting bit vector is the shortest 138 possible bit vector for the integer value supplied. For example, 139 when intVal is 0, the bit vector constructed will consist of just 140 the bit 0. 141 142 @tagC3 143 (C3) When initializing a bit vector with an intVal as shown above, you 144 can also specify a size for the bit vector: 145 146 bv = BitVector(intVal = 0, size = 8) 147 148 will return the bit vector consisting of the bit pattern 149 00000000. The zero padding needed for meeting the size 150 requirement is always on the left. If the size supplied is 151 smaller than what it takes to create the shortest possible bit 152 vector for intVal, an exception is thrown. 153 154 @tagC4 155 (C4) You can create a zero-initialized bit vector of a given size by 156 157 bv = BitVector(size = 62) 158 159 This bit vector will hold exactly 62 bits, all initialized to 160 the 0 bit value. 161 162 @tagC5 163 (C5) You can construct a bit vector from a disk file by a two-step 164 procedure. First you construct an instance of bit vector by 165 166 bv = BitVector(filename = 'somefile') 167 168 This bit vector itself is incapable of holding the bits. To now 169 create bit vectors that actually hold the bits, you need to make 170 the following sort of a call on the above variable bv: 171 172 bv1 = bv.read_bits_from_file(64) 173 174 bv1 will be a regular bit vector containing 64 bits from the disk 175 file. If you want to re-read a file from the beginning for some 176 reason, you must obviously first close the file object that was 177 acquired with a call to the BitVector constructor with a filename 178 argument. This can be accomplished by 179 180 bv.close_file_object() 181 182 @tagC6 183 (C6) You can construct a bit vector from a string of 1's and 0's by 184 185 bv = BitVector(bitstring = '110011110000') 186 187 @tagC7 188 (C7) Yet another way to construct a bit vector is to read the bits 189 directly from a file-like object, as in 190 191 import io 192 x = "111100001111" 193 fp_read = io.StringIO( x ) 194 bv = BitVector(fp = fp_read) 195 print(bv) # 111100001111 196 197 @tagC8 198 (C8) You can also construct a bit vector directly from a text string 199 as shown by the example: 200 201 bv3 = BitVector(textstring = "hello") 202 print(bv3) # 0110100001100101011011000110110001101111 203 mytext = bv3.get_bitvector_in_ascii() 204 print mytext # hello 205 206 The bit vector is constructed by using the one-byte ASCII 207 encoding of the characters in the text string. 208 209 @tagC9 210 (C9) You can also construct a bit vector directly from a string of hex 211 digits as shown by the example: 212 213 bv4 = BitVector(hexstring = "68656c6c6f") 214 print(bv4) # 0110100001100101011011000110110001101111 215 myhexstring = bv4.get_bitvector_in_hex() 216 print myhexstring # 68656c6c6 217 218 @tagC10 219 (C10) You can also construct a bit vector directly from a bytes type 220 object you previously created in your script. This can be useful 221 when you are trying to recover the integer parameters stored in 222 public and private keys. A typical usage scenario: 223 224 keydata = base64.b64decode(open(sys.argv[1]).read().split(None)[1]) 225 bv = BitVector.BitVector(rawbytes = keydata) 226 227 where sys.argv[1] is meant to supply the name of a public key 228 file (in this case an SSH RSA public key file). 229 230 231@title 232OPERATIONS SUPPORTED BY THE BITVECTOR CLASS: 233 234@title 235DISPLAYING BIT VECTORS: 236 237 @tag1 238 (1) Since the BitVector class implements the __str__ method, a bit 239 vector can be displayed on a terminal by 240 241 print(bitvec) 242 243 or, for only Python 2.x, by 244 245 print bitvec 246 247 Basically, you can always obtain the string representation of a 248 bit vector by 249 250 str(bitvec) 251 252 and integer value by 253 254 int(bitvec) 255 256 257@title 258ACCESSING AND SETTING INDIVIDUAL BITS AND SLICES: 259 260 @tag2 261 (2) Any single bit of a bit vector bv can be set to 1 or 0 by 262 263 bv[M] = 1_or_0 264 print( bv[M] ) 265 266 or, for just Python 2.x, by 267 268 bv[M] = 1_or_0 269 print bv[M] 270 271 for accessing (and setting) the bit at the position that is indexed 272 M. You can retrieve the bit at position M by bv[M]. Note that the 273 index 0 corresponds to the first bit at the left end of a bit 274 pattern. This is made possible by the implementation of the 275 __getitem__ and __setitem__ methods. 276 277 @tag3 278 (3) A slice of a bit vector obtained by 279 280 bv[i:j] 281 282 is a bit vector constructed from the bits at index positions from i 283 through j-1. This is made possible by the implementation of the 284 __getitem__ method. 285 286 @tag4 287 (4) You can also carry out slice assignment: 288 289 bv1 = BitVector(size = 25) 290 bv2 = BitVector(bitstring = '1010001') 291 bv1[6:9] = bv2[0:3] 292 bv3 = BitVector(bitstring = '101') 293 bv1[0:3] = bv3 294 295 The first slice assignment will set the 6th, 7th, and the 8th bits 296 of the bit vector bv1 according to the first three bits of bv2. 297 The second slice assignment will set the first three bits of bv1 298 according to the three bits in bv3. This is made possible by the 299 slice setting code in the __setitem__ method. 300 301 @tag5 302 (5) You can iterate over a bit vector, as illustrated by 303 304 for bit in bitvec: 305 print(bit) 306 307 This is made possible by the override definition for the special 308 __iter__() method. 309 310 @tag6 311 (6) Negative subscripts for array-like indexing are supported. 312 Therefore, 313 314 bitvec[-i] 315 316 is legal assuming that the index range is not violated. A negative 317 index carries the usual Python interpretation: The last element of 318 a bit vector is indexed -1 and the first element -(n+1) if n is the 319 total number of bits in the bit vector. Negative subscripts are 320 made possible by special-casing such access in the implementation 321 of the __getitem__ method (actually it is the _getbit method). 322 323 @tag7 324 (7) You can reset a previously constructed bit vector to either the 325 all-zeros state or the all-ones state by 326 327 bv1 = BitVector(size = 25) 328 ... 329 ... 330 bv1.reset(1) 331 ... 332 ... 333 bv1.reset(0) 334 335 The first call to reset() will set all the bits of bv1 to 1's and 336 the second call all the bits to 0's. What the method accomplishes 337 can be thought of as in-place resetting of the bits. The method 338 does not return anything. 339 340 341@title 342LOGICAL OPERATIONS ON BIT VECTORS: 343 344 @tag8 345 (8) Given two bit vectors bv1 and bv2, you can perform bitwise 346 logical operations on them by 347 348 result_bv = bv1 ^ bv2 # for bitwise XOR 349 result_bv = bv1 & bv2 # for bitwise AND 350 result_bv = bv1 | bv2 # for bitwise OR 351 result_bv = ~bv1 # for bitwise negation 352 353 These are made possible by implementing the __xor__, __and__, 354 __or__, and __invert__ methods, respectively. 355 356 357@title 358COMPARING BIT VECTORS: 359 360 @tag9 361 (9) Given two bit vectors bv1 and bv2, you can carry out the following 362 comparisons that return Boolean values: 363 364 bv1 == bv2 365 bv1 != bv2 366 bv1 < bv2 367 bv1 <= bv2 368 bv1 > bv2 369 bv1 >= bv2 370 371 The equalities and inequalities are determined by the integer 372 values associated with the bit vectors. These operator 373 overloadings are made possible by providing implementation code for 374 __eq__, __ne__, __lt__, __le__, __gt__, and __ge__, respectively. 375 376 377@title 378OTHER SUPPORTED OPERATIONS: 379 380 @tag10 381 (10) permute() 382 unpermute() 383 384 You can permute and unpermute bit vectors: 385 386 bv_permuted = bv.permute(permutation_list) 387 388 bv_unpermuted = bv.unpermute(permutation_list) 389 390 391 Both these methods return new bitvector objects. Permuting a 392 bitvector means that you select its bits in the sequence specified 393 by the argument permutation_list. Calling unpermute() with the same 394 argument permutation_list restores the sequence of bits to what it 395 was in the original bitvector. 396 397 @tag11 398 (11) circular shifts 399 400 Left and right circular rotations can be carried out by 401 402 bitvec << N 403 404 bitvec >> N 405 406 for circular rotations to the left and to the right by N bit 407 positions. These operator overloadings are made possible by 408 implementing the __lshift__ and __rshift__ methods, respectively. 409 Note that both these operators return the bitvector on which they 410 are invoked. This allows for a chained invocation of these two 411 operators. 412 413 @tag12 414 (12) shift_left() 415 shift_right() 416 417 If you want to shift in-place a bitvector non-circularly: 418 419 bitvec = BitVector(bitstring = '10010000') 420 bitvec.shift_left(3) # 10000000 421 bitvec.shift_right(3) # 00010000 422 423 As a bitvector is shifted non-circularly to the left or to the 424 right, the exposed bit positions at the opposite end are filled 425 with zeros. These two methods return the bitvector object on which 426 they are invoked. This is to allow for chained invocations of 427 these methods. 428 429 @tag13 430 (13) divide_into_two() 431 432 A bitvector containing an even number of bits can be divided into 433 two equal parts by 434 435 [left_half, right_half] = bitvec.divide_into_two() 436 437 where left_half and right_half hold references to the two returned 438 bitvectors. The method throws an exception if called on a 439 bitvector with an odd number of bits. 440 441 @tag14 442 (14) int_val() 443 444 You can find the integer value of a bitvector object by 445 446 bitvec.int_val() 447 448 or by 449 450 int(bitvec) 451 452 As you expect, a call to int_val() returns an integer value. 453 454 @tag15 455 (15) string representation 456 457 You can convert a bitvector into its string representation by 458 459 str(bitvec) 460 461 @tag16 462 (16) concatenation 463 464 Because __add__ is supplied, you can always join two bitvectors by 465 466 bitvec3 = bitvec1 + bitvec2 467 468 bitvec3 is a new bitvector object that contains all the bits of 469 bitvec1 followed by all the bits of bitvec2. 470 471 Starting with version 3.4.9, the module includes an implementation 472 for __iadd__ that allows for the following compound assignment to 473 be carried out efficiently: 474 475 bitvec1 += bitvec2 476 477 @tag17 478 (17) length() 479 480 You can find the length of a bitvector by 481 482 len = bitvec.length() 483 484 @tag18 485 (18) deep_copy() 486 487 You can make a deep copy of a bitvector by 488 489 bitvec_copy = bitvec.deep_copy() 490 491 Subsequently, any alterations to either of the bitvector objects 492 bitvec and bitvec_copy will not affect the other. 493 494 @tag19 495 (19) read_bits_from_file() 496 497 As mentioned earlier, you can construct bitvectors directly from 498 the bits in a disk file through the following calls. As you can 499 see, this requires two steps: First you make a call as illustrated 500 by the first statement below. The purpose of this call is to 501 create a file object that is associated with the variable bv. 502 Subsequent calls to read_bits_from_file(n) on this variable return 503 a bitvector for each block of n bits thus read. The 504 read_bits_from_file() throws an exception if the argument n is not 505 a multiple of 8. 506 507 bv = BitVector(filename = 'somefile') 508 bv1 = bv.read_bits_from_file(64) 509 bv2 = bv.read_bits_from_file(64) 510 ... 511 ... 512 bv.close_file_object() 513 514 When reading a file as shown above, you can test the attribute 515 more_to_read of the bitvector object in order to find out if there 516 is more to read in the file. The while loop shown below reads all 517 of a file in 64-bit blocks: 518 519 bv = BitVector( filename = 'testinput4.txt' ) 520 print("Here are all the bits read from the file:") 521 while (bv.more_to_read): 522 bv_read = bv.read_bits_from_file( 64 ) 523 print(bv_read) 524 bv.close_file_object() 525 526 The size of the last bitvector constructed from a file corresponds 527 to how many bytes remain unread in the file at that point. It is 528 your responsibility to zero-pad the last bitvector appropriately 529 if, say, you are doing block encryption of the whole file. 530 531 @tag20 532 (20) write_to_file() 533 534 You can write a bit vector directly to a file by calling 535 write_to_file(), as illustrated by the following example that reads 536 one bitvector from a file and then writes it to another file: 537 538 bv = BitVector(filename = 'input.txt') 539 bv1 = bv.read_bits_from_file(64) 540 print(bv1) 541 FILEOUT = open('output.bits', 'wb') 542 bv1.write_to_file(FILEOUT) 543 FILEOUT.close() 544 bv = BitVector(filename = 'output.bits') 545 bv2 = bv.read_bits_from_file(64) 546 print(bv2) 547 548 The method write_to_file() throws an exception if the size of the 549 bitvector on which the method is invoked is not a multiple of 8. 550 This method does not return anything. 551 552 IMPORTANT FOR WINDOWS USERS: When writing an internally generated 553 bit vector out to a disk file, it is important to open 554 the file in the binary mode as shown. Otherwise, the 555 bit pattern 00001010 ('\\n') in your bitstring will be 556 written out as 0000110100001010 ('\\r\\n'), which is 557 the linebreak on Windows machines. 558 559 @tag21 560 (21) write_bits_to_stream_object() 561 562 You can also write a bitvector directly to a stream object, as 563 illustrated by: 564 565 fp_write = io.StringIO() 566 bitvec.write_bits_to_stream_object(fp_write) 567 print(fp_write.getvalue()) 568 569 This method does not return anything. 570 571 @tag22 572 (22) pad_from_left() 573 pad_from_right() 574 575 You can pad a bitvector from the left or from the right with a 576 designated number of zeros: 577 578 bitvec.pad_from_left(n) 579 580 bitvec.pad_from_right(n) 581 582 These two methods return the bitvector object on which they are 583 invoked. So you can think of these two calls as carrying out 584 in-place extensions of the bitvector (although, under the hood, the 585 extensions are carried out by giving new longer _vector attributes 586 to the bitvector objects). 587 588 @tag23 589 (23) if x in y: 590 591 You can test if a bit vector x is contained in another bit vector y 592 by using the syntax 'if x in y'. This is made possible by the 593 override definition for the special __contains__ method. 594 595 @tag24 596 (24) set_value() 597 598 You can call set_value() to change the bit pattern associated with 599 a previously constructed bitvector object: 600 601 bv = BitVector(intVal = 7, size =16) 602 print(bv) # 0000000000000111 603 bv.set_value(intVal = 45) 604 print(bv) # 101101 605 606 You can think of this method as carrying out an in-place resetting 607 of the bit array in a bitvector. The method does not return 608 anything. The allowable modes for changing the internally stored 609 bit pattern for a bitvector are the same as for the constructor. 610 611 @tag25 612 (25) count_bits() 613 614 You can count the number of bits set in a BitVector instance by 615 616 bv = BitVector(bitstring = '100111') 617 print(bv.count_bits()) # 4 618 619 A call to count_bits() returns an integer value that is equal to 620 the number of bits set in the bitvector. 621 622 @tag26 623 (26) count_bits_sparse() 624 625 For folks who use bit vectors with millions of bits in them but 626 with only a few bits set, your bit counting will go much, much 627 faster if you call count_bits_sparse() instead of count_bits(): 628 However, for dense bitvectors, I expect count_bits() to work 629 faster. 630 631 # a BitVector with 2 million bits: 632 bv = BitVector(size = 2000000) 633 bv[345234] = 1 634 bv[233]=1 635 bv[243]=1 636 bv[18]=1 637 bv[785] =1 638 print(bv.count_bits_sparse()) # 5 639 640 A call to count_bits_sparse() returns an integer whose value is the 641 number of bits set in the bitvector. 642 643 @tag27 644 (27) jaccard_similarity() 645 jaccard_distance() 646 hamming_distance() 647 648 You can calculate the similarity and the distance between two 649 bitvectors using the Jaccard similarity coefficient and the Jaccard 650 distance. Also, you can calculate the Hamming distance between two 651 bitvectors: 652 653 bv1 = BitVector(bitstring = '11111111') 654 bv2 = BitVector(bitstring = '00101011') 655 print bv1.jaccard_similarity(bv2) # 0.675 656 print(str(bv1.jaccard_distance(bv2))) # 0.375 657 print(str(bv1.hamming_distance(bv2))) # 4 658 659 For both jaccard_distance() and jaccard_similarity(), the value 660 returned is a floating point number between 0 and 1. The method 661 hamming_distance() returns a number that is equal to the number of 662 bit positions in which the two operand bitvectors disagree. 663 664 @tag28 665 (28) next_set_bit() 666 667 Starting from a given bit position, you can find the position index 668 of the next set bit by 669 670 bv = BitVector(bitstring = '00000000000001') 671 print(bv.next_set_bit(5)) # 13 672 673 In this example, we are asking next_set_bit() to return the index 674 of the bit that is set after the bit position that is indexed 5. If 675 no next set bit is found, the method returns -1. A call to 676 next_set_bit() always returns a number. 677 678 @tag29 679 (29) rank_of_bit_set_at_index() 680 681 You can measure the "rank" of a bit that is set at a given 682 position. Rank is the number of bits that are set up to the 683 position of the bit you are interested in. 684 685 bv = BitVector(bitstring = '01010101011100') 686 print(bv.rank_of_bit_set_at_index(10)) # 6 687 688 The value 6 returned by this call to rank_of_bit_set_at_index() is 689 the number of bits set up to the position indexed 10 (including 690 that position). This method throws an exception if there is no bit 691 set at the argument position. Otherwise, it returns the rank as a 692 number. 693 694 @tag30 695 (30) is_power_of_2() 696 is_power_of_2_sparse() 697 698 You can test whether the integer value of a bit vector is a power 699 of two. The sparse version of this method will work much faster 700 for very long bit vectors. However, the regular version may work 701 faster for dense bit vectors. 702 703 bv = BitVector(bitstring = '10000000001110') 704 print(bv.is_power_of_2()) 705 print(bv.is_power_of_2_sparse()) 706 707 Both these predicates return 1 for true and 0 for false. 708 709 @tag31 710 (31) reverse() 711 712 Given a bit vector, you can construct a bit vector with all the 713 bits reversed, in the sense that what was left to right before now 714 becomes right to left. 715 716 bv = BitVector(bitstring = '0001100000000000001') 717 print(str(bv.reverse())) 718 719 A call to reverse() returns a new bitvector object whose bits are 720 in reverse order in relation to the bits in the bitvector on which 721 the method is invoked. 722 723 @tag32 724 (32) gcd() 725 726 You can find the greatest common divisor of two bit vectors: 727 728 bv1 = BitVector(bitstring = '01100110') # int val: 102 729 bv2 = BitVector(bitstring = '011010') # int val: 26 730 bv = bv1.gcd(bv2) 731 print(int(bv)) # 2 732 733 The result returned by gcd() is a bitvector object. 734 735 @tag33 736 (33) multiplicative_inverse() 737 738 This method calculates the multiplicative inverse using normal 739 integer arithmetic. [For such inverses in a Galois Field GF(2^n), 740 use the method gf_MI().] 741 742 bv_modulus = BitVector(intVal = 32) 743 bv = BitVector(intVal = 17) 744 bv_result = bv.multiplicative_inverse( bv_modulus ) 745 if bv_result is not None: 746 print(str(int(bv_result))) # 17 747 else: print "No multiplicative inverse in this case" 748 749 What this example says is that the multiplicative inverse of 17 750 modulo 32 is 17. That is because 17 times 17 modulo 32 equals 1. 751 When using this method, you must test the returned value for 752 None. If the returned value is None, that means that the number 753 corresponding to the bitvector on which the method is invoked does 754 not possess a multiplicative-inverse with respect to the modulus. 755 When the multiplicative inverse exists, the result returned by 756 calling multiplicative_inverse() is a bitvector object. 757 758 @tag34 759 (34) gf_MI() 760 761 To calculate the multiplicative inverse of a bit vector in the 762 Galois Field GF(2^n) with respect to a modulus polynomial, call 763 gf_MI() as follows: 764 765 modulus = BitVector(bitstring = '100011011') 766 n = 8 767 a = BitVector(bitstring = '00110011') 768 multi_inverse = a.gf_MI(modulus, n) 769 print multi_inverse # 01101100 770 771 A call to gf_MI() returns a bitvector object. 772 773 @tag35 774 (35) gf_multiply() 775 776 If you just want to multiply two bit patterns in GF(2): 777 778 a = BitVector(bitstring='0110001') 779 b = BitVector(bitstring='0110') 780 c = a.gf_multiply(b) 781 print(c) # 00010100110 782 783 As you would expect, in general, the bitvector returned by this 784 method is longer than the two operand bitvectors. A call to 785 gf_multiply() returns a bitvector object. 786 787 @tag36 788 (36) gf_multiply_modular() 789 790 If you want to carry out modular multiplications in the Galois 791 Field GF(2^n): 792 793 modulus = BitVector(bitstring='100011011') # AES modulus 794 n = 8 795 a = BitVector(bitstring='0110001') 796 b = BitVector(bitstring='0110') 797 c = a.gf_multiply_modular(b, modulus, n) 798 print(c) # 10100110 799 800 The call to gf_multiply_modular() returns the product of the two 801 bitvectors a and b modulo the bitvector modulus in GF(2^8). A call 802 to gf_multiply_modular() returns is a bitvector object. 803 804 @tag37 805 (37) gf_divide_by_modulus() 806 807 To divide a bitvector by a modulus bitvector in the Galois Field 808 GF(2^n): 809 810 mod = BitVector(bitstring='100011011') # AES modulus 811 n = 8 812 a = BitVector(bitstring='11100010110001') 813 quotient, remainder = a.gf_divide_by_modulus(mod, n) 814 print(quotient) # 00000000111010 815 print(remainder) # 10001111 816 817 What this example illustrates is dividing the bitvector a by the 818 modulus bitvector mod. For a more general division of one 819 bitvector a by another bitvector b, you would multiply a by the MI 820 of b, where MI stands for "multiplicative inverse" as returned by 821 the call to the method gf_MI(). A call to gf_divide_by_modulus() 822 returns two bitvectors, one for the quotient and the other for the 823 remainder. 824 825 @tag38 826 (38) runs() 827 828 You can extract from a bitvector the runs of 1's and 0's in the 829 vector as follows: 830 831 bv = BitVector(bitlist = (1,1, 1, 0, 0, 1)) 832 print(str(bv.runs())) # ['111', '00', '1'] 833 834 The object returned by runs() is a list of strings, with each 835 element of this list being a string of 1's and 0's. 836 837 @tag39 838 (39) gen_random_bits() 839 840 You can generate a bitvector with random bits with the bits 841 spanning a specified width. For example, if you wanted a random 842 bit vector to fully span 32 bits, you would say 843 844 bv = BitVector(intVal = 0) 845 bv = bv.gen_random_bits(32) 846 print(bv) # 11011010001111011010011111000101 847 848 As you would expect, gen_random_bits() returns a bitvector object. 849 850 @tag40 851 (40) test_for_primality() 852 853 You can test whether a randomly generated bit vector is a prime 854 number using the probabilistic Miller-Rabin test 855 856 bv = BitVector(intVal = 0) 857 bv = bv.gen_random_bits(32) 858 check = bv.test_for_primality() 859 print(check) 860 861 The test_for_primality() methods returns a floating point number 862 close to 1 for prime numbers and 0 for composite numbers. The 863 actual value returned for a prime is the probability associated 864 with the determination of its primality. 865 866 @tag41 867 (41) get_bitvector_in_ascii() 868 869 You can call get_bitvector_in_ascii() to directly convert a bit 870 vector into a text string (this is a useful thing to do only if the 871 length of the vector is an integral multiple of 8 and every byte in 872 your bitvector has a print representation): 873 874 bv = BitVector(textstring = "hello") 875 print(bv) # 0110100001100101011011000110110001101111 876 mytext = bv3.get_bitvector_in_ascii() 877 print mytext # hello 878 879 This method is useful when you encrypt text through its bitvector 880 representation. After decryption, you can recover the text using 881 the call shown here. A call to get_bitvector_in_ascii() returns a 882 string. 883 884 @tag42 885 (42) get_bitvector_in_hex() 886 887 You can directly convert a bit vector into a hex string (this is a 888 useful thing to do only if the length of the vector is an integral 889 multiple of 4): 890 891 bv4 = BitVector(hexstring = "68656c6c6f") 892 print(bv4) # 0110100001100101011011000110110001101111 893 myhexstring = bv4.get_bitvector_in_hex() 894 print myhexstring # 68656c6c6 895 896 This method throws an exception if the size of the bitvector is not 897 a multiple of 4. The method returns a string. 898 899 @tag43 900 (43) close_file_object() 901 902 When you construct bitvectors by block scanning a disk file, after 903 you are done, you can call this method to close the file object 904 that was created to read the file: 905 906 bv = BitVector(filename = 'somefile') 907 bv1 = bv.read_bits_from_file(64) 908 bv.close_file_object() 909 910 The constructor call in the first statement creates a file object 911 for reading the bits. It is this file object that is closed when 912 you call close_file_object(). 913 914 @tag44 915 (44) min_canonical() 916 917 This methods returns the canonical form of a bitvector, which 918 corresponds to a circularly rotated version of the bitvector with the 919 largest number of leading zeros. 920 921 bv = BitVector(intVal = 5678, size = 14) 922 min_bv = bv.min_canonical() 923 print(bv) # 01011000101110 924 print(min_bv) # 00010111001011 925 print(int(min_bv)) # 1483 926 927 928@title 929CHANGE HISTORY: 930 931 Version 3.4.6 932 933 Version 3.4.6 fixes what was hopefully the last remaining bug in using 934 negative index values for slice assignments. 935 936 Version 3.4.5 937 938 Version 3.4.5 fixes an important bug in the code meant for slice 939 assignment. This bug made itself evident when using negative 940 start/stop values for slice assignment. Additionally, this version 941 includes a new method named min_canonical() that returns a circularly 942 rotated form of a BitVector with the maximum number of leading zeros. 943 Such canonical forms of bit patterns are used in the "Local Binary 944 Pattern" algorithm for characterizing image textures. 945 946 Version 3.4.4 947 948 This version fixes the behavior of the module for the edge case of an 949 empty BitVector instance. (An empty BitVector has no bits at all.) 950 Previously, invoking the count_bits() and runs() methods on an empty 951 BitVector instance produced results that were inconsistent with those 952 from regular instances. 953 954 Version 3.4.3 955 956 This is a quick release that fixes the problem with the relative imports 957 in the previous version. Python3 does not like relative imports. 958 959 Version 3.4.2 960 961 Unfortunately, the packaging of the previous version was not exporting 962 the module metadata. That problem has been fixed in this version. 963 964 Version 3.4.1 965 966 This version fixes two module packaging errors in the previous version. 967 One error related to the specification of the "packages" keyword in 968 setup.py and the other error related to not updating the manifest with 969 the HTML documentation file for the module. 970 971 Version 3.4 972 973 This version includes a bug fix and significant improvements to the 974 documentation. The new documentation is clearer about what is returned 975 by each method and, when a method throws an exception, that fact is 976 stated in the documentation associated with the method. The condition 977 that triggers the exception is also stated. The bug fix was in the 978 test_for_primality() method. If invoked for testing the primality of 979 1, it would get trapped in an infinite loop. Additionally, when 980 constructing a bitvector from a hex string, this version allows the hex 981 characters to be in either case. Previously, only lowercase hex 982 characters were acceptable. Finally, I have changed the names of a 983 couple of methods to better reflect their function. The old names 984 would, however, continue to work for backward compatibility. 985 986 Version 3.3.2: 987 988 This version fixes a bug in the constructor code for creating a bit 989 vector from a text string. The bug was triggered by character escapes 990 in such strings. 991 992 Version 3.3.1: 993 994 This is a minor upgrade to make the syntax of the API method 995 declarations more uniform. Previously, while most of the method names 996 used underscores to connect multiple words, some used camelcasing. Now 997 all use underscores. For backward compatibility, the old calls will 998 continue to work. 999 1000 Version 3.3: 1001 1002 This version includes: (1) One additional constructor mode that allows 1003 a bit vector to be constructed directly from the bytes type objects in 1004 the memory. (2) A bugfix in the slice function for the case when the 1005 upper and the lower bounds of the slice range are identical. (3) A 1006 bugfix for the next_set_bit() method. 1007 1008 Version 3.2: 1009 1010 This version includes support for constructing bit vectors directly 1011 from text strings and hex strings. This version also includes a safety 1012 check on the sizes of the two argument bit vectors when calculating 1013 Jaccard similarity between the two. 1014 1015 Version 3.1.1: 1016 1017 This version includes: (1) a fix to the module test code to account for 1018 how string input is handled in the io.StringIO class in Python 2.7; (2) 1019 some improvements to the documentation. 1020 1021 Version 3.1: 1022 1023 This version includes: (1) Correction for a documentation error; (2) 1024 Fix for a bug in slice assignment when one or both of the slice limits 1025 were left unspecified; (3) The non-circular bit shift methods now 1026 return self so that they can be chained; (4) A method for testing a 1027 bitvector for its primality; and (5) A method that uses Python's 1028 'random.getrandbits()' to generate a bitvector that can serve as 1029 candidate for primes whose bitfield size is specified. 1030 1031 Version 3.0: 1032 1033 This is a Python 3.x compliant version of the latest incarnation of the 1034 BitVector module. This version should work with both Python 2.x and 1035 Python 3.x. 1036 1037 Version 2.2: 1038 1039 Fixed a couple of bugs, the most important being in the bitvector 1040 initialization code for the cases when the user-specified value for 1041 size conflicts with the user-specified int value for the vector. 1042 Version 2.2 also includes a new method runs() that returns a list of 1043 strings of the consecutive runs of 1's and 0's in the bitvector. The 1044 implementation of the circular shift operators has also been improved 1045 in Version 2.2. This version allows for a chained invocation of these 1046 operators. Additionally, the circular shift operators now exhibit 1047 expected behavior if the user-specified shift value is negative. 1048 1049 Version 2.1: 1050 1051 Includes enhanced support for folks who use this class for computer 1052 security and cryptography work. You can now call on the methods of the 1053 BitVector class to do Galois Field GF(2^n) arithmetic on bit arrays. 1054 This should save the users of this class the bother of having to write 1055 their own routines for finding multiplicative inverses in GF(2^n) 1056 finite fields. 1057 1058 Version 2.0.1: 1059 1060 Fixed numerous typos and other errors in the documentation page for the 1061 module. The implementation code remains unchanged. 1062 1063 Version 2.0: 1064 1065 To address the needs of the folks who are using the BitVector class in 1066 data mining research, the new version of the class includes several 1067 additional methods. Since the bitvectors used by these folks can be 1068 extremely long, possibly involving millions of bits, the new version of 1069 the class includes a much faster method for counting the total number 1070 of set bits when a bitvector is sparse. [But note that this new bit 1071 counting method may perform poorly for dense bitvectors. So the old bit 1072 counting method has been retained.] Also for data mining folks, the 1073 new version of the class is provided with similarity and distance 1074 calculation metrics such as the Jaccard similarity coefficient, the 1075 Jaccard distance, and the Hamming distance. Again for the same folks, 1076 the class now also has a next_set_bit(from_index) method. Other 1077 enhancements to the class include methods for folks who do research in 1078 cryptography. Now you can directly calculate the greatest common 1079 divisor of two bitvectors, or find the multiplicative inverse of one 1080 bitvector modulo another bitvector. 1081 1082 Version 1.5.1: 1083 1084 Removed a bug from the implementation of the right circular shift 1085 operator. 1086 1087 Version 1.5: 1088 1089 This version should prove to be much more efficient for long 1090 bitvectors. Efficiency in BitVector construction when only its size is 1091 specified was achieved by eliminating calls to _setbit(). The 1092 application of logical operators to two BitVectors of equal length was 1093 also made efficient by eliminating calls to the padding function. 1094 Another feature of this version is the count_bits() method that returns 1095 the total number of bits set in a BitVector instance. Yet another 1096 feature of this version is the setValue() method that alters the bit 1097 pattern associated with a previously constructed BitVector. 1098 1099 Version 1.4.1: 1100 1101 The reset() method now returns 'self' to allow for cascaded invocation 1102 with the slicing operator. Also removed the discrepancy between the 1103 value of the __copyright__ variable in the module and the value of 1104 license variable in setup.py. 1105 1106 Version 1.4: 1107 1108 This version includes the following two upgrades: 1) code for slice 1109 assignment; and 2) A reset function to reinitialize a previously 1110 constructed BitVector. Additionally, the code was cleaned up with the 1111 help of pychecker. 1112 1113 Version 1.3.2: 1114 1115 Fixed a potentially misleading documentation issue for the Windows 1116 users of the BitVector class. If you are writing an internally 1117 generated BitVector to a disk file, you must open the file in the 1118 binary mode. If you don't, the bit patterns that correspond to line 1119 breaks will be misinterpreted. On a Windows machine in the text mode, 1120 the bit pattern 000001010 ('\\n') will be written out to the disk as 1121 0000110100001010 ('\\r\\n'). 1122 1123 Version 1.3.1: 1124 1125 Removed the inconsistency in the internal representation of bitvectors 1126 produced by logical bitwise operations vis-a-vis the bitvectors created 1127 by the constructor. Previously, the logical bitwise operations 1128 resulted in bitvectors that had their bits packed into lists of ints, 1129 as opposed to arrays of unsigned shorts. 1130 1131 Version 1.3: 1132 1133 (a) One more constructor mode included: When initializing a new 1134 bitvector with an integer value, you can now also specify a size for 1135 the bitvector. The constructor zero-pads the bitvector from the left 1136 with zeros. (b) The BitVector class now supports 'if x in y' syntax to 1137 test if the bit pattern 'x' is contained in the bit pattern 'y'. (c) 1138 Improved syntax to conform to well-established Python idioms. (d) What 1139 used to be a comment before the beginning of each method definition is 1140 now a docstring. 1141 1142 Version 1.2: 1143 1144 (a) One more constructor mode included: You can now construct a 1145 bitvector directly from a string of 1's and 0's. (b) The class now 1146 constructs a shortest possible bit vector from an integer value. So 1147 the bit vector for the integer value 0 is just one bit of value 0, and 1148 so on. (c) All the rich comparison operators are now overloaded. (d) 1149 The class now includes a new method 'intValue()' that returns the 1150 unsigned integer value of a bit vector. This can also be done through 1151 '__int__'. (e) The package now includes a unittest based framework for 1152 testing out an installation. This is in a separate directory called 1153 "TestBitVector". 1154 1155 Version 1.1.1: 1156 1157 The function that does block reads from a disk file now peeks ahead at 1158 the end of each block to see if there is anything remaining to be read 1159 in the file. If nothing remains, the more_to_read attribute of the 1160 BitVector object is set to False. This simplifies reading loops. This 1161 version also allows BitVectors of size 0 to be constructed 1162 1163 1164 Version 1.1: 1165 1166 I have changed the API significantly to provide more ways for 1167 constructing a bit vector. As a result, it is now necessary to supply 1168 a keyword argument to the constructor. 1169 1170 1171@title 1172INSTALLATION: 1173 1174 The BitVector class was packaged using setuptools. For installation, 1175 execute the following command-line in the source directory (this is the 1176 directory that contains the setup.py file after you have downloaded and 1177 uncompressed the tar archive): 1178 1179 sudo python3 setup.py install 1180 1181 On Linux distributions, this will install the module file at a location 1182 that looks like 1183 1184 /usr/local/lib/python3.8/dist-packages/ 1185 1186 If you do not have root access, you have the option of working directly 1187 off the directory in which you downloaded the software by simply 1188 placing the following statements at the top of your scripts that use 1189 the BitVector class 1190 1191 import sys 1192 sys.path.append( "pathname_to_BitVector_directory" ) 1193 1194 To uninstall the module, simply delete the source directory, locate 1195 where BitVector was installed with "locate BitVector" and delete those 1196 files. As mentioned above, the full pathname to the installed version 1197 is likely to look like /usr/local/lib/python3.8/dist-packages/BitVector* 1198 1199 If you want to carry out a non-standard install of BitVector, look up 1200 the on-line information on Disutils by pointing your browser to 1201 1202 http://docs.python.org/dist/dist.html 1203 1204 1205@title 1206HOW THE BIT VECTORS ARE STORED: 1207 1208 The bits of a bit vector are stored in 16-bit unsigned ints 1209 following Josiah Carlson's recommendation to that effect on the 1210 Pyrex mailing list. As you can see in the code for `__init__()', 1211 after resolving the argument with which the constructor is called, 1212 the very first thing the constructor does is to figure out how many 1213 of those 2-byte ints it needs for the bits (see how the value is 1214 assigned to the variable `two_byte_ints_needed' toward the end of 1215 `__init__()'). For example, if you wanted to store a 64-bit array, 1216 the variable 'two_byte_ints_needed' would be set to 4. (This does 1217 not mean that the size of a bit vector must be a multiple of 16. 1218 Any sized bit vectors can be constructed --- the constructor will 1219 choose the minimum number of two-byte ints needed.) Subsequently, 1220 the constructor acquires an array of zero-initialized 2-byte ints. 1221 The last thing that is done in the code for `__init__()' is to 1222 shift the bits into the array of two-byte ints. 1223 1224 As mentioned above, note that it is not necessary for the size of a 1225 bit vector to be a multiple of 16 even though we are using C's 1226 unsigned short as as a basic unit for storing the bit arrays. The 1227 class BitVector keeps track of the actual number of bits in the bit 1228 vector through the "size" instance variable. 1229 1230 Note that, except for one case, the constructor must be called with 1231 a single keyword argument, which determines how the bit vector will 1232 be constructed. The single exception to this rule is for the 1233 keyword argument `intVal' which can be used along with the `size' 1234 keyword argument. When `intVal' is used without the `size' option, 1235 the bit vector constructed for the integer is the shortest possible 1236 bit vector. On the other hand, when `size' is also specified, the 1237 bit vector is padded with zeroes from the left so that it has the 1238 specified size. The code for `__init__()' begins by making sure 1239 your constructor call only uses the acceptable keywords. The 1240 constraints on how many keywords can be used together in a 1241 constructor call are enforced when we process each keyword option 1242 separately in the rest of the code for `__init__()'. 1243 1244 The first keyword option processed by `__init__()' is for 1245 `filename'. When the constructor is called with the `filename' 1246 keyword, as in 1247 1248 bv = BitVector(filename = 'myfilename') 1249 1250 the call returns a bit vector on which you must subsequently invoke 1251 the `read_bits_from_file()' method to actually obtain a bit vector 1252 consisting of the bits that constitute the information stored in 1253 the file. 1254 1255 The next keyword option considered in `__init__()' is for `fp', 1256 which is for constructing a bit vector by reading off the bits from 1257 a file-like object, as in 1258 1259 x = "111100001111" 1260 fileobj = StringIO.StringIO( x ) 1261 bv = BitVector( fp = fileobj ) 1262 1263 The keyword option `intVal' considered next is for converting an 1264 integer into a bit vector through a constructor call like 1265 1266 bv = BitVector(intVal = 123456) 1267 1268 The bits stored in the bit vector thus created correspond to the 1269 big-endian binary representation of the integer argument provided 1270 through `intVal' (meaning that the most significant bit will be at 1271 the leftmost position in the bit vector.) THE BIT VECTOR 1272 CONSTRUCTED WITH THE ABOVE CALL IS THE SHORTEST POSSIBLE BIT VECTOR 1273 FOR THE INTEGER SUPPLIED. As a case in point, when `intVal' is set 1274 to 0, the bit vector consists of a single bit is 0 also. When 1275 constructing a bit vector with the `intVal' option, if you also 1276 want to impose a size condition on the bit vector, you can make a 1277 call like 1278 1279 bv = BitVector(intVal = 46, size = 16) 1280 1281 which returns a bit vector of the indicated size by padding the 1282 shortest possible vector for the `intVal' option with zeros from 1283 the left. 1284 1285 The next option processed by `__init_()' is for the `size' keyword 1286 when this keyword is used all by itself. If you want a bit vector 1287 of just 0's of whatever size, you make a call like 1288 1289 bv = BitVector(size = 61) 1290 1291 This returns a bit vector that will hold exactly 61 bits, all 1292 initialized to the zero value. 1293 1294 The next constructor keyword processed by `__init__()' is 1295 `bitstring'. This is to allow a bit vector to be constructed 1296 directly from a bit string as in 1297 1298 bv = BitVector(bitstring = '00110011111') 1299 1300 The keyword considered next is `bitlist' which allows a bit vector 1301 to be constructed from a list or a tuple of individual bits, as in 1302 1303 bv = BitVector(bitlist = (1, 0, 1, 1, 0, 0, 1)) 1304 1305 The last two keyword options considered in `__init__()' are for 1306 keywords `textstring' and `hexstring'. If you want to construct a 1307 bitvector directly from a text string, you call 1308 1309 bv = BitVector(textstring = "hello") 1310 1311 The bit vector created corresponds to the ASCII encodings of the 1312 individual characters in the text string. 1313 1314 And if you want to do the same with a hex string, you call 1315 1316 bv = BitVector(hexstring = "68656c6c6f") 1317 1318 Now, as you would expect, the bits in the bit vector will 1319 correspond directly to the hex digits in your hex string. 1320 1321 1322@title 1323ACKNOWLEDGMENTS: 1324 1325 The author is grateful to Oleg Broytmann for suggesting many 1326 improvements that were incorporated in Version 1.1 of this package. 1327 The author would like to thank Kurt Schwehr whose email resulted in 1328 the creation of Version 1.2. Kurt also caught an error in my 1329 earlier version of 'setup.py' and suggested a unittest based 1330 approach to the testing of the package. Kurt also supplied the 1331 Makefile that is included in this distribution. The author would 1332 also like to thank all (Scott Daniels, Blair Houghton, and Steven 1333 D'Aprano) for their responses to my comp.lang.python query 1334 concerning how to make a Python input stream peekable. This 1335 feature was included in Version 1.1.1. 1336 1337 With regard to the changes incorporated in Version 1.3, thanks are 1338 owed to Kurt Schwehr and Gabriel Ricardo for bringing to my 1339 attention the bug related to the intVal method of initializing a 1340 bit vector when the value of intVal exceeded sys.maxint. This 1341 problem is fixed in Version 1.3. Version 1.3 also includes many 1342 other improvements that make the syntax better conform to the 1343 standard idioms of Python. These changes and the addition of the 1344 new constructor mode (that allows a bit vector of a given size to 1345 be constructed from an integer value) are also owing to Kurt's 1346 suggestions. 1347 1348 With regard to the changes incorporated in Version 1.3.1, I would 1349 like to thank Michael Haggerty for noticing that the bitwise 1350 logical operators resulted in bit vectors that had their bits 1351 packed into lists of ints, as opposed to arrays of unsigned shorts. 1352 This inconsistency in representation has been removed in version 1353 1.3.1. Michael has also suggested that since BitVector is mutable, 1354 I should be overloading __iand__(), __ior__(), etc., for in-place 1355 modifications of bit vectors. Michael certainly makes a good 1356 point. But I am afraid that this change will break the code for the 1357 existing users of the BitVector class. 1358 1359 I thank Mathieu Roy for bringing to my attention the problem with 1360 writing bitstrings out to a disk files on Windows machines. This 1361 turned out to be a problem more with the documentation than with 1362 the BitVector class itself. On a Windows machine, it is 1363 particularly important that a file you are writing a bitstring into 1364 be opened in binary mode since otherwise the bit pattern 00001010 1365 ('\\n') will be written out as 0000110100001010 ('\\r\\n'). This 1366 documentation fix resulted in Version 1.3.2. 1367 1368 With regard to Version 1.4, the suggestions/bug reports made by John 1369 Kominek, Bob Morse, and Steve Ward contributed to this version. I wish 1370 to thank all three. John wanted me to equip the class with a reset() 1371 method so that a previously constructed class could be reset to either 1372 all 0's or all 1's. Bob spotted loose local variables in the 1373 implementation --- presumably left over from a debugging phase of the 1374 code. Bob recommended that I clean up the code with pychecker. That has 1375 been done. Steve noticed that slice assignment was not working. It 1376 should work now. 1377 1378 Version 1.4.1 was prompted by John Kominek suggesting that if reset() 1379 returned self, then the slice operation could be combined with the reset 1380 operation. Thanks John! Another reason for 1.4.1 was to remove the 1381 discrepancy between the value of the __copyright__ variable in the module 1382 and the value of license variable in setup.py. This discrepancy was 1383 brought to my attention by David Eyk. Thanks David! 1384 1385 Version 1.5 has benefited greatly by the suggestions made by Ryan Cox. 1386 By examining the BitVector execution with cProfile, Ryan observed that my 1387 implementation was making unnecessary method calls to _setbit() when just 1388 the size option is used for constructing a BitVector instance. Since 1389 Python allocates cleaned up memory, it is unnecessary to set the 1390 individual bits of a vector if it is known in advance that they are all 1391 zero. Ryan made a similar observation for the logical operations applied 1392 to two BitVector instances of equal length. He noticed that I was making 1393 unnecessary calls to _resize_pad_from_left() for the case of equal 1394 arguments to logical operations. Ryan also recommended that I include a 1395 method that returns the total number of bits set in a BitVector instance. 1396 The new method count_bits() does exactly that. Thanks Ryan for all your 1397 suggestions. Version 1.5 also includes the method setValue() that allows 1398 the internally stored bit pattern associated with a previously 1399 constructed BitVector to be changed. A need for this method was 1400 expressed by Aleix Conchillo. Thanks Aleix. 1401 1402 Version 1.5.1 is a quick release to fix a bug in the right circular shift 1403 operator. This bug was discovered by Jasper Spaans. Thanks very much 1404 Jasper. 1405 1406 Version 2.0 was prompted mostly by the needs of the folks who play with 1407 very long bit vectors that may contain millions of bits. Such bit 1408 vectors are encountered in data mining research and development. Towards 1409 that end, among the new methods in Version 2.0, the count_bits_sparse() 1410 was provided by Rhiannon Weaver. She says when a bit vector contains 1411 over 2 million bits and, say, only five bits are set, her method is 1412 faster than the older count_bits() method by a factor of roughly 18. 1413 Thanks Rhiannon. [The logic of the new implementation works best for very 1414 sparse bit vectors. For very dense vectors, it may perform more slowly 1415 than the regular count_bits() method. For that reason, I have retained 1416 the original method.] Rhiannon's implementation is based on what has 1417 been called the Kernighan way at the web site 1418 http://graphics.stanford.edu/~seander/bithacks.html. Version 2.0 also 1419 includes a few additional functions posted at this web site for 1420 extracting information from bit fields. Also included in this new 1421 version is the next_set_bit() method supplied by Jason Allum. I believe 1422 this method is also useful for data mining folks. Thanks Jason. 1423 Additional methods in Version 2.0 include the similarity and the distance 1424 metrics for comparing two bit vectors, method for finding the greatest 1425 common divisor of two bit vectors, and a method that determines the 1426 multiplicative inverse of a bit vector vis-a-vis a modulus. The last two 1427 methods should prove useful to folks in cryptography. 1428 1429 With regard to Version 2.2, I would like to thank Ethan Price for 1430 bringing to my attention a bug in the BitVector initialization code for 1431 the case when both the int value and the size are user- specified and the 1432 two values happen to be inconsistent. Ethan also discovered that the 1433 circular shift operators did not respond to negative values for the 1434 shift. These and some other shortcomings discovered by Ethan have been 1435 fixed in Version 2.2. Thanks Ethan! 1436 1437 For two of the changes included in Version 3.1, I'd like to thank Libor 1438 Wagner and C. David Stahl. Libor discovered a documentation error in the 1439 listing of the 'count_bits_sparse()' method and David discovered a bug in 1440 slice assignment when one or both of the slice limits are left 1441 unspecified. These errors in Version 3.0 have been fixed in Version 3.1. 1442 1443 Version 3.1.1 was triggered by two emails, one from John-Mark Gurney and 1444 the other from Nessim Kisserli, both related to the issue of compilation 1445 of the module. John-Mark mentioned that since this module did not work 1446 with Python 2.4.3, the statement that the module was appropriate for all 1447 Python 2.x was not correct, and Nessim reported that he had run into a 1448 problem with the compilation of the test portion of the code with Python 1449 2.7 where a string of 1's and 0's is supplied to io.StringIO() for the 1450 construction of a memory file. Both these issues have been resolved in 1451 3.1.1. 1452 1453 Version 3.2 was triggered by my own desire to include additional 1454 functionality in the module to make it more useful for experimenting with 1455 hash functions. While I was at it, I also included in the module a 1456 couple of safety checks on the lengths of the two argument bit vectors 1457 when computing their Jaccard similarity. I could see the need for these 1458 checks after receiving an email from Patrick Nisch about the error 1459 messages he was receiving during Jaccard similarity calculations. Thanks 1460 Patrick! 1461 1462 Version 3.3 includes a correction by John Gleeson for a bug in the 1463 next_set_bit() method. Thanks, John! 1464 1465 Version 3.3.1 resulted from Thor Smith observing that my naming 1466 convention for the API methods was not uniform. Whereas most used the 1467 underscore for joining multiple words, some were based on 1468 camelcasing. Thanks, Thor! 1469 1470 Version 3.3.2 was in response to a bug discovery by Juan Corredor. The 1471 bug related to constructing bit vectors from text strings that include 1472 character escapes. Thanks, Juan! 1473 1474 Version 3.4.1 was triggered by an email from Kurt Schwehr who spotted an 1475 error in the setup.py of Version 3.4. Thanks, Kurt! 1476 1477 Version 3.4.4 resulted from an email from Adam Foltzer who noticed that 1478 an empty BitVector instance did not behave like a regular instance. 1479 Previously, the module simply aborted if you called count_bits() on an 1480 empty BitVector instance and threw an exception if you called runs() on 1481 such an instance. The new version returns 0 for the former case and an 1482 empty list for the latter. I should also mention that the new 1483 implementation of count_bits() was first suggested by Kurt Schwehr in an 1484 email a couple of months back. My original implementation called on the 1485 reduce() method of the functools module to do the job. Thanks to both 1486 Adam and Kurt! 1487 1488 Version 3.4.5 was triggered by an email from William Pietri who 1489 discovered a bug in the code for making slice assignments. While I was 1490 at it, thought I should also include in the module a new method that I 1491 have named min_canonical(). This method returns the "canonical" form of a 1492 bit pattern, which corresponds to a circularly shifted version of the 1493 pattern with the largest number of leading zeros. Such canonical forms 1494 of bit patterns play important role in image texture characterization 1495 algorithms. If you are curious as to how, see my "Texture and Color 1496 Tutorial" at 1497 https://engineering.purdue.edu/kak/Tutorials/TextureAndColor.pdf 1498 1499 The two bug fixes made in Version 3.4.8 were triggered by emails from 1500 Aurélien Buhrig and Bob Janssen. Aurélien found an error in one of my 1501 statements in the implementation of __setitem__(). And Bob reported that 1502 my implementation of the method write_bits_to_stream_object() was not 1503 working with Version 3.5.3 of Python. Version 3.4.8 of BitVector 1504 incorporates the fix provided by Aurélien and contains special-case logic 1505 for writing to stream objects in Version 3.5.3 of Python3. 1506 1507 The two changes made in Version 3.4.9 were supplied by Elliot James 1508 Edmunds. These include a new implementation for __iadd__, and a 1509 significantly faster implementation for __add__. Thanks, Elliot! 1510 1511 Version 3.5.0 was prompted by emails from Aaron Holtzman at Purdue and 1512 Chunlei Li from the University of Bergen, Norway. Aaron brought to my 1513 attention the fact that "array.fromstring" and "array.tostring" were 1514 going to be dropped in Python 3.9. And Chunlei Li noticed that that the 1515 version 3.4.9 implementations for __add__ and __iadd__ could occasionally 1516 give wrong answers. 1517 1518 1519@title 1520ABOUT THE AUTHOR: 1521 1522 The author, Avinash Kak, finished not too long ago his 17 years long 1523 Objects Trilogy project with the publication of the book "Designing with 1524 Objects" by John-Wiley. If interested, check out his web page at Purdue 1525 to find out what this long project was all about. You might like 1526 "Designing with Objects" especially if you enjoyed reading Harry Potter 1527 as a kid (or even as an adult, for that matter). 1528 1529 1530@title 1531SOME EXAMPLE CODE: 1532 1533 #!/usr/bin/env python 1534 import BitVector 1535 1536 # Construct a bit vector from a list or tuple of bits: 1537 bv = BitVector.BitVector( bitlist = (1, 0, 0, 1) ) 1538 print(bv) # 1001 1539 1540 # Construct a bit vector from an integer: 1541 bv = BitVector.BitVector( intVal = 5678 ) 1542 print(bv) # 0001011000101110 1543 1544 # Construct a bit vector of a given size from a given 1545 # integer: 1546 bv = BitVector( intVal = 45, size = 16 ) 1547 print(bv) # 0000000000101101 1548 1549 # Construct a zero-initialized bit vector of a given size: 1550 bv = BitVector.BitVector( size = 5 ) 1551 print(bv) # 00000 1552 1553 # Construct a bit vector from a bit string: 1554 bv = BitVector.BitVector( bitstring = '110001' ) 1555 print(bv[0], bv[1], bv[2], bv[3], bv[4], bv[5]) # 1 1 0 0 0 1 1556 print(bv[-1], bv[-2], bv[-3], bv[-4], bv[-5], bv[-6]) # 1 0 0 0 1 1 1557 1558 # Construct a bit vector from a file like object: 1559 import io 1560 x = "111100001111" 1561 fp_read = io.StringIO( x ) 1562 bv = BitVector( fp = fp_read ) 1563 print(bv) # 111100001111 1564 1565 # Experiments with bitwise logical operations: 1566 bv3 = bv1 | bv2 1567 bv3 = bv1 & bv2 1568 bv3 = bv1 ^ bv2 1569 bv6 = ~bv5 1570 1571 # Find the length of a bit vector 1572 print( str(len( bitvec ) ) ) 1573 1574 # Find the integer value of a bit vector 1575 print( bitvec.intValue() ) 1576 1577 # Open a file for reading bit vectors from 1578 bv = BitVector.BitVector( filename = 'TestBitVector/testinput1.txt' ) 1579 print( bv ) # nothing yet 1580 bv1 = bv.read_bits_from_file(64) 1581 print( bv1 ) # first 64 bits from the file 1582 1583 # Divide a bit vector into two equal sub-vectors: 1584 [bv1, bv2] = bitvec.divide_into_two() 1585 1586 # Permute and Un-Permute a bit vector: 1587 bv2 = bitvec.permute( permutation_list ) 1588 bv2 = bitvec.unpermute( permutation_list ) 1589 1590 # Try circular shifts to the left and to the right 1591 bitvec << 7 1592 bitvec >> 7 1593 1594 # Try 'if x in y' syntax for bit vectors: 1595 bv1 = BitVector( bitstring = '0011001100' ) 1596 bv2 = BitVector( bitstring = '110011' ) 1597 if bv2 in bv1: 1598 print( "%s is in %s" % (bv2, bv1) ) 1599 else: 1600 print( "%s is not in %s" % (bv2, bv1) ) 1601 1602 ..... 1603 ..... 1604 1605 (For a more complete working example, see the 1606 example code in the BitVectorDemo.py file in the 1607 Examples sub-directory.) 1608 1609@endofdocs 1610''' 1611 1612 1613import array 1614import operator 1615import sys 1616 1617_hexdict = { '0' : '0000', '1' : '0001', '2' : '0010', '3' : '0011', 1618 '4' : '0100', '5' : '0101', '6' : '0110', '7' : '0111', 1619 '8' : '1000', '9' : '1001', 'a' : '1010', 'b' : '1011', 1620 'c' : '1100', 'd' : '1101', 'e' : '1110', 'f' : '1111' } 1621 1622def _readblock(blocksize, bitvector): 1623 ''' 1624 If this function succeeds in reading all blocksize bits, it uses the 1625 tell-read-seek mechanism to peek ahead to see if there is anything more to be 1626 read in the file. If there is nothing further to be read, it sets the more_to_read 1627 attribute of the BitVector instance to False. Obviously, this can only be done for 1628 seekable streams such as those connected with disk files. According to Blair 1629 Houghton, a similar feature could presumably be implemented for socket streams by 1630 using recv() or recvfrom() if you set the flags argument to MSG_PEEK. 1631 ''' 1632 global _hexdict 1633 bitstring = '' 1634 i = 0 1635 while ( i < blocksize / 8 ): 1636 i += 1 1637 byte = bitvector.FILEIN.read(1) 1638 if byte == b'': 1639 if len(bitstring) < blocksize: 1640 bitvector.more_to_read = False 1641 return bitstring 1642 if sys.version_info[0] == 3: 1643 hexvalue = '%02x' % byte[0] 1644 else: 1645 hexvalue = hex( ord( byte ) ) 1646 hexvalue = hexvalue[2:] 1647 if len( hexvalue ) == 1: 1648 hexvalue = '0' + hexvalue 1649 bitstring += _hexdict[ hexvalue[0] ] 1650 bitstring += _hexdict[ hexvalue[1] ] 1651 file_pos = bitvector.FILEIN.tell() 1652 # peek at the next byte; moves file position only if a 1653 # byte is read 1654 next_byte = bitvector.FILEIN.read(1) 1655 if next_byte: 1656 # pretend we never read the byte 1657 bitvector.FILEIN.seek( file_pos ) 1658 else: 1659 bitvector.more_to_read = False 1660 return bitstring 1661 1662 1663#------------------------------ BitVector Class Definition -------------------------------- 1664 1665class BitVector( object ): 1666 1667 def __init__( self, *args, **kwargs ): 1668 if args: 1669 raise ValueError( 1670 '''BitVector constructor can only be called with keyword arguments for the following keywords: ''' 1671 '''filename, fp, size, intVal, bitlist, bitstring, hexstring, textstring, and rawbytes)''') 1672 allowed_keys = 'bitlist','bitstring','filename','fp','intVal', 'size','textstring','hexstring','rawbytes' 1673 keywords_used = kwargs.keys() 1674 for keyword in keywords_used: 1675 if keyword not in allowed_keys: 1676 raise ValueError("Wrong keyword used --- check spelling") 1677 filename=fp=intVal=size=bitlist=bitstring=textstring=hexstring=rawbytes=None 1678 if 'filename' in kwargs : filename=kwargs.pop('filename') 1679 if 'fp' in kwargs : fp = kwargs.pop('fp') 1680 if 'size' in kwargs : size = kwargs.pop('size') 1681 if 'intVal' in kwargs : intVal = kwargs.pop('intVal') 1682 if 'bitlist' in kwargs : bitlist = kwargs.pop('bitlist') 1683 if 'bitstring' in kwargs : bitstring = kwargs.pop('bitstring') 1684 if 'hexstring' in kwargs : hexstring = kwargs.pop('hexstring') 1685 if 'textstring' in kwargs : textstring = kwargs.pop('textstring') 1686 if 'rawbytes' in kwargs : rawbytes = kwargs.pop('rawbytes') 1687 self.filename = None 1688 self.size = 0 1689 self.FILEIN = None 1690 self.FILEOUT = None 1691 if filename: 1692 if fp or size or intVal or bitlist or bitstring or hexstring or textstring or rawbytes: 1693 raise ValueError('''When filename is specified, you cannot give values ''' 1694 '''to any other constructor args''') 1695 self.filename = filename 1696 self.FILEIN = open(filename, 'rb') 1697 self.more_to_read = True 1698 return 1699 elif fp: 1700 if filename or size or intVal or bitlist or bitstring or hexstring or textstring or rawbytes: 1701 raise ValueError('''When fileobject is specified, you cannot give ''' 1702 '''values to any other constructor args''') 1703 bits = self.read_bits_from_fileobject(fp) 1704 bitlist = list(map(int, bits)) 1705 self.size = len( bitlist ) 1706 elif intVal or intVal == 0: 1707 if filename or fp or bitlist or bitstring or hexstring or textstring or rawbytes: 1708 raise ValueError('''When intVal is specified, you can only give a ''' 1709 '''value to the 'size' constructor arg''') 1710 if intVal == 0: 1711 bitlist = [0] 1712 if size is None: 1713 self.size = 1 1714 elif size == 0: 1715 raise ValueError('''The value specified for size must be at least ''' 1716 '''as large as for the smallest bit vector possible ''' 1717 '''for intVal''') 1718 else: 1719 if size < len(bitlist): 1720 raise ValueError('''The value specified for size must be at least ''' 1721 '''as large as for the smallest bit vector ''' 1722 '''possible for intVal''') 1723 n = size - len(bitlist) 1724 bitlist = [0]*n + bitlist 1725 self.size = len(bitlist) 1726 else: 1727 hexVal = hex(intVal).lower().rstrip('l') 1728 hexVal = hexVal[2:] 1729 if len(hexVal) == 1: 1730 hexVal = '0' + hexVal 1731 bitlist = ''.join(map(lambda x: _hexdict[x],hexVal)) 1732 bitlist = list(map( int, bitlist)) 1733 i = 0 1734 while (i < len(bitlist)): 1735 if bitlist[i] == 1: break 1736 i += 1 1737 del bitlist[0:i] 1738 if size is None: 1739 self.size = len(bitlist) 1740 elif size == 0: 1741 if size < len(bitlist): 1742 raise ValueError('''The value specified for size must be at least ''' 1743 '''as large as for the smallest bit vector possible ''' 1744 '''for intVal''') 1745 else: 1746 if size < len(bitlist): 1747 raise ValueError('''The value specified for size must be at least ''' 1748 '''as large as for the smallest bit vector possible ''' 1749 '''for intVal''') 1750 n = size - len(bitlist) 1751 bitlist = [0]*n + bitlist 1752 self.size = len( bitlist ) 1753 elif size is not None and size >= 0: 1754 if filename or fp or intVal or bitlist or bitstring or hexstring or textstring or rawbytes: 1755 raise ValueError('''When size is specified (without an intVal), you cannot ''' 1756 '''give values to any other constructor args''') 1757 self.size = size 1758 two_byte_ints_needed = (size + 15) // 16 1759 self.vector = array.array('H', [0]*two_byte_ints_needed) 1760 return 1761 elif bitstring or bitstring == '': 1762 if filename or fp or size or intVal or bitlist or hexstring or textstring or rawbytes: 1763 raise ValueError('''When a bitstring is specified, you cannot give ''' 1764 '''values to any other constructor args''') 1765 bitlist = list(map(int, list(bitstring))) 1766 self.size = len(bitlist) 1767 elif bitlist is not None: 1768 if filename or fp or size or intVal or bitstring or hexstring or textstring or rawbytes: 1769 raise ValueError('''When bits are specified, you cannot give values ''' 1770 '''to any other constructor args''') 1771 self.size = len(bitlist) 1772 elif textstring or textstring == '': 1773 if filename or fp or size or intVal or bitlist or bitstring or hexstring or rawbytes: 1774 raise ValueError('''When bits are specified through textstring, you ''' 1775 '''cannot give values to any other constructor args''') 1776 hexlist = ''.join(map(lambda x: x[2:], map(lambda x: hex(x) if len(hex(x)[2:])==2 1777 else hex(x)[:2] + '0' + hex(x)[2:], map(ord, list(textstring))))) 1778 bitlist = list(map(int,list(''.join(map(lambda x: _hexdict[x], list(hexlist)))))) 1779 self.size = len(bitlist) 1780 elif hexstring or hexstring == '': 1781 if filename or fp or size or intVal or bitlist or bitstring or textstring or rawbytes: 1782 raise ValueError('''When bits are specified through hexstring, you ''' 1783 '''cannot give values to any other constructor args''') 1784 bitlist = list(map(int,list(''.join(map(lambda x: _hexdict[x], list(hexstring.lower())))))) 1785 self.size = len(bitlist) 1786 elif rawbytes: 1787 if filename or fp or size or intVal or bitlist or bitstring or textstring or hexstring: 1788 raise ValueError('''When bits are specified through rawbytes, you ''' 1789 '''cannot give values to any other constructor args''') 1790 import binascii 1791 hexlist = binascii.hexlify(rawbytes) 1792 if sys.version_info[0] == 3: 1793 bitlist = list(map(int,list(''.join(map(lambda x: _hexdict[x], list(map(chr,list(hexlist)))))))) 1794 else: 1795 bitlist = list(map(int,list(''.join(map(lambda x: _hexdict[x], list(hexlist)))))) 1796 self.size = len(bitlist) 1797 else: 1798 raise ValueError("wrong arg(s) for constructor") 1799 two_byte_ints_needed = (len(bitlist) + 15) // 16 1800 self.vector = array.array( 'H', [0]*two_byte_ints_needed ) 1801 list( map( self._setbit, range(len(bitlist)), bitlist) ) 1802 1803 def _setbit(self, posn, val): 1804 'Set the bit at the designated position to the value shown' 1805 if val not in (0, 1): 1806 raise ValueError( "incorrect value for a bit" ) 1807 if isinstance( posn, (tuple) ): 1808 posn = posn[0] 1809 if posn >= self.size or posn < -self.size: 1810 raise ValueError( "index range error" ) 1811 if posn < 0: posn = self.size + posn 1812 block_index = posn // 16 1813 shift = posn & 15 1814 cv = self.vector[block_index] 1815 if ( cv >> shift ) & 1 != val: 1816 self.vector[block_index] = cv ^ (1 << shift) 1817 1818 def _getbit(self, pos): 1819 'Get the bit from the designated position' 1820 if not isinstance( pos, slice ): 1821 if pos >= self.size or pos < -self.size: 1822 raise ValueError( "index range error" ) 1823 if pos < 0: pos = self.size + pos 1824 return ( self.vector[pos//16] >> (pos&15) ) & 1 1825 else: 1826 slicebits = [] 1827 i,j = pos.start,pos.stop 1828 if i is None and j is None: 1829 return self.deep_copy() 1830 if i is None: 1831 if j >= 0: 1832 if j > len(self): 1833 raise ValueError('illegal slice index values') 1834 for x in range(j): 1835 slicebits.append( self[x] ) 1836 return BitVector( bitlist = slicebits ) 1837 else: 1838 if abs(j) > len(self): 1839 raise ValueError('illegal slice index values') 1840 for x in range(len(self) - abs(j)): 1841 slicebits.append( self[x] ) 1842 return BitVector( bitlist = slicebits ) 1843 if j is None: 1844 if i >= 0: 1845 if i > len(self): 1846 raise ValueError('illegal slice index values') 1847 for x in range(i,len(self)): 1848 slicebits.append( self[x] ) 1849 return BitVector( bitlist = slicebits ) 1850 else: 1851 if abs(i) > len(self): 1852 raise ValueError('illegal slice index values') 1853 for x in range(len(self) - abs(i), len(self)): 1854 slicebits.append( self[x] ) 1855 return BitVector( bitlist = slicebits ) 1856 if (i >= 0 and j >= 0) and i > j: 1857 raise ValueError('illegal slice index values') 1858 if (i < 0 and j >= 0) and (len(self) - abs(i)) > j: 1859 raise ValueError('illegal slice index values') 1860 if (i >= 0 and j < 0): 1861 if len(self) - abs(j) < i: 1862 raise ValueError('illegal slice index values') 1863 else: 1864 for x in range(i, len(self) - abs(j)): 1865 slicebits.append( self[x] ) 1866 return BitVector( bitlist = slicebits ) 1867 if self.size == 0: 1868 return BitVector( bitstring = '' ) 1869 if i == j: 1870 return BitVector( bitstring = '' ) 1871 for x in range(i,j): 1872 slicebits.append( self[x] ) 1873 return BitVector( bitlist = slicebits ) 1874 1875 def __xor__(self, other): 1876 ''' 1877 Take a bitwise 'XOR' of the bit vector on which the method is invoked with 1878 the argument bit vector. Return the result as a new bit vector. If the two 1879 bit vectors are not of the same size, pad the shorter one with zeros from the 1880 left. 1881 ''' 1882 if self.size < other.size: 1883 bv1 = self._resize_pad_from_left(other.size - self.size) 1884 bv2 = other 1885 elif self.size > other.size: 1886 bv1 = self 1887 bv2 = other._resize_pad_from_left(self.size - other.size) 1888 else: 1889 bv1 = self 1890 bv2 = other 1891 res = BitVector( size = bv1.size ) 1892 lpb = map(operator.__xor__, bv1.vector, bv2.vector) 1893 res.vector = array.array( 'H', lpb ) 1894 return res 1895 1896 def __and__(self, other): 1897 ''' 1898 Take a bitwise 'AND' of the bit vector on which the method is invoked with 1899 the argument bit vector. Return the result as a new bit vector. If the two 1900 bit vectors are not of the same size, pad the shorter one with zeros from the 1901 left. 1902 ''' 1903 if self.size < other.size: 1904 bv1 = self._resize_pad_from_left(other.size - self.size) 1905 bv2 = other 1906 elif self.size > other.size: 1907 bv1 = self 1908 bv2 = other._resize_pad_from_left(self.size - other.size) 1909 else: 1910 bv1 = self 1911 bv2 = other 1912 res = BitVector( size = bv1.size ) 1913 lpb = map(operator.__and__, bv1.vector, bv2.vector) 1914 res.vector = array.array( 'H', lpb ) 1915 return res 1916 1917 def __or__(self, other): 1918 ''' 1919 Take a bitwise 'OR' of the bit vector on which the method is invoked with the 1920 argument bit vector. Return the result as a new bit vector. If the two bit 1921 vectors are not of the same size, pad the shorter one with zero's from the 1922 left. 1923 ''' 1924 if self.size < other.size: 1925 bv1 = self._resize_pad_from_left(other.size - self.size) 1926 bv2 = other 1927 elif self.size > other.size: 1928 bv1 = self 1929 bv2 = other._resize_pad_from_left(self.size - other.size) 1930 else: 1931 bv1 = self 1932 bv2 = other 1933 res = BitVector( size = bv1.size ) 1934 lpb = map(operator.__or__, bv1.vector, bv2.vector) 1935 res.vector = array.array( 'H', lpb ) 1936 return res 1937 1938 def __invert__(self): 1939 ''' 1940 Invert the bits in the bit vector on which the method is invoked 1941 and return the result as a new bit vector. 1942 ''' 1943 res = BitVector( size = self.size ) 1944 lpb = list(map( operator.__inv__, self.vector )) 1945 res.vector = array.array( 'H' ) 1946 for i in range(len(lpb)): 1947 res.vector.append( lpb[i] & 0x0000FFFF ) 1948 return res 1949 1950 1951# def __add__(self, other): 1952# ''' 1953# Because __add__ is supplied, you can always join two bitvectors by 1954# 1955# bitvec3 = bitvec1 + bitvec2 1956# 1957# bitvec3 is a new bitvector object that contains all the bits of bitvec1 1958# followed by all the bits of bitvec2. This is a faster implementation 1959# supplied by Elliot James Edmunds as a replacement for the previous version 1960# that you can find in Version 3.4.8. 1961# ''' 1962# new_bv = BitVector( size=0 ) 1963# if isinstance(self.vector, array.array): 1964# if sys.version_info[0] == 3: 1965# new_bv.vector.frombytes( self.vector.tobytes() ) 1966# else: 1967# # The following does not work in Python 3.9 1968# new_bv.vector.fromstring( self.vector.tostring() ) 1969# elif isinstance(self.vector, list) and sys.version_info[0] == 3: 1970# new_bv.vector = self.vector.copy() 1971# else: 1972# out_str = str(self) + str(other) 1973# return BitVector( bitstring = out_str ) 1974# new_bv.size = self.size 1975# new_bv += other 1976# return new_bv 1977# 1978# 1979# def __iadd__(self, other): 1980# ''' 1981# When extending an existing instance of a BitVector, __iadd__ should be faster 1982# than __add__ because we do not need to create a new BitVector. The call to 1983# __iadd__ simply modifies the current bitvector. __iadd__ is invoked when a 1984# user calls: 1985# bitvec1 += bitvec2 1986# Supplied by Elliot James Edmunds. 1987# ''' 1988# if not isinstance(other, type(self)): 1989# raise TypeError("Can only join two BitVector objects, not {}".format(type(other))) 1990# # Calculate number of two-byte ints we will need to add and extend the vector 1991# two_byte_ints_to_add = (self.size + other.size + 15) // 16 - len(self.vector) 1992# self.vector.extend([0] * two_byte_ints_to_add) 1993# # Add the bits 1994# curr_bit = self.size % 16 1995# curr_two_byte_int = self.size // 16 1996# for bit in other: 1997# self.vector[curr_two_byte_int] = self.vector[curr_two_byte_int] | (bit << curr_bit) 1998# curr_bit += 1 1999# curr_two_byte_int += curr_bit // 16 2000# curr_bit %= 16 2001# # Increase the size 2002# self.size += other.size 2003# return self 2004# 2005 2006 def __add__(self, other): 2007 ''' 2008 Restored from BitVector 3.4.5 2009 Because __add__ is supplied, you can always join two bitvectors by 2010 2011 bitvec3 = bitvec1 + bitvec2 2012 2013 bitvec3 is a new bitvector object that contains all the bits of 2014 bitvec1 followed by all the bits of bitvec2. 2015 ''' 2016 i = 0 2017 outlist = [] 2018 while i < self.size: 2019 outlist.append(self[i]) 2020 i += 1 2021 i = 0 2022 while i < other.size: 2023 outlist.append( other[i] ) 2024 i += 1 2025 return BitVector( bitlist = outlist ) 2026 2027 2028 def _getsize(self): 2029 'Return the number of bits in a bit vector.' 2030 return self.size 2031 2032 def read_bits_from_file(self, blocksize): 2033 ''' 2034 You can construct bitvectors directly from the bits in a disk file 2035 through the calls shown below. As you can see, this requires two 2036 steps: First you make a call as illustrated by the first statement 2037 below. The purpose of this call is to create a file object that is 2038 associated with the variable bv. Subsequent calls to 2039 read_bits_from_file(n) on this variable return a bitvector for each 2040 block of n bits thus read. The read_bits_from_file() throws an 2041 exception if the argument n is not a multiple of 8. 2042 2043 bv = BitVector(filename = 'somefile') 2044 bv1 = bv.read_bits_from_file(64) 2045 bv2 = bv.read_bits_from_file(64) 2046 ... 2047 ... 2048 bv.close_file_object() 2049 2050 When reading a file as shown above, you can test the attribute 2051 more_to_read of the bitvector object in order to find out if there 2052 is more to read in the file. The while loop shown below reads all 2053 of a file in 64-bit blocks: 2054 2055 bv = BitVector( filename = 'testinput4.txt' ) 2056 print("Here are all the bits read from the file:") 2057 while (bv.more_to_read): 2058 bv_read = bv.read_bits_from_file( 64 ) 2059 print(bv_read) 2060 bv.close_file_object() 2061 2062 The size of the last bitvector constructed from a file corresponds 2063 to how many bytes remain unread in the file at that point. It is 2064 your responsibility to zero-pad the last bitvector appropriately 2065 if, say, you are doing block encryption of the whole file. 2066 ''' 2067 error_str = '''You need to first construct a BitVector 2068 object with a filename as argument''' 2069 if not self.filename: 2070 raise SyntaxError( error_str ) 2071 if blocksize % 8 != 0: 2072 raise ValueError( "block size must be a multiple of 8" ) 2073 bitstr = _readblock( blocksize, self ) 2074 if len( bitstr ) == 0: 2075 return BitVector( size = 0 ) 2076 else: 2077 return BitVector( bitstring = bitstr ) 2078 2079 def read_bits_from_fileobject( self, fp ): 2080 ''' 2081 This function is meant to read a bit string from a file like 2082 object. 2083 ''' 2084 bitlist = [] 2085 while 1: 2086 bit = fp.read() 2087 if bit == '': return bitlist 2088 bitlist += bit 2089 2090 def write_bits_to_stream_object( self, fp ): 2091 ''' 2092 You can write a bitvector directly to a stream object, as 2093 illustrated by: 2094 2095 fp_write = io.StringIO() 2096 bitvec.write_bits_to_stream_object(fp_write) 2097 print(fp_write.getvalue()) 2098 2099 This method does not return anything. 2100 2101 This function is meant to write a bitvector directly to a file like 2102 object. Note that whereas 'write_to_file' method creates a memory 2103 footprint that corresponds exactly to the bitvector, the 2104 'write_bits_to_stream_object' actually writes out the 1's and 0's 2105 as individual items to the file object. That makes this method 2106 convenient for creating a string representation of a bitvector, 2107 especially if you use the StringIO class, as shown in the test 2108 code. 2109 2110 ''' 2111 for bit_index in range(self.size): 2112 if sys.version_info[0] == 3: 2113 if self[bit_index] == 0: 2114 fp.write('0') 2115 else: 2116 fp.write('1') 2117 else: 2118 if self[bit_index] == 0: 2119 fp.write( unicode('0') ) 2120 else: 2121 fp.write( unicode('1') ) 2122 2123 write_bits_to_fileobject = write_bits_to_stream_object 2124 2125 def divide_into_two(self): 2126 ''' 2127 A bitvector containing an even number of bits can be divided into 2128 two equal parts by 2129 2130 [left_half, right_half] = bitvec.divide_into_two() 2131 2132 where left_half and right_half hold references to the two returned 2133 bitvectors. The method throws an exception when called on a 2134 bitvector with an odd number of bits. 2135 ''' 2136 if self.size % 2 != 0: 2137 raise ValueError( "must have even num bits" ) 2138 i = 0 2139 outlist1 = [] 2140 while ( i < self.size /2 ): 2141 outlist1.append( self[i] ) 2142 i += 1 2143 outlist2 = [] 2144 while ( i < self.size ): 2145 outlist2.append( self[i] ) 2146 i += 1 2147 return [ BitVector( bitlist = outlist1 ), 2148 BitVector( bitlist = outlist2 ) ] 2149 2150 def permute(self, permute_list): 2151 ''' 2152 This method returns a new bitvector object. Permuting a bitvector means 2153 that you select its bits in the sequence specified by the argument 2154 permute_list. 2155 ''' 2156 if max(permute_list) > self.size -1: 2157 raise ValueError( "Bad permutation index" ) 2158 outlist = [] 2159 i = 0 2160 while ( i < len( permute_list ) ): 2161 outlist.append( self[ permute_list[i] ] ) 2162 i += 1 2163 return BitVector( bitlist = outlist ) 2164 2165 def unpermute(self, permute_list): 2166 ''' 2167 This method returns a new bitvector object. As indicated earlier 2168 for the permute() method, permuting a bitvector means that you 2169 select its bits in the sequence specified by the argument 2170 permute_list. Calling unpermute() with the same argument 2171 permute_list restores the sequence of bits to what it was in 2172 the original bitvector. 2173 ''' 2174 if max(permute_list) > self.size -1: 2175 raise ValueError( "Bad permutation index" ) 2176 if self.size != len( permute_list ): 2177 raise ValueError( "Bad size for permute list" ) 2178 out_bv = BitVector( size = self.size ) 2179 i = 0 2180 while i < len(permute_list): 2181 out_bv[ permute_list[i] ] = self[i] 2182 i += 1 2183 return out_bv 2184 2185 def write_to_file(self, file_out): 2186 ''' 2187 You can write a bit vector directly to a file by calling 2188 write_to_file(), as illustrated by the following example that reads 2189 one bitvector from a file and then writes it to another file: 2190 2191 bv = BitVector(filename = 'input.txt') 2192 bv1 = bv.read_bits_from_file(64) 2193 print(bv1) 2194 FILEOUT = open('output.bits', 'wb') 2195 bv1.write_to_file(FILEOUT) 2196 FILEOUT.close() 2197 bv = BitVector(filename = 'output.bits') 2198 bv2 = bv.read_bits_from_file(64) 2199 print(bv2) 2200 2201 Since all file I/O is byte oriented, the method write_to_file() 2202 throws an exception if the size of the bitvector on which the 2203 method is invoked is not a multiple of 8. This method does not 2204 return anything. 2205 2206 IMPORTANT FOR WINDOWS USERS: When writing an internally generated 2207 bit vector out to a disk file, it is important to open 2208 the file in the binary mode as shown. Otherwise, the 2209 bit pattern 00001010 ('\\n') in your bitstring will be 2210 written out as 0000110100001010 ('\\r\\n'), which is 2211 the linebreak on Windows machines. 2212 ''' 2213 err_str = '''Only a bit vector whose length is a multiple of 8 can 2214 be written to a file. Use the padding functions to satisfy 2215 this constraint.''' 2216 if not self.FILEOUT: 2217 self.FILEOUT = file_out 2218 if self.size % 8: 2219 raise ValueError( err_str ) 2220 for byte in range( int(self.size/8) ): 2221 value = 0 2222 for bit in range(8): 2223 value += (self._getbit( byte*8+(7 - bit) ) << bit ) 2224 if sys.version_info[0] == 3: 2225 file_out.write( bytes([value]) ) 2226 else: 2227 file_out.write( chr(value) ) 2228 2229 def close_file_object(self): 2230 ''' 2231 When you construct bitvectors by block scanning a disk file, after 2232 you are done, you can call this method to close the file object 2233 that was created to read the file: 2234 2235 bv = BitVector(filename = 'somefile') 2236 bv1 = bv.read_bits_from_file(64) 2237 bv.close_file_object() 2238 2239 The constructor call in the first statement creates a file object 2240 for reading the bits. It is this file object that is closed when 2241 you call close_file_object(). 2242 ''' 2243 if not self.FILEIN: 2244 raise SyntaxError( "No associated open file" ) 2245 self.FILEIN.close() 2246 2247 def int_val(self): 2248 'Return the integer value of a bitvector' 2249 intVal = 0 2250 for i in range(self.size): 2251 intVal += self[i] * (2 ** (self.size - i - 1)) 2252 return intVal 2253 2254 intValue = int_val 2255 2256 def get_bitvector_in_ascii(self): 2257 ''' 2258 You can call get_bitvector_in_ascii() to directly convert a bit 2259 vector into a text string (this is a useful thing to do only if the 2260 length of the vector is an integral multiple of 8 and every byte in 2261 your bitvector has a print representation): 2262 2263 bv = BitVector(textstring = "hello") 2264 print(bv) # 0110100001100101011011000110110001101111 2265 mytext = bv3.get_bitvector_in_ascii() 2266 print mytext # hello 2267 2268 This method is useful when you encrypt text through its bitvector 2269 representation. After decryption, you can recover the text using 2270 the call shown here. A call to get_bitvector_in_ascii() returns a 2271 string. 2272 ''' 2273 if self.size % 8: 2274 raise ValueError('''\nThe bitvector for get_bitvector_in_ascii() 2275 must be an integral multiple of 8 bits''') 2276 return ''.join(map(chr, map(int,[self[i:i+8] for i in range(0,self.size,8)]))) 2277 2278 # For backward compatibility: 2279 get_text_from_bitvector = get_bitvector_in_ascii 2280 getTextFromBitVector = get_bitvector_in_ascii 2281 2282 def get_bitvector_in_hex(self): 2283 ''' 2284 You can directly convert a bit vector into a hex string (this is a 2285 useful thing to do only if the length of the vector is an integral 2286 multiple of 4): 2287 2288 bv4 = BitVector(hexstring = "68656c6c6f") 2289 print(bv4) # 0110100001100101011011000110110001101111 2290 myhexstring = bv4.get_bitvector_in_hex() 2291 print myhexstring # 68656c6c6 2292 2293 This method throws an exception if the size of the bitvector is not 2294 a multiple of 4. The method returns a string that is formed by 2295 scanning the bits from the left and replacing each sequence of 4 2296 bits by its corresponding hex digit. 2297 ''' 2298 if self.size % 4: 2299 raise ValueError('''\nThe bitvector for get_bitvector_in_hex() ''' 2300 '''must be an integral multiple of 4 bits''') 2301 return ''.join(map(lambda x: x.replace('0x',''), \ 2302 map(hex,map(int,[self[i:i+4] for i in range(0,self.size,4)])))) 2303 2304 # For backward compatibility: 2305 get_hex_string_from_bitvector = get_bitvector_in_hex 2306 getHexStringFromBitVector = get_bitvector_in_hex 2307 2308 def __lshift__( self, n ): 2309 ''' 2310 Left circular rotation of a BitVector through N positions can be 2311 carried out by 2312 2313 bitvec << N 2314 2315 This operator overloading is made possible by implementing the 2316 __lshift__ method defined here. Note that this operator returns 2317 the bitvector on which it is invoked. This allows for a chained 2318 invocation of the operator 2319 2320 ''' 2321 if self.size == 0: 2322 raise ValueError('''Circular shift of an empty vector 2323 makes no sense''') 2324 if n < 0: 2325 return self >> abs(n) 2326 for i in range(n): 2327 self.circular_rotate_left_by_one() 2328 return self 2329 2330 def __rshift__( self, n ): 2331 ''' 2332 Right circular rotation of a BitVector through N positions can be 2333 carried out by 2334 2335 bitvec >> N 2336 2337 This operator overloading is made possible by implementing the 2338 __rshift__ method defined here. Note that this operator returns 2339 the bitvector on which it is invoked. This allows for a chained 2340 invocation of the operator. 2341 ''' 2342 if self.size == 0: 2343 raise ValueError('''Circular shift of an empty vector makes no sense''') 2344 if n < 0: 2345 return self << abs(n) 2346 for i in range(n): 2347 self.circular_rotate_right_by_one() 2348 return self 2349 2350 def circular_rotate_left_by_one(self): 2351 'For a one-bit in-place left circular shift' 2352 size = len(self.vector) 2353 bitstring_leftmost_bit = self.vector[0] & 1 2354 left_most_bits = list(map(operator.__and__, self.vector, [1]*size)) 2355 left_most_bits.append(left_most_bits[0]) 2356 del(left_most_bits[0]) 2357 self.vector = list(map(operator.__rshift__, self.vector, [1]*size)) 2358 self.vector = list(map( operator.__or__, self.vector, \ 2359 list( map(operator.__lshift__, left_most_bits, [15]*size) ))) 2360 self._setbit(self.size -1, bitstring_leftmost_bit) 2361 2362 def circular_rotate_right_by_one(self): 2363 'For a one-bit in-place right circular shift' 2364 size = len(self.vector) 2365 bitstring_rightmost_bit = self[self.size - 1] 2366 right_most_bits = list(map( operator.__and__, 2367 self.vector, [0x8000]*size )) 2368 self.vector = list(map( operator.__and__, self.vector, [~0x8000]*size )) 2369 right_most_bits.insert(0, bitstring_rightmost_bit) 2370 right_most_bits.pop() 2371 self.vector = list(map(operator.__lshift__, self.vector, [1]*size)) 2372 self.vector = list(map( operator.__or__, self.vector, \ 2373 list(map(operator.__rshift__, right_most_bits, [15]*size)))) 2374 self._setbit(0, bitstring_rightmost_bit) 2375 2376 def circular_rot_left(self): 2377 ''' 2378 This is merely another implementation of the method 2379 circular_rotate_left_by_one() shown above. This one does NOT use map 2380 functions. This method carries out a one-bit left circular shift of a bit 2381 vector. 2382 ''' 2383 max_index = (self.size -1) // 16 2384 left_most_bit = self.vector[0] & 1 2385 self.vector[0] = self.vector[0] >> 1 2386 for i in range(1, max_index + 1): 2387 left_bit = self.vector[i] & 1 2388 self.vector[i] = self.vector[i] >> 1 2389 self.vector[i-1] |= left_bit << 15 2390 self._setbit(self.size -1, left_most_bit) 2391 2392 def circular_rot_right(self): 2393 ''' 2394 This is merely another implementation of the method 2395 circular_rotate_right_by_one() shown above. This one does NOT use map 2396 functions. This method does a one-bit right circular shift of a bit vector. 2397 ''' 2398 max_index = (self.size -1) // 16 2399 right_most_bit = self[self.size - 1] 2400 self.vector[max_index] &= ~0x8000 2401 self.vector[max_index] = self.vector[max_index] << 1 2402 for i in range(max_index-1, -1, -1): 2403 right_bit = self.vector[i] & 0x8000 2404 self.vector[i] &= ~0x8000 2405 self.vector[i] = self.vector[i] << 1 2406 self.vector[i+1] |= right_bit >> 15 2407 self._setbit(0, right_most_bit) 2408 2409 def shift_left_by_one(self): 2410 ''' 2411 For a one-bit in-place left non-circular shift. Note that bitvector size 2412 does not change. The leftmost bit that moves past the first element of the 2413 bitvector is discarded and rightmost bit of the returned vector is set to 2414 zero. 2415 ''' 2416 size = len(self.vector) 2417 left_most_bits = list(map(operator.__and__, self.vector, [1]*size)) 2418 left_most_bits.append(left_most_bits[0]) 2419 del(left_most_bits[0]) 2420 self.vector = list(map(operator.__rshift__, self.vector, [1]*size)) 2421 self.vector = list(map( operator.__or__, self.vector, \ 2422 list(map(operator.__lshift__, left_most_bits, [15]*size)))) 2423 self._setbit(self.size -1, 0) 2424 2425 def shift_right_by_one(self): 2426 ''' 2427 For a one-bit in-place right non-circular shift. Note that bitvector size 2428 does not change. The rightmost bit that moves past the last element of the 2429 bitvector is discarded and leftmost bit of the returned vector is set to 2430 zero. 2431 ''' 2432 size = len(self.vector) 2433 right_most_bits = list(map( operator.__and__, self.vector, [0x8000]*size )) 2434 self.vector = list(map( operator.__and__, self.vector, [~0x8000]*size )) 2435 right_most_bits.insert(0, 0) 2436 right_most_bits.pop() 2437 self.vector = list(map(operator.__lshift__, self.vector, [1]*size)) 2438 self.vector = list(map( operator.__or__, self.vector, \ 2439 list(map(operator.__rshift__,right_most_bits, [15]*size)))) 2440 self._setbit(0, 0) 2441 2442 def shift_left( self, n ): 2443 ''' 2444 Call this method if you want to shift in-place a bitvector to the left 2445 non-circularly. As a bitvector is shifted non-circularly to the 2446 left, the exposed bit positions at the right end are filled with 2447 zeros. This method returns the bitvector object on which it is 2448 invoked. This is to allow for chained invocations of the method. 2449 ''' 2450 for i in range(n): 2451 self.shift_left_by_one() 2452 return self 2453 2454 def shift_right( self, n ): 2455 ''' 2456 Call this method if you want to shift in-place a bitvector to the right 2457 non-circularly. As a bitvector is shifted non-circularly to the 2458 right, the exposed bit positions at the left end are filled with 2459 zeros. This method returns the bitvector object on which it is 2460 invoked. This is to allow for chained invocations of the method. 2461 ''' 2462 for i in range(n): 2463 self.shift_right_by_one() 2464 return self 2465 2466 # Allow array like subscripting for getting and setting: 2467 __getitem__ = _getbit 2468 2469 def __setitem__(self, pos, item): 2470 ''' 2471 This is needed for both slice assignments and for index assignments. It 2472 checks the types of pos and item to see if the call is for slice assignment. 2473 For slice assignment, pos must be of type 'slice' and item of type BitVector. 2474 For index assignment, the argument types are checked in the _setbit() method. 2475 ''' 2476 # The following section is for slice assignment: 2477 if isinstance(pos, slice): 2478 if (not isinstance( item, BitVector )): 2479 raise TypeError("For slice assignment, the right hand side must be a BitVector") 2480 if (pos.start is None and pos.stop is None): 2481 return item.deep_copy() 2482 if pos.start is None: 2483 if pos.stop >= 0: 2484 if pos.stop != len(item): 2485 raise ValueError('incompatible lengths for slice assignment 1') 2486 for i in range(pos.stop): 2487 self[i] = item[i] 2488 else: 2489 if len(self) - abs(pos.stop) != len(item): 2490 raise ValueError('incompatible lengths for slice assignment 2') 2491 for i in range(len(self) + pos.stop): 2492 self[i] = item[i] 2493 return 2494 if pos.stop is None: 2495 if pos.start >= 0: 2496 if ((len(self) - pos.start) != len(item)): 2497 raise ValueError('incompatible lengths for slice assignment 3') 2498# for i in range(len(item)-1): 2499 for i in range(len(item)): 2500 self[pos.start + i] = item[i] 2501 else: 2502 if abs(pos.start) != len(item): 2503 raise ValueError('incompatible lengths for slice assignment 4') 2504 for i in range(len(item)): 2505 self[len(self) + pos.start + i] = item[i] 2506 return 2507 if pos.start >=0 and pos.stop < 0: 2508 if ( (len(self) + pos.stop - pos.start) != len(item) ): 2509 raise ValueError('incompatible lengths for slice assignment 5') 2510 for i in range( pos.start, len(self) + pos.stop ): 2511 self[i] = item[ i - pos.start ] 2512 return 2513 if pos.start < 0 and pos.stop >= 0: 2514 if ( (len(self) - pos.stop + pos.start) != len(item) ): 2515 raise ValueError('incompatible lengths for slice assignment 6') 2516 for i in range( len(self) + pos.start, pos.stop ): 2517 self[i] = item[ i - pos.start ] 2518 return 2519 if ( (pos.stop - pos.start) != len(item) ): 2520 raise ValueError('incompatible lengths for slice assignment 7') 2521 for i in range( pos.start, pos.stop ): 2522 self[i] = item[ i - pos.start ] 2523 return 2524 # For index assignment use _setbit() 2525 self._setbit(pos, item) 2526 2527 # Allow len() to work: 2528 __len__ = _getsize 2529 # Allow int() to work: 2530 __int__ = int_val 2531 2532 def __iter__(self): 2533 ''' 2534 To allow iterations over a bit vector by supporting the 'for bit in 2535 bit_vector' syntax: 2536 ''' 2537 return BitVectorIterator(self) 2538 2539 def __str__(self): 2540 'To create a print representation' 2541 if self.size == 0: 2542 return '' 2543 return ''.join(map(str, self)) 2544 2545 def __eq__(self, other): 2546 ''' 2547 Compare two bit vectors 2548 ''' 2549 if self.size != other.size: 2550 return False 2551 i = 0 2552 while ( i < self.size ): 2553 if (self[i] != other[i]): return False 2554 i += 1 2555 return True 2556 2557 def __ne__(self, other): 2558 return not self == other 2559 def __lt__(self, other): 2560 return self.intValue() < other.intValue() 2561 def __le__(self, other): 2562 return self.intValue() <= other.intValue() 2563 def __gt__(self, other): 2564 return self.intValue() > other.intValue() 2565 def __ge__(self, other): 2566 return self.intValue() >= other.intValue() 2567 2568 def deep_copy( self ): 2569 ''' 2570 You can make a deep copy of a bitvector by 2571 2572 bitvec_copy = bitvec.deep_copy() 2573 2574 Subsequently, any alterations to either of the bitvector objects 2575 bitvec and bitvec_copy will not affect the other. 2576 ''' 2577 copy = str( self ) 2578 return BitVector( bitstring = copy ) 2579 2580 # For backward compatibility: 2581 _make_deep_copy = deep_copy 2582 2583 def _resize_pad_from_left( self, n ): 2584 ''' 2585 Resize a bit vector by padding with n 0's from the left. Return the result as 2586 a new bit vector. 2587 ''' 2588 new_str = '0'*n + str( self ) 2589 return BitVector( bitstring = new_str ) 2590 2591 def _resize_pad_from_right( self, n ): 2592 ''' 2593 Resize a bit vector by padding with n 0's from the right. Return the result 2594 as a new bit vector. 2595 ''' 2596 new_str = str( self ) + '0'*n 2597 return BitVector( bitstring = new_str ) 2598 2599 def pad_from_left( self, n ): 2600 ''' 2601 You can pad a bitvector at its the left end with a designated number of 2602 zeros with this method. This method returns the bitvector object on 2603 which it is invoked. So you can think of this method as carrying 2604 out an in-place extension of a bitvector (although, under the hood, 2605 the extension is carried out by giving a new longer _vector 2606 attribute to the bitvector object). 2607 ''' 2608 new_str = '0'*n + str( self ) 2609 bitlist = list(map( int, list(new_str) )) 2610 self.size = len( bitlist ) 2611 two_byte_ints_needed = (len(bitlist) + 15) // 16 2612 self.vector = array.array( 'H', [0]*two_byte_ints_needed ) 2613 list(map( self._setbit, enumerate(bitlist), bitlist)) 2614 2615 def pad_from_right( self, n ): 2616 ''' 2617 You can pad a bitvector at its right end with a designated number of 2618 zeros with this method. This method returns the bitvector object on 2619 which it is invoked. So you can think of this method as carrying 2620 out an in-place extension of a bitvector (although, under the hood, 2621 the extension is carried out by giving a new longer _vector 2622 attribute to the bitvector object). 2623 ''' 2624 new_str = str( self ) + '0'*n 2625 bitlist = list(map( int, list(new_str) )) 2626 self.size = len( bitlist ) 2627 two_byte_ints_needed = (len(bitlist) + 15) // 16 2628 self.vector = array.array( 'H', [0]*two_byte_ints_needed ) 2629 list(map( self._setbit, enumerate(bitlist), bitlist)) 2630 2631 def __contains__( self, otherBitVec ): 2632 ''' 2633 This supports 'if x in y' and 'if x not in y' syntax for bit vectors. 2634 ''' 2635 if self.size == 0: 2636 raise ValueError("First arg bitvec has no bits") 2637 elif self.size < otherBitVec.size: 2638 raise ValueError("First arg bitvec too short") 2639 max_index = self.size - otherBitVec.size + 1 2640 for i in range(max_index): 2641 if self[i:i+otherBitVec.size] == otherBitVec: 2642 return True 2643 return False 2644 2645 def reset( self, val ): 2646 ''' 2647 Resets a previously created BitVector to either all zeros or all ones 2648 depending on the argument val. Returns self to allow for syntax like 2649 bv = bv1[3:6].reset(1) 2650 or 2651 bv = bv1[:].reset(1) 2652 ''' 2653 if val not in (0,1): 2654 raise ValueError( "Incorrect reset argument" ) 2655 bitlist = [val for i in range( self.size )] 2656 list(map( self._setbit, enumerate(bitlist), bitlist )) 2657 return self 2658 2659 def count_bits( self ): 2660 ''' 2661 You can count the number of bits set in a BitVector instance by 2662 2663 bv = BitVector(bitstring = '100111') 2664 print(bv.count_bits()) # 4 2665 2666 A call to count_bits() returns an integer value that is equal to 2667 the number of bits set in the bitvector. 2668 ''' 2669 return sum(self) 2670 2671 def set_value(self, *args, **kwargs): 2672 ''' 2673 You can call set_value() to change the bit pattern associated with 2674 a previously constructed bitvector object: 2675 2676 bv = BitVector(intVal = 7, size =16) 2677 print(bv) # 0000000000000111 2678 bv.set_value(intVal = 45) 2679 print(bv) # 101101 2680 2681 You can think of this method as carrying out an in-place resetting 2682 of the bit array in a bitvector. The method does not return 2683 anything. The allowable modes for changing the internally stored 2684 bit array for a bitvector are the same as for the constructor. 2685 ''' 2686 self.__init__( *args, **kwargs ) 2687 2688 # For backward compatibility: 2689 setValue = set_value 2690 2691 def count_bits_sparse(self): 2692 ''' 2693 For folks who use bit vectors with millions of bits in them but 2694 with only a few bits set, your bit counting will go much, much 2695 faster if you call count_bits_sparse() instead of count_bits(): 2696 However, for dense bitvectors, I expect count_bits() to work 2697 faster. 2698 2699 # a BitVector with 2 million bits: 2700 bv = BitVector(size = 2000000) 2701 bv[345234] = 1 2702 bv[233]=1 2703 bv[243]=1 2704 bv[18]=1 2705 bv[785] =1 2706 print(bv.count_bits_sparse()) # 5 2707 2708 A call to count_bits_sparse() returns an integer whose value is the 2709 number of bits set in the bitvector. Rhiannon, who contributed 2710 this method, estimates that if a bit vector with over 2 millions 2711 bits has only five bits set, this will return the answer in 1/18 of 2712 the time taken by the count_bits() method. Rhianon's implementation 2713 is based on an algorithm generally known as the Brian Kernighan's 2714 way, although its antecedents predate its mention by Kernighan and 2715 Ritchie. 2716 ''' 2717 num = 0 2718 for intval in self.vector: 2719 if intval == 0: continue 2720 c = 0; iv = intval 2721 while iv > 0: 2722 iv = iv & (iv -1) 2723 c = c + 1 2724 num = num + c 2725 return num 2726 2727 def jaccard_similarity(self, other): 2728 ''' 2729 You can calculate the similarity between two bitvectors using the 2730 Jaccard similarity coefficient. 2731 2732 bv1 = BitVector(bitstring = '11111111') 2733 bv2 = BitVector(bitstring = '00101011') 2734 print bv1.jaccard_similarity(bv2) # 0.675 2735 2736 The value returned is a floating point number between 0 and 1. 2737 ''' 2738 assert self.intValue() > 0 or other.intValue() > 0, 'Jaccard called on two zero vectors --- NOT ALLOWED' 2739 assert self.size == other.size, 'bitvectors for comparing with Jaccard must be of equal length' 2740 intersect = self & other 2741 union = self | other 2742 return ( intersect.count_bits_sparse() / float( union.count_bits_sparse() ) ) 2743 2744 def jaccard_distance( self, other ): 2745 ''' 2746 You can calculate the distance between two bitvectors using the 2747 Jaccard distance coefficient. 2748 2749 bv1 = BitVector(bitstring = '11111111') 2750 bv2 = BitVector(bitstring = '00101011') 2751 print(str(bv1.jaccard_distance(bv2))) # 0.375 2752 2753 The value returned is a floating point number between 0 and 1. 2754 ''' 2755 assert self.size == other.size, 'vectors of unequal length' 2756 return 1 - self.jaccard_similarity( other ) 2757 2758 def hamming_distance( self, other ): 2759 ''' 2760 You can compare two bitvectors with the Hamming distance: 2761 2762 bv1 = BitVector(bitstring = '11111111') 2763 bv2 = BitVector(bitstring = '00101011') 2764 print(str(bv1.hamming_distance(bv2))) # 4 2765 2766 This method returns a number that is equal to the number of bit 2767 positions in which the two operand bitvectors disagree. 2768 ''' 2769 assert self.size == other.size, 'vectors of unequal length' 2770 diff = self ^ other 2771 return diff.count_bits_sparse() 2772 2773 def next_set_bit(self, from_index=0): 2774 ''' 2775 Starting from a given bit position, you can find the position index 2776 of the next set bit by 2777 2778 bv = BitVector(bitstring = '00000000000001') 2779 print(bv.next_set_bit(5)) # 13 2780 2781 In this example, we are asking next_set_bit() to return the index 2782 of the bit that is set after the bit position that is indexed 5. If 2783 no next set bit is found, the method returns -1. A call to 2784 next_set_bit() always returns a number. This method was 2785 contributed originally by Jason Allum and updated subsequently by 2786 John Gleeson. 2787 ''' 2788 assert from_index >= 0, 'from_index must be nonnegative' 2789 i = from_index 2790 v = self.vector 2791 l = len(v) 2792 o = i >> 4 2793 s = i & 0x0F 2794 i = o << 4 2795 while o < l: 2796 h = v[o] 2797 if h: 2798 i += s 2799 m = 1 << s 2800 while m != (1 << 0x10): 2801 if h & m: return i 2802 m <<= 1 2803 i += 1 2804 else: 2805 i += 0x10 2806 s = 0 2807 o += 1 2808 return -1 2809 2810 def rank_of_bit_set_at_index(self, position): 2811 ''' 2812 You can measure the "rank" of a bit that is set at a given 2813 position. Rank is the number of bits that are set up to the 2814 position of the bit you are interested in. 2815 2816 bv = BitVector(bitstring = '01010101011100') 2817 print(bv.rank_of_bit_set_at_index(10)) # 6 2818 2819 The value 6 returned by this call to rank_of_bit_set_at_index() is 2820 the number of bits set up to the position indexed 10 (including 2821 that position). This method throws an exception if there is no bit 2822 set at the argument position. Otherwise, it returns the rank as a 2823 number. 2824 ''' 2825 assert self[position] == 1, 'the arg bit not set' 2826 bv = self[0:position+1] 2827 return bv.count_bits() 2828 2829 def is_power_of_2( self ): 2830 ''' 2831 You can test whether the integer value of a bit vector is a power of 2832 two. (The sparse version of this method works much faster for very 2833 long bit vectors.) However, the regular version defined here may 2834 work faster for dense bit vectors. 2835 2836 bv = BitVector(bitstring = '10000000001110') 2837 print(bv.is_power_of_2()) 2838 2839 This predicate returns 1 for true and 0 for false. 2840 ''' 2841 if self.intValue() == 0: return False 2842 bv = self & BitVector( intVal = self.intValue() - 1 ) 2843 if bv.intValue() == 0: return True 2844 return False 2845 2846 # For backward compatibility: 2847 isPowerOf2 = is_power_of_2 2848 2849 def is_power_of_2_sparse(self): 2850 ''' 2851 You can test whether the integer value of a bit vector is a power of 2852 two. This sparse version works much faster for very long bit 2853 vectors. (However, the regular version defined above may work 2854 faster for dense bit vectors.) 2855 2856 bv = BitVector(bitstring = '10000000001110') 2857 print(bv.is_power_of_2_sparse()) 2858 2859 This predicate returns 1 for true and 0 for false. 2860 ''' 2861 if self.count_bits_sparse() == 1: return True 2862 return False 2863 2864 # For backward compatibility: 2865 isPowerOf2_sparse = is_power_of_2_sparse 2866 2867 def reverse(self): 2868 ''' 2869 Given a bit vector, you can construct a bit vector with all the 2870 bits reversed, in the sense that what was left to right before now 2871 becomes right to left. 2872 2873 bv = BitVector(bitstring = '0001100000000000001') 2874 print(str(bv.reverse())) 2875 2876 A call to reverse() returns a new bitvector object whose bits are 2877 in reverse order in relation to the bits in the bitvector on which 2878 the method is invoked. 2879 ''' 2880 reverseList = [] 2881 i = 1 2882 while ( i < self.size + 1 ): 2883 reverseList.append( self[ -i ] ) 2884 i += 1 2885 return BitVector( bitlist = reverseList ) 2886 2887 def gcd(self, other): 2888 ''' 2889 Using Euclid's Algorithm, returns the greatest common divisor of 2890 the integer value of the bitvector on which the method is invoked 2891 and the integer value of the argument bitvector: 2892 2893 bv1 = BitVector(bitstring = '01100110') # int val: 102 2894 bv2 = BitVector(bitstring = '011010') # int val: 26 2895 bv = bv1.gcd(bv2) 2896 print(int(bv)) # 2 2897 2898 The result returned by gcd() is a bitvector object. 2899 ''' 2900 a = self.intValue(); b = other.intValue() 2901 if a < b: a,b = b,a 2902 while b != 0: 2903 a, b = b, a % b 2904 return BitVector( intVal = a ) 2905 2906 def multiplicative_inverse(self, modulus): 2907 ''' 2908 Using the Extended Euclid's Algorithm, this method calculates the 2909 multiplicative inverse using normal integer arithmetic. [For such 2910 inverses in a Galois Field GF(2^n), use the method gf_MI().] 2911 2912 bv_modulus = BitVector(intVal = 32) 2913 bv = BitVector(intVal = 17) 2914 bv_result = bv.multiplicative_inverse( bv_modulus ) 2915 if bv_result is not None: 2916 print(str(int(bv_result))) # 17 2917 else: print "No multiplicative inverse in this case" 2918 2919 What this example says is that the multiplicative inverse of 17 2920 modulo 32 is 17. That is because 17 times 17 modulo 32 equals 1. 2921 When using this method, you must test the returned value for 2922 None. If the returned value is None, that means that the number 2923 corresponding to the bitvector on which the method is invoked does 2924 not possess a multiplicative-inverse with respect to the modulus. 2925 When the multiplicative inverse exists, the result returned by 2926 calling multiplicative_inverse() is a bitvector object. 2927 ''' 2928 MOD = mod = modulus.intValue(); num = self.intValue() 2929 x, x_old = 0, 1 2930 y, y_old = 1, 0 2931 while mod: 2932 quotient = num // mod 2933 num, mod = mod, num % mod 2934 x, x_old = x_old - x * quotient, x 2935 y, y_old = y_old - y * quotient, y 2936 if num != 1: 2937 return None 2938 else: 2939 MI = (x_old + MOD) % MOD 2940 return BitVector( intVal = MI ) 2941 2942 def length(self): 2943 return self.size 2944 2945 def gf_multiply(self, b): 2946 ''' 2947 If you want to multiply two bit patterns in GF(2): 2948 2949 a = BitVector(bitstring='0110001') 2950 b = BitVector(bitstring='0110') 2951 c = a.gf_multiply(b) 2952 print(c) # 00010100110 2953 2954 As you would expect, in general, the bitvector returned by this 2955 method is longer than the two operand bitvectors. A call to 2956 gf_multiply() returns a bitvector object. 2957 ''' 2958 a = self.deep_copy() 2959 b_copy = b.deep_copy() 2960 a_highest_power = a.length() - a.next_set_bit(0) - 1 2961 b_highest_power = b.length() - b_copy.next_set_bit(0) - 1 2962 result = BitVector( size = a.length()+b_copy.length() ) 2963 a.pad_from_left( result.length() - a.length() ) 2964 b_copy.pad_from_left( result.length() - b_copy.length() ) 2965 for i,bit in enumerate(b_copy): 2966 if bit == 1: 2967 power = b_copy.length() - i - 1 2968 a_copy = a.deep_copy() 2969 a_copy.shift_left( power ) 2970 result ^= a_copy 2971 return result 2972 2973 def gf_divide_by_modulus(self, mod, n): 2974 ''' 2975 To divide a bitvector by a modulus bitvector in the Galois Field 2976 GF(2^n): 2977 2978 mod = BitVector(bitstring='100011011') # AES modulus 2979 n = 8 2980 a = BitVector(bitstring='11100010110001') 2981 quotient, remainder = a.gf_divide_by_modulus(mod, n) 2982 print(quotient) # 00000000111010 2983 print(remainder) # 10001111 2984 2985 What this example illustrates is dividing the bitvector a by the 2986 modulus bitvector mod. For a more general division of one 2987 bitvector a by another bitvector b, you would multiply a by the MI 2988 of b, where MI stands for "multiplicative inverse" as returned by 2989 the call to the method gf_MI(). A call to gf_divide_by_modulus() 2990 returns two bitvectors, one for the quotient and the other for the 2991 remainder. 2992 ''' 2993 num = self 2994 if mod.length() > n+1: 2995 raise ValueError("Modulus bit pattern too long") 2996 quotient = BitVector( intVal = 0, size = num.length() ) 2997 remainder = num.deep_copy() 2998 i = 0 2999 while 1: 3000 i = i+1 3001 if (i==num.length()): break 3002 mod_highest_power = mod.length()-mod.next_set_bit(0)-1 3003 if remainder.next_set_bit(0) == -1: 3004 remainder_highest_power = 0 3005 else: 3006 remainder_highest_power = remainder.length() - remainder.next_set_bit(0) - 1 3007 if (remainder_highest_power < mod_highest_power) or int(remainder)==0: 3008 break 3009 else: 3010 exponent_shift = remainder_highest_power - mod_highest_power 3011 quotient[quotient.length()-exponent_shift-1] = 1 3012 quotient_mod_product = mod.deep_copy(); 3013 quotient_mod_product.pad_from_left(remainder.length() - mod.length()) 3014 quotient_mod_product.shift_left(exponent_shift) 3015 remainder = remainder ^ quotient_mod_product 3016 if remainder.length() > n: 3017 remainder = remainder[remainder.length()-n:] 3018 return quotient, remainder 3019 3020 # For backward compatibility: 3021 gf_divide = gf_divide_by_modulus 3022 3023 def gf_multiply_modular(self, b, mod, n): 3024 ''' 3025 If you want to carry out modular multiplications in the Galois 3026 Field GF(2^n): 3027 3028 modulus = BitVector(bitstring='100011011') # AES modulus 3029 n = 8 3030 a = BitVector(bitstring='0110001') 3031 b = BitVector(bitstring='0110') 3032 c = a.gf_multiply_modular(b, modulus, n) 3033 print(c) # 10100110 3034 3035 The call to gf_multiply_modular() returns the product of the two 3036 bitvectors a and b modulo the bitvector modulus in GF(2^8). A call 3037 to gf_multiply_modular() returns is a bitvector object. 3038 ''' 3039 a = self 3040 a_copy = a.deep_copy() 3041 b_copy = b.deep_copy() 3042 product = a_copy.gf_multiply(b_copy) 3043 quotient, remainder = product.gf_divide_by_modulus(mod, n) 3044 return remainder 3045 3046 def gf_MI(self, mod, n): 3047 ''' 3048 To calculate the multiplicative inverse of a bit vector in the 3049 Galois Field GF(2^n) with respect to a modulus polynomial, call 3050 gf_MI() as follows: 3051 3052 modulus = BitVector(bitstring = '100011011') 3053 n = 8 3054 a = BitVector(bitstring = '00110011') 3055 multi_inverse = a.gf_MI(modulus, n) 3056 print multi_inverse # 01101100 3057 3058 A call to gf_MI() returns a bitvector object. 3059 ''' 3060 num = self 3061 NUM = num.deep_copy(); MOD = mod.deep_copy() 3062 x = BitVector( size=mod.length() ) 3063 x_old = BitVector( intVal=1, size=mod.length() ) 3064 y = BitVector( intVal=1, size=mod.length() ) 3065 y_old = BitVector( size=mod.length() ) 3066 while int(mod): 3067 quotient, remainder = num.gf_divide_by_modulus(mod, n) 3068 num, mod = mod, remainder 3069 x, x_old = x_old ^ quotient.gf_multiply(x), x 3070 y, y_old = y_old ^ quotient.gf_multiply(y), y 3071 if int(num) != 1: 3072 return "NO MI. However, the GCD of ", str(NUM), " and ", \ 3073 str(MOD), " is ", str(num) 3074 else: 3075 z = x_old ^ MOD 3076 quotient, remainder = z.gf_divide_by_modulus(MOD, n) 3077 return remainder 3078 3079 def runs(self): 3080 ''' 3081 You can extract from a bitvector the runs of 1's and 0's in the 3082 vector as follows: 3083 3084 bv = BitVector(bitlist = (1,1, 1, 0, 0, 1)) 3085 print(str(bv.runs())) # ['111', '00', '1'] 3086 3087 The object returned by runs() is a list of strings, with each 3088 element of this list being a string of 1's and 0's. 3089 ''' 3090 allruns = [] 3091 if self.size == 0: 3092 return allruns 3093 run = '' 3094 previous_bit = self[0] 3095 if previous_bit == 0: 3096 run = '0' 3097 else: 3098 run = '1' 3099 for bit in list(self)[1:]: 3100 if bit == 0 and previous_bit == 0: 3101 run += '0' 3102 elif bit == 1 and previous_bit == 0: 3103 allruns.append( run ) 3104 run = '1' 3105 elif bit == 0 and previous_bit == 1: 3106 allruns.append( run ) 3107 run = '0' 3108 else: 3109 run += '1' 3110 previous_bit = bit 3111 allruns.append( run ) 3112 return allruns 3113 3114 def test_for_primality(self): 3115 ''' 3116 You can test whether a randomly generated bit vector is a prime 3117 number using the probabilistic Miller-Rabin test 3118 3119 bv = BitVector(intVal = 0) 3120 bv = bv.gen_random_bits(32) 3121 check = bv.test_for_primality() 3122 print(check) 3123 3124 The test_for_primality() methods returns a floating point number 3125 close to 1 for prime numbers and 0 for composite numbers. The 3126 actual value returned for a prime is the probability associated 3127 with the determination of its primality. 3128 ''' 3129 p = int(self) 3130 if p == 1: return 0 3131 probes = [2,3,5,7,11,13,17] 3132 for a in probes: 3133 if a == p: return 1 3134 if any([p % a == 0 for a in probes]): return 0 3135 k, q = 0, p-1 3136 while not q&1: 3137 q >>= 1 3138 k += 1 3139 for a in probes: 3140 a_raised_to_q = pow(a, q, p) 3141 if a_raised_to_q == 1 or a_raised_to_q == p-1: continue 3142 a_raised_to_jq = a_raised_to_q 3143 primeflag = 0 3144 for j in range(k-1): 3145 a_raised_to_jq = pow(a_raised_to_jq, 2, p) 3146 if a_raised_to_jq == p-1: 3147 primeflag = 1 3148 break 3149 if not primeflag: return 0 3150 probability_of_prime = 1 - 1.0/(4 ** len(probes)) 3151 return probability_of_prime 3152 3153 def gen_random_bits(self, width): 3154 ''' 3155 You can generate a bitvector with random bits with the bits 3156 spanning a specified width. For example, if you wanted a random 3157 bit vector to fully span 32 bits, you would say 3158 3159 bv = BitVector(intVal = 0) 3160 bv = bv.gen_random_bits(32) 3161 print(bv) # 11011010001111011010011111000101 3162 3163 As you would expect, gen_random_bits() returns a bitvector object. 3164 3165 The bulk of the work here is done by calling random.getrandbits( 3166 width) which returns an integer whose binary code representation 3167 will NOT BE LARGER than the argument 'width'. When random numbers 3168 are generated as candidates for primes, you often want to make sure 3169 that the random number thus created spans the full width specified 3170 by 'width' and that the number is odd. This we do by setting the 3171 two most significant bits and the least significant bit. 3172 ''' 3173 import random 3174 candidate = random.getrandbits( width ) 3175 candidate |= 1 3176 candidate |= (1 << width-1) 3177 candidate |= (2 << width-3) 3178 return BitVector( intVal = candidate ) 3179 3180 # For backward compatibility: 3181 gen_rand_bits_for_prime = gen_random_bits 3182 3183 def min_canonical(self): 3184 ''' 3185 This method returns the "canonical" form of a BitVector instance that is obtained by 3186 circularly rotating the bit pattern through all possible shifts and returning the 3187 pattern with the maximum number of leading zeros. This is also the minimum int value 3188 version of a bit pattern. This method is useful in the "Local Binary Pattern" 3189 algorithm for characterizing image textures. If you are curious as to how, see my 3190 tutorial on "Measuring Texture and Color in Images." 3191 ''' 3192 intvals_for_circular_shifts = [int(self << 1) for _ in range(len(self))] 3193 return BitVector( intVal = min(intvals_for_circular_shifts), size = len(self)) 3194 3195 3196#-------------------------------- BitVectorIterator Class ----------------------------------- 3197 3198class BitVectorIterator: 3199 def __init__( self, bitvec ): 3200 self.items = [] 3201 for i in range( bitvec.size ): 3202 self.items.append( bitvec._getbit(i) ) 3203 self.index = -1 3204 def __iter__( self ): 3205 return self 3206 def next( self ): 3207 self.index += 1 3208 if self.index < len( self.items ): 3209 return self.items[ self.index ] 3210 else: 3211 raise StopIteration 3212 __next__ = next 3213 3214#----------------------------------- End of Class Definition ------------------------------- 3215 3216#---------------------------------- Test Code Follows -------------------------------- 3217 3218if __name__ == '__main__': 3219 3220 # Construct an EMPTY bit vector (a bit vector of size 0): 3221 print("\nConstructing an EMPTY bit vector (a bit vector of size 0):") 3222 bv1 = BitVector( size = 0 ) 3223 print(bv1) # no output 3224 3225 # Construct a bit vector of size 2: 3226 print("\nConstructing a bit vector of size 2:") 3227 bv2 = BitVector( size = 2 ) 3228 print(bv2) # 00 3229 3230 # Joining two bit vectors: 3231 print("\nConcatenating two previously constructed bit vectors:") 3232 result = bv1 + bv2 3233 print(result) # 00 3234 3235 # Construct a bit vector with a tuple of bits: 3236 print("\nConstructing a bit vector from a tuple of bits:") 3237 bv = BitVector(bitlist=(1, 0, 0, 1)) 3238 print(bv) # 1001 3239 3240 # Construct a bit vector with a list of bits: 3241 print("\nConstruct a bit vector from a list of bits:") 3242 bv = BitVector(bitlist=[1, 1, 0, 1]) 3243 print(bv) # 1101 3244 3245 # Construct a bit vector from an integer 3246 bv = BitVector(intVal=5678) 3247 print("\nBit vector constructed from integer 5678:") 3248 print(bv) # 1011000101110 3249 print("\nBit vector constructed from integer 0:") 3250 bv = BitVector(intVal=0) 3251 print(bv) # 0 3252 print("\nBit vector constructed from integer 2:") 3253 bv = BitVector(intVal=2) 3254 print(bv) # 10 3255 print("\nBit vector constructed from integer 3:") 3256 bv = BitVector(intVal=3) 3257 print(bv) # 11 3258 print("\nBit vector constructed from integer 123456:") 3259 bv = BitVector(intVal=123456) 3260 print(bv) # 11110001001000000 3261 print("\nInt value of the previous bit vector as computed by int_val():") 3262 print(bv.int_val()) # 123456 3263 print("\nInt value of the previous bit vector as computed by int():") 3264 print(int(bv)) # 123456 3265 3266 # Construct a bit vector from a very large integer: 3267 x = 12345678901234567890123456789012345678901234567890123456789012345678901234567890 3268 bv = BitVector(intVal=x) 3269 print("\nHere is a bit vector constructed from a very large integer:") 3270 print(bv) 3271 print("The integer value of the above bit vector is:%d" % int(bv)) 3272 3273 # Construct a bit vector directly from a file-like object: 3274 import io 3275 x = "111100001111" 3276 x = "" 3277 if sys.version_info[0] == 3: 3278 x = "111100001111" 3279 else: 3280 x = unicode("111100001111") 3281 fp_read = io.StringIO(x) 3282 bv = BitVector( fp = fp_read ) 3283 print("\nBit vector constructed directed from a file like object:") 3284 print(bv) # 111100001111 3285 3286 # Construct a bit vector directly from a bit string: 3287 bv = BitVector( bitstring = '00110011' ) 3288 print("\nBit Vector constructed directly from a bit string:") 3289 print(bv) # 00110011 3290 3291 bv = BitVector(bitstring = '') 3292 print("\nBit Vector constructed directly from an empty bit string:") 3293 print(bv) # nothing 3294 print("\nInteger value of the previous bit vector:") 3295 print(bv.int_val()) # 0 3296 3297 print("\nConstructing a bit vector from the textstring 'hello':") 3298 bv3 = BitVector(textstring = "hello") 3299 print(bv3) 3300 mytext = bv3.get_bitvector_in_ascii() 3301 print("Text recovered from the previous bitvector: ") 3302 print(mytext) # hello 3303 print("\nConstructing a bit vector from the textstring 'hello\\njello':") 3304 bv3 = BitVector(textstring = "hello\njello") 3305 print(bv3) 3306 mytext = bv3.get_bitvector_in_ascii() 3307 print("Text recovered from the previous bitvector:") 3308 print(mytext) # hello 3309 # jello 3310 3311 print("\nConstructing a bit vector from the hexstring '68656c6c6f':") 3312 bv4 = BitVector(hexstring = "68656c6c6f") 3313 print(bv4) 3314 myhexstring = bv4.get_bitvector_in_hex() 3315 print("Hex string recovered from the previous bitvector: ") 3316 print(myhexstring) # 68656c6c6f 3317 3318 print("\nDemonstrating the raw bytes mode of constructing a bit vector (useful for reading public and private keys):") 3319 mypubkey = 'ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEA5amriY96HQS8Y/nKc8zu3zOylvpOn3vzMmWwrtyDy+aBvns4UC1RXoaD9rDKqNNMCBAQwWDsYwCAFsrBzbxRQONHePX8lRWgM87MseWGlu6WPzWGiJMclTAO9CTknplG9wlNzLQBj3dP1M895iLF6jvJ7GR+V3CRU6UUbMmRvgPcsfv6ec9RRPm/B8ftUuQICL0jt4tKdPG45PBJUylHs71FuE9FJNp01hrj1EMFObNTcsy9zuis0YPyzArTYSOUsGglleExAQYi7iLh17pAa+y6fZrGLsptgqryuftN9Q4NqPuTiFjlqRowCDU7sSxKDgU7bzhshyVx3+pzXO4D2Q== kak@pixie' 3320 import base64 3321 if sys.version_info[0] == 3: 3322 import binascii 3323 keydata = base64.b64decode(bytes(mypubkey.split(None)[1], 'utf-8')) 3324 else: 3325 keydata = base64.b64decode(mypubkey.split(None)[1]) 3326 bv = BitVector( rawbytes = keydata ) 3327 print(bv) 3328 3329 # Test array-like indexing for a bit vector: 3330 bv = BitVector( bitstring = '110001' ) 3331 print("\nPrints out bits individually from bitstring 110001:") 3332 print(bv[0], bv[1], bv[2], bv[3], bv[4], bv[5]) # 1 1 0 0 0 1 3333 print("\nSame as above but using negative array indexing:") 3334 print(bv[-1], bv[-2], bv[-3], bv[-4], bv[-5], bv[-6]) # 1 0 0 0 1 1 3335 3336 # Test setting bit values with positive and negative 3337 # accessors: 3338 bv = BitVector( bitstring = '1111' ) 3339 print("\nBitstring for 1111:") 3340 print(bv) # 1111 3341 3342 print("\nReset individual bits of above vector:") 3343 bv[0]=0;bv[1]=0;bv[2]=0;bv[3]=0 3344 print(bv) # 0000 3345 print("\nDo the same as above with negative indices:") 3346 bv[-1]=1;bv[-2]=1;bv[-4]=1 3347 print(bv) # 1011 3348 3349 print("\nCheck equality and inequality ops:") 3350 bv1 = BitVector( bitstring = '00110011' ) 3351 bv2 = BitVector( bitlist = [0,0,1,1,0,0,1,1] ) 3352 print(bv1 == bv2) # True 3353 print(bv1 != bv2) # False 3354 print(bv1 < bv2) # False 3355 print(bv1 <= bv2) # True 3356 bv3 = BitVector( intVal = 5678 ) 3357 print(bv3.int_val()) # 5678 3358 print(bv3) # 1011000101110 3359 print(bv1 == bv3) # False 3360 print(bv3 > bv1) # True 3361 print(bv3 >= bv1) # True 3362 3363 # Write a bit vector to a file like object 3364 fp_write = io.StringIO() 3365 bv.write_bits_to_fileobject( fp_write ) 3366 print("\nGet bit vector written out to a file-like object:") 3367 print(fp_write.getvalue()) # 1011 3368 3369 print("\nExperiments with bitwise logical operations:") 3370 bv3 = bv1 | bv2 3371 print(bv3) # 00110011 3372 bv3 = bv1 & bv2 3373 print(bv3) # 00110011 3374 bv3 = bv1 + bv2 3375 print(bv3) # 0011001100110011 3376 bv4 = BitVector( size = 3 ) 3377 print(bv4) # 000 3378 bv5 = bv3 + bv4 3379 print(bv5) # 0011001100110011000 3380 bv6 = ~bv5 3381 print(bv6) # 1100110011001100111 3382 bv7 = bv5 & bv6 3383 print(bv7) # 0000000000000000000 3384 bv7 = bv5 | bv6 3385 print(bv7) # 1111111111111111111 3386 3387 print("\nTry logical operations on bit vectors of different sizes:") 3388 print(BitVector( intVal = 6 ) ^ BitVector( intVal = 13 )) # 1011 3389 print(BitVector( intVal = 6 ) & BitVector( intVal = 13 )) # 0100 3390 print(BitVector( intVal = 6 ) | BitVector( intVal = 13 )) # 1111 3391 3392 print(BitVector( intVal = 1 ) ^ BitVector( intVal = 13 )) # 1100 3393 print(BitVector( intVal = 1 ) & BitVector( intVal = 13 )) # 0001 3394 print(BitVector( intVal = 1 ) | BitVector( intVal = 13 )) # 1101 3395 3396 print("\nExperiments with setbit() and len():") 3397 bv7[7] = 0 3398 print(bv7) # 1111111011111111111 3399 print(len( bv7 )) # 19 3400 bv8 = (bv5 & bv6) ^ bv7 3401 print(bv8) # 1111111011111111111 3402 3403 print("\nConstruct a bit vector from what is in the file testinput1.txt:") 3404 bv = BitVector( filename = 'TestBitVector/testinput1.txt' ) 3405 #print bv # nothing to show 3406 bv1 = bv.read_bits_from_file(64) 3407 print("\nPrint out the first 64 bits read from the file:") 3408 print(bv1) 3409 # 0100000100100000011010000111010101101110011001110111001001111001 3410 print("\nRead the next 64 bits from the same file:") 3411 bv2 = bv.read_bits_from_file(64) 3412 print(bv2) 3413 # 0010000001100010011100100110111101110111011011100010000001100110 3414 print("\nTake xor of the previous two bit vectors:") 3415 bv3 = bv1 ^ bv2 3416 print(bv3) 3417 # 0110000101000010000110100001101000011001000010010101001000011111 3418 3419 print("\nExperiment with dividing an even-sized vector into two:") 3420 [bv4, bv5] = bv3.divide_into_two() 3421 print(bv4) # 01100001010000100001101000011010 3422 print(bv5) # 00011001000010010101001000011111 3423 3424 # Permute a bit vector: 3425 print("\nWe will use this bit vector for experiments with permute()") 3426 bv1 = BitVector( bitlist = [1, 0, 0, 1, 1, 0, 1] ) 3427 print(bv1) # 1001101 3428 3429 bv2 = bv1.permute( [6, 2, 0, 1] ) 3430 print("\nPermuted and contracted form of the previous bit vector:") 3431 print(bv2) # 1010 3432 3433 print("\nExperiment with writing an internally generated bit vector out to a disk file:") 3434 bv1 = BitVector( bitstring = '00001010' ) 3435 FILEOUT = open( 'TestBitVector/test.txt', 'wb' ) 3436 bv1.write_to_file( FILEOUT ) 3437 FILEOUT.close() 3438 bv2 = BitVector( filename = 'TestBitVector/test.txt' ) 3439 bv3 = bv2.read_bits_from_file( 32 ) 3440 print("\nDisplay bit vectors written out to file and read back from the file and their respective lengths:") 3441 print( str(bv1) + " " + str(bv3)) 3442 print(str(len(bv1)) + " " + str(len(bv3))) 3443 3444 print("\nExperiments with reading a file from the beginning to end:") 3445 bv = BitVector( filename = 'TestBitVector/testinput4.txt' ) 3446 print("\nHere are all the bits read from the file:") 3447 while (bv.more_to_read): 3448 bv_read = bv.read_bits_from_file( 64 ) 3449 print(bv_read) 3450 print("\n") 3451 3452 print("\nExperiment with closing a file object and start extracting bit vectors from the file from the beginning again:") 3453 bv.close_file_object() 3454 bv = BitVector( filename = 'TestBitVector/testinput4.txt' ) 3455 bv1 = bv.read_bits_from_file(64) 3456 print("\nHere are all the first 64 bits read from the file again after the file object was closed and opened again:") 3457 print(bv1) 3458 FILEOUT = open( 'TestBitVector/testinput5.txt', 'wb' ) 3459 bv1.write_to_file( FILEOUT ) 3460 FILEOUT.close() 3461 3462 print("\nExperiment in 64-bit permutation and unpermutation of the previous 64-bit bitvector:") 3463 print("The permutation array was generated separately by the Fisher-Yates shuffle algorithm:") 3464 bv2 = bv1.permute( [22, 47, 33, 36, 18, 6, 32, 29, 54, 62, 4, 3465 9, 42, 39, 45, 59, 8, 50, 35, 20, 25, 49, 3466 15, 61, 55, 60, 0, 14, 38, 40, 23, 17, 41, 3467 10, 57, 12, 30, 3, 52, 11, 26, 43, 21, 13, 3468 58, 37, 48, 28, 1, 63, 2, 31, 53, 56, 44, 24, 3469 51, 19, 7, 5, 34, 27, 16, 46] ) 3470 print("Permuted bit vector:") 3471 print(bv2) 3472 3473 bv3 = bv2.unpermute( [22, 47, 33, 36, 18, 6, 32, 29, 54, 62, 4, 3474 9, 42, 39, 45, 59, 8, 50, 35, 20, 25, 49, 3475 15, 61, 55, 60, 0, 14, 38, 40, 23, 17, 41, 3476 10, 57, 12, 30, 3, 52, 11, 26, 43, 21, 13, 3477 58, 37, 48, 28, 1, 63, 2, 31, 53, 56, 44, 24, 3478 51, 19, 7, 5, 34, 27, 16, 46] ) 3479 print("Unpurmute the bit vector:") 3480 print(bv3) 3481 3482 print("\nTry circular shifts to the left and to the right for the following bit vector:") 3483 print(bv3) # 0100000100100000011010000111010101101110011001110111001001111001 3484 print("\nCircular shift to the left by 7 positions:") 3485 bv3 << 7 3486 print(bv3) # 1001000000110100001110101011011100110011101110010011110010100000 3487 3488 print("\nCircular shift to the right by 7 positions:") 3489 bv3 >> 7 3490 print(bv3) # 0100000100100000011010000111010101101110011001110111001001111001 3491 3492 print("Test len() on the above bit vector:") 3493 print(len( bv3 )) # 64 3494 3495 print("\nTest forming a [5:22] slice of the above bit vector:") 3496 bv4 = bv3[5:22] 3497 print(bv4) # 00100100000011010 3498 3499 print("\nTest the iterator:") 3500 for bit in bv4: 3501 print(bit) # 0 0 1 0 0 1 0 0 0 0 0 0 1 1 0 1 0 3502 3503 print("\nDemonstrate padding a bit vector from left:") 3504 bv = BitVector(bitstring = '101010') 3505 bv.pad_from_left(4) 3506 print(bv) # 0000101010 3507 3508 print("\nDemonstrate padding a bit vector from right:") 3509 bv.pad_from_right(4) 3510 print(bv) # 00001010100000 3511 3512 print("\nTest the syntax 'if bit_vector_1 in bit_vector_2' syntax:") 3513 try: 3514 bv1 = BitVector(bitstring = '0011001100') 3515 bv2 = BitVector(bitstring = '110011') 3516 if bv2 in bv1: 3517 print("%s is in %s" % (bv2, bv1)) 3518 else: 3519 print("%s is not in %s" % (bv2, bv1)) 3520 except ValueError as arg: 3521 print("Error Message: " + str(arg)) 3522 3523 print("\nTest the size modifier when a bit vector is initialized with the intVal method:") 3524 bv = BitVector(intVal = 45, size = 16) 3525 print(bv) # 0000000000101101 3526 bv = BitVector(intVal = 0, size = 8) 3527 print(bv) # 00000000 3528 bv = BitVector(intVal = 1, size = 8) 3529 print(bv) # 00000001 3530 3531 print("\nTesting slice assignment:") 3532 bv1 = BitVector( size = 25 ) 3533 print("bv1= " + str(bv1)) # 0000000000000000000000000 3534 bv2 = BitVector( bitstring = '1010001' ) 3535 print("bv2= " + str(bv2)) # 1010001 3536 bv1[6:9] = bv2[0:3] 3537 print("bv1= " + str(bv1)) # 0000001010000000000000000 3538 bv1[:5] = bv1[5:10] 3539 print("bv1= " + str(bv1)) # 0101001010000000000000000 3540 bv1[20:] = bv1[5:10] 3541 print("bv1= " + str(bv1)) # 0101001010000000000001010 3542 bv1[:] = bv1[:] 3543 print("bv1= " + str(bv1)) # 0101001010000000000001010 3544 bv3 = bv1[:] 3545 print("bv3= " + str(bv3)) # 0101001010000000000001010 3546 3547 print("\nTesting reset function:") 3548 bv1.reset(1) 3549 print("bv1= " + str(bv1)) # 1111111111111111111111111 3550 print(bv1[3:9].reset(0)) # 000000 3551 print(bv1[:].reset(0)) # 0000000000000000000000000 3552 3553 print("\nTesting count_bit():") 3554 bv = BitVector(intVal = 45, size = 16) 3555 y = bv.count_bits() 3556 print(y) # 4 3557 bv = BitVector(bitstring = '100111') 3558 print(bv.count_bits()) # 4 3559 bv = BitVector(bitstring = '00111000') 3560 print(bv.count_bits()) # 3 3561 bv = BitVector(bitstring = '001') 3562 print(bv.count_bits()) # 1 3563 bv = BitVector(bitstring = '00000000000000') 3564 print(bv.count_bits()) # 0 3565 3566 print("\nTest set_value idea:") 3567 bv = BitVector(intVal = 7, size =16) 3568 print(bv) # 0000000000000111 3569 bv.set_value(intVal = 45) 3570 print(bv) # 101101 3571 3572 print("\nTesting count_bits_sparse():") 3573 bv = BitVector(size = 2000000) 3574 bv[345234] = 1 3575 bv[233]=1 3576 bv[243]=1 3577 bv[18]=1 3578 bv[785] =1 3579 print("The number of bits set: " + str(bv.count_bits_sparse())) # 5 3580 3581 print("\nTesting Jaccard similarity and distance and Hamming distance:") 3582 bv1 = BitVector(bitstring = '11111111') 3583 bv2 = BitVector(bitstring = '00101011') 3584 print("Jaccard similarity: " + str(bv1.jaccard_similarity(bv2))) # 0.5 3585 print("Jaccard distance: " + str(bv1.jaccard_distance(bv2))) # 0.5 3586 print("Hamming distance: " + str(bv1.hamming_distance(bv2))) # 4 3587 3588 print("\nTesting next_set_bit():") 3589 bv = BitVector(bitstring = '00000000000001') 3590 print(bv.next_set_bit(5)) # 13 3591 bv = BitVector(bitstring = '000000000000001') 3592 print(bv.next_set_bit(5)) # 14 3593 bv = BitVector(bitstring = '0000000000000001') 3594 print(bv.next_set_bit(5)) # 15 3595 bv = BitVector(bitstring = '00000000000000001') 3596 print(bv.next_set_bit(5)) # 16 3597 3598 print("\nTesting rank_of_bit_set_at_index():") 3599 bv = BitVector(bitstring = '01010101011100') 3600 print(bv.rank_of_bit_set_at_index( 10 )) # 6 3601 3602 print("\nTesting is_power_of_2():") 3603 bv = BitVector(bitstring = '10000000001110') 3604 print("int value: " + str(int(bv))) # 826 3605 print(bv.is_power_of_2()) # False 3606 print("\nTesting is_power_of_2_sparse():") 3607 print(bv.is_power_of_2_sparse()) # False 3608 3609 print("\nTesting reverse():") 3610 bv = BitVector(bitstring = '0001100000000000001') 3611 print("original bv: " + str(bv)) # 0001100000000000001 3612 print("reversed bv: " + str(bv.reverse())) # 1000000000000011000 3613 3614 print("\nTesting Greatest Common Divisor (gcd):") 3615 bv1 = BitVector(bitstring = '01100110') 3616 print("first arg bv: " + str(bv1) + " of int value: " + str(int(bv1))) #102 3617 bv2 = BitVector(bitstring = '011010') 3618 print("second arg bv: " + str(bv2) + " of int value: " + str(int(bv2)))# 26 3619 bv = bv1.gcd(bv2) 3620 print("gcd bitvec is: " + str(bv) + " of int value: " + str(int(bv))) # 2 3621 3622 print("\nTesting multiplicative_inverse:") 3623 bv_modulus = BitVector(intVal = 32) 3624 print("modulus is bitvec: " + str(bv_modulus) + " of int value: " + str(int(bv_modulus))) 3625 bv = BitVector(intVal = 17) 3626 print("bv: " + str(bv) + " of int value: " + str(int(bv))) 3627 result = bv.multiplicative_inverse(bv_modulus) 3628 if result is not None: 3629 print("MI bitvec is: " + str(result) + " of int value: " + str(int(result))) 3630 else: print("No multiplicative inverse in this case") 3631 # 17 3632 print("\nTest multiplication in GF(2):") 3633 a = BitVector(bitstring='0110001') 3634 b = BitVector(bitstring='0110') 3635 c = a.gf_multiply(b) 3636 print("Product of a=" + str(a) + " b=" + str(b) + " is " + str(c)) 3637 # 00010100110 3638 3639 print("\nTest division in GF(2^n):") 3640 mod = BitVector(bitstring='100011011') # AES modulus 3641 n = 8 3642 a = BitVector(bitstring='11100010110001') 3643 quotient, remainder = a.gf_divide_by_modulus(mod, n) 3644 print("Dividing a=" + str(a) + " by mod=" + str(mod) + " in GF(2^8) returns the quotient " 3645 + str(quotient) + " and the remainder " + str(remainder)) 3646 # 10001111 3647 3648 print("\nTest modular multiplication in GF(2^n):") 3649 modulus = BitVector(bitstring='100011011') # AES modulus 3650 n = 8 3651 a = BitVector(bitstring='0110001') 3652 b = BitVector(bitstring='0110') 3653 c = a.gf_multiply_modular(b, modulus, n) 3654 print("Modular product of a=" + str(a) + " b=" + str(b) + " in GF(2^8) is " + str(c)) 3655 # 10100110 3656 3657 print("\nTest multiplicative inverses in GF(2^3) with " + "modulus polynomial = x^3 + x + 1:") 3658 print("Find multiplicative inverse of a single bit array") 3659 modulus = BitVector(bitstring='100011011') # AES modulus 3660 n = 8 3661 a = BitVector(bitstring='00110011') 3662 mi = a.gf_MI(modulus,n) 3663 print("Multiplicative inverse of " + str(a) + " in GF(2^8) is " + str(mi)) 3664 3665 print("\nIn the following three rows shown, the first row shows the " +\ 3666 "\nbinary code words, the second the multiplicative inverses," +\ 3667 "\nand the third the product of a binary word with its" +\ 3668 "\nmultiplicative inverse:\n") 3669 mod = BitVector(bitstring = '1011') 3670 n = 3 3671 bitarrays = [BitVector(intVal=x, size=n) for x in range(1,2**3)] 3672 mi_list = [x.gf_MI(mod,n) for x in bitarrays] 3673 mi_str_list = [str(x.gf_MI(mod,n)) for x in bitarrays] 3674 print("bit arrays in GF(2^3): " + str([str(x) for x in bitarrays])) 3675 print("multiplicati_inverses: " + str(mi_str_list)) 3676 3677 products = [ str(bitarrays[i].gf_multiply_modular(mi_list[i], mod, n)) \ 3678 for i in range(len(bitarrays)) ] 3679 print("bit_array * multi_inv: " + str(products)) 3680 3681 # UNCOMMENT THE FOLLOWING LINES FOR 3682 # DISPLAYING ALL OF THE MULTIPLICATIVE 3683 # INVERSES IN GF(2^8) WITH THE AES MODULUS: 3684 3685# print("\nMultiplicative inverses in GF(2^8) with " + \ 3686# "modulus polynomial x^8 + x^4 + x^3 + x + 1:") 3687# print("\n(This may take a few seconds)\n") 3688# mod = BitVector(bitstring = '100011011') 3689# n = 8 3690# bitarrays = [BitVector(intVal=x, size=n) for x in range(1,2**8)] 3691# mi_list = [x.gf_MI(mod,n) for x in bitarrays] 3692# mi_str_list = [str(x.gf_MI(mod,n)) for x in bitarrays] 3693# print("\nMultiplicative Inverses:\n\n" + str(mi_str_list)) 3694# products = [ str(bitarrays[i].gf_multiply_modular(mi_list[i], mod, n)) \ 3695# for i in range(len(bitarrays)) ] 3696# print("\nShown below is the product of each binary code word " +\ 3697# "in GF(2^3) and its multiplicative inverse:\n\n") 3698# print(products) 3699 3700 print("\nExperimenting with runs():") 3701 bv = BitVector(bitlist = (1, 0, 0, 1)) 3702 print("For bit vector: " + str(bv)) 3703 print(" the runs are: " + str(bv.runs())) 3704 bv = BitVector(bitlist = (1, 0)) 3705 print("For bit vector: " + str(bv)) 3706 print(" the runs are: " + str(bv.runs())) 3707 bv = BitVector(bitlist = (0, 1)) 3708 print("For bit vector: " + str(bv)) 3709 print(" the runs are: " + str(bv.runs())) 3710 bv = BitVector(bitlist = (0, 0, 0, 1)) 3711 print("For bit vector: " + str(bv)) 3712 print(" the runs are: " + str(bv.runs())) 3713 bv = BitVector(bitlist = (0, 1, 1, 0)) 3714 print("For bit vector: " + str(bv)) 3715 print(" the runs are: " + str(bv.runs())) 3716 3717 print("\nExperiments with chained invocations of circular shifts:") 3718 bv = BitVector(bitlist = (1,1, 1, 0, 0, 1)) 3719 print(bv) 3720 bv >> 1 3721 print(bv) 3722 bv >> 1 >> 1 3723 print(bv) 3724 bv = BitVector(bitlist = (1,1, 1, 0, 0, 1)) 3725 print(bv) 3726 bv << 1 3727 print(bv) 3728 bv << 1 << 1 3729 print(bv) 3730 3731 print("\nExperiments with chained invocations of NON-circular shifts:") 3732 bv = BitVector(bitlist = (1,1, 1, 0, 0, 1)) 3733 print(bv) 3734 bv.shift_right(1) 3735 print(bv) 3736 bv.shift_right(1).shift_right(1) 3737 print(bv) 3738 bv = BitVector(bitlist = (1,1, 1, 0, 0, 1)) 3739 print(bv) 3740 bv.shift_left(1) 3741 print(bv) 3742 bv.shift_left(1).shift_left(1) 3743 print(bv) 3744 3745 # UNCOMMENT THE FOLLOWING LINES TO TEST THE 3746 # PRIMALITY TESTING METHOD. IT SHOULD SHOW 3747 # THAT ALL OF THE FOLLOWING NUMBERS ARE PRIME: 3748# print("\nExperiments with primality testing. If a number is not prime, its primality " + 3749# "test output must be zero. Otherwise, it should a number very close to 1.0.") 3750# primes = [179, 233, 283, 353, 419, 467, 547, 607, 661, 739, 811, 877, \ 3751# 947, 1019, 1087, 1153, 1229, 1297, 1381, 1453, 1523, 1597, \ 3752# 1663, 1741, 1823, 1901, 7001, 7109, 7211, 7307, 7417, 7507, \ 3753# 7573, 7649, 7727, 7841] 3754# for p in primes: 3755# bv = BitVector(intVal = p) 3756# check = bv.test_for_primality() 3757# print("The primality test for " + str(p) + ": " + str(check)) 3758 3759 print("\nGenerate 32-bit wide candidate for primality testing:") 3760 bv = BitVector(intVal = 0) 3761 bv = bv.gen_random_bits(32) 3762 print(bv) 3763 check = bv.test_for_primality() 3764 print("The primality test for " + str(int(bv)) + ": " + str(check)) 3765