| |
- BitVectorIterator
- __builtin__.object
-
- BitVector
class BitVector(__builtin__.object) |
| |
Methods defined here:
- __add__(self, other)
- Because __add__ is supplied, you can always join two bitvectors by
bitvec3 = bitvec1 + bitvec2
bitvec3 is a new bitvector object that contains all the bits of
bitvec1 followed by all the bits of bitvec2.
- __and__(self, other)
- Take a bitwise 'AND' of the bit vector on which the method is invoked with
the argument bit vector. Return the result as a new bit vector. If the two
bit vectors are not of the same size, pad the shorter one with zeros from the
left.
- __contains__(self, otherBitVec)
- This supports 'if x in y' and 'if x not in y' syntax for bit vectors.
- __eq__(self, other)
- # Compare two bit vectors:
- __ge__(self, other)
- __getitem__ = _getbit(self, pos)
- __getslice__(self, i, j)
- Fetch slices with [i:j], [:], etc.
- __gt__(self, other)
- __init__(self, *args, **kwargs)
- __int__ = int_val(self)
- __invert__(self)
- Invert the bits in the bit vector on which the method is invoked
and return the result as a new bit vector.
- __iter__(self)
- To allow iterations over a bit vector by supporting the 'for bit in
bit_vector' syntax:
- __le__(self, other)
- __len__ = _getsize(self)
- __lshift__(self, n)
- Left circular rotation of a BitVector through N positions can be
carried out by
bitvec << N
This operator overloading is made possible by implementing the
__lshift__ method defined here. Note that this operator returns
the bitvector on which it is invoked. This allows for a chained
invocation of the operator
- __lt__(self, other)
- __ne__(self, other)
- __or__(self, other)
- Take a bitwise 'OR' of the bit vector on which the method is invoked with the
argument bit vector. Return the result as a new bit vector. If the two bit
vectors are not of the same size, pad the shorter one with zero's from the
left.
- __rshift__(self, n)
- Right circular rotation of a BitVector through N positions can be
carried out by
bitvec >> N
This operator overloading is made possible by implementing the
__rshift__ method defined here. Note that this operator returns
the bitvector on which it is invoked. This allows for a chained
invocation of the operator.
- __setitem__(self, pos, item)
- This is needed for both slice assignments and for index assignments. It
checks the types of pos and item to see if the call is for slice assignment.
For slice assignment, pos must be of type 'slice' and item of type BitVector.
For index assignment, the argument types are checked in the _setbit() method.
- __str__(self)
- To create a print representation
- __xor__(self, other)
- Take a bitwise 'XOR' of the bit vector on which the method is invoked with
the argument bit vector. Return the result as a new bit vector. If the two
bit vectors are not of the same size, pad the shorter one with zeros from the
left.
- circular_rot_left(self)
- This is merely another implementation of the method
circular_rotate_left_by_one() shown above. This one does NOT use map
functions. This method carries out a one-bit left circular shift of a bit
vector.
- circular_rot_right(self)
- This is merely another implementation of the method
circular_rotate_right_by_one() shown above. This one does NOT use map
functions. This method does a one-bit right circular shift of a bit vector.
- circular_rotate_left_by_one(self)
- For a one-bit in-place left circular shift
- circular_rotate_right_by_one(self)
- For a one-bit in-place right circular shift
- close_file_object(self)
- When you construct bitvectors by block scanning a disk file, after
you are done, you can call this method to close the file object
that was created to read the file:
bv = BitVector(filename = 'somefile')
bv1 = bv.read_bits_from_file(64)
bv.close_file_object()
The constructor call in the first statement creates a file object
for reading the bits. It is this file object that is closed when
you call close_file_object().
- count_bits(self)
- You can count the number of bits set in a BitVector instance by
bv = BitVector(bitstring = '100111')
print(bv.count_bits()) # 4
A call to count_bits() returns an integer value that is equal to
the number of bits set in the bitvector.
- count_bits_sparse(self)
- For folks who use bit vectors with millions of bits in them but
with only a few bits set, your bit counting will go much, much
faster if you call count_bits_sparse() instead of count_bits():
However, for dense bitvectors, I expect count_bits() to work
faster.
# a BitVector with 2 million bits:
bv = BitVector(size = 2000000)
bv[345234] = 1
bv[233]=1
bv[243]=1
bv[18]=1
bv[785] =1
print(bv.count_bits_sparse()) # 5
A call to count_bits_sparse() returns an integer whose value is the
number of bits set in the bitvector. Rhiannon, who contributed
this method, estimates that if a bit vector with over 2 millions
bits has only five bits set, this will return the answer in 1/18 of
the time taken by the count_bits() method. Rhianon's implementation
is based on an algorithm generally known as the Brian Kernighan's
way, although its antecedents predate its mention by Kernighan and
Ritchie.
- deep_copy(self)
- You can make a deep copy of a bitvector by
bitvec_copy = bitvec.deep_copy()
Subsequently, any alterations to either of the bitvector objects
bitvec and bitvec_copy will not affect the other.
- divide_into_two(self)
- A bitvector containing an even number of bits can be divided into
two equal parts by
[left_half, right_half] = bitvec.divide_into_two()
where left_half and right_half hold references to the two returned
bitvectors. The method throws an exception when called on a
bitvector with an odd number of bits.
- gcd(self, other)
- Using Euclid's Algorithm, returns the greatest common divisor of
the integer value of the bitvector on which the method is invoked
and the integer value of the argument bitvector:
bv1 = BitVector(bitstring = '01100110') # int val: 102
bv2 = BitVector(bitstring = '011010') # int val: 26
bv = bv1.gcd(bv2)
print(int(bv)) # 2
The result returned by gcd() is a bitvector object.
- gen_rand_bits_for_prime = gen_random_bits(self, width)
- gen_random_bits(self, width)
- You can generate a bitvector with random bits with the bits
spanning a specified width. For example, if you wanted a random
bit vector to fully span 32 bits, you would say
bv = BitVector(intVal = 0)
bv = bv.gen_random_bits(32)
print(bv) # 11011010001111011010011111000101
As you would expect, gen_random_bits() returns a bitvector object.
The bulk of the work here is done by calling random.getrandbits(
width) which returns an integer whose binary code representation
will NOT BE LARGER than the argument 'width'. When random numbers
are generated as candidates for primes, you often want to make sure
that the random number thus created spans the full width specified
by 'width' and that the number is odd. This we do by setting the
two most significant bits and the least significant bit.
- getHexStringFromBitVector = get_bitvector_in_hex(self)
- getTextFromBitVector = get_bitvector_in_ascii(self)
- get_bitvector_in_ascii(self)
- You can call get_bitvector_in_ascii() to directly convert a bit
vector into a text string (this is a useful thing to do only if the
length of the vector is an integral multiple of 8 and every byte in
your bitvector has a print representation):
bv = BitVector(textstring = "hello")
print(bv) # 0110100001100101011011000110110001101111
mytext = bv3.get_bitvector_in_ascii()
print mytext # hello
This method is useful when you encrypt text through its bitvector
representation. After decryption, you can recover the text using
the call shown here. A call to get_bitvector_in_ascii() returns a
string.
- get_bitvector_in_hex(self)
- You can directly convert a bit vector into a hex string (this is a
useful thing to do only if the length of the vector is an integral
multiple of 4):
bv4 = BitVector(hexstring = "68656c6c6f")
print(bv4) # 0110100001100101011011000110110001101111
myhexstring = bv4.get_bitvector_in_hex()
print myhexstring # 68656c6c6
This method throws an exception if the size of the bitvector is not
a multiple of 4. The method returns a string that is formed by
scanning the bits from the left and replacing each sequence of 4
bits by its corresponding hex digit.
- get_hex_string_from_bitvector = get_bitvector_in_hex(self)
- get_text_from_bitvector = get_bitvector_in_ascii(self)
- gf_MI(self, mod, n)
- To calculate the multiplicative inverse of a bit vector in the
Galois Field GF(2^n) with respect to a modulus polynomial, call
gf_MI() as follows:
modulus = BitVector(bitstring = '100011011')
n = 8
a = BitVector(bitstring = '00110011')
multi_inverse = a.gf_MI(modulus, n)
print multi_inverse # 01101100
A call to gf_MI() returns a bitvector object.
- gf_divide = gf_divide_by_modulus(self, mod, n)
- gf_divide_by_modulus(self, mod, n)
- To divide a bitvector by a modulus bitvector in the Galois Field
GF(2^n):
mod = BitVector(bitstring='100011011') # AES modulus
n = 8
a = BitVector(bitstring='11100010110001')
quotient, remainder = a.gf_divide_by_modulus(mod, n)
print(quotient) # 00000000111010
print(remainder) # 10001111
What this example illustrates is dividing the bitvector a by the
modulus bitvector mod. For a more general division of one
bitvector a by another bitvector b, you would multiply a by the MI
of b, where MI stands for "multiplicative inverse" as returned by
the call to the method gf_MI(). A call to gf_divide_by_modulus()
returns two bitvectors, one for the quotient and the other for the
remainder.
- gf_multiply(self, b)
- If you want to multiply two bit patterns in GF(2):
a = BitVector(bitstring='0110001')
b = BitVector(bitstring='0110')
c = a.gf_multiply(b)
print(c) # 00010100110
As you would expect, in general, the bitvector returned by this
method is longer than the two operand bitvectors. A call to
gf_multiply() returns a bitvector object.
- gf_multiply_modular(self, b, mod, n)
- If you want to carry out modular multiplications in the Galois
Field GF(2^n):
modulus = BitVector(bitstring='100011011') # AES modulus
n = 8
a = BitVector(bitstring='0110001')
b = BitVector(bitstring='0110')
c = a.gf_multiply_modular(b, modulus, n)
print(c) # 10100110
The call to gf_multiply_modular() returns the product of the two
bitvectors a and b modulo the bitvector modulus in GF(2^8). A call
to gf_multiply_modular() returns is a bitvector object.
- hamming_distance(self, other)
- You can compare two bitvectors with the Hamming distance:
bv1 = BitVector(bitstring = '11111111')
bv2 = BitVector(bitstring = '00101011')
print(str(bv1.hamming_distance(bv2))) # 4
This method returns a number that is equal to the number of bit
positions in which the two operand bitvectors disagree.
- intValue = int_val(self)
- int_val(self)
- Return the integer value of a bitvector
- isPowerOf2 = is_power_of_2(self)
- isPowerOf2_sparse = is_power_of_2_sparse(self)
- is_power_of_2(self)
- You can test whether the integer value of a bit vector is a power of
two. (The sparse version of this method works much faster for very
long bit vectors.) However, the regular version defined here may
work faster for dense bit vectors.
bv = BitVector(bitstring = '10000000001110')
print(bv.is_power_of_2())
This predicate returns 1 for true and 0 for false.
- is_power_of_2_sparse(self)
- You can test whether the integer value of a bit vector is a power of
two. This sparse version works much faster for very long bit
vectors. (However, the regular version defined above may work
faster for dense bit vectors.)
bv = BitVector(bitstring = '10000000001110')
print(bv.is_power_of_2_sparse())
This predicate returns 1 for true and 0 for false.
- jaccard_distance(self, other)
- You can calculate the distance between two bitvectors using the
Jaccard distance coefficient.
bv1 = BitVector(bitstring = '11111111')
bv2 = BitVector(bitstring = '00101011')
print(str(bv1.jaccard_distance(bv2))) # 0.375
The value returned is a floating point number between 0 and 1.
- jaccard_similarity(self, other)
- You can calculate the similarity between two bitvectors using the
Jaccard similarity coefficient.
bv1 = BitVector(bitstring = '11111111')
bv2 = BitVector(bitstring = '00101011')
print bv1.jaccard_similarity(bv2) # 0.675
The value returned is a floating point number between 0 and 1.
- length(self)
- multiplicative_inverse(self, modulus)
- Using the Extended Euclid's Algorithm, this method calculates the
multiplicative inverse using normal integer arithmetic. [For such
inverses in a Galois Field GF(2^n), use the method gf_MI().]
bv_modulus = BitVector(intVal = 32)
bv = BitVector(intVal = 17)
bv_result = bv.multiplicative_inverse( bv_modulus )
if bv_result is not None:
print(str(int(bv_result))) # 17
else: print "No multiplicative inverse in this case"
What this example says is that the multiplicative inverse of 17
modulo 32 is 17. That is because 17 times 17 modulo 32 equals 1.
When using this method, you must test the returned value for
None. If the returned value is None, that means that the number
corresponding to the bitvector on which the method is invoked does
not possess a multiplicative-inverse with respect to the modulus.
When the multiplicative inverse exists, the result returned by
calling multiplicative_inverse() is a bitvector object.
- next_set_bit(self, from_index=0)
- Starting from a given bit position, you can find the position index
of the next set bit by
bv = BitVector(bitstring = '00000000000001')
print(bv.next_set_bit(5)) # 13
In this example, we are asking next_set_bit() to return the index
of the bit that is set after the bit position that is indexed 5. If
no next set bit is found, the method returns -1. A call to
next_set_bit() always returns a number. This method was
contributed originally by Jason Allum and updated subsequently by
John Gleeson.
- pad_from_left(self, n)
- You can pad a bitvector at its the left end with a designated number of
zeros with this method. This method returns the bitvector object on
which it is invoked. So you can think of this method as carrying
out an in-place extension of a bitvector (although, under the hood,
the extension is carried out by giving a new longer _vector
attribute to the bitvector object).
- pad_from_right(self, n)
- You can pad a bitvector at its right end with a designated number of
zeros with this method. This method returns the bitvector object on
which it is invoked. So you can think of this method as carrying
out an in-place extension of a bitvector (although, under the hood,
the extension is carried out by giving a new longer _vector
attribute to the bitvector object).
- permute(self, permute_list)
- This method returns a new bitvector object. Permuting a bitvector means
that you select its bits in the sequence specified by the argument
permute_list.
- rank_of_bit_set_at_index(self, position)
- You can measure the "rank" of a bit that is set at a given
position. Rank is the number of bits that are set up to the
position of the bit you are interested in.
bv = BitVector(bitstring = '01010101011100')
print(bv.rank_of_bit_set_at_index(10)) # 6
The value 6 returned by this call to rank_of_bit_set_at_index() is
the number of bits set up to the position indexed 10 (including
that position). This method throws an exception if there is no bit
set at the argument position. Otherwise, it returns the rank as a
number.
- read_bits_from_file(self, blocksize)
- You can construct bitvectors directly from the bits in a disk file
through the calls shown below. As you can see, this requires two
steps: First you make a call as illustrated by the first statement
below. The purpose of this call is to create a file object that is
associated with the variable bv. Subsequent calls to
read_bits_from_file(n) on this variable return a bitvector for each
block of n bits thus read. The read_bits_from_file() throws an
exception if the argument n is not a multiple of 8.
bv = BitVector(filename = 'somefile')
bv1 = bv.read_bits_from_file(64)
bv2 = bv.read_bits_from_file(64)
...
...
bv.close_file_object()
When reading a file as shown above, you can test the attribute
more_to_read of the bitvector object in order to find out if there
is more to read in the file. The while loop shown below reads all
of a file in 64-bit blocks:
bv = BitVector( filename = 'testinput4.txt' )
print("Here are all the bits read from the file:")
while (bv.more_to_read):
bv_read = bv.read_bits_from_file( 64 )
print(bv_read)
bv.close_file_object()
The size of the last bitvector constructed from a file corresponds
to how many bytes remain unread in the file at that point. It is
your responsibility to zero-pad the last bitvector appropriately
if, say, you are doing block encryption of the whole file.
- read_bits_from_fileobject(self, fp)
- This function is meant to read a bit string from a file like
object.
- reset(self, val)
- Resets a previously created BitVector to either all zeros or all ones
depending on the argument val. Returns self to allow for syntax like
bv = bv1[3:6].reset(1)
or
bv = bv1[:].reset(1)
- reverse(self)
- Given a bit vector, you can construct a bit vector with all the
bits reversed, in the sense that what was left to right before now
becomes right to left.
bv = BitVector(bitstring = '0001100000000000001')
print(str(bv.reverse()))
A call to reverse() returns a new bitvector object whose bits are
in reverse order in relation to the bits in the bitvector on which
the method is invoked.
- runs(self)
- You can extract from a bitvector the runs of 1's and 0's in the
vector as follows:
bv = BitVector(bitlist = (1,1, 1, 0, 0, 1))
print(str(bv.runs())) # ['111', '00', '1']
The object returned by runs() is a list of strings, with each
element of this list being a string of 1's and 0's.
- setValue = set_value(self, *args, **kwargs)
- set_value(self, *args, **kwargs)
- You can call set_value() to change the bit pattern associated with
a previously constructed bitvector object:
bv = BitVector(intVal = 7, size =16)
print(bv) # 0000000000000111
bv.set_value(intVal = 45)
print(bv) # 101101
You can think of this method as carrying out an in-place resetting
of the bit array in a bitvector. The method does not return
anything. The allowable modes for changing the internally stored
bit array for a bitvector are the same as for the constructor.
- shift_left(self, n)
- Call this method if you want to shift in-place a bitvector to the left
non-circularly. As a bitvector is shifted non-circularly to the
left, the exposed bit positions at the right end are filled with
zeros. This method returns the bitvector object on which it is
invoked. This is to allow for chained invocations of the method.
- shift_left_by_one(self)
- For a one-bit in-place left non-circular shift. Note that bitvector size
does not change. The leftmost bit that moves past the first element of the
bitvector is discarded and rightmost bit of the returned vector is set to
zero.
- shift_right(self, n)
- Call this method if you want to shift in-place a bitvector to the right
non-circularly. As a bitvector is shifted non-circularly to the
right, the exposed bit positions at the left end are filled with
zeros. This method returns the bitvector object on which it is
invoked. This is to allow for chained invocations of the method.
- shift_right_by_one(self)
- For a one-bit in-place right non-circular shift. Note that bitvector size
does not change. The rightmost bit that moves past the last element of the
bitvector is discarded and leftmost bit of the returned vector is set to
zero.
- test_for_primality(self)
- You can test whether a randomly generated bit vector is a prime
number using the probabilistic Miller-Rabin test
bv = BitVector(intVal = 0)
bv = bv.gen_random_bits(32)
check = bv.test_for_primality()
print(check)
The test_for_primality() methods returns a floating point number
close to 1 for prime numbers and 0 for composite numbers. The
actual value returned for a prime is the probability associated
with the determination of its primality.
- unpermute(self, permute_list)
- This method returns a new bitvector object. As indicated earlier
for the permute() method, permuting a bitvector means that you
select its bits in the sequence specified by the argument
permute_list. Calling unpermute() with the same argument
permute_list restores the sequence of bits to what it was in
the original bitvector.
- write_bits_to_fileobject(self, fp)
- You can write a bitvector directly to a stream object, as
illustrated by:
fp_write = io.StringIO()
bitvec.write_bits_to_fileobject(fp_write)
print(fp_write.getvalue())
This method does not return anything.
This function is meant to write a bitvector directly to a file like
object. Note that whereas 'write_to_file' method creates a memory
footprint that corresponds exactly to the bitvector, the
'write_bits_to_fileobject' actually writes out the 1's and 0's as
individual items to the file object. That makes this method
convenient for creating a string representation of a bitvector,
especially if you use the StringIO class, as shown in the test
code.
- write_to_file(self, file_out)
- You can write a bit vector directly to a file by calling
write_to_file(), as illustrated by the following example that reads
one bitvector from a file and then writes it to another file:
bv = BitVector(filename = 'input.txt')
bv1 = bv.read_bits_from_file(64)
print(bv1)
FILEOUT = open('output.bits', 'wb')
bv1.write_to_file(FILEOUT)
FILEOUT.close()
bv = BitVector(filename = 'output.bits')
bv2 = bv.read_bits_from_file(64)
print(bv2)
Sicne all file I/O is byte oriented, the method write_to_file()
throws an exception if the size of the bitvector on which the
method is invoked is not a multiple of 8. This method does not
return anything.
IMPORTANT FOR WINDOWS USERS: When writing an internally generated
bit vector out to a disk file, it is important to open
the file in the binary mode as shown. Otherwise, the
bit pattern 00001010 ('\n') in your bitstring will be
written out as 0000110100001010 ('\r\n'), which is
the linebreak on Windows machines.
Data descriptors defined here:
- __dict__
- dictionary for instance variables (if defined)
- __weakref__
- list of weak references to the object (if defined)
|
|