python - Why does my Sieve of Eratosthenes work faster with integers than with booleans? -


i wrote simple sieve of eratosthenes, uses list of ones , turns them zeros if not prime, so:

def esieve(n): #where m fixed-length list of integers n     '''creates list of primes less or equal n'''     m = [1]*(n+1)     in xrange(2,int((n)**0.5)+1):         if m[i]:             j in xrange(i*i,n+1,i):                 m[j]=0     return [i in xrange(2,n) if m[i]] 

i tested speed ran %timeit , got:

#n: t #10**1: 7 μs #10**2: 26.6 μs #10**3: 234 μs #10**4: 2.46 ms #10**5: 26.4 ms #10**6: 292 ms #10**7: 3.27 s 

i assumed, if changed [1] , 0 booleans, run faster... opposite:

#n: t #10**1: 7.31 μs #10**2: 29.5 μs #10**3: 297 μs #10**4: 2.99 ms #10**5: 29.9 ms #10**6: 331 ms #10**7: 3.7 s 

why booleans slower?

this happens because true , false looked globals in python 2. 0 , 1 literals constants, looked quick array reference, while globals dictionary lookups in global namespace (falling through built-ins namespace):

>>> import dis >>> def foo(): ...     = true ...     b = 1 ...  >>> dis.dis(foo)   2           0 load_global              0 (true)               3 store_fast               0 (a)    3           6 load_const               1 (1)               9 store_fast               1 (b)              12 load_const               0 (none)              15 return_value         

the true value looked load_global bytecode, while 1 literal value copied stack load_const.

if make true , false locals can make them fast again:

def esieve(n, true=true, false=false):     m = [true]*(n+1)     in xrange(2,int((n)**0.5)+1):         if m[i]:             j in xrange(i*i,n+1,i):                 m[j]=false     return [i in xrange(2,n) if m[i]] 

assigning true , false default values arguments gives function names locals, exact same values; again using simplified version:

>>> def bar(true=true, false=false): ...     true == false ...  >>> dis.dis(bar)   2           0 load_fast                0 (true)               3 load_fast                1 (false)               6 compare_op               2 (==)               9 pop_top                           10 load_const               0 (none)              13 return_value         

note load_fast opcodes, indices load_const bytecodes; locals in cpython function stored in array bytecode constants.

with change, using booleans wins out, albeit small margin; timings:

# n      integers  globals  locals # 10**1  4.31 µs   4.2 µs   4.2 µs # 10**2  17.1 µs   17.3 µs  16.5 µs # 10**3  147 µs    158 µs   144 µs # 10**4  1.5 ms    1.66 ms  1.48 ms # 10**5  16.4 ms   18.2 ms  15.9 ms # 10**6  190 ms    215 ms   189 ms    # 10**7  2.21 s    2.47 s   2.18 s 

the difference isn't because python booleans int subclass.

note in python 3, true , false have become keywords , can no longer assigned to, making possible treat them integer literals.


Comments