# TD 3 - Divide and Conquer¶

In [46]:
import numpy as np
import random


# 1. Dominating element¶

Function: instances
INPUT : value t, list T
OUTPUT : counts the number of instance of t in T

Function: isDominant
INPUT : value t, list T
OUTPUT : evaluates (recursively) if t is dominant in T (i.e., present more than len(T)//2 times)

Function: dominant
INPUT : list T
OUTPUT : find a dominant value of T or returns None if none
In [24]:
def instances(t,T):
if len(T)==0:
return 0
return (t==T[0]) + instances(t,T[1:])

def isDominant(t,T):
if len(T)==0:
return False
return instances(t,T)>len(T)/2

def dominant(T):
for t in T[:(len(T)+1)//2]:
if isDominant(t,T):
return t
return None # no dominant term

In [48]:
Tsize = 8
T=np.random.randint(0,2,size=Tsize)
print("Random table T:\n",T,"\n")

print("Number of 1s in T:",instances(1,T))
print("\nIs 1 dominant?:",isDominant(1,T))

print("\nDominant term:",dominant(T))

Random table T:
[1 1 0 1 0 1 1 1]

Number of 1s in T: 6

Is 1 dominant?: True

Dominant term: 1


The algorithm is quite naive and has worst case complexity of $O(n^2/2)$.

## 1.1. Divide and Conquer¶

We use here a recursive dichotomic approach using the fact that a dominant value of T must dominate one of the two subvision of T in two half-lists.

Function: divideAndConquer
INPUT : list T
OUTPUT : divideAndConquer version of dominant
In [28]:
def divideAndConquer(T):
if len(T)==1:
return T[0]
center= len(T)//2
left  = divideAndConquer(T[:center])
right = divideAndConquer(T[center:])
if left == right:
return left
if isDominant(right,T):
return right
if isDominant(left,T):
return left
return None


In [58]:
Tsize = 10
T=np.random.randint(0,3,size=Tsize)
print("Random table T:\n",T,"\n")
print("Number of 0s, 1s, 2s: ",sum(T==0),sum(T==1),sum(T==2))

print("Dominant term by divide and conquer: ",divideAndConquer(T))

Random table T:
[1 2 2 1 2 2 2 0 2 2]

Number of 0s, 1s, 2s:  1 2 7
Dominant term by divide and conquer:  2


The complexity is here such that $$C(n)=2C(n/2)+2n$$ in the worst case. By the master theorem, this gives: $$C(n)=O(n\log n)$$

## 1.2. Variation¶

If we compare the values of T by successive pairs, removing those which have different values and only keeping one of the two in case of equality, starting from $n$ values, a maximum of $n/2$ is maintained (the case of all pairwise equalities).

To confirm that a dominant element e of number $k$ of occurrences in T (i.e., $k>n/2$, or more accurately $k\geq n/2-1/2$ in both odd and even $n$'s) is still dominant after this process, consider a given pair:

• if it does not contain e, $n$ is decreased but not $k$
• if it contains e and another value, then $n\to n-2$ and $k\to k-1$ and of course $k-1>(n-2)/2=n/2-1$
• if it contains twice e, then $n\to n-1$ and $k\to k-1$, and $k-1\geq (n-1)/2-1/2=n/2-1$, that is $k\geq n/2\geq n/2-1/2$ both for even and off $n$'s.
Function: divideAndConquerPairs
INPUT : list T
OUTPUT : finds a dominant term in T using the previous remark
In [59]:
def divideAndConquerPairs(T):
if len(T)==0:
return None
if len(T)==1:
return T[0]
if len(T)%2 == 1 and isDominant(T[-1],T):   # in the odd case, last one could be "tipping" the dominance
return T[-1]

halfT = [ t1 for t1,t2 in zip(T[::2],T[1::2]) if t1==t2 ]
candidate = divideAndConquerPairs(halfT)
if isDominant(candidate,T):
return candidate

In [62]:
Tsize = 10
T=np.random.randint(0,3,size=Tsize)
print("Random table T:\n",T,"\n")
print("Number of 0s, 1s, 2s: ",sum(T==0),sum(T==1),sum(T==2))

print("Dominant term by divide and conquer: ",divideAndConquerPairs(T))

Random table T:
[1 1 0 1 0 2 0 2 1 0]

Number of 0s, 1s, 2s:  4 4 2
Dominant term by divide and conquer:  None


For this code, say in the even $n$ case, $C(n)=C(n/2)+n$ so that, by the master theorem, $$C(n)=O(n)$$

# 2. Multiplying large integers – The Karatsuba algorithm¶

## 2.1. Naive algorithm¶

To compute the product of two large integers $A$ and $B$, an idea is to use the binary representation of each as $A=A_1A_2$ and $B=B_1B_2$ on $2\times n/2$ bits and see that $$AB = (A_1B_1)2^n + (A_1B_2+A_2B_1)2^{n/2}+A_2B_2$$ which reduces the calculus to the products $A_1B_1$, $A_1B_2$, $A_2B_1$ and $A_2B_2$, since the products by powers of two are just bit shifts.

The associated cost is then such that $C(n)=4C(n/2)+O(1)$, that is: $$C(n)=O(n^2).$$

Function: naiveProduct
INPUT : integers A, B
OUTPUT : product AB via binary representation
In [63]:
def naiveProduct(A,B):
# Removes the indicator of sign and 'b' in xbxxx
a=bin(A)[2:]
b=bin(B)[2:]

len_a=len(a)
len_b=len(b)

if min(len_a,len_b) == 0:
return '0'

if len_a == 1:
if a[0] == '1':
return b
else:
return '0'

if len_b == 1:
if b[0] == '1':
return a
else:
return '0'

center_a = len_a//2
center_b = len_b//2

#Be careful to properly fill with zeros
return bin(int(naiveProduct(int(a[center_a:],2),int(b[center_b:],2)),2)\
+int(naiveProduct(int(a[:center_a],2),int(b[:center_b],2))+'0'.zfill(len_a-center_a+len_b-center_b),2)\
+int(naiveProduct(int(a[center_a:],2),int(b[:center_b],2))+'0'.zfill(len_b-center_b),2)\
+int(naiveProduct(int(a[:center_a],2),int(b[center_b:],2))+'0'.zfill(len_a-center_a),2))[2:]

In [64]:
A,B=37,14
print(naiveProduct(A,B))
print(bin(A*B)[2:])

1000000110
1000000110


## 2.2. Karatsuba¶

Here we observe that $$(A_1 + A_2)(B_1 + B_2) = A_1 B_1 + A_2 B_2 + (A_1 B_2 + A_2 B_1 )$$ so that $A_1 B_2 + A_2 B_1$ can be computed as $(A_1 + A_2)(B_1 + B_2) - A_1 B_1 - A_2 B_2$, which reduces the number of products to compute from $4$ to $3$.

The resulting complexity is $$C(n) = 3C(n/2)+O(1)$$ or equivalently $$C(n) = O(n^{\log_2(3)})$$

Function: karatsuba
INPUT : integers A, B
OUTPUT : product AB via karatsuba algorithm based on above remark
In [65]:
def karatsuba(A,B):
a=bin(A)[2:]
b=bin(B)[2:]

if len(a)>len(b):
b=b.zfill(len(a))
if len(b)>len(a):
a=a.zfill(len(b))

len_=len(a)

if len_ == 0:
return '0'

if len_ == 1:
if a[0] == '1':
return b
else:
return '0'

center = len_//2

P1 = naiveProduct(int(a[center:],2),int(b[center:],2))
P2 = naiveProduct(int(a[:center],2),int(b[:center],2))
P3 = naiveProduct(int(a[center:],2)+int(a[:center],2),int(b[center:],2)+int(b[:center],2))

return bin(int(P1,2)\
+int(P2+'0'.zfill(2*(len_-center)),2)\
+int(P3+'0'.zfill(len_-center),2)-int(P1+'0'.zfill(len_-center),2)-int(P2+'0'.zfill(len_-center),2))[2:]

In [66]:
A,B=37,14
print(karatsuba(A,B))
print(bin(A*B)[2:])

1000000110
1000000110


## EXTRAS¶

### Reminder of the Master Theorem¶

If $$C(n)=a C\left({\frac {n}{b}}\right)+O(n^{d})$$ then,

• if $d < \log_b a$, $$C(n)=O(n^{\log _{b}a})$$
• if $d = \log_b ⁡a$, $$C(n)=O(n^d \log ⁡n )$$
• if $d > \log_b ⁡a$, $$C(n)=O(n^d).$$
In [ ]: