{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# TD11 - Implementations of Dijkstra's algorithm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 0. Preliminaries"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import sys"
]
},
{
"cell_type": "code",
"execution_count": 169,
"metadata": {},
"outputs": [],
"source": [
"class G:\n",
" def __init__(self,value=None,index=None,children_nodes=None,weights=None):\n",
" self.value = value\n",
" self.index = index\n",
" self.children = []\n",
" for child,weight in zip(children_nodes,weights):\n",
" self.children.append([child,weight])\n",
" \n",
" def print_graph(self,depth=0,node_list=[]):\n",
" if self in set(node_list):\n",
" print(\"|\",\"---\" * depth,\">\",self.value)\n",
" return\n",
" else:\n",
" print(\"|\",\"---\" * depth,\">\",self.value)\n",
" node_list.append(self)\n",
" for child,weight in self.children:\n",
" if child != None: \n",
" child.print_graph(depth+1,node_list) "
]
},
{
"cell_type": "code",
"execution_count": 224,
"metadata": {},
"outputs": [],
"source": [
"def dijkstra(nodes,start_node):\n",
" A=[]\n",
" notA=[start_node]\n",
" distances=np.zeros(len(nodes))\n",
" predecessors = []\n",
" for node in nodes:\n",
" predecessors.append([[]])\n",
"\n",
" for v in [v for v in nodes if v!= start_node]:\n",
" distances[v.index]=sys.maxsize//2\n",
" \n",
" while notA != []:\n",
" u = min(notA,key=lambda v:distances[v.index])\n",
" \n",
" A.append(u)\n",
" notA.remove(u)\n",
" \n",
" for v in [ v for v in nodes if v not in A]:\n",
" for c,w in v.children:\n",
" if (u==c and (distances[v.index]>distances[u.index]+w)): \n",
" if distances[v.index]==sys.maxsize//2:\n",
" notA.append(v)\n",
" distances[v.index]=distances[u.index]+w\n",
" predecessors[v.index]=u \n",
" \n",
" return distances"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The complexity of Dikjstra's algorithm for a graph $\\mathcal G=(\\mathcal E,\\mathcal V)$ can be evaluated as:\n",
" * the initialization step costs $O(|\\mathcal V|)$\n",
" * the while loop is run at most $O(|\\mathcal V|)$ times since vertices in $\\bar A$ (`notA`) are successively deleted and can never reappear.\n",
" * within each loop, a search of the highest distance node is performed, which by default is also $O(|\\mathcal V|)$ in the worse case. But can be improved! say the resulting cost is $O(x)$: the total cost over the while loops is then $O(|\\mathcal V|x)$\n",
" * again within each while loop, we conditionally update the distance for vertices not in $A$ (`A`). These are in number also $O(|\\mathcal V|)$ in total. However, only \"neighboring nodes\" will ever be considered, so the total number of iterations of this part of the while loop, summed over all while loops, is rather $O(|\\mathcal E|)$ in total (which in sparse graphs would be much less than $O(|\\mathcal V|^2)$. Assuming that the actual updating operation has cost $O(y)$, the maximum cost for this part of the loop is then $O(|\\mathcal E|y)$.\n",
" \n",
"In total, the complexity is $$O(|\\mathcal V|x+|\\mathcal E|y).$$"
]
},
{
"cell_type": "code",
"execution_count": 225,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"List of values in the graph: [ 8 66 18 78 69 1 81 31 32 14]\n",
"\n",
"Adjacency matrix:\n",
" [[ 0 0 68 0 78 0 47 0 35 0]\n",
" [ 0 0 0 0 0 0 0 0 50 22]\n",
" [68 0 0 0 39 0 0 75 47 90]\n",
" [ 0 0 0 0 0 76 15 46 0 0]\n",
" [78 0 39 0 0 16 0 28 0 0]\n",
" [ 0 0 0 76 16 0 0 67 50 0]\n",
" [47 0 0 15 0 0 0 16 2 0]\n",
" [ 0 0 75 46 28 67 16 0 4 0]\n",
" [35 50 47 0 0 50 2 4 0 44]\n",
" [ 0 22 90 0 0 0 0 0 44 0]]\n",
"\n",
"Tree representation of the graph starting from node[ 0 ]:\n",
"\n",
"| > 8\n",
"| --- > 18\n",
"| ------ > 8\n",
"| ------ > 69\n",
"| --------- > 8\n",
"| --------- > 18\n",
"| --------- > 1\n",
"| ------------ > 78\n",
"| --------------- > 1\n",
"| --------------- > 81\n",
"| ------------------ > 8\n",
"| ------------------ > 78\n",
"| ------------------ > 31\n",
"| --------------------- > 18\n",
"| --------------------- > 78\n",
"| --------------------- > 69\n",
"| --------------------- > 1\n",
"| --------------------- > 81\n",
"| --------------------- > 32\n",
"| ------------------------ > 8\n",
"| ------------------------ > 66\n",
"| --------------------------- > 32\n",
"| --------------------------- > 14\n",
"| ------------------------------ > 66\n",
"| ------------------------------ > 18\n",
"| ------------------------------ > 32\n",
"| ------------------------ > 18\n",
"| ------------------------ > 1\n",
"| ------------------------ > 81\n",
"| ------------------------ > 31\n",
"| ------------------------ > 14\n",
"| ------------------ > 32\n",
"| --------------- > 31\n",
"| ------------ > 69\n",
"| ------------ > 31\n",
"| ------------ > 32\n",
"| --------- > 31\n",
"| ------ > 31\n",
"| ------ > 32\n",
"| ------ > 14\n",
"| --- > 69\n",
"| --- > 81\n",
"| --- > 32\n",
"\n",
"Output of Dijkstra's algorithm from node 0 : [ 0. 85. 68. 52. 67. 83. 37. 39. 35. 79.]\n"
]
}
],
"source": [
"### Random graph generator\n",
"\n",
"graph_size = 10\n",
"list_values = np.random.randint(0,100,graph_size)\n",
"\n",
"# Affinity matrix (undirected) of the graph with 'p' probability of connection\n",
"# and integer weights from 0 to 'max_weight'\n",
"p = .5\n",
"max_weight = 100\n",
"A = np.random.binomial(1,p,(graph_size,graph_size))*np.random.randint(0,max_weight+1,(graph_size,graph_size))\n",
"A = np.triu(A,1)+np.triu(A,1).T\n",
"\n",
"nodes = []\n",
"\n",
"print(\"List of values in the graph:\",list_values)\n",
"\n",
"for i in range(len(list_values)):\n",
" nodes.append(G(value=list_values[i],index=i,children_nodes=[],weights=[]))\n",
"\n",
"for i in range(len(nodes)):\n",
" for j in range(len(nodes)):\n",
" if A[i,j]>0:\n",
" nodes[i].children.append([nodes[j],A[i,j]])\n",
" \n",
"print(\"\\nAdjacency matrix:\\n\",A)\n",
" \n",
"node_index = 0\n",
"print(\"\\nTree representation of the graph starting from node[\",node_index,\"]:\\n\") \n",
"nodes[node_index].print_graph(node_list=[])\n",
"\n",
"print(\"\\nOutput of Dijkstra's algorithm from node\",node_index,\":\",dijkstra(nodes,nodes[node_index]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Chained lists\n",
"\n",
"As seen above, if the set `notA` is a list, one has to browse through the complete list to find the smallest distance, inducing a cost $x=O(|\\mathcal V|)$ in the worst case. As for the updating rule, it costs in general $y=O(1)$, and the resulting total cost is then:\n",
"$$O(|\\mathcal V|^2+|\\mathcal E|)$$\n",
"which, since $|\\mathcal E|\\leq |\\mathcal V|^2$, is simply\n",
"$$O(|\\mathcal V|^2).$$\n",
"\n",
"\n",
"## 2. Dial's implementation\n",
"\n",
"Say `u` is the node added to `A` at iteration $i$. This means it has the smallest distance to the starting node `start_node` among the nodes in `notA`. During this same iteration, the distances of the nodes (be they in `notA` or not) are only updated by addition of a positive value. So at iteration $i+1$, the next node added to `A` must have a greater (or equal) distance to `u`.\n",
"\n",
"For integer distances, this remark can be used to improve Dikjstra's algorithm by creating a list `c` of all nodes at distances $0,1,\\ldots,w_{\\rm max}(|\\mathcal V|-1)$, where $w_{\\rm max}$ is the maximum weight value, and looping over `c`.\n",
"\n",
"The resulting complexity a priori seems larger because we need to loop over all `c` which is of large size. However, `c` is rather empty and only $O(|\\mathcal E|)$ distance updates occur. Together, the complexity may be evaluated as $O(|\\mathcal V|w_{\\rm max}+|\\mathcal E|)$ but, again, knowing that most of the $O(|\\mathcal V|w_{\\rm max})$ operations come at negligible cost. This implementation is at any rate particularly convenient if $w_{\\rm max}$ is small, such as with only binary weights.\n"
]
},
{
"cell_type": "code",
"execution_count": 232,
"metadata": {},
"outputs": [],
"source": [
"def dijkstra_Dial(nodes,start_node,maxW):\n",
" A=[]\n",
" notA=[start_node]\n",
" n=len(nodes)\n",
" distances=np.zeros(n).astype(int)\n",
" \n",
" ## creation of table 'c'\n",
" c=[]\n",
" for _ in range(maxW*(n-1)+1):\n",
" c.append([])\n",
" c[0].append(start_node)\n",
"\n",
" predecessors = []\n",
" for node in nodes:\n",
" predecessors.append([[]])\n",
"\n",
" for v in [v for v in nodes if v!= start_node]:\n",
" distances[v.index]=maxW*(n-1)\n",
" c[maxW*(n-1)].append(v)\n",
" \n",
" i=0\n",
" while i<=(n-1)*maxW:\n",
" if c[i]!=[]:\n",
" u = c[i][0]\n",
" c[i].remove(u)\n",
" du = i\n",
"\n",
" for v in nodes:\n",
" for child,w in v.children:\n",
" if (u==child and distances[v.index]>du+w):\n",
" c[distances[v.index]].remove(v)\n",
" distances[v.index]=du+w\n",
" c[distances[v.index]].append(v)\n",
" predecessors[v.index]=u\n",
" else:\n",
" i+=1\n",
" \n",
" return distances"
]
},
{
"cell_type": "code",
"execution_count": 233,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"Output of Dijkstra-Dial's algorithm from node 0 : [ 0 85 68 52 67 83 37 39 35 79]\n"
]
}
],
"source": [
"print(\"\\nOutput of Dijkstra-Dial's algorithm from node\",node_index,\":\",dijkstra_Dial(nodes,nodes[node_index],max_weight))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Coming back to the original Dikjstra's method, we can show that, at any iteration of the while loop,\n",
"$$ {\\rm max}({\\tt distances[v],~v}\\in \\bar A) - {\\rm min}({\\tt distances[v],~v}\\in \\bar A) \\leq w_{\\rm max}. $$\n",
"\n",
"This follows by induction on the while loop. Assumption this holds at a given stage of the loop, two situations may occur that change the set of distances in $\\bar A$: calling `u` the vertex selected having minimal distance within $\\bar A$,\n",
" * for every `v` such that $({\\tt u},{\\tt v})\\in \\mathcal E$, if `v` already belongs to $\\bar A$, its distance is either maintained (when ${\\tt distances[v]}\\leq {\\tt distances[u]}+w_{uv}$) or reduced by at most ${\\tt distances[v]}-{\\tt distances[u]}-w_{uv}$ which is less than $w_{\\rm max}$ by the induction assumption, so the condition still holds;\n",
" * for every `v` such that $({\\tt u},{\\tt v})\\in \\mathcal E$, if `v` did not belong to $\\bar A$ (its distance was set to $+\\infty$ thus far), then its distance becomes exactly ${\\tt distances[u]}+w_{uv}$ which is at most $w_{\\rm max}$ away from ${\\tt distances[u]}$, again confirming the induction.\n",
" \n",
"As a consequence to this remark, the array `c` introduced in the previous Dial approach may be reduced to a size $w_{\\rm max}+1$ array (there is no need to maintain the complete $(|\\mathcal V|-1)w_{\\rm max}+1$ sized array, which can be discarded and replaced at each loop update).\n",
"\n",
"\n",
"## 3. Heap implementation\n",
"\n",
"For a heap implementation based on the distances, it suffices to create a heap whose entries are the couple `(node,distances[node.index])` and values the distances. Then we replace the calls (see TP8 for a reference):\n",
" * `notA=[start_node]` by a binary heap constructor `notA=BH([start_node])`\n",
" * `u = min(notA,key=lambda v:distances[v.index])` by a \"min extraction\" from the heap (`notA.remove_minimum()`)\n",
" * `notA.append(v)` by a heap insertion (`notA.insert(v)`)\n",
" * after the update `distances[v.index]=distances[u.index]+w`, not forgetting to also adapt the heap \"piority\" (`notA.reduce_value(v,distances[v.index],distances[v.index]-(distances[u.index]+w)`)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}