|
1 |
| -Source: https://www.hackerrank.com/challenges/torque-and-development/ |
| 1 | +# Roads and Libraries |
2 | 2 |
|
3 |
| -TODO(domfarolino): revise this post. |
| 3 | +Source: https://www.hackerrank.com/challenges/torque-and-development/ |
4 | 4 |
|
5 | 5 | This is a pretty interesting graph problem. It vexed me for a bit until I made some cruicial realizations.
|
6 | 6 |
|
7 | 7 | # Divide the problem into connected components
|
8 | 8 |
|
9 |
| -When starting with this problem I fumbled around quite a bit. Eventually I came to some good realizations: |
| 9 | +When starting with this problem I fumbled around quite a bit, but eventually I came to some realizations revolving |
| 10 | +around focusing on each connected component in the given graph: |
10 | 11 |
|
11 |
| - - We'll need at least one library per connected component |
12 |
| - - In each component, there are two extremes: |
| 12 | + - We'll need at least one library per connected component in the graph |
| 13 | + - In each connected component, there are two extremes: |
13 | 14 | - Every city in a connected component has a library
|
14 | 15 | - Only one city in a connected component has a library
|
15 | 16 |
|
16 |
| -My next thought was that the naive solution would be to find all possible combinations of library/road |
17 |
| -allocations in between the extremes, which seems combinatorially explosive. For example, what if there |
18 |
| -were not the extreme `n` libraries and `0` roads in a component, but instead `n - 1` libraries and `1` |
19 |
| -road, or `n - 2` libraries and `2` roads. How many different ways can we |
20 |
| -[*choose*](https://en.wikipedia.org/wiki/Binomial_coefficient) how to allocate which cities have libraries |
21 |
| -and which cities to connect, and more importantly, does the choosing of these actually affect the cost? |
22 |
| -Determining the number of possible choices we can make when allocating libraries to cities is actually pretty |
23 |
| -easy (it's just the summation of binomial coefficients, [see here](https://math.stackexchange.com/questions/519832/)), |
24 |
| -it would just be combinatorially explosive to go through each one; was it necessary? |
| 17 | +My next thought was that the naïve solution would be to find all possible combinations of allocated libraries |
| 18 | +and built roads between the aforementioned extremes, which seems combinatorially explosive. For example, what |
| 19 | +if there were not the extreme `n` libraries and `0` roads in a component, but instead `n - 1` libraries and `1` |
| 20 | +road, or maybe `n - 2` libraries and `2` roads. It is obvious to determine how many different ways we can |
| 21 | +[*choose*](https://en.wikipedia.org/wiki/Binomial_coefficient) to allocate `n - k` libraries amongst `n` cities |
| 22 | +in a component. But how many different ways can we decide what roads to build given a particular allocation? Do |
| 23 | +the different choices of which roads to build for some given library allocation actually affect the cost? |
| 24 | + |
| 25 | +When looking at an example graph with five (once-) connected cities, I realized that the allocation of libraries |
| 26 | +doesn't matter at all and won't affect the cost. (I had considered the idea that perhaps the degree of each city |
| 27 | +might have an affect on, or indicate priority of library assignment). Which cities we choose to build libraries |
| 28 | +in is irrelevant as long as we don't waste a road connecting two library-bearing cities when we could use it to |
| 29 | +connect a non-library-bearing city to a library-bearing one. |
25 | 30 |
|
26 |
| -When looking at an example graph with five (once-) connected cities I realized that the allocation of libraries |
27 |
| -doesn't matter at all and won't affect the cost. (I was considering the idea that perhaps the degree of each city |
28 |
| -might have an affect on, or indicate priority of library assignment). The allocation makes no difference as long |
29 |
| -as we don't waste a road connecting two library-bearing cities, because why would we do that? |
| 31 | +> Warning, entering a tangent: |
30 | 32 |
|
31 |
| -[Enter a tangent]... |
| 33 | + |
32 | 34 |
|
33 | 35 | The whole reason this accidentally-connecting-two-library-bearing-cities issue came up is because I was examining a
|
34 |
| -quite feasible 5-city graph with a cycle trying to allocate `3` libraries and `2` roads. I wondered if I could choose |
35 |
| -a "bad" allocation of libraries and roads, namely one that doesn't actually connect each city in the component. This is |
36 |
| -certainly possible in a graph with cycles when only dealing with `numberOfCities` resources (`3` libraries and `2` roads). |
| 36 | +quite feasible 5-city graph with a cycle, trying to allocate `3` libraries and `2` roads. I wondered what would happen |
| 37 | +if I chose to waste the building of a road on connecting two cities that both bore libraries, instead of building a road |
| 38 | +from a city that was not connected to a library to a city in the same component that was. This "bad allocation" can happen |
| 39 | +because of this cycle (effectively wasting a road on a group of cities that don't need another built). |
37 | 40 |
|
38 |
| -I was then worried about making sure my implementation would not accidentally theoretically waste a road on two |
39 |
| -library-bearing cities, and then I realized well yeah, if the allocation doesn't matter, we just have to know that |
40 |
| -some working allocation exists, and that will be the minimum total cost for such choices of the number of libraries |
41 |
| -and roads for that connected component. |
| 41 | +I was wondering how I could ensure my implementation would not accidentally do this, but then I realized it wouldn't be |
| 42 | +necessary. *Some* proper allocation of built roads exists, and as long as I know it exists, I don't have to worry about |
| 43 | +accidentally choosing a "bad allocation", because I'll be using the same number of roads either way! And I know that by |
| 44 | +using that number of roads, it *is* possible to connect all cities, therefore the exact layout is meaningless. |
42 | 45 |
|
43 | 46 | # A connected component is at least a tree
|
44 | 47 |
|
45 |
| -The "choice" of which roads to build dissolves when you realize that the connected component by definition is at least a |
46 |
| -tree, and thus always has valid allocations of libraries and roads in the form of: |
| 48 | +The reason there is definitely *a* working allocation of roads to build is true is because we can ignore cycles that would |
| 49 | +otherwise form a "bad allocation", because the connected component contains *at least* the roads necessary to connect all |
| 50 | +cities without roads that form cycles. In other words, the connected component is at least a tree, so mathematically *some* |
| 51 | +non-wasteful layout of roads exists. |
| 52 | + |
| 53 | +A group of connectable cities can therefore have all of its cities connected to libraries in `N` different ways, where `N` |
| 54 | +is the number of cities in the group: |
47 | 55 |
|
48 | 56 | `N - K` libraries + `K` roads, `∀ K < N` (remember, we need at least one library).
|
49 | 57 |
|
50 |
| -This means each connected component had `N` possible solutions, and for each of the values of `K`, we needed to choose the |
51 |
| -minumum one. Going through some examples I realized the best answer always seemed to be one of the extreme allocations, namely |
52 |
| -an allocation with all `N` libraries or only `1` library. I tried to find an example where one of the middleground less |
53 |
| -extreme allocations could be more optimal, but I came to the conclusion that that will never be the case, because we greedily |
54 |
| -want to choose to employ as many of the cheapest resource (either libraries or roads) as possible. In other words, if roads were |
55 |
| -cheaper to build then libraries, and there exist the possible roads to repair to connect the entire component (the definition!), |
56 |
| -then we'd want to only build `1` library, and as many remaining roads as we'd need. We could build two libraries, and one less |
57 |
| -road, but that would give us the same connected result but with a higher cost, unnecessarily. |
| 58 | +This means each connected component has `N` possible solutions, and for each of the values of `K` (ranging from `0` to `N - 1`), |
| 59 | +we need to choose the most cost-efficient one. Going through some examples I realized the best answer always seemed to be one of |
| 60 | +the extremes, namely a layout containing all `N` libraries and `0` roads, or only `1` library and `N - 1` roads. Trying to find |
| 61 | +an example in which one of the middleground distributions was more optimal, I eventually came to the conclusino that this will never |
| 62 | +be the case, because we greedily want to choose to employ as many of the cheapest resource (libraries or roads) as possible! |
| 63 | + |
| 64 | +In other words, if roads were cheaper to build then libraries, and there exist the possible roads to repair to connect the entire |
| 65 | +component (the definition!), then we'd want to only build `1` library and as many remaining roads as we'd need. We could build two |
| 66 | +libraries, and one less road, but that would give us the same connected result but with a higher cost, unnecessarily. |
58 | 67 |
|
59 | 68 | # Implementation design
|
60 | 69 |
|
61 | 70 | When thinking about the implementation, I knew the number of connected components was relevant to this problem. I also knew
|
62 |
| -we could get an entire connected component (but more importantly its size) using a trivial-to-implement BFS algorithm. I figured |
| 71 | +we could get an entire connected component (but more importantly, its size) using good ole BFS (DFS suffices too). I figured |
63 | 72 | I'd use an adjecency list to store the graph, since I wasn't going to perform any operations that a matrix would be more suited
|
64 | 73 | for. The necessary steps were something like this:
|
65 | 74 |
|
66 | 75 | - Build the graph's adjacency list
|
67 |
| - - For each connected component |
| 76 | + - For each connected component: |
68 | 77 | - Get the size of the component
|
69 |
| - - Minimal cost of connecting this component was `min(a, b)` where: |
70 |
| - - `a = numCities * costLib` |
71 |
| - - `b = costLib + (numCities - 1) * costRoad` |
72 |
| - - With the minimal cost of the component in hand, add the value to the running some, and perform the same operation for the next component. |
| 78 | + - Compute the minimal cost of connecting this component (rebuilding existing roads), which is `min(a, b)` where: |
| 79 | + - `a = componentSize * costLib` |
| 80 | + - `b = costLib + (componentSize - 1) * costRoad` |
| 81 | + - With the minimal cost of the component in hand, add the value to a running sum, and perform the same operation for the next component. |
73 | 82 |
|
74 |
| -Moving from component-to-component is as easy as just using BFS with some sort of global visitation store. |
| 83 | +Moving from component-to-component is as easy as just using BFS with some sort of global visitation data structure. |
75 | 84 | We can try to find a connected component from each given city. The first time we run BFS, we'll mark *all* nodes in
|
76 | 85 | the discovered component as visited. Then in the next given city, we'll try to find another connected component *if*
|
77 |
| -the city has not already been visited (does not exist as a part of an already-discovered connected component). We keep |
78 |
| -a running sum, adding to it the minimum cost required to connect a once-connected component, and eventually return the |
79 |
| -final value. |
| 86 | +the city has not already been visited (does not exist as a part of an already-discovered connected component). Eventually |
| 87 | +we'll return the value of a running sum we've kept (as mentioned above). |
80 | 88 |
|
81 |
| -Time complexity: O(n) (by marking nodes as visited, we're repeating ourselves) |
| 89 | +Time complexity: O(n) |
82 | 90 | Space complexity: O(n)
|
83 | 91 |
|
84 | 92 | *It should be noted that the complexity of this algorithm could easily by O(n^2) (due to edge processing in the complete
|
|
0 commit comments