Tasks 1, 2, 3

There should not be much that is confusing in these tasks.

  1. Should you add two vertices with the same id to the graph? Probably not.
  2. To implement AMGraph, you may assume that there is a maximum number of vertices and that the ids will be in the range [1, maxVertices]. You should not, however, assume that vertices will be added with ids in that order (1 to maxVertices). You should use a matrix to get the benefits of this representation. Using a Map or a List of Lists really defaults to a representation that is closer to an adjacency list.
  3. Can the length of an edge be 0? Yes, this is possible.
  4. You may assume that all subtypes of Vertex implement clone correctly.
  5. getEdge: What should one do if there is no edge between a given pair of vertices?
    1. One could return null but this is problematic because a client may have an unguarded use of the return value.
    2. One could throw a RuntimeException (unchecked) but it is quite possible that someone asks for an edge when it does not exist. Throwing a RuntimeException is not a good choice.
    3. One can throw a checked exception and this requires that a client wrap the call within a try and catch block. Moreover, we should try to avoid the overhead of exception processing when possible. Throwing a checked exception also weakens the spec.
    4. A last option is to add a precondition that getEdge should be called only when edge returns true. Adding the precondition weakens the specification but the impact on the client is less significant (the check is easy) and the overhead is restricted. The best option may well be to weaken the specification in the interface and implement to this weakened spec. (The goal of leaving these methods somewhat underdetermined at the start is to allow people to think about whether a strong spec is always the best option.)
  6. shortestPath: What should one do if there is no path between a given pair of vertices? What should one do if the source and sink are the same? What should one do if one or both vertices are not in the graph? The discussion is somewhat similar to getEdge.
    1. If the source and sink are the same then this is easy for the client to verify. One can return a list with a single vertex to disambiguate the next two cases. This also means that pathLength should behave correctly for a list with one vertex.
    2. What if the two vertices are in the graph and no path exists between them? One can return an empty list and the client can check that the two vertices are in the graph and are not the same with limited effort. One should also implement pathLength such that an empty list implies no path ⇒ a path of infinite length (maybe Integer.MAX_VALUE) with a suitable spec for the postcondition.
    3. What if the one (or both) vertices are not in the graph? It is easy enough for a client to check if the vertices are in the graph so we could use the same behaviour as for when a path does not exist.
    4. In effect, by carefully reasoning about the behaviour we can specify the behaviour and avoid throwing exceptions.

Task 4

  1. minimumSpanningComponents
    1. For k = 1, you should return a minimum spanning tree.
    2. For other values of k, you are essentially making a small modification to the MST algorithm.
    3. It is acceptable to return a component with exactly one vertex.
    4. Do we need a precondition here? Very likely yes. We can add a precondition that the number of components in the graph is at least k. If we believe that a client may not be able to verify this requirement (because it is a computationally-expensive verification) then we can throw a checked exception. Both these approaches weaken the specification and so the spec in the interface needs to be updated.
  2. getCenter: Do not use the Marvel dataset for your own tests unless you are using a small subset. The Marvel dataset can be used separately to test the performance of your algorithm but your test suite needs to be efficient to run (should not take much time). The test suite for Task 1-4 should not take more than 3 minutes. You should pick tests suitably. Small tests are enough to achieve coverage and (in almost all cases) find bugs.

Task 5

  1. You should implement clone() for Link.
  2. This task is quite open-ended but your algorithm should not spend too much time “thinking”. It should be possible to execute the gather stage algorithm with low latency (< 1 minute, for instance).
  3. You can add observers to ImGraph to help with this task. (You should not need to do so for Tasks 1-4.)