01 Fundamentals, Connectivity

Graphs: The Basic Building Blocks (1.1)

At the heart of graph theory lies the concept of a graph, a structure used to model relationships between objects. Formally, a graph $G$ is an ordered pair $(V, E)$ , where $V$ is a non-empty, finite set of vertices (or nodes), and $E$ is a set of edges. In our initial definition, we consider undirected graphs, where each edge connects a pair of vertices without specifying a direction. Mathematically, edges are defined as two-element subsets of the vertex set $V$ .

E \subseteq (2 V) = {{x, y} ∣ x, y \in V, x \neq = y}

Each element $e \in E$ , an edge, is thus a set ${x, y}$ representing a connection between vertices $x$ and $y$ . This formulation implies that the order of vertices in an edge does not matter, and self-loops (edges connecting a vertex to itself) are not allowed in simple graphs.

Graphs provide an intuitive way to visualize and analyze relationships. We can represent a graph by drawing vertices as points or circles and edges as lines connecting these vertices. An edge is drawn between two vertices if and only if the corresponding vertices are connected by an edge in the graph. This visual representation, as exemplified in Figure 1.1 of the provided text, greatly aids in understanding the structure and properties of graphs.

Types of Graphs and Examples

To illustrate the versatility of graphs, let’s consider some fundamental graph types. These examples help solidify the abstract definition and introduce common graph structures we will encounter.

Complete Graph ( $K_{n}$ ): A complete graph on $n$ vertices, denoted as $K_{n}$ , is a graph where every pair of distinct vertices is connected by an edge. In a complete graph, there are no missing edges; it is maximally connected.
Cycle Graph ( $C_{n}$ ): A cycle graph on $n$ vertices, denoted as $C_{n}$ , forms a closed loop. Vertices are connected in a cyclic manner, where each vertex is connected to exactly two others, forming a ring structure.
Path Graph ( $P_{n}$ ): A path graph on $n$ vertices, denoted as $P_{n}$ , is formed by a sequence of vertices where each vertex is connected to the next in a linear chain. It can be visualized as a cycle graph with one edge removed, resulting in a path with two endpoints.
d-dimensional Hypercube ( $Q_{d}$ ): The d-dimensional hypercube $Q_{d}$ is a more complex graph structure defined on the vertex set ${0, 1}^{d}$ , which is the set of all binary sequences of length $d$ . Two vertices in $Q_{d}$ are adjacent if and only if their binary sequences differ in exactly one position. The hypercube generalizes the familiar cube ( $Q_{3}$ ) and square ( $Q_{2}$ ) to higher dimensions.

Bipartite Graphs

A significant class of graphs is bipartite graphs. A graph $G = (V, E)$ is called bipartite if its vertex set $V$ can be partitioned into two disjoint sets, say $A$ and $B$ (denoted $V = A ⊎ B$ ), such that every edge in $E$ connects a vertex in $A$ to a vertex in $B$ . There are no edges within $A$ or within $B$ . Bipartite graphs are crucial in modeling relationships where two distinct types of entities are interconnected, but entities of the same type are not.

Hypercubes and path graphs are examples of bipartite graphs. Cycles, however, are bipartite only if they have an even number of vertices.

Multigraphs and Loops

While our initial definition of a graph excludes self-loops and multiple edges between the same pair of vertices, a more general structure, called a multigraph, allows for these features.

Loops: A loop is an edge that connects a vertex to itself, i.e., an edge of the form ${v, v}$ .
Multiple Edges: Multiple edges occur when more than one edge exists between the same pair of vertices.

Graphs that permit loops and multiple edges are termed multigraphs. While multigraphs are useful in certain contexts, in this course, unless explicitly stated otherwise, we will primarily focus on simple graphs, which are graphs without loops and multiple edges. When we do consider multigraphs, we will clearly indicate it.

Degree and Neighborhood

To further describe the properties of vertices within a graph, we introduce the concepts of neighborhood and degree.

Neighborhood ( $N_{G} (v)$ ): The neighborhood of a vertex $v$ in a graph $G = (V, E)$ , denoted $N_{G} (v)$ , is the set of all vertices adjacent to $v$ .
$N_{G} (v) := {u \in V ∣ {v, u} \in E}$
Degree ( $d e g_{G} (v)$ ): The degree of a vertex $v$ , denoted $d e g_{G} (v)$ , is the number of vertices in its neighborhood, or equivalently, the number of edges incident to $v$ .
$d e g_{G} (v) := ∣ N_{G} (v) ∣$

When the graph $G$ is clear from the context, we often omit the subscript $G$ and simply write $N (v)$ and $d e g (v)$ .

A graph $G$ is called k-regular if every vertex in $G$ has the same degree $k$ . Complete graphs $K_{n}$ are $(n - 1)$ -regular, cycles $C_{n}$ are 2-regular, and hypercubes $Q_{d}$ are $d$ -regular.

Adjacency and Incidence

Adjacent Vertices: Two vertices $u$ and $v$ are said to be adjacent if there is an edge connecting them, i.e., ${u, v} \in E$ .
Incident Edge and Vertex: A vertex $v$ and an edge $e$ are said to be incident if $v$ is one of the endpoints of $e$ . For an edge $e = {u, v}$ , vertices $u$ and $v$ are the endpoints of $e$ .

Handshaking Lemma

A fundamental result in graph theory that relates the degrees of vertices to the number of edges is the Handshaking Lemma (Satz 1.2). It states that the sum of the degrees of all vertices in a graph is equal to twice the number of edges.

Theorem 1.2 (Handshaking Lemma): For any graph $G = (V, E)$ , the sum of the degrees of all vertices is equal to twice the number of edges:

v \in V \sum d e g (v) = 2∣ E ∣

Proof: This result arises from a simple counting principle, often referred to as double counting. When we sum the degrees of all vertices, each edge ${u, v}$ is counted exactly twice: once when we consider the degree of $u$ and once when we consider the degree of $v$ . Therefore, the sum of degrees counts each edge twice, leading to the factor of 2 in the formula.

Corollary 1.3: An immediate consequence of the Handshaking Lemma is that in any graph, the number of vertices with an odd degree must be even.

Proof: Let $V_{g}$ be the set of vertices with even degrees, and $V_{u}$ be the set of vertices with odd degrees. Then,

v \in V \sum d e g (v) = v \in V_{g} \sum d e g (v) + v \in V_{u} \sum d e g (v)

The sum of even numbers is always even. For the total sum to be even (as it is equal to $2∣ E ∣$ ), the sum of odd degrees, $\sum_{v \in V_{u}} d e g (v)$ , must also be even. This is only possible if the number of terms in this sum, $∣ V_{u} ∣$ , is even.

Corollary 1.4: In any graph $G = (V, E)$ , the average degree of a vertex is $2∣ E ∣/∣ V ∣$ . Consequently, there must exist at least two vertices, say $x$ and $y$ , such that $d e g (x) \geq 2∣ E ∣/∣ V ∣$ and $d e g (y) \leq 2∣ E ∣/∣ V ∣$ . This follows from the fact that in any set of numbers, there must be at least one number greater than or equal to the average and at least one number less than or equal to the average.

Subgraphs

A subgraph $H = (V_{H}, E_{H})$ of a graph $G = (V_{G}, E_{G})$ is a graph where the vertex set $V_{H}$ is a subset of $V_{G}$ , and the edge set $E_{H}$ is a subset of $E_{G}$ such that both endpoints of every edge in $E_{H}$ are in $V_{H}$ . We write $H \subseteq G$ to denote that $H$ is a subgraph of $G$ .

V_{H} \subseteq V_{G} and E_{H} \subseteq E_{G}

An induced subgraph is a special type of subgraph where the edge set $E_{H}$ consists of all edges from $E_{G}$ that connect vertices in $V_{H}$ . If $H$ is an induced subgraph of $G$ with vertex set $V_{H}$ , we write $H = G [V_{H}]$ .

E_{H} = E_{G} \cap (2 V _{H})

To obtain a subgraph, we can remove vertices and/or edges from the original graph. However, to obtain an induced subgraph, we can only remove vertices; the edges are then determined by the remaining vertices.

The notation $G - v$ denotes the induced subgraph obtained by removing vertex $v$ and all edges incident to $v$ from graph $G$ .

Connectivity and Trees (1.1.1)

Walks, Paths, and Cycles

Within a graph, we can traverse sequences of vertices and edges. These traversals are fundamental to understanding connectivity and structure.

Walk: A walk in a graph is a sequence of vertices $v_{1}, v_{2}, \dots, v_{k}$ such that for each $i \in {1, \dots, k - 1}$ , there is an edge between $v_{i}$ and $v_{i + 1}$ . The length of a walk is the number of steps, which is $k - 1$ in this case. Vertices $v_{1}$ and $v_{k}$ are the start and end vertices of the walk, respectively. A walk with start vertex $s$ and end vertex $t$ is called an $s - t$ -walk.
Path: A path is a walk where all vertices are distinct, except possibly the start and end vertices. An $s - t$ -path is a path with start vertex $s$ and end vertex $t$ .
Cycle: A cycle is a walk $v_{1}, v_{2}, \dots, v_{k}$ where $v_{1} = v_{k}$ (start and end vertices are the same), and the length is at least 3, and all vertices $v_{1}, v_{2}, \dots, v_{k - 1}$ are distinct. A cycle is also known as a closed walk.

Connectedness

A graph $G$ is connected if for every pair of vertices $s, t \in V$ , there exists a path between $s$ and $t$ . If a graph is not connected, it consists of several connected components. A connected component is a maximal connected subgraph. The vertex sets of the connected components of a graph form a partition of the vertex set.

Trees

A graph that contains no cycles is called acyclic or cycle-free. A tree is defined as a connected, acyclic graph. Trees are fundamental structures in graph theory and have many important properties.

Lemma 1.5: For a tree $T = (V, E)$ with $∣ V ∣ \geq 2$ vertices:

(a) $T$ contains at least two leaves (vertices of degree 1).
(b) If $v \in V$ is a leaf, then $T - v$ is also a tree.

Proof of (a): Consider any edge $e = {u, v} \in E$ . Start a walk from $u$ and $v$ along the graph, continuing as long as possible without repeating edges. Since $T$ is acyclic, we will never revisit a vertex in the current path to form a cycle. Since $V$ is finite, these walks must eventually terminate at vertices where no further edges can be traversed. These terminating vertices must be leaves, as they have degree 1 (or potentially 0 if we walked back to the starting vertex immediately, but since $∣ V ∣ \geq 2$ , this is not the case starting from an edge). Since we started from both ends of an edge, we must find at least two leaves.

Proof of (b): If $T$ is acyclic, then removing a vertex and its incident edges will not create any cycles. Thus, $T - v$ remains acyclic. We need to show that $T - v$ remains connected if $v$ is a leaf. Let $x, y \in V ∖ {v}$ . Since $T$ is connected, there is a path $P$ between $x$ and $y$ in $T$ . If this path $P$ does not contain $v$ , then it is also a path in $T - v$ . If $P$ does contain $v$ , since $v$ is a leaf, it has degree 1. Therefore, $v$ can only be an endpoint of the path $P$ . If $v$ is an intermediate vertex in the path, it would require degree at least 2. Since $v$ is a leaf, $v$ cannot be an intermediate vertex on any path. Thus, if a path exists between $x$ and $y$ in $T$ it will exist in $T - v$ . Hence, $T - v$ remains connected.

Theorem 1.6: For a graph $G = (V, E)$ with $∣ V ∣ \geq 1$ vertices, the following statements are equivalent:

(a) $G$ is a tree.
(b) $G$ is connected and acyclic.
(c) $G$ is connected and $∣ E ∣ = ∣ V ∣ - 1$ .
(d) $G$ is acyclic and $∣ E ∣ = ∣ V ∣ - 1$ .
(e) For every pair of vertices $x, y \in V$ , there is exactly one $x - y$ path in $G$ .

Proof: The proof demonstrates the equivalence of these properties through a series of implications, leveraging induction and contradiction to establish the relationships between connectedness, acyclicity, edge count, and path uniqueness in trees. The proof is detailed in the provided text and is an excellent exercise in understanding the fundamental properties of trees.

Lemma 1.7: A forest $G = (V, E)$ (a graph without cycles, but not necessarily connected) contains exactly $∣ V ∣ - ∣ E ∣$ connected components.

Proof: The proof uses induction on the number of edges $∣ E ∣$ . The base case is when $∣ E ∣ = 0$ . In this case, each vertex is a connected component, and the number of components is $∣ V ∣ = ∣ V ∣ - 0 = ∣ V ∣ - ∣ E ∣$ . For the inductive step, assume the lemma holds for a forest $G^{'} = (V, E^{'})$ and consider adding an edge $e$ to form $G = (V, E^{'} \cup {e})$ , such that $G$ is still acyclic. Edge $e$ must connect two vertices in different connected components of $G^{'}$ (otherwise, adding it would create a cycle). Adding $e$ merges these two components into one, reducing the number of components by one, while increasing the number of edges by one. Thus, the lemma still holds for $G$ .

Directed Graphs (1.1.2)

In directed graphs (or digraphs), edges have direction. Instead of two-element sets, edges are ordered pairs of vertices, denoted as arcs. An arc $(u, v)$ represents a directed edge from vertex $u$ to vertex $v$ . The set of arcs in a directed graph $D = (V, A)$ is a subset of the Cartesian product $V \times V$ .

A \subseteq V \times V

A directed edge $(u, v)$ is graphically represented as an arrow from $u$ to $v$ . While simple graphs do not allow loops or multiple edges, in directed graphs, loops (arcs from a vertex to itself) and multiple arcs between the same pair of vertices are formally allowed, although in this course we will generally consider directed graphs without loops and multiple arcs unless explicitly stated otherwise. Note that even without multiple arcs in the same direction, we may have both $(x, y)$ and $(y, x)$ in a directed graph, which are distinct arcs.

In-degree and Out-degree

For vertices in directed graphs, we distinguish between incoming and outgoing edges.

Out-degree ( $d e g^{+} (v)$ ): The out-degree of a vertex $v$ is the number of arcs originating from $v$ .
$d e g^{+} (v) := ∣ {(x, y) \in A ∣ x = v} ∣$
In-degree ( $d e g^{-} (v)$ ): The in-degree of a vertex $v$ is the number of arcs terminating at $v$ .
$d e g^{-} (v) := ∣ {(x, y) \in A ∣ y = v} ∣$

Theorem 1.8: In any directed graph $D = (V, A)$ , the sum of in-degrees is equal to the sum of out-degrees, and both are equal to the total number of arcs $∣ A ∣$ .

v \in V \sum d e g^{-} (v) = v \in V \sum d e g^{+} (v) = ∣ A ∣

Directed Walks, Paths, and Cycles

Analogous to undirected graphs, we define directed walks, paths, and cycles in directed graphs, respecting the direction of arcs.

Directed Walk: A directed walk is a sequence of vertices $v_{1}, v_{2}, \dots, v_{k}$ such that for each $i \in {1, \dots, k - 1}$ , $(v_{i}, v_{i + 1}) \in A$ .
Directed Path: A directed path is a directed walk where all vertices are distinct.
Directed Cycle: A directed cycle is a directed walk $v_{1}, v_{2}, \dots, v_{k}, v_{1}$ where $k \geq 2$ and $v_{1}, v_{2}, \dots, v_{k}$ are distinct.

Acyclic Digraphs (DAGs)

A directed graph that contains no directed cycles is called an acyclic digraph or simply a DAG (Directed Acyclic Graph). DAGs have many special properties and are used to model dependencies and hierarchies. A key property of DAGs is that their vertices can be topologically sorted.

Theorem 1.9: For any DAG $D = (V, A)$ , a topological sorting can be computed in $O (∣ V ∣ + ∣ A ∣)$ time.

Strong and Weak Connectivity

Connectivity in directed graphs has two main variations:

Strongly Connected: A directed graph is strongly connected if for every pair of vertices $u, v \in V$ , there is a directed path from $u$ to $v$ and a directed path from $v$ to $u$ .
Weakly Connected: A directed graph is weakly connected if the underlying undirected graph (obtained by ignoring the direction of edges) is connected.

Every strongly connected graph is also weakly connected, but the converse is not necessarily true. DAGs, for instance, can be weakly connected if their underlying undirected graph is connected, but they cannot be strongly connected (unless trivial, with one vertex and no arcs), as the presence of a directed cycle is required for strong connectivity.

Data Structures (1.1.3)

To efficiently work with graphs in algorithms, we need appropriate data structures to represent them. The two most common representations are:

Adjacency Matrix: An adjacency matrix for a graph with $n$ vertices (labeled $1, \dots, n$ ) is an $n \times n$ matrix $A_{G}$ , where $A_{G} [i, j] = 1$ if there is an edge (or arc) between vertex $i$ and vertex $j$ , and $A_{G} [i, j] = 0$ otherwise. For undirected graphs, the adjacency matrix is symmetric. For graphs without self-loops, the diagonal elements are zero.
Adjacency List: An adjacency list representation stores, for each vertex $v$ , a list of its neighbors (or vertices reachable by outgoing arcs from $v$ in a directed graph). Adjacency lists are generally more space-efficient for sparse graphs (graphs with relatively few edges).

The choice between adjacency matrices and adjacency lists often depends on the specific algorithm and the density of the graph. Adjacency lists are typically preferred for algorithms like breadth-first search (BFS) and depth-first search (DFS) on sparse graphs, offering a time complexity of $O (∣ V ∣ + ∣ E ∣)$ . Adjacency matrices, while potentially less space-efficient for sparse graphs, allow for constant-time adjacency checks and can be beneficial for dense graphs or algorithms that rely on matrix operations. Adjacency matrices also lend themselves to algebraic graph theory methods.

Theorem 1.13: If $M$ is the adjacency matrix of a graph (or digraph), then the entry $(M^{k})_{ij}$ represents the number of paths of length exactly $k$ from vertex $i$ to vertex $j$ .

This theorem highlights the utility of adjacency matrices in analyzing path properties of graphs, particularly when combined with matrix multiplication techniques.

Next: 02 Trees, Minimum Spanning Trees

CS Notes

Explorer