A use for Sankey Diagrams?

A Sankey diagram is a type of flow chart used to visualize the distribution or movement of quantities through a system. Named after Captain Matthew Sankey, who first used it in 1898 to depict energy efficiency in steam engines, the diagram uses arrows or paths whose widths are proportional to the quantity they represent. This makes Sankey diagrams particularly effective for showing how resources, energy, information etc, are allocated across categories. They are especially useful for identifying dominant flows, bottlenecks, or imbalances within a process or dataset.

It occured to be that it might be an interesting way to display the breakdown of marks in an exam paper.

This Sankey diagram provides a visual breakdown of how marks were distributed across the A2 Component 2 (2023) Physics paper. It begins with a single “Total Marks” block, which then splits into individual physics topics, such as Circular Motion, Electric Fields, and Nuclear Decay. Each topic node flows into different styles of questions—Calculation, Explanation, Graphing/Data, or Recall—based on how the marks were awarded.

The diagram makes it easy to see which topics carried the most weight (e.g., Oscillations and Electric Fields received high totals) and which styles of questions were most common across the paper. Calculation-heavy topics dominate the assessment (obviously), while Graphing/Data and Recall play smaller but still significant roles.

Show the code

# Install if needed
library(networkD3)

# Define nodes with Total Marks first
nodes <- data.frame(name = c(
  "Total Marks",
  "Circular Motion", "Gravitational Fields", "Electric Fields", "Capacitance",
  "Magnetic Fields", "Nuclear Decay", "Nuclear Energy", "Thermodynamics",
  "Oscillations", "Astrophysics", "Data Analysis",
  "Calculation", "Explanation", "Graphing/Data"
))

# Define total marks per topic (names MUST match exactly with nodes$name)
topic_totals <- c(
  8,  # Circular Motion
  7,  # Gravitational Fields
  9,  # Electric Fields
  8,  # Capacitance
  7,  # Magnetic Fields
  8,  # Nuclear Decay
  8,  # Nuclear Energy
  9,  # Thermodynamics
  9,  # Oscillations
  9,  # Astrophysics
  8   # Data Analysis
)

# Topics names must align with node indices 1–11
topic_links <- data.frame(
  source = 0,
  target = 1:11,
  value = topic_totals
)

# Style links from topics (1–11) to styles (12–14)
style_links <- data.frame(
  source = c(
    1, 1,  # Circular Motion
    2, 2,  # Gravitational Fields
    3, 3,  # Electric Fields
    4, 4,  # Capacitance
    5, 5,  # Magnetic Fields
    6, 6,  # Nuclear Decay
    7, 7,  # Nuclear Energy
    8, 8,  # Thermodynamics
    9, 9,  # Oscillations
    10,10, # Astrophysics
    11,11  # Data Analysis
  ),
  target = c(
    12, 13,
    12, 13,
    12, 14,
    14, 13,
    12, 13,
    12, 13,
    12, 13,
    12, 13,
    12, 13,
    12, 13,
    14, 13
  ),
  value = c(
    6, 2,
    5, 2,
    5, 4,
    5, 3,
    5, 2,
    4, 4,
    6, 2,
    5, 4,
    7, 2,
    5, 4,
    6, 2
  )
)

# Combine both sets of links
links <- rbind(topic_links, style_links)

# Draw Sankey diagram
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
              Target = "target", Value = "value", NodeID = "name",
              fontSize = 12, nodeWidth = 30,
              sinksRight = FALSE)