SankeyMATIC Gallery: A Monthly Budget
Walking through one diagram, from concept to completion
In July 2013, there were numerous stories across many media about a proposed “Sample Monthly Budget” put together by McDonald's and Visa. The sample budget (in its revised edition, which includes a line item for Heating) is shown below.
The original source website has been taken down, but here are some sample stories which survive in Spring 2014:
- Forbes: “Why McDonald's Employee Budget Has Everyone Up In Arms”
- Consumerist: “We Have Some Problems With Visa’s Sample Budget For McDonald’s Employees”
- The Atlantic: “McDonald's Can't Figure Out How Its Workers Survive on Minimum Wage”
While a list of several labeled numbers has a long history as a popular method of data presentation, let's see what kind of picture it makes with a Sankey diagram.
First, let's diagram the Income.
Each job can be treated as one “flow” of money into a shared pool from which the expenses will be taken out.
For our first line of data, let's take the income from the first job and put it into the shared pool. You can call the pool anything; I'll label it “Monthly Budget”.
Flows are entered in the format:
Source [Amount] Target
In this first line, the Source of money is called “Job 1”, the Amount of income from that job is $1,105, and the Target is “Monthly Budget”. (Leave the symbols and commas out and just enter the raw number “1105”; labeling the values will come later.):
Clicking the Preview button produces this diagram:
This is as simple as a Sankey diagram can get: a single Amount is flowing from one Source to one Target.
The data line for the second flow (Job 2) looks almost exactly like the first:
Now Job 1 and Job 2 are both flowing into Monthly Budget, which is showing the sum of the two.
(If you look closely, you can also see that Job 1's flow is just barely wider than Job 2's.)
Now the Income side is done.
Turning to the Expenses: For these data lines, the money's Source will be the “Monthly Budget”, and the Target will be the name of the Expense.
Starting with the first one from the list, $100 from Monthly Budget to Savings:
At this point SankeyMATIC alerts us that the amount into the “Monthly Budget” node is not (yet) matching up with the amount flowing out of it.
This cross-checking can be useful to verify that you have entered your flows accurately; in a complicated diagram it can be hard to spot an imbalance visually. (In this simple diagram, the imbalance is obvious.)
Since we are still entering data, we can ignore these cautionary messages for now.
After entering the rest of the Expenses list, including the $750 for Monthly Spending Money, the diagram is “balanced” again. (Each node in the middle has the same amount flowing in & out.)
After all the data is entered:
The data is all present. Time to start tweaking the diagram's look.
First, the diagram is pretty crowded. Let's give it some room to spread out.
After changing Width & Height from 400x200 to 500x250:
Even with the larger space, the labels on the right are pretty tightly spaced. Increasing the “Space between Nodes” can help with that:
After changing Space between Nodes from 8px to 12px:
Now there's a bit more room for the text, so under the Labels & Units section you can increase its Size.
While there, let's give the amounts their proper Unit labels.
After changing Label Size from 13px to 14px and setting Units Prefix = $:
At this point, the diagram is fairly readable and could be considered complete.
However, you can go further to emphasize specific amounts using colors.
One common approach is to set the most favorable numbers to green and the least favorable numbers to red. There isn't a particularly bad flow or group of flows here, but let's try highlighting the “Spending Money” and “Savings” flows in green.
As laid out in the Manual, we have ways to control the colors of individual nodes using “Node Definition” lines. That's a type of data line which lets you assign to any particular Node a color and (optionally) a direction for that color to be ‘inherited’.
The format of a Node Definition line is:
:Node-name #Color >> <<
In this case, we use "<<" to tell SankeyMATIC that any flows to the left of the node should inherit its color. (As there are no flows to the right of the nodes, there is no need to include the ">>" token.)
Added two Node Definition lines to the input data, setting two Nodes to green and coloring the flows into them.
A different coloring approach is to try to communicate some other information besides just ‘good’ or ‘bad’ flows.
Here's an experiment in styling the Job flows using McDonald's signature colors and the colors of another likely employer (the largest private employer in the US, Walmart).
This is achieved by assigning the Job flows and the Job nodes each their own specific color.
For the Job nodes, use a Node Definition line as above (but without any color inheritance in either direction). Then to assign a color to a specific flow, you add an HTML color code to the end of that flow's input line, like so:
Source [Amount] Target #Color
Assigning custom colors to the Job nodes and to the Job flows.
Hang on, why are the flow lines so pale?
Because by default, flows are displayed as semi-transparent—in case you have flows which overlap, you can still follow each one's path if they are partly see-through.
You have control over this, though; adding a suffix between .0 and .9 to your custom flow color lets you set the opacity for each flow, a la:
Source [Amount] Target #Color.Opacity
After adding opacity .8 to both custom-colored flows:
Whatever the merits of this particular coloring scheme (or of this diagram as a whole), I hope I've conveyed that SankeyMATIC makes it fairly easy to:
- sketch out a Sankey diagram and therefore find the shape of your data
- control your diagram's size and style in various ways
- try out your own coloring ideas rapidly