How to find patterns in supply chain data?

Case study

How to find patterns in supply chain data?

Obtaining insight by using clustering techniques

In this case study the application of clustering techniques is described, to give us insight into data. Clustering, in general, aims to divide a dataset into groups with similar properties, which is often very difficult to do manually when there are a large number of variables involved.

As an example, consider the case of a coverage rate. Figure 1a shows that from a coverage pattern of a single order it is easy to distinguish what the drivers of the performance for this particular order are. It shows that at the end of the ‘Consolidation for export’ and ‘Shipping from export terminal’ processes, the coverage rate decreases sharply. This is a reason to further investigate these processes.

Supply Chain Analytics
Figure 1a: An example of a coverage pattern. The coverage decreases sharply at 5 and 10 weeks, as indicated by the red arrows.

With a large number of coverage patterns however, it becomes more complicated. A single coverage pattern provides the insights you need for a single order, but going through every order individually is a time consuming and cumbersome activity. Not only would this take too much time, it would also be very easy to lose track of the big picture and get caught in details. This is shown in Figure 1b.

Supply Chain Analytics
Figure 1b: All 26.000 coverage patterns plotted in one figure. Although the patterns contain a lot of information, it is not feasible to go through them one by one.

Obtain insight in the coverage patterns

How can we obtain insight in the coverage patterns of a large number of orders? Averaging the coverage patterns over a large time period, for example one year, would seem like a logical step, but this inevitably leads to a loss of information as shown in Figure 1c. This panel shows the average coverage of one year of data. While the coverage rate varies a bit over the year, the overall information in this graph is limited. For example, it is hard to see in which month, for which product, etc. the coverage rate decreased or increased.

Supply Chain Analytics
Figure 1c: The average of the coverage over a period of 1 year. Small variations are visible, but overall the loss of information compared to a single pattern is large.

A lot of information can be obtained if we automatically group the patterns in buckets, where each bucket contains patterns of similar shape. This can be achieved by using k-means clustering. With this method the first step is to randomly select k points to be the center points, where each point resembles a single coverage pattern.

Then you assign the other points to its nearest center point, based on whether or not the point contains similar characteristics. After that, you determine for each group which data point is the actual center from that group and use these k points as the new center points. These last two steps are repeated until the center points no longer change.

The end result is clusters of points that are similar to each other. As said before, each point resembles a coverage pattern and thus we end up with a cluster of similar coverage patterns. Two of the resulting clusters are shown in Figure 1d. It is clear that each cluster contains coverage patterns of similar shape and a cluster now represents a coverage pattern that is similar across several orders.

Supply Chain Analytics
Figure 1d: Two clusters of coverage patterns. Each cluster contains a great number of similar coverage patterns.

More information about Supply Chain Analytics?

Would you like to know more about Supply Chain Analytics? Please contact Naser Bakhshi: or +31 (0)88 288 3874 or Robert Jan Huizing: or +31(0)88 288 3154.

Vond u dit nuttig?