One of the things that I have noticed as I look out at the world is – things change.

The weather changes, the tides change, and customer demand changes.

Of these three, variation in customer demand is perhaps of the most interest to lean practitioners, because, after all, if customer demand were constant, the challenge of designing and managing the value stream would be much simpler.

So accepting that in most cases customer demand does change, a sensible follow up questions are: By how much? And is the demand at least stable?

The answer to these questions is important because it often determines the strategy that we use to deal with that variation.

Fortunately, lean math, offers some guidance. If we take the standard deviation of the customer demand and divide it by the average customer demand, the resulting dimensionless number is called the coefficient of variation (*C _{v}*). Low values (i.e. less than 0.2) are associated with stable customer demand, and higher values (i.e. greater than 1.0) are associated with unstable customer demand.

Knowing the coefficient of variation for your products can be useful, especially in conjunction with the product's demand volume because it will help determine the fulfillment or inventory replenishment strategy. For example, finished good strategies may generically go as follows:

- high volume, low
*C*- rate based production_{v} - moderate volume and moderate
*C*- kanban (a.k.a. supermarket pull)_{v} - low volume, high
*C*_{v}- make-to-order

We will be addressing this subject in greater detail in a future post about demand segmentation.

Furthermore the coefficient of variation is necessary, depending upon the formula(s) used, for sizing kanban. Kanban sizing will be addressed in future Lean Math^{TM} posts.

## There are 14 Comments

A client of mine uses the Cv in their calculation of recommended minimum buffer stock levels. This metric is more accurate when based on a larger base of historical demand data. How do you know what sample size is appropriate? To compound the matter, their historical demand shows a lot of seasonality so what you pick as "history" can skew the Cv. While not really strictly math, can you comment on how math can be skewed by the selection of data and how to address that?

Hi Phil,

Thanks for the question. The question of what is an appropriate sample size is a bit tricky. As you point out, when you have seasonal effects, using a data sample that covers a few weeks, or months, will only give you the short-term average and the short-term variation. This is also the case when calculating things like process capability metrics.

I find the key to determining how large a sample size should be is to be very clear on what exactly it is you want to know, and how precisely you have to know it. Once that is established, there are sample size formulas that will estimate the size of the sample needed. These formulas generally leave out two things however, the cost of collecting the sample, and the population size - meaning they could suggest you collect a sample that is way beyond your budget or worse yet, larger than the actual population. So, like many things, in the end, the best way to determine these things seems to be to use some math combined with good judgement.

Thanks for the post!

All the best,

Michael

Phil,

My perhaps over-simplistic answer relative to sample size is the bigger the better. If there is little seasonality, then 6 months may be fine. If seasonality, 12 months. What we're looking for here typically is normal variation. If there's a one/off special cause event, then we may remove the impact of that special cause when determining the Cv (thus impacting the calculated buffer stock level). It is good practice to revisit kanban/buffer stock sizing at least quarterly and whenever there are known significant changes/anticipated changes (>20% or so). This may be due to seasonality or other market dynamics.

That said, all of this math is just math. In the end we should do table top simulations of the systems we build, using historical data and other prospective 'monkey wrenches," to see where, when and why it breaks, make the adjustments and then...take it to the gemba, where more PDCA will follow.

Best regards,

Mark

Phil,

Another thought. And I am sure you are aware of this, but perhaps other are not - Cv is used for calculating demand variation. It's based upon history.

When talking about kanban sizing, specifically the cycle stock portion (average period demand X lead time to replenish) we should use the better of forecasted or historical demand, the buffer portion applies the Cv (and Z score) to the cycle stock.

Thanks,

Mark

Interesting stuff! Your blog definitely stands out from the other Lean blogs with your focus on the math. Thanks for sharing.

Hi Chris,

Thanks for the comment. Yes, this blog is definitely a niche thing. And, we're definitely putting ourselves out there. Talking about math is one thing, writing about it is another. Math is precise...until we start looking at the variety of context and experience. Real-life application definitely makes it more interesting.

Best regards,

Mark

Hi,

I might be a purist, but a measurement without an appropriate sampling strategy, e.g. decisions on how to handle seasonality are not really acceptable in my eyes. the same is true about reporting / using an estimated value without confidence intervals, This is a bit tricky in the case of the CV as it is not easy to find the sample size amd confidence interval calculations, but it is doable and should be done too.

Sandor, There is also the question of bucket size or time unit of measure when looking at customer demand, i.e. month, week, day, etc. For seasonality it might sufficient to use 36 months, 1095 days might be overkill. The whole range of topics of descriptive statistics, confidence intervals, and sampling will be covered. Perhaps you can tell us how you would determine an appropriate sample size?

For me, I use Minitab/stat/power & sample size/sample size for estimation/ one-sided upper and 95%.

Hi Mark! Big fan of your site (and book). Just one thing on the COV formula above. It appears that you are mixing greek (population parameter symbols) and Roman (sample data statistic symbols). Since we are inferring about the future from some subset of the actual past (or foreseeable future) demand, shouldn't the symbols both be Roman letters (COV = s/x-bar)?

Dave - You make a good point. You are correct that statisticians typically use Greek letters to refer to parameters populations and Roman letters to refer to sample parameters. But my sense is that most non-statisticians tend to use the Greek letter sigma regardless if they are referring to a sample or a population. So I believe the notation used is in line with common usage. In all mathematical writing, symbols need to be defined and understood in their context, and since the meaning was clearly defined, I am comfortable with the notation used.

But excellent point, and a good post!

Curious about the treatment of seasonality within a CoV calculation. I've historically used CoV and let the seasonality ride through sample, but that artificially raises the StDev since the average is forced to incorporate a natural (and relatively predictable) peak and trough in demand.

I suppose the answer would be to measure the variation against some kind of polynomial function that allows for a seasonal peak and trough and then calculates variation from that curve.

Anyone else dealt with this type of demand pattern?

Robert,

Excellent point, and I agree with you. The terms high/mid/low volume are relative, so I would consider them in context. For example, Chipotle by most measures would be considered as high volume and it could have a low Cv, but their business model is made to order and not rate production.

They could even be seen as an example of the emerging "mass customization" businesses that Hirono has written about. Perhaps the ideal for many businesses is one of high volume mass customization, in which case the guidelines that I suggested in my blog post (e.g. high volume, low Cv products would be rate based production) do not apply. And clearly other factors also came into play. Things like product shelf life come to mind. If Chipotle started making burritos for the lunch time rush at 10 a.m., they would end up with a lot of burritos and a lot of unhappy customers. Where as products that have a long shelf life (think compressors for air conditioners) that have high volume and low Cv would be perfectly suited for rate based production.

So, I think the answer is, like many things, that one needs to use good judgement.