Vanessa Winspeare and Sam Green examine whether basic linear regression reduces the inherent uncertainty when predicting the future.

When forensic accountants quantify damages they may need to analyse accounting and non-accounting information in order to answer the question: "What has the claimant lost as a result of the defendant's actions?"

A major barrier to answering this is the inherent uncertainty in predicting future events and their impact, for example, on future income or profit. In order to reduce this uncertainty, forensic accountants sometimes apply statistical techniques. But how accurate are these techniques and can the results be misleading?

Linear regression

One of the most common statistical tools that forensic accountants use is linear regression, or the 'line of best fit'. This technique models the relationship between two variables – one dependent and one independent. For example, the cost of production in relation to units of output where you would expect the level of output (independent variable) to determine the level of costs (dependent variable), assuming a constant marginal unit cost.

Linear regression, as the name suggests, is based on the assumption that a linear relationship exists between the two variables. Widely used computer packages, such as Microsoft Excel, can easily perform linear regression. Forensic accountants sometimes employ linear regression models to determine losses because they use historical data to predict future trends. However, this approach has its limitations.

The reliability of a future trend predicted by the linear regression of historic data depends on the assumption that past trends will continue into the future. Increasing caution needs to be applied the further the projection into the future.

A linear regression model is also based on the assumption that changes in the one variable can be explained solely by changes in the other. In reality, there are many influences on the data that the model does not take into account. These influences, such as market conditions, exchange rates, prices and the competitive environment, can alter the relationship and thus undermine the quality of the prediction.

Linear regression will always draw a straight line through a set of data points (the 'best fit') regardless of whether a linear relationship exists in reality. But best fit does not necessarily signify a 'good fit', bearing in mind the purpose for which the trend line is being derived; to use historic trends to project into the future.

The coefficient of determination

To determine the quality of the fit, it is necessary to analyse the strength of the relationship between the two variables. The most commonly used measure is the coefficient of determination (r2).

The value of r2 ranges from 0 to 1, where 1 represents a perfect fit/correlation and 0 means no fit at all. A low value thus implies a weak linear relationship between the variables. This may result from trying to derive a trend from a non-linear relationship, other variables not taken into account in the model or that there is simply no relationship between the two variables.

A relatively high level of correlation could lead to the conclusion that for a given independent variable, the dependent variable could be reasonably predicted. But caution still needs to be applied; there may be other influencing factors.

Consider the following example. We statistically analyse the correlation between the number of times a man sleeps with his socks on and the number of times he wakes up with a headache. A high correlation would suggest that if he wanted to avoid waking up with a headache, he shouldn't go to sleep with socks on. But there would be no logic to this inference unless there were other indicators that the wearing of socks in bed can cause headaches. Both variables might be dependent on a third factor that is not included in the model but results in the high correlation. In this example, the third and independent factor is actually that when he goes to bed drunk he forgets to take his socks off!

Provided its limitations are borne in mind, simple linear regression can be a useful tool to assist the court in matters of quantum. However, caution should always be exercised when reviewing or accepting calculations based on these techniques; the sceptism a good forensic accountant will always apply.

The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.