Tel: +44 191 255 8899
Fax: +44 191 255 8898

What are missing values, and why are they a problem?

Having missing values means that some variables do not have a measurement.

For 2D analysis, this is caused by a match series having 'gaps'. That is, some of the images have no measured spot volume. This can be because a spot wasn't detected there, or because it was detected but matched incorrectly. Both of these causes can be corrected manually, but doing so introduces subjectivity and is error-prone (as well as very time consuming).

The screenshot below of a measurements table shows the missing values in pink (each row is a match series).

Measurments table with missing values

Missing values can cause serious problems. Most statistical procedures automatically eliminate cases with missing values, so you may not have enough data to perform the statistical analysis. Alternatively, although the analysis might run, the results may not be statistically significant because of the small amount of input data. Missing values can also cause misleading results by introducing bias.

With the traditional approach to 2D analysis, running more replicates makes this problem worse:

Matched spots vs number of gels

Fully matched spots vs. number of gels, from Houtman et al, Proteomics 2003,3,2008-2018

With each added gel, the chance of getting a missing value increases.

Typical levels of missing values introduced with increasing numbers of replicates/experiment using traditional analysis approach

No. Gels No. Spots detected No. Matched in all gels % Missing Values
2 1000 900 10
5 1000 750 25
10 1000 600 40
20 1000 400 60
100 1000 <100 >90%

The approach used by SameSpots avoids the missing values problem completely.