[buug] Determining which records were dropped?

Claude Rubinson cmsclaud at uga.edu
Wed Jul 2 10:49:58 PDT 2003


I've inherited a statistical analysis project from a former
employee who didn't leave behind any documentation detailing
the analysis procedure.  The dataset has 58 records.  Eight of
those records were dropped prior to the analysis -- but I
don't know which eight (I only have a copy of the aggregate
results).  I've tried locating the original analysis
specifications, to no avail.

I'm able to replicate the methods that he used; it's just the
subsetting procedure that's missing.

The obvious solution is to systematically remove eight records
from the dataset, run the analysis, and see if it matches the
original results.  Loop until the original results are found.
 (The chances of a false positive are actually very low as
there are a number of analyses which can be run to confirm
that the subselection has been correctly identified.)

The problem is that there are approximately 2 billion
combinations of 58 elements when selected 8 at a time.  Since
generating 2 billion combinations will take a bit longer than
is budgeted for this project, I'm hoping that someone might be
able to point me in a better direction.  Thanks.

Claude



More information about the buug mailing list