Need for Randomization

We may recall that in Chapter 7, discussing the experiment on the benefit of a hypothetical plant food, and also in other con­texts, we mentioned the word “random” quite a few times. We may now ask, Why random? Why not pick the first forty or so plants of one kind that the experimenter came across and use the first twenty as subjects and the rest as controls? There are several answers to these questions: Firstly, there is the dichotomy of “subjective” and “objective” in an investigation. What distin­guishes scientific investigations is that they are meant to be objec­tive, meaning that an individual’s findings as a result of his investigation are not tainted by his prejudice, biases, and wishful thinking, which are subjective elements. As we pointed out before, every inference, every conclusion that is offered, is yet another “brick” in the structure of science and is open for anyone interested to believe or to doubt, to accept as given, or to check for its truth-value before acceptance. A particular experimenter may choose to accept that object or phenomenon or situation which is favorable to his wishful thinking and reject those that are not. An ideal, as a code of conduct that a scientist is expected to strive for, is to free himself by deliberately practicing random­ness at all levels, thus giving equal chance to those factors that are favorable to his wishful thinking and to those that are not.

Another circumstance which gives credence to randomization is the fact that nature is filled with variety, with possibly no two “similar” things or events being “exactly similar.” Add to this the fact that an experimenter is limited in time and place, whereas the inference or conclusion drawn from his experiments is meant to be beyond the limitation of time and place. Hence, the best he can do is to have “representatives” to experiment on, from differ­ent places and, if possible, from different times, and to create an assembly of such representatives; such an assembly to him becomes his “universe.” When he finds the truth-value that he surmises, within his limited universe, he dares to project it as good for the outside, real world. This truth has now to contend with the world, in which variety is more a rule than an exception. It cannot be expected to be a “perfect fit” in any specific domain, but it can be a fairly close fit in most domains. Experimentally derived truth is thus essentially statistical in nature; it cannot be absolute. The way the experimenter collects representatives, often called sampling, is a reflection of the situation that the truth derived as the end from these representatives serving as the mean, is expected to be a close fit, with various degrees of closeness, in all the domains represented in the sample.

A simple example may illustrate the point. Suppose an experi­menter is called upon to find the grain-size distribution of sand in a two-mile-long sea beach. He should not take a bucket full of sand at any convenient location on the beach and proceed to the lab for testing. He needs to collect “representative” samples from different locations over the entire two-mile stretch. Further, he cannot say to himself, Every one hundred yards I will pick a handful at the surface to be representative. He needs some sand samples from the surface, some from, let us say, a foot below the surface, some from flat surfaces, some from crevices, some dry and away from the waterfront, some wet and near the waterfront, and so forth, and to cap it off, the samples need to be collected at “random,” that is, not in conformance with any order. The neces­sity of this last criterion is twofold. First, the experimenter is likely to use any order as a tool for “fixing” the sampling, and thereby “fixing the truth.” Second, the size analysis he would come up with may not be found anywhere in the entire beach, but it may be fairly close to the analysis found at various locations with different degrees of closeness. Those locations where the closeness is very high, as well as those where the difference is very high, cannot be predicted. They are likely to be spread along the beach “at random.” The nature of the results the experimenter is going to derive is reflected in the way he collects the samples.

Yet another benefit attributed to randomization is that, when used in combination with replication, the effects of nuisance fac­tors, particularly when such factors are difficult to detect, can be minimized, though not nullified. This obviously rests on the idea that randomization necessarily scatters, instead of focusing, the intensity of effects caused by nuisance factors. For instance, consider a paired comparison experiment in which, let us say, twenty pairs of plants are being tested. If all the control plants (A, B, C, . . .) are placed along the south edge of the fenced plot, and all the subjects (A1, B1, C1, . . .) are placed along the north edge, factors such as sunshine, wind current, and shadow cover­age, which are known to influence plant growth and yield, but are not controllable, may be different on the south edge from those on the north edge. When it comes to comparison of con­trols as a group against subjects as a group, there is likely to be a bias unintentionally introduced in favor of one group. Random­ization is the remedy to such defect. If plants were placed either at the south edge or the north edge, as decided by the toss of a coin, regardless of whether the particular plant was a subject or a control, the effect of the uncontrollable factors would be likely to even out.

Source: Srinagesh K (2005), The Principles of Experimental Research, Butterworth-Heinemann; 1st edition.

Leave a Reply

Your email address will not be published. Required fields are marked *