Abstract:
This paper describes the statistical analysis of a varietal trial with two unusual characteristics : (i) The plant (coffee) is one of those which show strong maxima and minima of production in alternate years. This phenomenon must be prevented from masking or biasing the other varietal comparisons in which we are interested. (ii) The design of the experiment is systematic. It was laid down in Campinas, Brazil, in 1933 at a time when the principles of randomisation were not so widely known as they are today. THE EXPERIMENT AND DATA. Six varieties are compared, denoted by A B C D E and F (see page 104). They are planted in thirty rows, each with 50 plants, according to the systematic design : A B C D E F A B C D E F A B C D E F A B C D E F A B C D E F Data for twelve years are available in quadro 1 but those of the small and irregular yields in the first two years were discarded. The mean yields of the remaining ten years (1935-1946) appear by figure 1 to be fairly regular and consistent in their behaviour. Most of the plants, but by no means all, showed their maxima in the even years. STATISTICAL ANALYSIS. The quantity of primary interest is the mean yield over the whole period. It is essential that these means should be based (as here) on an even number of years in order to eliminate, from their comparisons, the effect of the alternations of maxima and minima. The magnitude of the oscillation is conveniently measured by total of even years minus total of odd years. Finally we need a linear function of the annual yields for measuring secular trend in order to discriminate varieties which are slowly gaining on the others. The usual linear orthogonal polynomial (with coefficients —9, —7, —5, etc.) is unsuitable because it is not independent of the component of oscillation. A suitable function is obtained instead by using the coefficients —2 —2 —1 —1 0 0 +1 +1 +2 +2. The coefficients of the three linear functions thus defined are set out in quadro 2 (page 107), where it will be verified that they are mutually orthogonal. The effect of the heterogeneity of the soil is as far as possible eliminated (separately for the three functions) by an analysis of covariance, using the number of the row (1-30) as the concommitant observation. A simple linear regression formula is however inadequate. The regressions were taken to the fifth degree by means of orthogonal polynomials. Since the "between varieties" contribution must be removed from the sums of squares and products, the regression coefficients are no longer independently obtainable. It is found however that the normal equations fall into two sets, one yielding the regression coefficients of odd degree and the other those of even degree. Consequently the use of orthogonal polinomiais still effects a considerable saving of work. The computations are set out in full in quadro 3 and in abbreviated form in quadro 4 and 5 for the total, the oscillation and the trend respectively. (Note that the comma indicates the decimal point.) We find that a quadratic regression is adequate for the first and cubic regressions for the others. For the sake of uniformity, a cubic regression was used in every case. The residuals found by subtracting the varietal means from the rows are plotted in figures 2a, 3 and 4a. respectively, together with the regression curves and the 2.5% control limits. These control charts suggest that it is not unreasonable to suppose that the remaining variation is random. Next we use the regression formulae to correct the varietal means. The approximate 80% fiducial intervals of the mean annual yields (kg per row) and the rate of increase of yield (kg per row per year per year) are shown in figures 2b and 4b respectively. In the case of the component of oscillation, the analysis of covariance failed to show the slightest suggestion of differences between varieties. DISCUSSION. An examination of the regressions on number of row reveals the interesting fact that the more fertile portions of the field produce lower yields in the odd years than the less fertile portions. The reason is presumably that the heavier yields in the even years, by exhausting the plant, depress the yields in the following years. The major differences between varietal means over the ten years were sufficiently clear even before the analysis though some of the adjustments are appreciable. A striking fact is that, although there are big general differences between varieties, there are no significant differences between them in respect of the amplitude of oscillation. In other words, the increment of yield in the better varieties is obtained equally in odd and even years. In spite of the large component of oscillation, it is possible to discriminate varieties in respect of their rate of increase of yield (figure 4b). CONCLUSIONS. (i) The extra difficulty introduced by the strong alternations of yield from year to year can be solved b y the choice of suitable orthogonal functions of yearly yields. (ii) Once again a systematic design is found wanting — it fails to eliminate the effect of soil heterogeneity from varietal comparisons. This defect can however be removed, for practical purposes, by an adequate analysis of covariance on row number.