Overblog
Editer l'article Suivre ce blog Administration + Créer mon blog

Ce site...

    

GMichael

George A. MICHAEL
Docteur en Neuropsychologie
Professeur des Universités
Université Lyon 2
Dpt. Psychologie Cognitive & Neuropsychologie

E-mail: george.michael@univ-lyon2.fr


     ALLER AU CONTENU

 


14 février 2007 3 14 /02 /février /2007 13:07

Case-Group comparisons with the Q’ test

The comparison of the scores obtained by an individual in multiple tests (or conditions of a test), as well as the direct comparison of the performance of two or more individuals relatively to a control group is of central importance in clinical research, mainly in cognitive neuropsychology. The existing inferential methods allow the comparison of one, two or k scores of an individual to a control group, but there no satisfactory statistical tools for the analysis of factorial designs. On the other hand, the direct comparison of two or more individuals (once referred to the control group) is sometimes necessary for some conclusions to be drawn. For instance, a firm conclusion that two cognitive subsystems are independent is only possible when two patients are directly compared with each other, and the comparison shows that, “…on task I, patient A performs significantly better than patient B, but on task II, the position is reversed” (Shallice, 1988, p.235). The existing statistical tools do not allow such a direct comparison. The Q’ test (Michael , 2007) presented here is based on the analysis of the proportions of subjects having obtained less extremes scores than the individual and allows the comparison of one, two, or k scores obtained by an individual to the scores obtained by a control group, whatever its size. It also allows similar comparisons of data obtained in 2x2 and 2xk designs. Of course, in the last two cases, it is also possible to compare 2 or k scores obtained by 2, or even 2 scores obtained by k individuals. With some simple changes, this can be extended to a 2x2xk design too.

(a) One score
The main question is whether an individual’s single score is more extreme than expected. The first step is to find the proportion of the population that would obtain scores less extreme than that obtained by the individual. This proportion corresponds to the unilateral probability of the z-score computed as follows:


(eq.1)

where x is the individual’s score, X denotes the mean group score, and S denotes the standard deviation of the group. The z-score follows a central distribution from which the proportion of subjects having less extreme scores can be found (for example, with the aid of a statistical table of the normal distribution like the one presented at the end of this article). This proportion should then be entered in the following equation


(eq.2)

where p denotes the proportion of subjects having less extreme scores than the individual, and N denotes the number of subjects in the control group. The Q’ follows the normal distribution, thus the P-value is the corresponding probability read in a table of the normal distribution. Let’s say that the individual’s score is 45, the mean group (N=16) score is 52 and the standard deviation is 14. The z-score is -0.5 and p=0.309. The equation is then as follows:

The corresponding bilateral P-value is 0.088 (for unilateral values, just divide the bilateral P-value by 2). There is a trend for the individual’s score to be lower than the mean group score. Note that a t-test would yield a P-value of 0.064 which is not so different from the one obtained with the Q’ test. When comparing the results of the Q’ and the t-test for diverse data, they are very close with the Q’ being slightly more conservative.

(b) Two or k scores
For analyses other than the comparison of a single score, the procedure is somewhat more complex. The logic is that if, as referenced to the control group, the position of all scores is equally extreme (i.e., if the proportion of subjects obtaining less extreme scores than the individual is similar in all the tested conditions), then, the performance pattern of the individual follows the pattern of the controls. That is, even in the possibility that the individual’s scores in each condition differ significantly from the mean scores of the control group, this can be attributed to a general change due to the pathological condition. On the other hand, if the proportion of subjects obtaining less extreme scores than the individual is different in some (or all) the tested conditions, then the performance pattern of the individual does not follow the pattern of the controls. The Q’ test will yield a significant difference, meaning that, compared to the pattern of the controls, at least one of the individual’s scores is well different from the others. For k scores (and factorial designs), post-hoc comparisons can localize the sources of significant differences.

For these tests to be correctly carried out, some adjustments of the initial scores are required. Suppose that the proportions of individuals obtaining less extreme scores than the individual are 0.01 and 0.02. These proportions are too extreme and too weak to yield any difference when entered as such in the equations of Q’ test given below (Michael, 2007), even though a real difference may exist. In order to avoid this problem, (a) the mean score of the individual (all scores included) and the mean score of the control group (all scores included) must be first averaged. Let’s say that the individual’s scores are 45 and 55, respectively, and the corresponding mean scores of the control group are 52 and 52. Thus, the mean score is 50 and 52 for the individual and the group, respectively, and the grand mean is 51. Then, (b) the difference from the grand mean is computed for both the individual and the group. In the example, the individual is at -1 point from the grand mean (50-51=-1) and the group at +1 (52-51=1). Finally, (c) this difference is subtracted from each score of the individual and from each mean score of the controls. Thus, the adjusted scores are 46 [45-(-1)=46] and 56 [55-(-1)=56] for the individual, and 51 [52-(1)=51] and 51 [52-(1)=51] for the control group. This adjustment procedure results in the controls and the individual having exactly the same mean yet keeping the difference between the scores the same, avoiding therefore the extreme proportions and keeping the performance pattern stable. All the computations are carried out on these adjusted values.

As previously, the z-score is computed for each adjusted value using eq.1 in which the standard deviation is the original one, and the proportion (p) of subjects that obtain less extreme scores than the individual are found using the table of the normal distribution.

Each proportion will be compared to the pooled proportion of subjects obtaining less extreme scores




where k denotes the number of scores. Of course, the sample size of the pooled proportion is the sum of sample sizes of all the conditions




The variance of each proportion is computed with the following equation


(eq.3)

For each proportion, the difference with the pooled proportion is derived


(eq.4)

The variance of the difference between each proportion and the pooled proportion is the sum of their respective variances (each one computed with eq.3), denoted D. The difference of each proportion with the pooled proportion is then divided by the corresponding variance of the difference, and all the scores are summed up:


(eq.5)

Subsequently, the value 1 is divided by each variance of difference and all the obtained scores are summed up:


(eq.6)

Now, dividing eq.5 by eq.6 results in the expected difference if the difference between each proportion and the pooled proportion was 0. This difference is denoted d0. The formula of the Q’ is then


(eq.7)

The Q’ follows the chi-square distribution with k-1 degrees of freedom. The critical value above which the difference of the tested scores is significant can be read in the corresponding chi-square table.

Let’s consider the above example, where the adjusted scores of the individual are, respectively, x=46 and y=56, and the corresponding adjusted mean scores of the controls x=51 and y=51. The sample sizes of the control group are 16 and 16, respectively, and, for ease, the standard deviation will be kept at 14 for both conditions. The z-scores are thus zx=-0.36 and zy=0.36, and the corresponding proportions are, respectively, px=0.36 and py=0.64. The pooled proportion is p=(0.36+0.64)/2=0.5 and its sample size is N=16+16=32 .

Using eq.3, the variance of each proportion, as well as the variance of the pooled proportion, is as follows:




The difference of each proportion from the pooled proportion, is




and the variances of their differences are, respectively




The value of the d0 is thus




This value is 0 because only two scores are examined and because the sample sizes and the standard deviations of the controls are identical in both cases. CAUTION! This is absolutely not the case when the comparison concerns k scores and when the sample sizes and the standard deviations differ. Finally, the value of the Q’ is




The number of scores being k=2, the degrees of freedom are df=2-1. The critical chi-square value for df=1 being 3.84, it can be said that the pattern of results of the individual does not differ significantly from the pattern of the controls. The precise P-value is 0.15, and the overall result is presented as Q’(1)=2.07; P<0.15.

The one-score test described earlier can be subsequently used for assessing each score separately. In the example, the first score differs only marginally from the score of the controls (Q’=-1.7; P<0.088) but the second one does not (Q’=0.76; P<0.45).


(c) Factorial designs: 2x2 and 2xk
Up until now, we have examined the case for one, two and k scores. Tests that perform these analyzes are not missing from the literature, but the use of the Q’ tests renders more coherent the analyses and makes more sense especially when the following tests are used. The strength of the Q’ test is the analysis of some factorial designs.

The very first step involves the same adjustment of the original scores, just as described earlier for 2 and k scores. It is important to note that ALL the scores of the individual and the controls should be used for the computation of the grand mean. Furthermore, ALL the scores are used for this computation even if the analysis concerns the comparison of 2 or more individuals (e.g., 2 or k scores of 2 individuals, or 2 scores of k individuals). The second step is also identical, and it concerns the computation of the z-scores and the use of the tables of the normal distribution for finding the corresponding proportions.

For the analysis of the factorial design to begin, the proportions have to be arranged in 2 columns and k rows, in a way that level 1 and level 2 of the 2-level factor to be next to each other for each level of the k-level factor (see proportions in the table). This arrangement is the same for the 2x2 design. The variance (Var) of each proportion is computed using eq.3. For each level of the k-level factor, the difference (d) between the proportions of levels 1 and 2 of the 2-level factor is computed using eq.4. For each level of the k-level factor, the variance of the difference in proportions (D, which is the sum of their variances, as described earlier) for levels 1 and 2 of the 2-level factor is computed. Using eq.5 and eq.6 allows computing the expected difference (d0), and, finally, eq.7 allows computing the value of the Q’, with k-1 degrees of freedom.

An example of a 2xk design is presented in the table. Suppose that a patient and sixteen controls participated in a test where they had to identify pictures of objects. The pictures were either colored or grey-scaled (levels 1 and 2 of the 2-level factor which we can name “color”), and were either degraded by a lot of added noise, little added noise, or intact (levels a, b and c of the 3-level factor which we can name “noise”). The scores are the number of pictures correctly identified from a total of 100 pictures per condition. For ease, the standard deviation is supposed to be the same in each condition. The scores obtained by the individual and the controls are presented in the table (see original scores). The grand mean is 50.5. The individual’s distance from this mean is -3.2 and that of the controls is +3.2.




A detailed visual inspection of the proportions (which are also plotted in the embodied graphic) suggests that, in the case of the colored pictures, the pattern changes little as a function of noise. At the beginning of the description of the analysis of 2 and k scores, it was stated that if the proportions of subjects having obtained less extreme scores is the same over the tested conditions, this means that the pattern of performance of the individual is close to the pattern of the controls. This may be the case for the colored pictures. Conversely, there are large differences in the proportions observed in the grey-scaled condition, suggesting that, this time, the performance of the individual might differ from that of the controls. The visual inspection also suggests that there might be a significant color X noise interaction, since there are large differences between the colored and grey-scaled pictures at least for levels a and b of noise. Indeed, the color X noise interaction reaches significance (Q’(2)=7.23; P<0.027), suggesting that the patient’s picture identification performance varies as a function of the color and the presence of noise. The analysis of each score (with eq.2) shows that the patient identifies fewer pictures than the controls in the grey-scaled/a lot of noise condition (Q’=-3.57; P<0.0004, bilateral) and in the grey-scaled/little noise condition (Q’=-4.01; P<0.0001, bilateral), whilst there is no difference in the grey-scaled/no noise condition (Q’=0; P<1, bilateral), as well as in the three colored conditions.


(d) Main Effects in 2x2 and 2xk designs
Usually, factorial designs allow the analysis of the main effect of each factor. This is also the case of the Q’ test. The procedure is identical to the one used for the analysis of 2 or k scores. The proportions collapsed through the factor that is not currently examined are compared to the pooled proportion. In the previous example, for the analysis of the main effect of noise, the proportions of each one of the three levels is collapsed through the color factor. Thus, the proportions corresponding to the three levels are 0.37, 0.432 and 0.733, respectively, and their sample sizes are 32, 32 and 32, respectively (i.e., the sum of the sample sizes of the two levels of the collapsed factor). The pooled proportion to which these three values are compared is 0.512 and its sample size is 96 (N=32+32+32). The degrees of freedom correspond to k-1. The Q’ test yields a significant main effect of noise (Q’(2)=8.87; P<0.012), suggesting that, independently of the color, the levels of noise affect the performance pattern of the patient in a way that it differs from the pattern of the controls.

The same procedure is used for the main effect of the 2-level factor, but this time the proportions are collapsed through the k-level factor. In the example, there is a significant main effect of color (Q’(1)=9.66; P<0.0019) suggesting that there is a difference between the colored (p=0.692) and grey-scaled (p=0.331) pictures whatever the noise, and that this pattern is different from the one observed in the performance of the controls. In point of fact, an inspection of the table (see original scores) suggests that the controls obtain similar mean scores for colored (score=52.3) and grey-scaled (score=55) pictures, whilst the mean score of the patient for grey-scaled pictures (score=41.3) is lower than the mean score for colored pictures (score=53.3).


(e) Multiple Comparisons
The presence of a significant interaction or a significant main effect of a k-level factor does not really inform us on the reasons of these significances. The following equation allows conducting multiple comparisons for localizing the sources of significance


(eq.8)

where d denotes the difference of the two proportions to compare, and D denotes the variance of their difference. The value of the q’ follows the chi-square distribution with k-1 degrees of freedom. In the previous example, the main effect of noise reached significance. However, it is difficult to localize the sources of significance, contrary to the main effect of the “color” factor which it was easy to interpret because of the presence of only 2 levels. Post-hoc comparisons suggest that there is no difference between the high and low quantities of noise (q’=0.29; P>0.866), but there are significant differences between the high noise and the no noise conditions (q’=10.82; P<0.005), as well as between the low noise and the no noise conditions (q’=7.24; P<0.027). It can be thus concluded that the presence of noise, whatever its quantity, degrades the performance of the patient more than the performance of the controls. Similar analyses can be done to localize the sources of the significant color X noise interaction.



REFERENCES
- Michael G.A., A significance test of interaction in 2xK designs with proportions, Tutorials in Quantitative Methods for Psychology, 3 : 1-7, 2007.
- Shallice T., From Neuropsychology to Mental Structure, Cambridge: Cambridge University Press, 1988.



Unilateral Probabilities For The Normal Distribution
The following table gives the unilateral probabilities for z-scores from -3 to 3. The leftmost column contains the z-scores with the first decimal, and the upper row contains the second decimal. If the z-score is 0.46, then the proportion of subjects having obtained less extreme scores than the individual is given at the intersection of 0.40 (on the left) and 0.06 (on the top): 0.67724.








Partager cet article
Repost0

commentaires

M
"Brand" Mobile Phones Prices in Pakistan (Karachi, Lahore, Islamabad & KPK) - "Brand" Price Comparison and complete Specifications of new smart phones.
Répondre