Fig. A. An abundance-dependent error model derived from regression of observed variances. In order to generate variance estimates that were more robust than those that could be obtained from three or four replicate measurements of one individual gene at any single timepoint, we pooled variance estimates from large groups of genes with very similar abundances at any timepoint and used these pooled (or regressed) estimates for ANOVA testing. Blue dots indicate the observed variances sij2 of 6617 genes on the A array (one of three total arrays), each measured in 12 experimental groups (timepoints). Only one-tenth (7941 blue dots) of the data are plotted to avoid crowding the graph. Black line is the windowed median fit rij2 to the observed variances. Regressed error estimates were obtained by referring the observed means to this fitted line. y-axis is limited to show detail of fit; some observed variances exceed 1.
Fig. B. Cluster analysis reveals coordinate expression patterns. 97% of 3157 genes with P<0.001 in either multi-sample ANOVA are included in 106 clusters that meet coherence criteria (average pair-wise correlation coefficient in the presence of simulated noise was more than 0.7). Clusters are numbered according to size and labeled by number. Axis labels are included for Cluster 1 (bottom right). Clusters are arranged using the first two principal components of their centroids, so that similar patterns are placed near each other.
Fig. C. Expression classes can be subdivided into non-overlapping kinetic subclasses. Gene expression profiles for the average of each of five maternal degradation subclasses (A), 10embryonicsubclasses (B), 10strictly embryonic subclasses (C) and and eightembryonic transient subclasses (D). Each gene was mean normalized before computing the subclass average. The legend in A applies to A-D. Legend labels correspond to the defining time point in each subclassification scheme, i.e. time of earliest significant decrease (A), earliest significant increase (B,C) and max abundance (D).