SnPM is an SPM toolbox developed by Andrew Holmes & Tom Nichols
|
This page... introduction example data background design setup computation viewing results SnPM pages... SnPM manual PET example fMRI example FSL users |   |
SnPM exampleIn this section we analyze a simple motor activation experiment with the SnPM software. The aim of this example is three-fold:
|
You can down load the data from ftp://www.fil.ion.ucl.ac.uk/spm/data/PET_motor.tar.gz or, in North America, from ftp://rowdy.pet.upmc.edu/pub/outgoing/PET_motor.tar.gz. We are indebted to Paul Kinahan and Doug Noll for sharing this data. See this reference for details: Noll D, Kinahan et al. (1996) "Comparison of activation response using functional PET and MRI" NeuroImage 3(3):S34.
Currently this data is normalized with SPM94/95 templates, so the activation site will not map correctly to ICBM reference images.
Exchangeability
First we need some definitions.
We'll make this concrete by defining these terms with our data, considering just one voxel (i.e 12 values). Our labels are 6 A's and 6 B's. A reasonable null hypothesis is ``The A observations have the same distribution as the B observations''. For a simple statistic we'll use the difference between the sample means of the A & B observations. If we say that all 12 scans are exchangeable under the null hypothesis we are asserting that for any permutation of A's and B's applied to the data the expected value of the difference between A's & B's would be zero.
This should seem reasonable: If there is no experimental effect the labels A and B are arbitrary, and we should be able to shuffle them with out changing the expected outcome of a statistic.
But now consider a confound. The most ubiquitous confound is time. Our example data took over two hours to collect, hence it is reasonable to suspect that the subject's mental state changed over that time. In particular we would have reason to think that difference between the sample means of the A's and B's for the labeling
Exchangeability Blocks
The permutation approach requires exchangeability under the null hypothesis. If all scans are not exchangeable we are not defeated, rather we can define exchangeability blocks (EBs), groups of scans which can be regarded as exchangeable, then only permute within EB.
We've made a case for the non-exchangeability of all 12 scans, but what if we considered groups of 4 scans. While the temporal confound may not be eliminated its magnitude within the 4 scans will be smaller simply because less time elapses during those 4 scans. Hence if we only permute labels within blocks of 4 scans we can protect ourselves from temporal confounds. In fact, the most temporally confounded labeling possible with an EB size of 4 is
This brings us to the impact of EB size on the number of permutations. The table below shows how EB size affects the number of permutations for our 12 scan, 2 condition activation study. As the EB size gets smaller we have few possible permutations.
| EB size | Num EB's | Num Permutations | ||
| 12 | 1 | 12C6 | = | 924 |
| 6 | 2 | (6C3)2 | = | 400 |
| 4 | 3 | (4C2)3 | = | 216 |
| 2 | 6 | (2C1)6 | = | 64 |
Hence we have to make a trade off. We want small EBs to ensure exchangeability within block, but very small EBs yield insufficient numbers of permutations to describe the permutation distribution well, and hence assign significance finely. We usually will use the smallest EB that allows for at least hundreds of permutations (unless, of course, we were untroubled by temporal effects).
First, if you have a choice, choose a machine with lots of memory. We found that this example causes the Matlab process to grow to at least 90MB.
Create a new directory where the results from this analysis will go. Either start Matlab in this directory, or cd to this directory in an existing Matlab session.
Start SnPM by typing
Now we come to
Next you need to enter the ``conditions index.'' This is a sequence of A's and B's (A's for activation, B's for baseline) that describe the labeling used in the experiment. Since this experiment was not randomized we have a nice neat arrangement:
Exchangeability Business
Next you are asked about variance smoothing. If there are fewer than 20 degrees of freedom available to estimate the variance, variance smoothing is a good idea. If you have around 20 degrees of freedom you might look at at the variance from a SPM run (Soon we'll give a way to look at the variance images from any SPM run). This data has 12-2-1=9 degrees of freedom at each voxel, so we definately want to smooth the variance.
It is our experience that the size of the variance smoothing is not critical, so we suggest 10 mm FWHM variance smoothing. Values smaller than 4 mm won't do much smoothing and smoothing probably won't buy you anything and will take more time. Specify 0 for no smoothing.
The next question is "Collect Supra-Threshold stats?" The default statistic is the maximum intensity of the t-statistic image, or max pseudo-t if variance smoothing is used. If you would like to use the maximum supra-threshold cluster size statistic you have to collect extra data at the time of the analysis. Beware, this can take up a tremendous amount of disk space; the more permutations the more disk space required. This example generates a 70MB suprathreshold statistics mat file. Answer 'yes' to collect these stats. Or click on 'no' to save some disk space.
The remaining questions are the standard SPM questions. You can choose global normalization (we choose 3, Ancova), choose 'global calculation' (we choose 2, mean voxel value), then choose 'Threshold masking' (we choose proportion), and keep 'Prop'nal threshod' as its default 0.8. Choose 'grand mean scaling' (we choose 1, scaling of overall grand mean) and keep 'scale overall grand mean' at its default 50. the gray matter threshold (we choose the default, 0.8) and the value for grand mean (again we choose the default, 50).
Now SnPM will run a short while while it builds a configuration file that will completely specify the analysis. When finished it will display a page (or pages) with file names and design information. When it is finished you are ready to run the SnPM engine.
On fast new machines, like Sun Sparc Ultras or a Hewlett Packard C180, the computation of permutation should only take about 5 minutes.
One of the reasons that SnPM is divided into 3 discrete operations (Configure, Compute, Results) is to allow the Compute operation to be run at a later time or in the background. To this end, the 'Compute' function does not need any the windows of SPM and can be run with out initializing SPM (though the MATLABPATH environment variable must be set). This maybe useful to remember if you have trouble with running out of memory
To maximize the memory available for the 'Compute' step, and to see how to run it in batch mode, follow these steps.
On a Sun UltraSPARC 167 MHz this took under 6 minutes.
Then, you will be asked questions such as 'Write filtered statistic img?' and 'Write FWE-corrected p-value img?'. You can choose 'no' for both of them. It will save time and won't change the final result.
Next it will ask for a corrected p value for filtering. The uncorrected and FWE-corrected p-values are exact, meaning if the null hypothesis is true exactly 5% of the time you'll find a P-value 0.05 or smaller (assuming 0.05 is a possible nonparameteric P-value, as the permutaiton distribution is discrete and all p-values are multiples of 1/nPerm). FDR p-values are valid based on an assumption of positive dependence between voxels; this seems to be a reasonable assumption for image data. Note that SPM's corrected p values derived from the Gaussian random field theory are only approximate.
Next, if you collected supra-threshold stats, it will ask if you want to assess spatial extent. For now, let's not assess spatial extent.
You will be given the opportunity to write out the statistic image and the p-value image. Examining the location of activation on an atlas image or coregistered anatomical data is one of the best way to understand your data.
Shortly the Results screen will first show the permutation distributions.

The figure is titled ``SnPM{Pseudo-t}'' to remind you that the variance has been smoothed and hence the intensity values listed don't follow a t distribution. The tabular listing indicates that there are 68 voxels significant at the 0.05 level; the maximum pseudo-t is 6.61 and it occurs at (38, -28, 48).

This information at the bottom of the page documents the parameters of the analysis. The ``bhPerms=1'' is noting that only half of the permutations were calculated; this is because this simple A-B paradigm gives you two labelings for every calculation. For example, the maximum pseudo-t of the
Now click again on
When SnPM saves supra-threshold stats, it saves all pseudo-t values above a given threshold for all permutations. The lower lower limit shown (it is 1.23 for the motor data) is this ``data collection'' threshold. The upper threshold is the pseudo-t value that corresponds to the corrected p-value threshold (4.98 for our data); there is no sense entering threshold above this value since any voxels above it are already significant by intensity alone.
Trying a couple different thresholds, we found 2.5 to be a good threshold. This, though, is a problem. The inference is strictly only valid when the threshold is specified a priori. If this were a parametric t image (i.e. we had not smoothed the variance) we could specify a univariate p-value which would translate to specify a t threshold; since we are using a pseudo t, we have no parametric distributional results with which to convert a p-value into a pseudo t. The only strictly valid approach is to determine a threshold from one dataset (by fishing with as many thresholds as desired) and then applying that threshold to a different dataset. We are working to come up with guidelines which will assist in the threshold selection process.
Now we see that we have identified one 562 voxel cluster as being significant at 0.005 (all significances must be multiples of 1 over the number of permutations, so this significance is really 1/216=0.0046). This means that when pseudo-t images from all the permutations were thresholded at 2.5, no permutation had a maximum cluster size greater than 562.