Home > FMAToolbox > Analyses > CompareDistributions.m

CompareDistributions

PURPOSE ^

CompareDistributions - Compare two N-dimensional distributions (e.g. prospective place fields).

SYNOPSIS ^

function [h,stats] = CompareDistributions(group1,group2,varargin)

DESCRIPTION ^

CompareDistributions - Compare two N-dimensional distributions (e.g. prospective place fields).

  This non-parametric test determines in which dimensions two distributions are
  significantly different. For example, the two distributions could be
  leftward vs rightward trajectories in a T maze (each 'dimension' of these
  distributions corresponds to a spatial bin), and the test would determine
  where the trajectories are significantly different (diverge).

  This test is based on the bootstrap method developed in Fujisawa et al.
  (2008). This compares the observed difference between the two distribution
  averages vs the averages and confidence intervals for surrogate data (where
  individual observations are shuffled across groups). Both pointwise and
  global confidence intervals are computed.

  USAGE

    [h,stats] = CompareDistributions(group1,group2,<options>)

    group1,group2  values for the two groups: lines are observations, columns
                   are dimensions (e.g. spatial bins)
    <options>      optional list of property-value pairs (see table below)

    =========================================================================
     Properties    Values
    -------------------------------------------------------------------------
     'nShuffles'   number of shuffles to estimate the distribution
                   (default = 5000)
     'alpha'       confidence level (default = 0.05)
     'max'         maximum number of iterations for global confidence
                   intervals (default = 6)
     'tolerance'   maximum difference (in %) between target and computed
                   global alpha (default = 0.8)
     'tail'        one or two-tailed ditributions (default = two)
     'show'        show figure (default 'off')
     'verbose'     show information about ongoing processing (default 'off')
    =========================================================================

  OUTPUT

    h                  h = 1 if the null hypothesis can be rejected
    stats.observed     observed difference between the means of the two groups
    stats.null         expected difference between the means of the two groups
                       under the null hypothesis
    stats.pointwise    pointwise confidence intervals at alpha level
    stats.global       global confidence intervals at alpha level
    stats.above        logical vector indicating where the null hypothesis can
                       be rejected because the observed difference exceeds the
                       upper confidence limits
    stats.below        logical vector indicating where the null hypothesis can
                       be rejected because the observed difference lies below
                       the lower confidence limits
    stats.alpha        successive pointwise alpha values used to target
                       the required global alpha level (one for each iteration)
    stats.p            successive resulting global alpha values (one for each iteration)

CROSS-REFERENCE INFORMATION ^

This function calls: This function is called by:

SOURCE CODE ^

0001 function [h,stats] = CompareDistributions(group1,group2,varargin)
0002 
0003 %CompareDistributions - Compare two N-dimensional distributions (e.g. prospective place fields).
0004 %
0005 %  This non-parametric test determines in which dimensions two distributions are
0006 %  significantly different. For example, the two distributions could be
0007 %  leftward vs rightward trajectories in a T maze (each 'dimension' of these
0008 %  distributions corresponds to a spatial bin), and the test would determine
0009 %  where the trajectories are significantly different (diverge).
0010 %
0011 %  This test is based on the bootstrap method developed in Fujisawa et al.
0012 %  (2008). This compares the observed difference between the two distribution
0013 %  averages vs the averages and confidence intervals for surrogate data (where
0014 %  individual observations are shuffled across groups). Both pointwise and
0015 %  global confidence intervals are computed.
0016 %
0017 %  USAGE
0018 %
0019 %    [h,stats] = CompareDistributions(group1,group2,<options>)
0020 %
0021 %    group1,group2  values for the two groups: lines are observations, columns
0022 %                   are dimensions (e.g. spatial bins)
0023 %    <options>      optional list of property-value pairs (see table below)
0024 %
0025 %    =========================================================================
0026 %     Properties    Values
0027 %    -------------------------------------------------------------------------
0028 %     'nShuffles'   number of shuffles to estimate the distribution
0029 %                   (default = 5000)
0030 %     'alpha'       confidence level (default = 0.05)
0031 %     'max'         maximum number of iterations for global confidence
0032 %                   intervals (default = 6)
0033 %     'tolerance'   maximum difference (in %) between target and computed
0034 %                   global alpha (default = 0.8)
0035 %     'tail'        one or two-tailed ditributions (default = two)
0036 %     'show'        show figure (default 'off')
0037 %     'verbose'     show information about ongoing processing (default 'off')
0038 %    =========================================================================
0039 %
0040 %  OUTPUT
0041 %
0042 %    h                  h = 1 if the null hypothesis can be rejected
0043 %    stats.observed     observed difference between the means of the two groups
0044 %    stats.null         expected difference between the means of the two groups
0045 %                       under the null hypothesis
0046 %    stats.pointwise    pointwise confidence intervals at alpha level
0047 %    stats.global       global confidence intervals at alpha level
0048 %    stats.above        logical vector indicating where the null hypothesis can
0049 %                       be rejected because the observed difference exceeds the
0050 %                       upper confidence limits
0051 %    stats.below        logical vector indicating where the null hypothesis can
0052 %                       be rejected because the observed difference lies below
0053 %                       the lower confidence limits
0054 %    stats.alpha        successive pointwise alpha values used to target
0055 %                       the required global alpha level (one for each iteration)
0056 %    stats.p            successive resulting global alpha values (one for each iteration)
0057 %
0058 
0059 % Copyright (C) 2010-2011 by Erika Cerasti, 2013-2018 by Michaƫl Zugaro
0060 %
0061 % This program is free software; you can redistribute it and/or modify
0062 % it under the terms of the GNU General Public License as published by
0063 % the Free Software Foundation; either version 3 of the License, or
0064 % (at your option) any later version.
0065 
0066 % Default values
0067 nShuffles = 5000;
0068 alpha = 0.05;
0069 tail = 'two';
0070 show = false;
0071 maxIterations = 6;
0072 tolerance = 0.8;
0073 verbose = false;
0074 
0075 % Check number of parameters
0076 if nargin < 2 | mod(length(varargin),2) ~= 0,
0077     error('Incorrect number of parameters (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).');
0078 end
0079 
0080 if isempty(group1) || isempty(group2), return; end
0081 
0082 % Check parameter sizes
0083 if size(group1,2) ~= size(group2,2),
0084     error('The two groups do not have the same number of columns (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).');
0085 end
0086 
0087 isNDimensional = size(group1,2) > 1; % N dimensions (N>1) => compute global confidence intervals
0088 
0089 % Parse parameter list
0090 for i = 1:2:length(varargin),
0091     if ~ischar(varargin{i}),
0092         error(['Parameter ' num2str(i+1) ' is not a property (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).']);
0093     end
0094     switch(lower(varargin{i})),
0095         case 'nshuffles',
0096             nShuffles = lower(varargin{i+1});
0097             if ~isdscalar(nShuffles,'>0'),
0098                 error('Incorrect value for property ''nShuffles'' (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).');
0099             end
0100         case 'alpha',
0101             alpha = varargin{i+1};
0102             if ~isdscalar(alpha,'>0','<1'),
0103                 error('Incorrect value for property ''alpha'' (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).');
0104             end
0105         case 'max',
0106             maxIterations = varargin{i+1};
0107             if ~isdscalar(maxIterations,'>1'),
0108                 error('Incorrect value for property ''max'' (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).');
0109             end
0110         case 'tolerance',
0111             tolerance = varargin{i+1};
0112             if ~isdscalar(tolerance,'>0'),
0113                 error('Incorrect value for property ''tolerance'' (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).');
0114             end
0115         case 'tail',
0116             tail = varargin{i+1};
0117             if ~isastring(tail,'one','two'),
0118                 error('Incorrect value for property ''tail'' (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).');
0119             end
0120         case 'show',
0121             show = varargin{i+1};
0122             if ~isastring(show,'on','off'),
0123                 error('Incorrect value for property ''show'' (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).');
0124             end
0125             show = strcmp(show,'on');
0126         case 'verbose',
0127             verbose = varargin{i+1};
0128             if ~isastring(verbose,'on','off'),
0129                 error('Incorrect value for property ''verbose'' (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).');
0130             end
0131             verbose = strcmp(verbose,'on');
0132         otherwise,
0133             error(['Unknown property ''' num2str(varargin{i}) ''' (type ''help <a href="matlab:help CompareDistributions">CompareDistributions</a>'' for details).']);
0134     end
0135 end
0136 
0137 if verbose,
0138     disp('Shuffling the data...');
0139 end
0140 
0141 n1 = size(group1,1);
0142 n2 = size(group2,1);
0143 
0144 differences = nan(nShuffles,size(group1,2));
0145 
0146 g = [group1;group2];
0147 for i = 1:nShuffles,
0148     
0149    % Shuffle the data
0150     shuffled = randperm(n1+n2);
0151     shuffled1 = g(shuffled(1:n1),:);
0152     shuffled2 = g(shuffled(n1+1:end),:);
0153 
0154     % Compute the mean difference
0155     m1 = mean(shuffled1);
0156     m2 = mean(shuffled2);
0157     differences(i,:) = m1 - m2;
0158 
0159     if verbose,
0160         step = i - floor(i/1000)*1000;
0161         if step == 0,
0162             disp([' Iteration # ' int2str(i) '/' int2str(nShuffles)]);
0163         end
0164     end
0165 
0166 end
0167 
0168 if verbose,
0169     disp('Computing confidence intervals...');
0170 end
0171 
0172 nBins = size(group1,2);
0173 deviation = 1;
0174 iteration = 0;
0175 while deviation*100 > tolerance,
0176 
0177     % Check number of iterations
0178     iteration = iteration + 1;
0179     if iteration > maxIterations,
0180         warning(['Reached maximum number of iterations (' int2str(maxIterations) ')']);
0181         break;
0182     end
0183 
0184     % Update quantiles based on the value of alpha computed during the previous iteration
0185     if strcmp(tail,'one'),
0186         quantiles = 1-alpha;
0187     else
0188         quantiles = [alpha/2 1-(alpha/2)];
0189     end
0190 
0191     % Pointwise confidence intervals at alpha level, using the value of alpha computed during the previous iteration
0192     confidenceIntervals = quantile(differences,quantiles);
0193     confidenceIntervals = confidenceIntervals([2 1],:);
0194 
0195     if ~isNDimensional,
0196         pointwise = confidenceIntervals;
0197         break;
0198     end
0199     
0200     % Global confidence intervals, using the pointwise intervals
0201     significant = differences > repmat(confidenceIntervals(1,:),nShuffles,1) | differences < repmat(confidenceIntervals(2,:),nShuffles,1);
0202     n = sum(any(significant,2));
0203     p = n/nShuffles;
0204     pGlobal(iteration) = p;
0205     alphaGlobal(iteration) = alpha;
0206 
0207     % Update alpha for next iteration
0208     deviation = abs(p-alpha);
0209     if iteration == 1,
0210         pointwise = confidenceIntervals;
0211         alpha = 0.003;
0212     else
0213         s = sign(p-alpha);
0214         if deviation > 0.05,
0215             alpha = alpha - s*deviation*0.00075;
0216         else
0217             alpha = alpha - s*deviation*0.033;
0218         end
0219         if alpha < 0.001,
0220             warning('alpha is becoming too small, global confidence bands will not be accurate');
0221             break;
0222         end
0223     end
0224 
0225 end
0226 
0227 % Statistics
0228 stats.observed = nanmean(group1) - nanmean(group2);
0229 stats.null = nanmean(differences,1);
0230 stats.pointwise = pointwise;
0231 stats.global = [];
0232 stats.alpha = [];
0233 stats.p = [];
0234 if(iteration > 1)
0235     stats.global = confidenceIntervals;
0236     stats.alpha = alphaGlobal;
0237     stats.p = pGlobal;
0238 end
0239 
0240 % Determine significant segments
0241 
0242 if ~isNDimensional,
0243 
0244     h = stats.observed > stats.pointwise(1) | stats.observed < stats.pointwise(2);
0245     stats.above = [];
0246     stats.below = [];
0247 
0248 else
0249 
0250     % Where does the observed difference exceed the pointwise upper confidence limit? (etc.)
0251     abovePointwise = stats.observed > stats.pointwise(1,:);
0252     belowPointwise = stats.observed < stats.pointwise(2,:);
0253     aboveGlobal = stats.observed > stats.global(1,:);
0254     belowGlobal = stats.observed < stats.global(2,:);
0255     above = abovePointwise & aboveGlobal;
0256     below = belowPointwise & belowGlobal;
0257 
0258     % Segments above the upper limit: a segment is considered significant if exceeds the global upper limit,
0259     % but its extent is determined using the pointwise upper limit
0260 
0261     if any(above),
0262         % Points where global upper limit is exceeded (when several points are adjacent, keep only the first one)
0263         start = find(diff([0 above])==1);
0264         for i = start,
0265             % Examine points to the left: do they exceed the pointwise upper limit?
0266             j = 1;
0267             while i-j ~= 0 && abovePointwise(i-j),
0268                 above(i-j) = 1;
0269                 j = j+1;
0270             end
0271             % Examine points to the right: do they exceed the pointwise upper limit?
0272             j = 1;
0273             while i+j <= length(abovePointwise) && abovePointwise(i+j),
0274                 above(i+j) = 1;
0275                 j = j+1;
0276             end
0277         end
0278     end
0279     stats.above = above;
0280 
0281     % Segments below the lower limit
0282     if any(below),
0283         start = find(diff([0 below])==1);
0284         for t=1: length(start)
0285             i=start(t);
0286             j=1;
0287             while(i-j)~=0 && (belowPointwise(i-j)==1)
0288                     below(i-j)=1;
0289                     j=j+1;
0290             end
0291             j=1;
0292             while(i+j)<=length(belowPointwise) && (belowPointwise(i+j)==1)
0293                     below(i+j)=1;
0294                     j=j+1;
0295             end
0296         end
0297     end
0298     stats.below = below;
0299 
0300     % Reject H0?
0301     h = any(above) | any(below);
0302 
0303 end
0304 
0305 if isNDimensional && show,
0306 
0307     orange = [1 0.4 0];
0308     lightGreen = [0.6 0.9 0.2];
0309     darkGreen = [0.15 0.35 0.15];
0310     
0311     x = 1:length(stats.observed);
0312     figure; hold on;
0313     plot(x,stats.global,'o-','MarkerSize',4,'Color',darkGreen,'MarkerFaceColor',darkGreen,'MarkerEdgeColor',darkGreen);
0314     plot(x,stats.pointwise,'o-','MarkerSize',4,'Color',lightGreen,'MarkerFaceColor',lightGreen,'MarkerEdgeColor',lightGreen);
0315     plot(x,stats.observed,'Color',orange,'LineWidth',2);
0316     PlotHVLines(0,'h','--','Color','k');
0317     ylabel('Mean Differences');
0318     xlabel('Bins');
0319     
0320     PlotIntervals(ToIntervals(stats.above|stats.below),'rectangles');
0321 
0322 end

Generated on Fri 16-Mar-2018 13:00:20 by m2html © 2005