npcdf - Non-parametric cumulative distribution function. Given a list of samples drawn from an unknown (non-parametric) distribution, the corresponding cumulative distribution function is evaluated at a set of given points. USAGE p = npcdf(X,x) X list of samples drawn from the unknown distribution x set of points where the cdf should be evaluated
0001 function p = npcdf(X,x) 0002 0003 %npcdf - Non-parametric cumulative distribution function. 0004 % 0005 % Given a list of samples drawn from an unknown (non-parametric) distribution, 0006 % the corresponding cumulative distribution function is evaluated at a set of 0007 % given points. 0008 % 0009 % USAGE 0010 % 0011 % p = npcdf(X,x) 0012 % 0013 % X list of samples drawn from the unknown distribution 0014 % x set of points where the cdf should be evaluated 0015 % 0016 0017 % Copyright (C) 2010-2011 by Michaƫl Zugaro 0018 % 0019 % This program is free software; you can redistribute it and/or modify 0020 % it under the terms of the GNU General Public License as published by 0021 % the Free Software Foundation; either version 3 of the License, or 0022 % (at your option) any later version. 0023 0024 if nargin < 2, 0025 error('Incorrect number of parameters (type ''help <a href="matlab:help npcdf">npcdf</a>'' for details).'); 0026 end 0027 0028 % Get rid of NaNs, reshape inputs as vectors 0029 X(isnan(X)) = []; 0030 X = X(:); 0031 nX = length(X); 0032 x(isnan(x)) = []; 0033 x = x(:); 0034 nx = length(x); 0035 0036 % Construct a matrix where the first column is a concatenation of the samples and points, 0037 % the second is 1 for samples and 0 for points, and the third is the order of the points 0038 % (this will be necessary to keep track of the points after the matrix is reordered) 0039 Y = [X ones(nX,1) zeros(nX,1);x zeros(nx,1) (1:nx)']; 0040 % Sort samples and points in ascending order, so that the position of the points in the matrix 0041 % (row numbers) corresponds to their cdf values 0042 [Y,i] = sortrows(Y); 0043 0044 % Actually, row numbers should only take samples into account; this is precisely what the 0045 % second column of Y is for (it has 1 for samples and 0 for points) 0046 F = cumsum(Y(:,2))/nX; 0047 % Now that we have the value of the cdf at each sample and each point, we need to extract 0048 % the value at the points; again, we use the second column of Y (it has 0 for points) 0049 tested = Y(:,2) == 0; 0050 % Because Y was reordered (sortrows), we need to reorder the values of F back to the order 0051 % in which the points were listed in x 0052 p(Y(tested,3)) = F(tested);