Description of DBMatchValues

DBMatchValues - Match values for two different sets of variables.

  It is often necessary to match (associate) variables that were computed
  independently, e.g. the phase precession slope and field size of hippocampal
  place cells. Typically, such data are computed over repeated experiments
  using batch processing tools (see <a href="StartBatch">StartBatch</a>) and stored in databases for
  subsequent group analyses. Once retrieved using <a href="DBGetValues">DBGetValues</a>, they can be
  matched using this function.

  USAGE

    [in1,in2,out1,out2,m1,m2] = DBMatchValues(eid1,eid2,rexp)

    eid1,eid2      experiment IDs (see <a href="DBGetValues">DBGetValues</a>)
    rexp           optional regular expression (see Example 2)

  OUTPUT

    in1            indices of elements in eid1 that are also in eid2
    in2            indices of elements in eid2 that are also in eid1
    out1           indices of elements in eid1 that are not in eid2
    out2           indices of elements in eid2 that are not in eid1
    m1             matches (extracted patterns) in eid1
    m2             matches (extracted patterns) in eid2

  EXAMPLE 1

    In this example, place cells were recorded on successive days while the
    animal explored a maze. Phase precession slopes and firing field sizes
    were stored in a database using eids like '20120213-Maze-(1,2)' (where 1,2
    corresponds to tetrode 1, cluster 2) and named 'PPSlope' and 'FieldSize',
    respectively.

    Get slopes and sizes:

    [slopes,eid1] = DBGetValues('eid like "%Maze%" and name="PPSlope"');
    [sizes,eid2] = DBGetValues('eid like "%Maze%" and name="FieldSize"');

    Discard values that cannot be correlated (incomplete pairs), and reorder
    so that each line in both variables corresponds to the same data:

    [in1,in2] = DBMatchValues(eid1,eid2);
    slopes = slopes(in1);
    sizes = sizes(in2);

  EXAMPLE 2

    In this example, cells were recorded during both wake an sleep, and their
    average firing rates should be compared across behavioral conditions.
    Data were stored in a database using eids like '20120213-Maze-(1,2)' or
    '20130214-Sleep-(1,2)' and named 'MeanRate'. Matching them is trickier
    because their eids are not pairwise identical. Here we need to extract
    the relevant portions of the eids, i.e. discard 'Sleep' or 'Maze'.

    Get mean rates:

    [maze,eid1] = DBGetValues('eid like "%Maze%" and name="MeanRate"');
    [sleep,eid2] = DBGetValues('eid like "%Sleep%" and name="MeanRate"');

    Discard values that cannot be correlated (incomplete pairs), and reorder
    so that each line in both variables corresponds to the same data:

    [in1,in2] = DBMatchValues(eid1,eid2,'([0-9]{8}).*([0-9]*,[0-9]*)');
    maze = maze(in1);
    sleep = sleep(in2);

    The regular expression includes two tokens (indicated by the parentheses),
    namely [0-9]{8} (eight successive occurrences of a digit between 0 and 9)
    and [0-9]*,[0-9]* (any number of digits, a comma, any number of digits).

  SEE

    See also DBGetValues, DBGetVariables, DBAddVariable.

0001 function [in1,in2,out1,out2,eid1,eid2] = DBMatchValues(eid1,eid2,rexp)
0002 
0003 %DBMatchValues - Match values for two different sets of variables.
0004 %
0005 %  It is often necessary to match (associate) variables that were computed
0006 %  independently, e.g. the phase precession slope and field size of hippocampal
0007 %  place cells. Typically, such data are computed over repeated experiments
0008 %  using batch processing tools (see <a href="StartBatch">StartBatch</a>) and stored in databases for
0009 %  subsequent group analyses. Once retrieved using <a href="DBGetValues">DBGetValues</a>, they can be
0010 %  matched using this function.
0011 %
0012 %  USAGE
0013 %
0014 %    [in1,in2,out1,out2,m1,m2] = DBMatchValues(eid1,eid2,rexp)
0015 %
0016 %    eid1,eid2      experiment IDs (see <a href="DBGetValues">DBGetValues</a>)
0017 %    rexp           optional regular expression (see Example 2)
0018 %
0019 %  OUTPUT
0020 %
0021 %    in1            indices of elements in eid1 that are also in eid2
0022 %    in2            indices of elements in eid2 that are also in eid1
0023 %    out1           indices of elements in eid1 that are not in eid2
0024 %    out2           indices of elements in eid2 that are not in eid1
0025 %    m1             matches (extracted patterns) in eid1
0026 %    m2             matches (extracted patterns) in eid2
0027 %
0028 %  EXAMPLE 1
0029 %
0030 %    In this example, place cells were recorded on successive days while the
0031 %    animal explored a maze. Phase precession slopes and firing field sizes
0032 %    were stored in a database using eids like '20120213-Maze-(1,2)' (where 1,2
0033 %    corresponds to tetrode 1, cluster 2) and named 'PPSlope' and 'FieldSize',
0034 %    respectively.
0035 %
0036 %    Get slopes and sizes:
0037 %
0038 %    [slopes,eid1] = DBGetValues('eid like "%Maze%" and name="PPSlope"');
0039 %    [sizes,eid2] = DBGetValues('eid like "%Maze%" and name="FieldSize"');
0040 %
0041 %    Discard values that cannot be correlated (incomplete pairs), and reorder
0042 %    so that each line in both variables corresponds to the same data:
0043 %
0044 %    [in1,in2] = DBMatchValues(eid1,eid2);
0045 %    slopes = slopes(in1);
0046 %    sizes = sizes(in2);
0047 %
0048 %  EXAMPLE 2
0049 %
0050 %    In this example, cells were recorded during both wake an sleep, and their
0051 %    average firing rates should be compared across behavioral conditions.
0052 %    Data were stored in a database using eids like '20120213-Maze-(1,2)' or
0053 %    '20130214-Sleep-(1,2)' and named 'MeanRate'. Matching them is trickier
0054 %    because their eids are not pairwise identical. Here we need to extract
0055 %    the relevant portions of the eids, i.e. discard 'Sleep' or 'Maze'.
0056 %
0057 %    Get mean rates:
0058 %
0059 %    [maze,eid1] = DBGetValues('eid like "%Maze%" and name="MeanRate"');
0060 %    [sleep,eid2] = DBGetValues('eid like "%Sleep%" and name="MeanRate"');
0061 %
0062 %    Discard values that cannot be correlated (incomplete pairs), and reorder
0063 %    so that each line in both variables corresponds to the same data:
0064 %
0065 %    [in1,in2] = DBMatchValues(eid1,eid2,'([0-9]{8}).*([0-9]*,[0-9]*)');
0066 %    maze = maze(in1);
0067 %    sleep = sleep(in2);
0068 %
0069 %    The regular expression includes two tokens (indicated by the parentheses),
0070 %    namely [0-9]{8} (eight successive occurrences of a digit between 0 and 9)
0071 %    and [0-9]*,[0-9]* (any number of digits, a comma, any number of digits).
0072 %
0073 %  SEE
0074 %
0075 %    See also DBGetValues, DBGetVariables, DBAddVariable.
0076 %
0077 
0078 
0079 % Copyright (C) 2013 by Michaël Zugaro
0080 %
0081 % This program is free software; you can redistribute it and/or modify
0082 % it under the terms of the GNU General Public License as published by
0083 % the Free Software Foundation; either version 3 of the License, or
0084 % (at your option) any later version.
0085 
0086 % Transform eids if necessary
0087 if nargin >= 3,
0088     eid1 = DoRegexp(eid1,rexp);
0089     eid2 = DoRegexp(eid2,rexp);
0090 else
0091     disp([ 'Example eid: ' eid1{1}]);
0092     disp([ 'Example eid: ' eid2{1}]);
0093 end
0094 % Make sure at least one of the lists contains unique values (keys)
0095 key1 = length(eid1) == length(unique(eid1));
0096 key2 = length(eid2) == length(unique(eid2));
0097 if ~key1 && ~key2,
0098     error('Neither list contains unique values.');
0099 end
0100 
0101 % Check which eids in list 1 are in list 2, making sure empty eids always fail the test
0102 [~,in2] = ismember(eid1,eid2);
0103 empty = cellfun('isempty',eid1);
0104 in2(empty) = 0;
0105 % The code above returns locations (items found) intermixed with zeros (items not found)
0106 % (for details, see help for 'ismember'). Split this information into separate variables.
0107 out1 = find(in2==0);
0108 in2 = in2(in2~=0);
0109 % If values in list 2 are not unique, duplicate entries in list 1 where necessary
0110 if ~key2,
0111     in2 = find(ismember(eid2,eid2(in2)));
0112 end
0113 
0114 % Check which eids in list 2 are in list 1, making sure empty eids always fail the test
0115 [~,in1] = ismember(eid2,eid1);
0116 empty = cellfun('isempty',eid2);
0117 in1(empty) = 0;
0118 % The code above returns locations (items found) intermixed with zeros (items not found)
0119 % (for details, see help for 'ismember'). Split this information into separate variables.
0120 out2 = find(in1==0);
0121 in1 = in1(in1~=0);
0122 % If values in list 1 are not unique, duplicate entries in list 2 where necessary
0123 if ~key1,
0124     in1 = find(ismember(eid1,eid1(in1)));
0125 end
0126 
0127 % Helper function: transform all eids, i.e. extract tokens using the regular expression then concatenate
0128 % the tokens (eids that do not match are set to the empty string '')
0129 
0130 function r = DoRegexp(s,rexp)
0131 
0132 r = cellfun(@(x) regexp(x,rexp,'tokens'),s,'uniformoutput',false);
0133 empty = cellfun('isempty',r);
0134 r(empty) = {''};
0135 r(~empty) = cellfun(@(x) horzcat(x{1}{:}),r(~empty),'uniformoutput',false);
0136 
0137 disp([ 'Example pattern extraction: ' s{1} ' -> ' r{1}]);

DBMatchValues

PURPOSE

SYNOPSIS

DESCRIPTION

CROSS-REFERENCE INFORMATION

SUBFUNCTIONS

SOURCE CODE