[ENH] CompCor enhancement #2878

rciric · 2019-02-09T10:36:39Z

Summary

Continuing #2859

As a step toward implementing nipreps/fmriprep#1458, this PR updates the CompCor algorithm to allow greater flexibility.

List of changes proposed in this PR (pull-request)

As before, num_components encodes the number of components to return if it takes an integer value greater than 1.
If num_components is all, then CompCor returns all component time series.
Alternatively, a mutually exclusive interface option, variance_threshold, allows for automatic selection of components on the basis of their capacity to explain variance in the data. variance_threshold is mutually exclusive with num_components and takes values between 0 and 1.
If variance_threshold is set, CompCor returns the minimum number of components necessary to explain the indicated variance in the data, computed as the cumulative sum of (singular_value^2)/sum(singular_values^2).
For backward compatibility, if neither num_components nor variance_threshold is defined, then CompCor executes as though num_components were set to 6 (the previous default behaviour).
CompCor optionally writes out a metadata table, which includes for each component the source mask, the singular value, the explained variance, and the cumulative explained variance.

Acknowledgment

(Mandatory) I acknowledge that this contribution will be available under the Apache 2 license.

codecov-io · 2019-02-09T12:08:45Z

Codecov Report

Merging #2878 into master will increase coverage by 0.22%.
The diff coverage is 70.78%.

@@            Coverage Diff            @@
##           master   #2878      +/-   ##
=========================================
+ Coverage   67.57%   67.8%   +0.22%     
=========================================
  Files         343     344       +1     
  Lines       43645   44494     +849     
  Branches     5428    5557     +129     
=========================================
+ Hits        29494   30168     +674     
- Misses      13447   13606     +159     
- Partials      704     720      +16

Flag	Coverage Δ
#smoketests	`50.38% <12.35%> (-0.1%)`	⬇️
#unittests	`65.3% <70.78%> (+0.29%)`	⬆️

Impacted Files	Coverage Δ
nipype/workflows/rsfmri/fsl/resting.py	`85.24% <ø> (-0.24%)`	⬇️
nipype/info.py	`93.93% <ø> (ø)`	⬆️
nipype/algorithms/confounds.py	`67.52% <70.78%> (+1.17%)`	⬆️
nipype/interfaces/dipy/registration.py	`93.33% <0%> (-6.67%)`	⬇️
nipype/interfaces/dipy/base.py	`76.28% <0%> (-0.3%)`	⬇️
nipype/interfaces/dipy/stats.py	`100% <0%> (ø)`
nipype/interfaces/dipy/reconstruction.py	`33.49% <0%> (+0.48%)`	⬆️
nipype/interfaces/afni/preprocess.py	`85.42% <0%> (+3.02%)`	⬆️
nipype/interfaces/dipy/tracks.py	`37.05% <0%> (+4.66%)`	⬆️
nipype/interfaces/dipy/preprocess.py	`29.11% <0%> (+5.58%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 44ed184...b80a3d7. Read the comment docs.

nipype/algorithms/confounds.py

oesteban

Worried about that i variable containing names. Other than that this is shaping up!

nipype/algorithms/confounds.py

rciric · 2019-03-01T08:52:40Z

Interestingly, I'm now getting

ValueError: Inputs for CompCor, timeseries and mask, do not have matching spatial dimensions ((64, 64, 46) and (64, 64, 46, 1), respectively)

for one of my test masks. Probably an oddity on my end -- I didn't add the squeeze back in, since it seemed to be causing trouble earlier.

rciric · 2019-03-01T20:29:31Z

Looks like the empty mask is now breaking cosine_filter -- I'll try and fix that next.

effigies · 2019-03-12T20:31:55Z

@rciric Check out nibabel.squeeze_image.

mgxd · 2019-03-27T15:53:09Z

hi @rciric - do you have a moment to finish this up sometime this week? We hope to release next week and would love to get this in.

rciric · 2019-03-28T00:15:52Z

Happy to do what it takes to get this over the finish line. It looks like I'm getting errors related to diffusion processing now -- specifically an import failure from dipy.workflows.tracking for DetTrackPAMFlow. The only workflows in dipy.workflows.tracking are PFTrackingPAMFlow and LocalFiberTrackingPAMFlow. I guess it's now LocalFiberTrackingPAMFlow with sh_strategy="deterministic"?

mgxd · 2019-03-28T00:18:42Z

@rciric yeah that is unrelated to this PR - I just merged #2903 which should fix it so if you pull the latest master it should be good.

oesteban · 2019-04-04T22:19:04Z

@rciric what is the status of this PR? /cc @mgxd @effigies

rciric · 2019-04-05T16:12:03Z

I tested it with my fmriprep install before the latest push, so I think we just need verification as to whether it works from reviewers. Additional feedback is also always welcome.

oesteban · 2019-04-09T19:12:57Z

@rciric are there any public tests (i.e. CircleCI) where we can check how this is playing along with fMRIPrep?

oesteban · 2019-04-17T21:39:16Z

Okay, so it seems this PR is working out as expected (https://circleci.com/workflow-run/3c1ae1ca-27db-43e6-897e-54cc5d323dcb, the red light on ds005 is unrelated to this PR).

If @rciric promises that the correlation between a_comp_cor_xx regressors does not pertain to this PR and should be addressed in niworkflows/fmriprep context then we can go ahead with this. Could you double check whether we get the same outputs if you force the CompCor interfaces in fmriprep to behave like the former implementation (ie. leave both num_components and variance_threshold undefined)?

Do you want to have a final review @effigies, @mgxd ?

@satra, it would be great if you could take a quick look, we don't want to regret having this PR merged in the future.

satra · 2019-04-18T00:12:35Z

@rciric and @oesteban - this looks good to me!

effigies

No serious complaints, but a few minor suggestions. Feel free to ignore if you just want to get this in.

@mgxd @satra Just a note on the #2913 situation here. Do you agree with my proposed traits constraint, or did you have something else in mind?

effigies · 2019-04-18T00:24:47Z

nipype/algorithms/tests/test_CompCor.py


-            components_data = [line.split('\t') for line in components_file]
+            components_data = [re.sub('\n', '', line).split('\t')


This is just stripping the newline? If so, re is overly heavy machinery.

Suggested change

components_data = [re.sub('\n', '', line).split('\t')

components_data = [line.rstrip().split('\t')

effigies · 2019-04-18T00:25:25Z

nipype/algorithms/tests/test_CompCor.py

+            assert os.path.getsize(expected_metadata_file) > 0
+
+            with open(ccresult.outputs.metadata_file, 'r') as metadata_file:
+                components_metadata = [re.sub('\n', '', line).split('\t')


Again, if I understand the purpose:

Suggested change

components_metadata = [re.sub('\n', '', line).split('\t')

components_metadata = [line.rstrip().split('\t')

effigies · 2019-04-18T00:30:11Z

nipype/info.py

@@ -141,7 +141,8 @@ def get_nipype_gitversion():
    'numpy>=%s ; python_version >= "3.7"' % NUMPY_MIN_VERSION_37,
    'python-dateutil>=%s' % DATEUTIL_MIN_VERSION,
    'scipy>=%s' % SCIPY_MIN_VERSION,
-    'traits>=%s' % TRAITS_MIN_VERSION,
+    'traits>=%s,<%s ; python_version == "2.7"' % (TRAITS_MIN_VERSION, '5.0.0'),


This is #2913. The fix should be just to skip 5.0, and we can do that for all versions.

Suggested change

'traits>=%s,<%s ; python_version == "2.7"' % (TRAITS_MIN_VERSION, '5.0.0'),

'traits>=%s,!=5.0' % TRAITS_MIN_VERSION,

I'm fine with this, though we should probably only restrict 5.0 if the user is running py27 - to my knowledge everything was working fine with py3.

I don't have a strong opinion here (this seems simpler than splitting out by Python version, but yes there may be someone who really needs Py3 + traits 5.0), but I think maybe let's make that change after this is merged?

effigies · 2019-04-18T00:30:18Z

nipype/info.py

@@ -141,7 +141,8 @@ def get_nipype_gitversion():
    'numpy>=%s ; python_version >= "3.7"' % NUMPY_MIN_VERSION_37,
    'python-dateutil>=%s' % DATEUTIL_MIN_VERSION,
    'scipy>=%s' % SCIPY_MIN_VERSION,
-    'traits>=%s' % TRAITS_MIN_VERSION,
+    'traits>=%s,<%s ; python_version == "2.7"' % (TRAITS_MIN_VERSION, '5.0.0'),
+    'traits>=%s ; python_version >= "3.0"' % TRAITS_MIN_VERSION,


Suggested change

'traits>=%s ; python_version >= "3.0"' % TRAITS_MIN_VERSION,

effigies · 2019-04-18T00:31:36Z

nipype/algorithms/tests/test_CompCor.py

@@ -1,6 +1,7 @@
 # emacs: -*- mode: python; py-indent-offset: 4; indent-tabs-mode: nil -*-
 # vi: set ft=python sts=4 ts=4 sw=4 et:
 import os
+import re


The suggestions below were prompted by a suspicion this might be heavier than needed. If you accept those changes, also revert:

Suggested change

import re

effigies · 2019-04-18T00:38:56Z

nipype/algorithms/confounds.py

            raise ValueError('No components found')
-        components = np.ones((M.shape[0], num_components), dtype=np.float32) * np.nan
-    return components, basis
+        components = np.full((M.shape[0], num_components), np.nan)


Are you intentionally promoting this to a np.float64?

effigies · 2019-04-18T00:58:34Z

nipype/algorithms/confounds.py

+                np.hstack((metadata['cumulative_variance_explained'],
+                           cumulative_variance_explained)))
+            metadata['retained'] = (
+                metadata['retained'] + [i < num_components for i in range(len(s))])


If these can get large and you do more than 2 or 3 loops, this will be significantly less efficient than aggregating these things into lists and concatenating/stacking them at the end. e.g.,

# Outside loop md_mask = [] md_sv = [] md_var = [] md_cumvar = [] md_retained = [] for ... # here md_mask.append([name] * len(s)) md_sv.append(s) md_var.append(variance_explained) md_cumvar.append(cumulative_variance_explained) # The lack of square brackets is intentional md_retained.append(i < num_components for i in range(len(s))) # below metadata = { 'mask': list(itertools.chain(*md_mask)), 'singular_value': np.hstack(md_sv), 'variance_explained': np.hstack(md_var), 'cumulative_variance_explained': np.hstack(md_cumvar), 'retained': list(itertools.chain(*md_retained))}

effigies · 2019-04-18T00:59:15Z

nipype/algorithms/confounds.py

+        Numpy array containing the requested set of noise components
+    basis: numpy array
+        Numpy array containing the (non-constant) filter regressors
+    metadata: OrderedDict{str: numpy array}


Why is this an OrderedDict as opposed to a dict?

…; correct var name

rciric · 2019-04-19T15:47:55Z

Hello, and thanks for the comments! I think they've cleaned up the code considerably -- I've updated the code to reflect those comments with the following exceptions:

I wasn't able to get md_retained working as a generator on my local test machine, so I ended up using a list.
I am using an OrderedDict because the call that prints the metadata to TSV currently requires its entries to be in a particular order.

effigies · 2019-04-19T16:46:30Z

Oh, looks like tests are failing. Sorry if I've led you astray.

rciric · 2019-04-19T18:09:29Z

No, this is totally me not updating my own unit tests.

oesteban · 2019-04-29T21:25:40Z

yay! @rciric

rciric added 19 commits January 19, 2019 11:18

add variance-driven component selection, return component metadata

329c74d

expose metadata to interface, fix component selection for multiple masks

17f3e12

propagate failure mode if provided

114e6d4

allow mask naming in metadata

6f4fc19

add contributor

4d2208e

include component index in metadata

bfbde82

update autotests and make naming consistent

0373879

(CompCor) more intuitive interface following review from @effigies

2c551d0

manually set num_components in test

a53cd46

manually set num_components in test

b811d47

Merge branch 'master' of https://github.com/rciric/nipype

2743189

add unit test for variance_threshold condition

577e395

provide mask name to circumvent test failure

66c7540

(CompCor) try using an OrderedDict for metadata

0bb0096

first-pass refactor CompCor to SimpleInterface

94bea4a

return metadata for all components regardless of retention criterion

addb0e9

@oesteban: limit np array use, clean up conditionals, remove invalid obj

b04c9ca

less np array use; unique names for dropped components

e957e87

ensure absolute path to components file

797801e

rciric added 2 commits February 9, 2019 04:31

(CompCor) try BaseInterface

67a3276

ensure absolute path to components file

fe430f5

oesteban reviewed Feb 15, 2019

View reviewed changes

nipype/algorithms/confounds.py Outdated Show resolved Hide resolved

oesteban reviewed Feb 15, 2019

View reviewed changes

rciric added 3 commits February 15, 2019 10:58

update per @oesteban 's review

1625bdb

assign output to _results

9afb3f5

assign output to _results

689d064

oesteban reviewed Feb 15, 2019

View reviewed changes

nipype/algorithms/confounds.py Show resolved Hide resolved

oesteban reviewed Feb 15, 2019

View reviewed changes

nipype/algorithms/confounds.py Show resolved Hide resolved

some fixes

f390bc6

oesteban mentioned this pull request Mar 13, 2019

Integrate new confounds infrastructure nipreps/fmriprep#1540

Closed

4 tasks

rciric added 2 commits March 27, 2019 10:03

filter handles empty masks, use squeeze_image

27ed03f

Merge branch 'master' of https://github.com/nipy/nipype

422c04c

rciric added 2 commits March 27, 2019 17:32

Merge branch 'master' of https://github.com/nipy/nipype

144fca3

default to old behaviour for temporal filters

82a25c2

Merge branch 'master' into master

4c1af8a

effigies reviewed Apr 18, 2019

View reviewed changes

oesteban mentioned this pull request Apr 18, 2019

tcompcor: IndexError: cannot do a non-empty take from an empty axes. nipreps/fmriprep#1409

Open

rciric added 3 commits April 18, 2019 23:31

integrate @effigies review comments

79e840d

propagate retention status to metadata; use list instead of generator…

1b1b6fa

…; correct var name

Merge branch 'master' of https://github.com/rciric/nipype

89ba3b4

effigies approved these changes Apr 19, 2019

View reviewed changes

update unit test to include new metadata field

b80a3d7

effigies merged commit d353f0d into nipy:master Apr 29, 2019

effigies mentioned this pull request May 8, 2019

REL: 1.2.0 #2930

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] CompCor enhancement #2878

[ENH] CompCor enhancement #2878

rciric commented Feb 9, 2019

codecov-io commented Feb 9, 2019 •

edited

Loading

oesteban left a comment

rciric commented Mar 1, 2019

rciric commented Mar 1, 2019

effigies commented Mar 12, 2019

mgxd commented Mar 27, 2019

rciric commented Mar 28, 2019 •

edited

Loading

mgxd commented Mar 28, 2019

oesteban commented Apr 4, 2019

rciric commented Apr 5, 2019

oesteban commented Apr 9, 2019

oesteban commented Apr 17, 2019 •

edited

Loading

satra commented Apr 18, 2019

effigies left a comment

effigies Apr 18, 2019

effigies Apr 18, 2019

effigies Apr 18, 2019

mgxd Apr 19, 2019

effigies Apr 19, 2019

effigies Apr 18, 2019

effigies Apr 18, 2019

effigies Apr 18, 2019

effigies Apr 18, 2019

effigies Apr 18, 2019

rciric commented Apr 19, 2019

effigies commented Apr 19, 2019

rciric commented Apr 19, 2019

oesteban commented Apr 29, 2019


		components_data = [line.split('\t') for line in components_file]
		components_data = [re.sub('\n', '', line).split('\t')

	components_data = [re.sub('\n', '', line).split('\t')
	components_data = [line.rstrip().split('\t')

	components_metadata = [re.sub('\n', '', line).split('\t')
	components_metadata = [line.rstrip().split('\t')

	'traits>=%s,<%s ; python_version == "2.7"' % (TRAITS_MIN_VERSION, '5.0.0'),
	'traits>=%s,!=5.0' % TRAITS_MIN_VERSION,

[ENH] CompCor enhancement #2878

[ENH] CompCor enhancement #2878

Conversation

rciric commented Feb 9, 2019

Summary

List of changes proposed in this PR (pull-request)

Acknowledgment

codecov-io commented Feb 9, 2019 • edited Loading

Codecov Report

oesteban left a comment

Choose a reason for hiding this comment

rciric commented Mar 1, 2019

rciric commented Mar 1, 2019

effigies commented Mar 12, 2019

mgxd commented Mar 27, 2019

rciric commented Mar 28, 2019 • edited Loading

mgxd commented Mar 28, 2019

oesteban commented Apr 4, 2019

rciric commented Apr 5, 2019

oesteban commented Apr 9, 2019

oesteban commented Apr 17, 2019 • edited Loading

satra commented Apr 18, 2019

effigies left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rciric commented Apr 19, 2019

effigies commented Apr 19, 2019

rciric commented Apr 19, 2019

oesteban commented Apr 29, 2019

codecov-io commented Feb 9, 2019 •

edited

Loading

rciric commented Mar 28, 2019 •

edited

Loading

oesteban commented Apr 17, 2019 •

edited

Loading