Skip to content

[ENH] CompCor enhancement #2859

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from
Closed

[ENH] CompCor enhancement #2859

wants to merge 0 commits into from

Conversation

rciric
Copy link
Contributor

@rciric rciric commented Jan 19, 2019

Summary

As a step toward implementing nipreps/fmriprep#1458, this PR updates the CompCor algorithm to allow greater flexibility.

Given that I would like to incorporate features enabled by these changes into fmriprep, does it make sense to propose this separately as a patch in niworkflows?

List of changes proposed in this PR (pull-request)

  • As before, num_components encodes the number of components to return if it takes an integer value greater than 1.
  • If num_components is all, then CompCor returns all component time series.
  • Alternatively, a mutually exclusive interface option, variance_threshold, allows for automatic selection of components on the basis of their capacity to explain variance in the data. variance_threshold is mutually exclusive with num_components and takes values between 0 and 1. If variance_threshold is set, CompCor returns the minimum number of components necessary to explain the indicated variance in the data, computed as the cumulative sum of (singular_value^2)/sum(singular_values^2).
  • For backward compatibility, if neither num_components nor variance_threshold is defined, then CompCor executes as though num_components were set to 6 (the previous default behaviour).
  • CompCor optionally writes out a metadata table, which includes for each component the source mask, the singular value, the explained variance, and the cumulative explained variance.

Acknowledgment

  • (Mandatory) I acknowledge that this contribution will be available under the Apache 2 license.

@codecov-io
Copy link

codecov-io commented Jan 20, 2019

Codecov Report

Merging #2859 into master will increase coverage by 0.01%.
The diff coverage is 75.86%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2859      +/-   ##
==========================================
+ Coverage   67.46%   67.48%   +0.01%     
==========================================
  Files         341      341              
  Lines       43392    43713     +321     
  Branches     5383     5508     +125     
==========================================
+ Hits        29276    29498     +222     
- Misses      13419    13499      +80     
- Partials      697      716      +19
Flag Coverage Δ
#smoketests 50.46% <15.51%> (-0.07%) ⬇️
#unittests 64.94% <75.86%> (+0.05%) ⬆️
Impacted Files Coverage Δ
nipype/algorithms/confounds.py 67.13% <75.86%> (+0.84%) ⬆️
nipype/algorithms/modelgen.py 66.01% <0%> (-1.43%) ⬇️
nipype/info.py 94.8% <0%> (+0.6%) ⬆️
nipype/interfaces/mrtrix3/tracking.py 90.9% <0%> (+1.43%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ebbd062...addb0e9. Read the comment docs.

@effigies
Copy link
Member

I think it's reasonable to put this here. I have a couple overall comments about the approach, though:

If num_components is valued between 0 and 1, then CompCor instead returns the minimum number of components necessary to explain the indicated variance in the data, computed as the cumulative sum of (singular_value^2)/sum(singular_values^2).

I would rather not overload this, as it's not an intuitive interpretation of "num_components". I would rather have a separate input like variance_threshold or some other descriptive name to enable this behavior. You can add xor=['num_components'] to the trait to ensure that people don't specify both.

If num_components is -1, then CompCor returns all component time series.

What about num_components = 'all'? You can make the trait:

num_components = traits.Either('all', traits.Int, ...)

@rciric
Copy link
Contributor Author

rciric commented Jan 20, 2019

Thank you for the review — I’ve now updated the PR. In the update, I automatically instantiate the selection criterion equivalently to num_components = 6 if neither num_components nor variance_threshold is defined (the current default behaviour, for backward compatibility) and raise a warning (perhaps this should be a FutureWarning or DeprecationWarning); I’m not sure what the best way of handling this is.

@@ -613,13 +663,19 @@ def _list_outputs(self):
save_pre_filter = os.path.abspath('pre_filter.tsv')
outputs['pre_filter_file'] = save_pre_filter

save_metadata = self.inputs.save_metadata
if save_metadata:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if save_metadata:
if save_metadata is True:

@@ -613,13 +663,19 @@ def _list_outputs(self):
save_pre_filter = os.path.abspath('pre_filter.tsv')
outputs['pre_filter_file'] = save_pre_filter

save_metadata = self.inputs.save_metadata
if save_metadata:
if isinstance(save_metadata, bool):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, looking at it now, I'm not sure that change would work because the conditional branch should be entered if save_metadata is either True or a string (the metadata file path) so as to set the output path.

        if save_metadata:
            if isinstance(save_metadata, bool):
                save_metadata = os.path.abspath('component_metadata.tsv')
            outputs['metadata_file'] = save_metadata

Of course, we could circumvent this entirely with a SimpleInterface as per your earlier suggestion. I'm not sure if there's a good reason for not using SimpleInterface or fname_presuffix here (perhaps this function predates their integration into nipype?), but I mirrored this block structure from the save_pre_filter block above it. I could definitely try and see if I can refactor into a SimpleInterface with fname_presuffix if you think it would be worthwhile.

@rciric
Copy link
Contributor Author

rciric commented Jan 28, 2019

There are a couple of potential changes here that I think could be worthwhile, both in general and downstream in fmriprep:

  • Save all component metadata regardless of the number of components that are retained.
  • Automatically remove zero-variance/zero-SV columns from the decomposition.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants