Skip to content

When node changed and workflow rerun, child nodes of changed node failed to rerun #2951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mick-d opened this issue Jun 28, 2019 · 7 comments

Comments

@mick-d
Copy link
Contributor

mick-d commented Jun 28, 2019

Summary

Usually when i am debugging a workflow, i change one of the node, and when rerunning the workflow all child nodes of that node will be rerun but parent and independent nodes will just use the cached previous results. However this does not happen in one of my workflow and i was wondering what were the criteria for this behavior to happen?

Actual behavior

1st run: Node 1 (results cached) --> Node 2 (results cached) --> Node 3 (results cached)

2nd run with Node 2 modified: Node 1 (use previous cached results --> Node 2 (creating new results to be cached) --> Node 3 (keep previous cached results although input changed)

Expected behavior

1st run: Node 1 (results cached) --> Node 2 (results cached) --> Node 3 (results cached)

2nd run with Node 2 modified: Node 1 (use previous cached results --> Node 2 (creating new results to be cached) --> Node 3 (creating new results to be cached)

How to replicate the behavior

I can put more details here but first it'd be great to know if the behavior expectation is correct, and what are the requirements for it to work. I believe the issue may come from the problematic node not rerun fsl.CopyGeom which creates a local copy of the file on which to copy the header information. It'd be great to have more information on how the "Node rerun" decision is made.

Script/Workflow details

    rerunissue.connect([
                               (Node1, CopyGeom,
                                [("out_file", "in_file")]),
                               (Node2, CopyGeom,
                                [("out_file", "dest_file")]),
                               (CopyGeom, Node4,
                                [("out_file", "in_file")]),
                         ])

When Node2 is modified and the workflow is rerun, CopyGeom is not rerun (and subsequent nodes such as Node4 are not rerun either).

Platform details:

{'commit_hash': '%h',
 'commit_source': 'archive substitution',
 'networkx_version': '2.3',
 'nibabel_version': '2.4.1',
 'nipype_version': '1.2.0',
 'numpy_version': '1.16.4',
 'pkg_path': '/home/<my_username>/pyutils/miniconda3/envs/mri36/lib/python3.6/site-packages/nipype',
 'scipy_version': '1.2.1',
 'sys_executable': '/home/<my_username>/pyutils/miniconda3/envs/mri36/bin/python',
 'sys_platform': 'linux',
 'sys_version': '3.6.8 |Anaconda, Inc.| (default, Dec 30 2018, 01:22:34) \n'
                '[GCC 7.3.0]',
 'traits_version': '5.1.1'}

Execution environment

Choose one

  • My python environment outside container
@satra
Copy link
Member

satra commented Jun 28, 2019

@mick-d - are you positive that node2 is producing new outputs after change? it may still be producing the same output (content). if you are using content hash instead of timestamp that can help.

but more details would help.

@mick-d
Copy link
Contributor Author

mick-d commented Jun 28, 2019

Hi Satra, yes i am 100% positive, both the output content and timestamp of node2 changed in the workflow basedir, but still node3 (fsl.CopyGeom) would not rerun (timestamp of node3 content was anterior to output of node2).

@satra
Copy link
Member

satra commented Jun 30, 2019

@mick-d - in that case would it be possible to create a small example that replicates the issue?

@mick-d
Copy link
Contributor Author

mick-d commented Jun 30, 2019

@satra Yes, the previous nodes are actually part of a worflow so i'll create a simple and clear example from scratch to better illustrate it.

@oesteban
Copy link
Contributor

oesteban commented Aug 7, 2019

Hey @mick-d, I've just merged #2971 which potentially affects this particular problem.

Can you check whether this has been fixed? (please remind running both instances of the workflow with use_relative_paths set on).

@axiezai
Copy link
Contributor

axiezai commented May 28, 2020

Hi, Just to add onto this,

I created a workflow with the following MapNodes for a BIDS dataset with 1 subject and 2 sessions:

# other FSL nodes...

# FSL ApplyWarp interface
ACPC_warp = MapNode(fsl.preprocess.ApplyWarp(), name='apply_warp', iterfield=["in_file", "premat"])
ACPC_warp.inputs.out_file = 'acpc_t1.nii.gz'
ACPC_warp.inputs.relwarp = True
ACPC_warp.inputs.interp = "spline"
ACPC_warp.inputs.ref_file = MNI_template

# gunzip:
gz2nii = MapNode(gunzip_nii(), name='gunzip', iterfield="in_file")
gz2nii.inputs.out_file = 'acpc_t1.nii'

On the first run, I did not have the gunzip MapNode and I set ACPC_warp.inputs.out_file = 'acpc_t1.nii, and it produced such file with .nii extension, then I changed the input to .nii.gz and re-ran my workflow after deleting the previous output directory.

For my workflow I provided the following:

workflow = Workflow(name='mni_reconall', base_dir = os.path.join(Path(BIDS_DIR).parent, 'derivatives'))
workflow.connect(
        [
            (subject_source, select_files, [("subject_id", "subject_id")]),
            (select_files, reduceFOV, [("anat", "in_file")]),
            (reduceFOV, xfminverse, [("out_transform", "in_file")]),
            (reduceFOV, flirt, [("out_roi", "in_file")]),
            (xfminverse, concatxfm, [("out_file", "in_file")]),
            (flirt, concatxfm, [("out_matrix_file", "in_file2")]),
            (concatxfm, alignxfm, [("out_file", "in_file")]),
            (select_files, ACPC_warp, [("anat", "in_file")]),
            (alignxfm, ACPC_warp, [("out_file", "premat")]),
            (ACPC_warp, gz2nii, [("out_file", "in_file")]),
            (gz2nii, reconall, [("out_file", "T1_files")]),
            (select_files, get_fs_id, [("anat", "anat_files")]),
            (get_fs_id, reconall, [("fs_id_list", "subject_id")])
        ]
    )
workflow.config['execution'] = {'use_relative_paths': 'True', 'hash_method': 'content'}
workflow.run('MultiProc', plugin_args = {'n_procs': 2})

On the second run where I expect a .nii.gz output, the workflow still uses the old cached results:

200528-10:45:03,70 nipype.workflow INFO:
	 [Node] Cached "_apply_warp0" - collecting precomputed outputs
200528-10:45:03,70 nipype.workflow INFO:
	 [Node] "_apply_warp0" found cached.

And sure enough, the gunzip MapNode doesnt find .nii.gz file, because a .nii file from the previous run is used:

Standard error:
gzip: acpc_t1.nii.gz: No such file or directory
Return code: 1

Singularity> ls /dwi_preproc/derivatives/mni_reconall/_subject_id_01/apply_warp/mapflow/_apply_warp1/
_0x0573287f3994c86d318ed310ffb09564.json  **acpc_t1.nii**  command.txt  _inputs.pklz  _node.pklz  _report  result__apply_warp1.pklz

Platform details:

200528-08:35:53,637 nipype.utils INFO:
         Running nipype version 1.5.0-rc1 (latest: 1.4.2)
{'commit_hash': '%h',
 'commit_source': 'archive substitution',
 'networkx_version': '2.4',
 'nibabel_version': '3.1.0',
 'nipype_version': '1.5.0-rc1',
 'numpy_version': '1.18.4',
 'pkg_path': '/opt/miniconda-latest/envs/tracts/lib/python3.7/site-packages/nipype',
 'scipy_version': '1.4.1',
 'sys_executable': '/opt/miniconda-latest/envs/tracts/bin/python',
 'sys_platform': 'linux',
 'sys_version': '3.7.3 | packaged by conda-forge | (default, Dec  6 2019, '
                '08:54:18) \n'
                '[GCC 7.3.0]',
 'traits_version': '6.0.0'}

Let me know how else I can help. Thank you :)

@axiezai
Copy link
Contributor

axiezai commented May 28, 2020

After a more detailed look, specifying the output type to the FSL interface ACPC_warp.inputs.output_type = 'NIFTI'/'NIFT_GZ' solves the problem.

However, in my previous runs where I did not specify the output type, the inputs to the node did change from .nii to .nii.gz and the workflow still used the cached results instead of reproducing despite use_relative_paths = True. Hopefully this helps...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants