Skip to content

Support for wait for a resource state condition #995

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
csviri opened this issue Mar 4, 2022 · 9 comments
Closed

Support for wait for a resource state condition #995

csviri opened this issue Mar 4, 2022 · 9 comments

Comments

@csviri
Copy link
Collaborator

csviri commented Mar 4, 2022

Problem statement

In a reconciler logic can happen that we need to wait synchronously or asynchronously, until a resource get's into a state. In other words a condition on the resource is evaluated continuously on the resource. The framework could explicitly support this, by a wait-for microframework, that can be the bases to implement automated workflows, desribed in issues: #850

Requirements

  • able to specify conditions to wait for
  • setting timeouts
  • ability specify the behavior on timeout. Like, an how the UpdateControl should behave in case of a timeout.
@csviri csviri self-assigned this Mar 4, 2022
@csviri csviri linked a pull request Mar 4, 2022 that will close this issue
@metacosm
Copy link
Collaborator

metacosm commented Mar 4, 2022

The other alternative could also be to be able to re-schedule reconciliation for a later time very easily. It will all depend on how easy it is to define conditions and wait for them because if it makes the code complex to reason about, it might be easier and simpler to just reschedule a reconciliation at a later time. Cost vs. benefit kind of analysis.

@csviri
Copy link
Collaborator Author

csviri commented Mar 4, 2022

The other alternative could also be to be able to re-schedule reconciliation for a later time very easily. It will all depend on how easy it is to define conditions and wait for them because if it makes the code complex to reason about, it might be easier and simpler to just reschedule a reconciliation at a later time. Cost vs. benefit kind of analysis.

Yes, that is the plan to support that, like specif a condition, but the timeout is optional; if no timeout, it will just reschedule / return from reconciliation - so it is triggered by an event source event.

@scrocquesel
Copy link
Contributor

The other alternative could also be to be able to re-schedule reconciliation for a later time very easily. It will all depend on how easy it is to define conditions and wait for them because if it makes the code complex to reason about, it might be easier and simpler to just reschedule a reconciliation at a later time. Cost vs. benefit kind of analysis.

Yes, that is the plan to support that, like specif a condition, but the timeout is optional; if no timeout, it will just reschedule / return from reconciliation - so it is triggered by an event source event.

I think the waitFor is designed to be called in a new spawned thread from a reconcile loop. And I don't see the added value to wait synchronously when the whole architecture is based on eventing. You can register a reschedule on deadline and reconcile logic should handle this time based logic. I mean this is how it is done actually without Dependent. Why should we need another thread/timer to do it ?

@csviri
Copy link
Collaborator Author

csviri commented Mar 7, 2022

I think the waitFor is designed to be called in a new spawned thread from a reconcile loop. And I don't see the added value to wait synchronously when the whole architecture is based on eventing. You can register a reschedule on deadline and reconcile logic should handle this time based logic. I mean this is how it is done actually without Dependent. Why should we need another thread/timer to do it ?

It's never a new Thread, pls see related PR (but it's pretty much just started ATM).

What I see now are three different scenarios:

  1. (preferred) The waitFor, will return from reconciliation instantly if the condition is not met. The assumption is that there is an event source in place, that will trigger reconiliation if something changed.
  2. it will return from reconciliation but will register a reSchedule to trigger the reconciliation again. This is interesting for cases when there are no actual event source properly registered. It's problematic because can create an endless loop, to workaround that some state should stored (maybe as part of the Condition in status)
  3. When is will wait synchronously (on the same thread, a.k.a sleep) for a condition and a timeout. This is for the case when for example an operator just periodically reconciles, manages external resources. And it's not desired to run the whole reconciliation again since it will resource lot's of other API calls.

@scrocquesel
Copy link
Contributor

3. When is will wait synchronously (on the same thread, a.k.a sleep) for a condition and a timeout. This is for the case when for example an operator just periodically reconciles, manages external resources. And it's not desired to run the whole reconciliation again since it will resource lot's of other API calls.

I'm not sure to understand this use case. What do you mean by " since it will resource lot's of other API calls."

@csviri
Copy link
Collaborator Author

csviri commented Mar 9, 2022

  1. When is will wait synchronously (on the same thread, a.k.a sleep) for a condition and a timeout. This is for the case when for example an operator just periodically reconciles, manages external resources. And it's not desired to run the whole reconciliation again since it will resource lot's of other API calls.

I'm not sure to understand this use case. What do you mean by " since it will resource lot's of other API calls."

So, this is quite a special case of operators, and up to discussion if we want to support it this way, if it would make too much confusion we can remove it:
Let's say an operator during a reconciliation calls just external API's, let's say 3 after each other, for example checks a

  1. s3 bucket
  2. calls github api,
  3. and manages AWS RDS instance.

In this order. But not having event sources for those just reconciling in every 10 minutes. So usually changes to an RDS takes some time. In case a change is made to RDS, we could reschedule a new reconciliation, but that would call the API 1. and 2. also, instead we could synchrnously wait, just check the state of RDS in 2 mins. That would block the thread, but won't call the API's. So this is where sync wait would allow such traidoff.
Again this might be a corner case with, not 100% percent concinced we needs this, especially if its makes confusion.
Just wanted to cover initially the scnenarios I was able to think of.

@scrocquesel
Copy link
Contributor

  1. When is will wait synchronously (on the same thread, a.k.a sleep) for a condition and a timeout. This is for the case when for example an operator just periodically reconciles, manages external resources. And it's not desired to run the whole reconciliation again since it will resource lot's of other API calls.

I'm not sure to understand this use case. What do you mean by " since it will resource lot's of other API calls."

So, this is quite a special case of operators, and up to discussion if we want to support it this way, if it would make too much confusion we can remove it: Let's say an operator during a reconciliation calls just external API's, let's say 3 after each other, for example checks a

  1. s3 bucket
  2. calls github api,
  3. and manages AWS RDS instance.

In this order. But not having event sources for those just reconciling in every 10 minutes. So usually changes to an RDS takes some time. In case a change is made to RDS, we could reschedule a new reconciliation, but that would call the API 1. and 2. also, instead we could synchrnously wait, just check the state of RDS in 2 mins. That would block the thread, but won't call the API's. So this is where sync wait would allow such traidoff. Again this might be a corner case with, not 100% percent concinced we needs this, especially if its makes confusion. Just wanted to cover initially the scnenarios I was able to think of.

I see, IMHO, reconcile loop shouldn't block for long-running process or it should be cancellable. If RDS is a long running process, I think the best practice is to have an Event Source. If we agree on that, I don't think adding helpers to make things easier to implement anti-pattern is a good solution.

@csviri
Copy link
Collaborator Author

csviri commented Mar 10, 2022

I see, IMHO, reconcile loop shouldn't block for long-running process or it should be cancellable. If RDS is a long running process, I think the best practice is to have an Event Source. If we agree on that, I don't think adding helpers to make things easier to implement anti-pattern is a good solution.

Yes, I'm starting to reconsider this too. Will come back to this PR, later. In context of managed dependent resources, but probably will remove polling. Thx for the input!

@csviri
Copy link
Collaborator Author

csviri commented Apr 19, 2022

Updated the design of depends_on, the synchronous wait is not out of scope. Thank you all.

(see #850 )

@csviri csviri closed this as completed Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants