Support for wait for a resource state condition #995

csviri · 2022-03-04T12:54:08Z

Problem statement

In a reconciler logic can happen that we need to wait synchronously or asynchronously, until a resource get's into a state. In other words a condition on the resource is evaluated continuously on the resource. The framework could explicitly support this, by a wait-for microframework, that can be the bases to implement automated workflows, desribed in issues: #850

Requirements

able to specify conditions to wait for
setting timeouts
ability specify the behavior on timeout. Like, an how the UpdateControl should behave in case of a timeout.

The text was updated successfully, but these errors were encountered:

metacosm · 2022-03-04T13:58:07Z

The other alternative could also be to be able to re-schedule reconciliation for a later time very easily. It will all depend on how easy it is to define conditions and wait for them because if it makes the code complex to reason about, it might be easier and simpler to just reschedule a reconciliation at a later time. Cost vs. benefit kind of analysis.

csviri · 2022-03-04T14:47:22Z

The other alternative could also be to be able to re-schedule reconciliation for a later time very easily. It will all depend on how easy it is to define conditions and wait for them because if it makes the code complex to reason about, it might be easier and simpler to just reschedule a reconciliation at a later time. Cost vs. benefit kind of analysis.

Yes, that is the plan to support that, like specif a condition, but the timeout is optional; if no timeout, it will just reschedule / return from reconciliation - so it is triggered by an event source event.

scrocquesel · 2022-03-05T22:53:08Z

The other alternative could also be to be able to re-schedule reconciliation for a later time very easily. It will all depend on how easy it is to define conditions and wait for them because if it makes the code complex to reason about, it might be easier and simpler to just reschedule a reconciliation at a later time. Cost vs. benefit kind of analysis.

Yes, that is the plan to support that, like specif a condition, but the timeout is optional; if no timeout, it will just reschedule / return from reconciliation - so it is triggered by an event source event.

I think the waitFor is designed to be called in a new spawned thread from a reconcile loop. And I don't see the added value to wait synchronously when the whole architecture is based on eventing. You can register a reschedule on deadline and reconcile logic should handle this time based logic. I mean this is how it is done actually without Dependent. Why should we need another thread/timer to do it ?

csviri · 2022-03-07T08:19:46Z

I think the waitFor is designed to be called in a new spawned thread from a reconcile loop. And I don't see the added value to wait synchronously when the whole architecture is based on eventing. You can register a reschedule on deadline and reconcile logic should handle this time based logic. I mean this is how it is done actually without Dependent. Why should we need another thread/timer to do it ?

It's never a new Thread, pls see related PR (but it's pretty much just started ATM).

What I see now are three different scenarios:

(preferred) The waitFor, will return from reconciliation instantly if the condition is not met. The assumption is that there is an event source in place, that will trigger reconiliation if something changed.
it will return from reconciliation but will register a reSchedule to trigger the reconciliation again. This is interesting for cases when there are no actual event source properly registered. It's problematic because can create an endless loop, to workaround that some state should stored (maybe as part of the Condition in status)
When is will wait synchronously (on the same thread, a.k.a sleep) for a condition and a timeout. This is for the case when for example an operator just periodically reconciles, manages external resources. And it's not desired to run the whole reconciliation again since it will resource lot's of other API calls.

scrocquesel · 2022-03-08T22:15:54Z

3. When is will wait synchronously (on the same thread, a.k.a sleep) for a condition and a timeout. This is for the case when for example an operator just periodically reconciles, manages external resources. And it's not desired to run the whole reconciliation again since it will resource lot's of other API calls.

I'm not sure to understand this use case. What do you mean by " since it will resource lot's of other API calls."

csviri · 2022-03-09T14:37:05Z

When is will wait synchronously (on the same thread, a.k.a sleep) for a condition and a timeout. This is for the case when for example an operator just periodically reconciles, manages external resources. And it's not desired to run the whole reconciliation again since it will resource lot's of other API calls.

I'm not sure to understand this use case. What do you mean by " since it will resource lot's of other API calls."

So, this is quite a special case of operators, and up to discussion if we want to support it this way, if it would make too much confusion we can remove it:
Let's say an operator during a reconciliation calls just external API's, let's say 3 after each other, for example checks a

s3 bucket
calls github api,
and manages AWS RDS instance.

In this order. But not having event sources for those just reconciling in every 10 minutes. So usually changes to an RDS takes some time. In case a change is made to RDS, we could reschedule a new reconciliation, but that would call the API 1. and 2. also, instead we could synchrnously wait, just check the state of RDS in 2 mins. That would block the thread, but won't call the API's. So this is where sync wait would allow such traidoff.
Again this might be a corner case with, not 100% percent concinced we needs this, especially if its makes confusion.
Just wanted to cover initially the scnenarios I was able to think of.

scrocquesel · 2022-03-10T11:30:13Z

When is will wait synchronously (on the same thread, a.k.a sleep) for a condition and a timeout. This is for the case when for example an operator just periodically reconciles, manages external resources. And it's not desired to run the whole reconciliation again since it will resource lot's of other API calls.

I'm not sure to understand this use case. What do you mean by " since it will resource lot's of other API calls."

So, this is quite a special case of operators, and up to discussion if we want to support it this way, if it would make too much confusion we can remove it: Let's say an operator during a reconciliation calls just external API's, let's say 3 after each other, for example checks a

s3 bucket

calls github api,

and manages AWS RDS instance.

In this order. But not having event sources for those just reconciling in every 10 minutes. So usually changes to an RDS takes some time. In case a change is made to RDS, we could reschedule a new reconciliation, but that would call the API 1. and 2. also, instead we could synchrnously wait, just check the state of RDS in 2 mins. That would block the thread, but won't call the API's. So this is where sync wait would allow such traidoff. Again this might be a corner case with, not 100% percent concinced we needs this, especially if its makes confusion. Just wanted to cover initially the scnenarios I was able to think of.

I see, IMHO, reconcile loop shouldn't block for long-running process or it should be cancellable. If RDS is a long running process, I think the best practice is to have an Event Source. If we agree on that, I don't think adding helpers to make things easier to implement anti-pattern is a good solution.

csviri · 2022-03-10T13:10:58Z

I see, IMHO, reconcile loop shouldn't block for long-running process or it should be cancellable. If RDS is a long running process, I think the best practice is to have an Event Source. If we agree on that, I don't think adding helpers to make things easier to implement anti-pattern is a good solution.

Yes, I'm starting to reconsider this too. Will come back to this PR, later. In context of managed dependent resources, but probably will remove polling. Thx for the input!

csviri · 2022-04-19T15:37:09Z

Updated the design of depends_on, the synchronous wait is not out of scope. Thank you all.

(see #850 )

csviri added the dependent-resources-epic label Mar 4, 2022

csviri self-assigned this Mar 4, 2022

csviri linked a pull request Mar 4, 2022 that will close this issue

WIP: feat: depends on wait condition design #994

Closed

csviri mentioned this issue Mar 8, 2022

WIP: feat: depends on wait condition design #994

Closed

csviri closed this as completed Apr 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for wait for a resource state condition #995

Support for wait for a resource state condition #995

csviri commented Mar 4, 2022

metacosm commented Mar 4, 2022

csviri commented Mar 4, 2022 •

edited

Loading

scrocquesel commented Mar 5, 2022

csviri commented Mar 7, 2022

scrocquesel commented Mar 8, 2022

csviri commented Mar 9, 2022 •

edited

Loading

scrocquesel commented Mar 10, 2022

csviri commented Mar 10, 2022

csviri commented Apr 19, 2022

Support for wait for a resource state condition #995

Support for wait for a resource state condition #995

Comments

csviri commented Mar 4, 2022

Problem statement

Requirements

metacosm commented Mar 4, 2022

csviri commented Mar 4, 2022 • edited Loading

scrocquesel commented Mar 5, 2022

csviri commented Mar 7, 2022

scrocquesel commented Mar 8, 2022

csviri commented Mar 9, 2022 • edited Loading

scrocquesel commented Mar 10, 2022

csviri commented Mar 10, 2022

csviri commented Apr 19, 2022

csviri commented Mar 4, 2022 •

edited

Loading

csviri commented Mar 9, 2022 •

edited

Loading