Skip to content

Custom event filter for controllers #457

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Sep 29, 2021

Conversation

lburgazzoli
Copy link
Collaborator

@lburgazzoli lburgazzoli commented Jul 22, 2021

Fixes #404

@lburgazzoli
Copy link
Collaborator Author

@metacosm this PR is incomplete as it lacks some tests, however I'd like to discuss if it goes in the right direction because there are some behavioral changes:

  • the custom resource cache did not perform a copy of the resource in the cacheResource method, which could lead unexpected results (but I don't know it this was expected)
  • the latest generation value was cached as part of the CustomEventSource after a successful invocation of the event handler and I'm not sure about the reasoning behind that logic as I would assume that any retry logic should not be affected by the revision.


@Override
public void eventSourceDeRegisteredForResource(String customResourceUid) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this method removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because the method was only about to remove the generation from the local cache, but since the cache does not exists anymore the method is useless (and the interface provide a default no-op method)

@lburgazzoli
Copy link
Collaborator Author

lburgazzoli commented Aug 18, 2021

@csviri @metacosm what do you think about the PR (in general) ?

@csviri
Copy link
Collaborator

csviri commented Aug 18, 2021

@csviri @metacosm what do you think about the PR (in general) ?

@lburgazzoli will try to take a look soon, like the idea of predicate very much! Will check deeper later.

@lburgazzoli
Copy link
Collaborator Author

@csviri @metacosm what do you think about the PR (in general) ?

@lburgazzoli will try to take a look soon, like the idea of predicate very much! Will check deeper later.

At the moment it is like a POC so, lot of things may be wrong :)

@lburgazzoli lburgazzoli force-pushed the gh-404 branch 2 times, most recently from 097933b to b29157c Compare August 20, 2021 11:29
@lburgazzoli lburgazzoli marked this pull request as ready for review September 6, 2021 08:03
@lburgazzoli lburgazzoli marked this pull request as draft September 6, 2021 10:50
@lburgazzoli lburgazzoli force-pushed the gh-404 branch 3 times, most recently from ebca1db to 326a3ad Compare September 6, 2021 12:00
@lburgazzoli lburgazzoli marked this pull request as ready for review September 6, 2021 12:00
@lburgazzoli
Copy link
Collaborator Author

@csviri @metacosm I need to add some tests but it should be ready for review

@metacosm
Copy link
Collaborator

metacosm commented Sep 9, 2021

How do you propose to configure the predicate?

@lburgazzoli
Copy link
Collaborator Author

How do you propose to configure the predicate?

This is an advanced functionality which I don't think it should be made configurable as it depends on the specific spec/status and behavior of the operator so I think this should not be exposed as a configuration option (IMHO, the "generation aware" configuration option should not be exposed too)

@metacosm
Copy link
Collaborator

metacosm commented Sep 9, 2021

How do you propose to configure the predicate?

This is an advanced functionality which I don't think it should be made configurable as it depends on the specific spec/status and behavior of the operator so I think this should not be exposed as a configuration option (IMHO, the "generation aware" configuration option should not be exposed too)

I'm confused: I thought the whole point of this PR was to be able to configure the behavior with more than just a property or annotation value. Could you please explain the use case in greater details, then?

@csviri
Copy link
Collaborator

csviri commented Sep 9, 2021

hi @lburgazzoli @metacosm sorry busy days, will try to take a look today

@lburgazzoli
Copy link
Collaborator Author

lburgazzoli commented Sep 9, 2021

How do you propose to configure the predicate?

This is an advanced functionality which I don't think it should be made configurable as it depends on the specific spec/status and behavior of the operator so I think this should not be exposed as a configuration option (IMHO, the "generation aware" configuration option should not be exposed too)

I'm confused: I thought the whole point of this PR was to be able to configure the behavior with more than just a property or annotation value. Could you please explain the use case in greater details, then?

Oh sorry I though that with wow do you propose to configure the predicate you meant how to use a property to configure a predicate.

So far what I did is to use ControllerConfiguration upon registration so like

var operator = new Operator();
operator.register(new MyController(), new MyConfiguration() {
    @Override
    public CustomResourcePredicate<MyResource> getPredicate() {
        return ...;
    }
});

@metacosm
Copy link
Collaborator

metacosm commented Sep 9, 2021

OK, so that should part of the configuration overriding mechanism, then…

@lburgazzoli
Copy link
Collaborator Author

OK, so that should part of the configuration overriding mechanism, then…

will have a look later this week or beginning next week

@lburgazzoli
Copy link
Collaborator Author

OK, so that should part of the configuration overriding mechanism, then…

will have a look later this week or beginning next week

I've added support to the conf override, now I wonder if it still make sense to expose the option to configure the generation awarness

Copy link
Collaborator

@metacosm metacosm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase your PR so that all commits follow the conventional commit message format: https://www.conventionalcommits.org/en/v1.0.0/

@metacosm
Copy link
Collaborator

OK, so that should part of the configuration overriding mechanism, then…

will have a look later this week or beginning next week

I've added support to the conf override, now I wonder if it still make sense to expose the option to configure the generation awarness

I'm not sure either but then again we can't break the API before v2. That said, the other question I have is how someone using Quarkus where the controller is automatically registered would be able to specify their predicates and have it taken into account…

@lburgazzoli
Copy link
Collaborator Author

lburgazzoli commented Sep 13, 2021

I'm not sure either but then again we can't break the API before v2. That said, the other question I have is how someone using Quarkus where the controller is automatically registered would be able to specify their predicates and have it taken into account…

What about something like:

@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.TYPE})
public @interface Controller {
    Class<CustomResource> eventFilter default GenrationAwareFilter.class;
}
@Controller(
      eventFilter = MyCustomResourcePredicate.class
)

As alternative the controller should be able to set some config option in its init method (that would be my preferred option)

@csviri
Copy link
Collaborator

csviri commented Sep 13, 2021

I'm not sure either but then again we can't break the API before v2. That said, the other question I have is how someone using Quarkus where the controller is automatically registered would be able to specify their predicates and have it taken into account…

What about something like:

@Retention(RetentionPolicy.RUNTIME)
@Target({ElementType.TYPE})
public @interface Controller {
    Class<CustomResource> eventFilter default GenrationAwareFilter.class;
}
@Controller(
      eventFilter = MyCustomResourcePredicate.class
)

As alternative the controller should be able to set some config option in its init method (that would be my preferred option)

@lburgazzoli @metacosm just to add my two cents.

  • should not be this a list of predicates in the annotation? So it's easy to compose more predicates.
  • we could consider having the generation awareness as a boolean (as it is now ) and just add the predicate in the background. The only reason to do this is just to have this nice explicit api about generations, since those are quite central concepts of event processing. While the other predicates will usually solve some special cases I guess.

// if the finalizer has not been set, the operator logic will add it so there will be
// events generated by the operator itself that won't lead to a new generation so such
// events should not be filtered
final CustomResourceEventFilter<T> finalizerPredicate =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking this with predicate? It would be much simpler and easy to read with just checking directly the field with in a method.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that the same logic is applied everywhere it's needed instead of having variations of the same code.

@csviri
Copy link
Collaborator

csviri commented Sep 19, 2021

Back to the main topic of this PR:

In general I think if we could solve these cases elegantly with this predicates as now, it would be a nice way to implement things like generation awareness. On the other hand I see currently two issues:

  • The oldObject (thus the previouse version of CR) is used in the filter api, which might not be awailable in the future.
  • I still don't see the clear vision / use cases, if somebody wan't to test a new version of an operator it's typically we can already do in a separate namespace. Remember that if we introduce this API for filters we will have to support it. But without clear vision it could make some confusion. Is there something like this supported in controller runtime?

@lburgazzoli
Copy link
Collaborator Author

Back to the main topic of this PR:

In general I think if we could solve these cases elegantly with this predicates as now, it would be a nice way to implement things like generation awareness. On the other hand I see currently two issues:

* The oldObject (thus the previouse version of CR) is used in the filter api, which might not be awailable in the future.

I'm not sure about this point but it can eventually be done or maybe we can clearly document that the oldObject may be null in case of a cacheless controller. Eventually also the generation generation awarness can be delegated to the user by using the obeservedGeneration patter (implemented by a number of Kubernetes controllers) in which, in essence you have to store the generation the controller is act upon in the status.

As a side note, removing the oldObject would make the java sdk to drift from the design of the golang sdk (https://sdk.operatorframework.io/docs/building-operators/golang/references/event-filtering/)

* I still don't see the clear vision / use cases, if somebody wan't to test a new version of an operator it's typically we can already do in a separate namespace. Remember that if we introduce this API for filters we will have to support it. But without clear vision it could make some confusion. Is there something like this supported in controller runtime?

This issue/PR is not related to testing or canary deployment (that issue was the one about using label selector #453) but it is about letting the user be able to install advanced filter to reduce the reconcile events.

@csviri
Copy link
Collaborator

csviri commented Sep 20, 2021

I'm not sure about this point but it can eventually be done or maybe we can clearly document that the oldObject may be null in case of a cacheless controller. Eventually also the generation generation awarness can be delegated to the user by using the obeservedGeneration patter (implemented by a number of Kubernetes controllers) in which, in essence you have to store the generation the controller is act upon in the status.

If we document that old object might be null, not sure if that would make sense even to put it there - Or we could also replicate the 3 methods in the filter (Create, Update, Delete), (if we are actually going to use informers in the background.)

This interface I guess comes from the fact that there is an informer used in the background and the old object is always cached.
(I mean this https://sdk.operatorframework.io/docs/building-operators/golang/references/event-filtering/ )
So not sure if these filters are such tightly coupled with informers, it could be added to rabric8 client, and we could just make an adaptor for it (can't see it there atm).

The main problems with the oldObject. If the operator is restarted we don't have the old object anymore which is then a little bit smell, although also it's true that on restart we usually want to reconcile.

But on large clusters where there are thousands of custom resources we don't want to cache all the objects since it can happen that there might be just no related events coming for a very long time. In that case it might be more efficient to just read the CR from the k8s API. The plan for the future is that we will provide a configurable cache for this (like Caffein). So in this case the old object won't be available by definition.
We should have actually a discussion about from above, also if we should have multiple options for caches etc, to cache or not cache the objects. So very interested in your opinion @lburgazzoli also. But the design of this predicates also depends on that outcome.

@lburgazzoli
Copy link
Collaborator Author

I'm not sure about this point but it can eventually be done or maybe we can clearly document that the oldObject may be null in case of a cacheless controller. Eventually also the generation generation awarness can be delegated to the user by using the obeservedGeneration patter (implemented by a number of Kubernetes controllers) in which, in essence you have to store the generation the controller is act upon in the status.

If we document that old object might be null, not sure if that would make sense even to put it there - Or we could also replicate the 3 methods in the filter (Create, Update, Delete), (if we are actually going to use informers in the background.)

This interface I guess comes from the fact that there is an informer used in the background and the old object is always cached.
(I mean this sdk.operatorframework.io/docs/building-operators/golang/references/event-filtering )
So not sure if these filters are such tightly coupled with informers, it could be added to rabric8 client, and we could just make an adaptor for it (can't see it there atm).

The main problems with the oldObject. If the operator is restarted we don't have the old object anymore which is then a little bit smell, although also it's true that on restart we usually want to reconcile.

But on large clusters where there are thousands of custom resources we don't want to cache all the objects since it can happen that there might be just no related events coming for a very long time. In that case it might be more efficient to just read the CR from the k8s API. The plan for the future is that we will provide a configurable cache for this (like Caffein). So in this case the old object won't be available by definition.
We should have actually a discussion about from above, also if we should have multiple options for caches etc, to cache or not cache the objects. So very interested in your opinion @lburgazzoli also. But the design of this predicates also depends on that outcome.

Fine with me, maybe we can have an ad-hoc meeting for this discussion ?

@csviri
Copy link
Collaborator

csviri commented Sep 20, 2021

Fine with me, maybe we can have an ad-hoc meeting for this discussion ?

Definitely, that would be the best IMHO.

@lburgazzoli lburgazzoli marked this pull request as draft September 20, 2021 12:50
@lburgazzoli
Copy link
Collaborator Author

Converted to a draft until we get consensus

/**
* Determines whether the change between the old version of the resource and the new one needs to
* be propagated to the controller or not.
*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have filter on Creates(Deletes)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@lburgazzoli lburgazzoli marked this pull request as ready for review September 29, 2021 07:01
@lburgazzoli
Copy link
Collaborator Author

@csviri @metacosm pr is now ready for review

@csviri
Copy link
Collaborator

csviri commented Sep 29, 2021

@lburgazzoli will take a look today, after finished my current issue

try {
lock.lock();
final var uid = getUID(resource);
if (predicate.test(resources.get(uid))) {
if (passthrough != predicate) {
log.trace("Update cache after condition is true: {}", getName(resource));
}
resources.put(uid, resource);
// defensive copy
resources.put(getUID(resource), clone(resource));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we cloning here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that as example, if for some reason the process that applies the finalizer fails when updating the object, then the cache would still reporting the object as having the finalizer which is not correct.

it probably won't matter so if it should be removed, I ca do it in a subsequent pr

(configuration, oldResource, newResource) -> {
if (configuration.useFinalizer()) {
final var finalizer = configuration.getFinalizer();
boolean oldFinalizer = oldResource == null || oldResource.hasFinalizer(finalizer);
Copy link
Collaborator

@csviri csviri Sep 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to write unit tests for these methods, since there is non-trivial logic. But it can be a separate PR.

@csviri csviri merged commit 0dc0de1 into operator-framework:master Sep 29, 2021
@lburgazzoli lburgazzoli deleted the gh-404 branch September 29, 2021 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Custom event filter for controllers
3 participants