Skip to content

Executor performance optimization #314

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 21, 2018
Merged

Conversation

jakubkulhan
Copy link
Contributor

@jakubkulhan jakubkulhan commented Jul 26, 2018

Hi. First, thank you for the great library! We use it to power our API at Scuk.cz and it's been a smooth ride.

Recently, however, we've discovered poor performance for some of the queries the frontend service sends to the GraphQL endpoint. Over time our schema got pretty big (currently it consists of ~300 types /half of which are object types/ and ~1000 resolvable fields on object types). So did the queries (one of the pages in our app queries ~200 fields).

We successfully refactored GraphQL schema so that types & fields are created lazily. Also we started caching parsed AST. This got us nice speed improvements. After these I've started to look at performance of the Executor. It costs us more than 150ms per request for some queries. This is the call graph of GraphQL endpoint for such query (library calls are highlighted):

GraphQL call graph

I've started digging in the code and found some easy to improve functions (e.g. Executor::shouldIncludeNode()). But I estimate these would shave off only couple of milliseconds. So I've started to work on a new executor:

  • separated compilation & execution phases
    • first, AST is transformed into an intermediate representation (= instructions) that is better suitable for manipulation and exection
    • then, the instructions are processed sequentially (I suppose sequential execution should be more machine-friendly than jumping between tens of functions and callbacks, therefore perform better)
  • instruction pipeline can be manipulated with during the execution (new instructions are pushed to the front or back)
  • compiled instructions could be cached instead of an AST
    • in the future, they could be compiled down to PHP code for even better performance (as some templating engines, DI containers etc. do)

This is still a work in progress and will need some time before the new executor passes the test suite. Yet, what do you think about this? :)


Questions & notes:

  1. ValueNode interface doc comment says that VariableNode is a ValueNode. But VariableNode does not implement it. Is it intentional, or a bug?
  2. This might be quite a lot of changes. Is there a way to check code style? (I've run composer run lint, however, PHP-CS reported issues even with current files.)
  3. Current executor works with results as arrays. But this is a problem for JSON serialization if the array remains empty (there is a workaround converting empty array to stdClass). I started new executor to work with stdClasses instead, because I need pass by reference semantics, also it fixes JSON-serialization-related issue from the start. But this would be big breaking change. Should I continue with stdClasses, or use arrays?
  4. I started typehinting all parameters of all new methods. But then I realized the library still supports older PHP versions. Is there a plan to drop old versions support?
  5. ResolveInfo is tightly coupled to AST traversal execution model. After the compilation, AST is no longer relevant for new executor. It could include needed AST nodes in instructions, however, it defies the point of the compilation step.

@vladar
Copy link
Member

vladar commented Jul 26, 2018

Hey, it is very interesting! Curious to see your final benchmarks. But keep in mind that GraphQL execution is nuanced and has edge cases. So you can't be sure that you've got significant performance benefits until full test suite passes.

Also, one thing you should be aware of is that this project is a direct port of the graphql-js. So the executor mirrors their executor for about 95%. This was a pragmatic choice for us to make maintenance of this library realistic. And we try to keep as close as possible to the reference implementation exactly due to the cost of maintenance.

So for this PR to be merged into the lib, we must be sure that you will be ready to maintain it for some foreseeable future (meaning bug-fixes and keeping it up-to-date with the future versions of the spec). Because currently it's the matter of porting changes from the reference implementation, but if we switch to the new executor the cost of changes will be much higher.

But anyway, if you succeed with major performance improvements we can actually back-port them to the graphql-js and suggest for merging (since they have a similar issue - graphql/graphql-js#723). So I am really curious to see your final results!

Now back to your questions:

ValueNode interface doc comment says that VariableNode is a ValueNode. But VariableNode does not implement it. Is it intentional, or a bug?

This is probably a bug.

This might be quite a lot of changes. Is there a way to check code style? (I've run composer run lint, however, PHP-CS reported issues even with current files.)

Yeah, we just started integrating code style checks into the code base (#284). At the moment only Error and Server namespaces were cleared. Others are yet to come. You can use phpcs against your code directly for now.

But this would be big breaking change. Should I continue with stdClasses, or use arrays?

We'd like to avoid severe breaking changes. I guess it shouldn't be that hard to pass an array by reference instead of just stdClass or am I missing something?

I started typehinting all new methods. But then I realized the library still supports older PHP versions. Is there a plan to drop old versions support?

Yes there is. Basically the next major version will require at least PHP7.1

@jakubkulhan jakubkulhan changed the title [WIP] Executor performance optimization Executor performance optimization Jul 28, 2018
@jakubkulhan jakubkulhan force-pushed the compiler branch 2 times, most recently from f108893 to 0845fb0 Compare July 28, 2018 20:18
@jakubkulhan
Copy link
Contributor Author

But keep in mind that GraphQL execution is nuanced and has edge cases.

I get now what you meant. I had to give up the idea of separate compilation phase, because CollectFields() algorithm depends on the variables in the request, so execution pipeline can't be determined ahead-of-time, e.g.:

query Q($condition: Boolean!) {
  foo @include(if: $condition)
  foo: bar
}

This query might call resolver for either foo, or bar - depending on the value of $condition.

Curious to see your final benchmarks.

However, I kept working with the simplified execution flow. And it got me to an implementation that is, for our "big query", at least twice as fast.

New executor uses coroutines implemented using generators instead of promises and callbacks. For every object field resolution the Executor creates a new strand (also called fiber / green thread / lightweight thread / goroutine) and starts a new coroutine Executor::spawn(). Then it runs (Executor::run()) queued strands until completion.

Promises are handled by Executor::$pending counter - when the execution is postponed by promise, pending counter is incremented. After the promise is fulfilled/rejected, pending is decremented, strand is pushed back to the queue and the execution resumes. After all strands are completed, the main promise is resolved.

Call graph is much simpler than before:

New executor's call graph

A lot of performance improvements were gained by memoization, see ExecutionContextShared. Most importantly if there is an array of objects, sub-fields' collection and other computations are run only once.

Another 30 % improvement could be gained when schema sanity checks are removed, see jakubkulhan/graphql-php@compiler...faster. However, it would mean some breaking changes (11 tests failing).

You can use phpcs against your code directly for now.

Ok, changes formatted by phpcbf.

I guess it shouldn't be that hard to pass an array by reference instead of just stdClass or am I missing something?

Because of coroutines scheduling I don't know the point in which the resulting array should be checked whether it must be converted to stdClass. Instead, I've added Executor::resultToArray() that converts stdClasses to arrays at the end of the execution. It adds negligible overhead.

So for this PR to be merged into the lib, we must be sure that you will be ready to maintain it for some foreseeable future (meaning bug-fixes and keeping it up-to-date with the future versions of the spec).

As we use this library in production, I intend to do this :)

@vladar
Copy link
Member

vladar commented Jul 29, 2018

Sounds promising! Will look into it closer when I have a chance.

One major issue for me though is that you got rid of PromiseAdapter. There are other projects already which depend on this feature, like dataloder-php + I know people who do use it with ReactPromiseAdapter in production.

Also, I am curious what would happen if we apply memoization for collectSubFields for list items. This optimization is actually already merged into graphql-js (here) but we didn't port it yet.

Would be great to compare. By the way, do you have any benchmarking project? I'd like to play with it too, just need some common ground to check against.

@vladar
Copy link
Member

vladar commented Jul 29, 2018

Oh, ignore about promise adapter part, I see it is still there. Just misinterpreted your notes %)

@vladar
Copy link
Member

vladar commented Jul 29, 2018

I did a quick benchmark and looks like the improvement suggested in the reference implementation makes it very close to your implementation (with slightly better memory footprint).

+----------------+---------------------------------+--------+--------+------+-----+------------+----------+----------+----------+----------+---------+--------+-------+
| benchmark      | subject                         | groups | params | revs | its | mem_peak   | best     | mean     | mode     | worst    | stdev   | rstdev | diff  |
+----------------+---------------------------------+--------+--------+------+-----+------------+----------+----------+----------+----------+---------+--------+-------+
| ExecutionBench | benchNewExecutor                |        | []     | 10   | 3   | 5,872,848b | 7.554ms  | 7.602ms  | 7.572ms  | 7.679ms  | 0.055ms | 0.72%  | 1.00x |
| ExecutionBench | benchOldExecutor                |        | []     | 10   | 3   | 3,272,136b | 15.500ms | 15.765ms | 15.873ms | 15.909ms | 0.187ms | 1.19%  | 2.07x |
| ExecutionBench | benchOldExecutorWithMemoization |        | []     | 10   | 3   | 2,959,416b | 8.673ms  | 8.754ms  | 8.786ms  | 8.809ms  | 0.059ms | 0.67%  | 1.15x |
+----------------+---------------------------------+--------+--------+------+-----+------------+----------+----------+----------+----------+---------+--------+-------+

I just pushed this memoized version to master. Can you try it against your complex query?

@jakubkulhan
Copy link
Contributor Author

By the way, do you have any benchmarking project?

I created a benchmark for our complex query (it does 1499 resolve calls, of which 449 are for __typename /the query is generated by Apollo Client/) with static data pre-generated from production & resolver like Executor::defaultFieldResolver instead of individual resolve methods. But I don't think I'll be able to release this.

Results are similar to yours:

+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+
| benchmark         | subject                         | groups | params | revs | its | mem_peak    | best     | mean     | mode     | worst    | stdev   | rstdev | diff  |
+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+
| ExecutorBenchmark | benchNewExecutor                |        | []     | 100  | 3   | 33,452,600b | 6.356ms  | 6.463ms  | 6.404ms  | 6.620ms  | 0.113ms | 1.75%  | 1.05x |
| ExecutorBenchmark | benchNewExecutorNoSchemaChecks  |        | []     | 100  | 3   | 33,346,304b | 6.020ms  | 6.146ms  | 6.055ms  | 6.369ms  | 0.158ms | 2.58%  | 1.00x |
| ExecutorBenchmark | benchOldExecutor                |        | []     | 100  | 3   | 9,369,216b  | 14.010ms | 14.105ms | 14.126ms | 14.192ms | 0.075ms | 0.53%  | 2.30x |
| ExecutorBenchmark | benchOldExecutorWithMemoization |        | []     | 100  | 3   | 9,487,376b  | 9.394ms  | 9.635ms  | 9.672ms  | 9.860ms  | 0.190ms | 1.98%  | 1.57x |
+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+

Reason, that old executor with memoization is worse than in your benchmark, is probably by new executor's short-circuit execution for __typename field. I think it could be added to the old executor as well.

We run queries with SyncPromiseAdapter, however, some fields return Deferreds to leverage batch loading from DB. So I re-ran benchmark with 15 % of calls to resolve methods return Deferred (about the same ratio of resolve calls that our real schema would return as async):

+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+
| benchmark         | subject                         | groups | params | revs | its | mem_peak    | best     | mean     | mode     | worst    | stdev   | rstdev | diff  |
+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+
| ExecutorBenchmark | benchNewExecutor                |        | []     | 100  | 3   | 54,489,368b | 8.319ms  | 8.354ms  | 8.361ms  | 8.386ms  | 0.028ms | 0.33%  | 1.08x |
| ExecutorBenchmark | benchNewExecutorNoSchemaChecks  |        | []     | 100  | 3   | 53,846,080b | 7.597ms  | 7.728ms  | 7.638ms  | 7.950ms  | 0.158ms | 2.04%  | 1.00x |
| ExecutorBenchmark | benchOldExecutor                |        | []     | 100  | 3   | 13,377,192b | 21.281ms | 21.487ms | 21.549ms | 21.658ms | 0.156ms | 0.72%  | 2.78x |
| ExecutorBenchmark | benchOldExecutorWithMemoization |        | []     | 100  | 3   | 14,034,973b | 15.836ms | 16.142ms | 16.241ms | 16.387ms | 0.229ms | 1.42%  | 2.09x |
+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+
Also if all calls to resolve methods return `Deferred`:
+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+
| benchmark         | subject                         | groups | params | revs | its | mem_peak    | best     | mean     | mode     | worst    | stdev   | rstdev | diff  |
+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+
| ExecutorBenchmark | benchNewExecutor                |        | []     | 100  | 3   | 57,660,896b | 14.826ms | 15.154ms | 14.911ms | 15.749ms | 0.421ms | 2.78%  | 1.06x |
| ExecutorBenchmark | benchNewExecutorNoSchemaChecks  |        | []     | 100  | 3   | 56,003,240b | 14.058ms | 14.308ms | 14.380ms | 14.520ms | 0.190ms | 1.33%  | 1.00x |
| ExecutorBenchmark | benchOldExecutor                |        | []     | 100  | 3   | 17,371,088b | 36.865ms | 37.332ms | 37.500ms | 37.681ms | 0.344ms | 0.92%  | 2.61x |
| ExecutorBenchmark | benchOldExecutorWithMemoization |        | []     | 100  | 3   | 17,139,288b | 30.933ms | 31.350ms | 31.212ms | 31.840ms | 0.374ms | 1.19%  | 2.19x |
+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+

Old executor seems to perform worse if promises are involved.


Btw still the benchmark results show overhead of old executor only about 20ms. I ran benchmarks on my local computer. I'll have to look why on the server overhead is so much bigger.

@vladar
Copy link
Member

vladar commented Jul 30, 2018

Apart from performance, also keep an eye on memory footprint. Looks like the new executor is much heavier in this regard.

@jakubkulhan
Copy link
Contributor Author

Fixed, reference cycles needed to be broken.

+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+
| benchmark         | subject                         | groups | params | revs | its | mem_peak    | best     | mean     | mode     | worst    | stdev   | rstdev | diff  |
+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+
| ExecutorBenchmark | benchNewExecutor                |        | []     | 100  | 3   | 10,168,368b | 8.193ms  | 8.243ms  | 8.211ms  | 8.325ms  | 0.058ms | 0.71%  | 1.06x |
| ExecutorBenchmark | benchNewExecutorNoSchemaChecks  |        | []     | 100  | 3   | 9,523,752b  | 7.542ms  | 7.751ms  | 7.830ms  | 7.899ms  | 0.152ms | 1.96%  | 1.00x |
| ExecutorBenchmark | benchOldExecutor                |        | []     | 100  | 3   | 10,926,120b | 22.744ms | 23.053ms | 22.853ms | 23.554ms | 0.357ms | 1.55%  | 2.97x |
| ExecutorBenchmark | benchOldExecutorWithMemoization |        | []     | 100  | 3   | 11,391,309b | 17.129ms | 17.346ms | 17.197ms | 17.713ms | 0.261ms | 1.51%  | 2.24x |
+-------------------+---------------------------------+--------+--------+------+-----+-------------+----------+----------+----------+----------+---------+--------+-------+

@vladar
Copy link
Member

vladar commented Jul 30, 2018

Numbers are great! Still hesitate though if 10ms difference for your 150ms query worth the complete rewrite (and increased maintenance costs in future).

But if we continue with the new executor we will still give users an option to switch between the old and the new one (at least for 0.13.x). So it makes sense to put it alongside with the old one (i.e. in the ExperimentalExecutor namespace) and add a static method to switch implementations on demand.

The reason for this is that while a test suite should cover most of the edge cases, there are still unknown unknowns. Pretty sure we'll get unexpected issues and it would be nice if users could report them and switch to the old implementation without major efforts until we fix everything.

After the adoption period, we'll remove the switch (probably for 0.14.x).

Please let me know if it works for you.

@Smolevich
Copy link
Contributor

@vladar, when do you plan to release version 0.13.x? Our team also use webonyx in preproduction and we interested in optimisation stage of executing.

@jakubkulhan
Copy link
Contributor Author

@vladar Yes, that sounds reasonable.

@vladar
Copy link
Member

vladar commented Jul 31, 2018

@Smolevich likely somewhere in September. But I plan to release the memoization improvement of Executor in 0.12.x branch in next couple days (since it's a cheap and fully backward compatible change)

@jakubkulhan
Copy link
Contributor Author

jakubkulhan commented Aug 1, 2018

Rebased onto master, new executor implementation moved to namespace GraphQL\Experimental\Executor, GraphQL\Executor\Executor can switch between implementations.

@Smolevich
Copy link
Contributor

@jakubkulhan, can you show code of benchmarks upper?

@jakubkulhan
Copy link
Contributor Author

@Smolevich Sorry, can't do, it uses our schema code. However, the query looks like this https://gist.github.com/jakubkulhan/e938b66d7e498d549d1d8727cddd3659

@vladar
Copy link
Member

vladar commented Aug 20, 2018

FYI, I will get back to this PR after porting all of the changes for 0.13, so that we only need to resolve conflicts once before merge. Will post another comment here when ready for merge.

@jakubkulhan
Copy link
Contributor Author

@vladar Ok, thank you. Let me know and I'll rebase the branch onto master.

@vladar
Copy link
Member

vladar commented Nov 3, 2018

After some thinking, I decided to preserve reference executor as a default one for the upcoming version.

But we will encourage users to try the new Executor in UPGRADE docs. Given their feedback and how things go, we may change the default implementation in the next version for all users.

The reason is that it is hard for me to read and understand the new code, so maintenance of the executor will be mostly on your shoulders. We must make sure this cooperation works smoothly before making it a default.

So if it works for you, can you rebase it onto master? I am ready to release 0.13.x

@jakubkulhan
Copy link
Contributor Author

@vladar I'll rebase new executor code in the next few days. Is there something I can improve, so the code could be easier to understand?

@jakubkulhan jakubkulhan force-pushed the compiler branch 5 times, most recently from 17aa1ee to 4bc956c Compare November 6, 2018 22:29
@jakubkulhan
Copy link
Contributor Author

OK, rebase is complete.

@vladar vladar merged commit 21e0c83 into webonyx:master Nov 21, 2018
@vladar
Copy link
Member

vladar commented Nov 21, 2018

Merged, thanks!

I did some further research and the interesting thing about new executor is that it traverses fields in a different order. ReferenceExecutor is a depth-first while new Executor does a breadth-first traversal (scoped to a parent).

Imagine the following query:

{
  a1 {
    b1 {
      c1
    }
    d1
  }
  a2 {
    b2 {
      c2
    }
    d2
  }
}

The old executor would traverse it as a1 b1 c1 d1 a2 b2 c2 d2
The new executor would traverse it as a1 a2 b1 d1 c1 b2 d2 c2

I am curious if actual performance boost occurs because of this. Will do some research in my spare time.

Anyways, this should be totally OK in theory, but we need to collect some real-world feedback on the implications of this. Do you use it in production already?

@vladar
Copy link
Member

vladar commented Nov 21, 2018

@jakubkulhan @simPod Code coverage dropped to 84% because it won't run tests against the new Executor. Any ideas if can we make code coverage run tests twice (with different environment variables)?

@jakubkulhan
Copy link
Contributor Author

@vladar Thank you for merging!

About order of traversal, I've encountered this in some test cases (e.g. DeferredFieldsTest). I hope nobody's code depends on the specific order :)

Also the new executor may call resolvers for fields that won't be included in the result due to errors on sibling fields or sibling's sub-graphs. For example with schema:

type RootQuery {
  foo: Foo!
}
type Foo {
  id: ID!
  foo1: Foo!
  foo2: Foo!
  foo3: Foo!
}

For query:

query Q {
  foo {
    id
    foo1 { id }
    foo2 { id }
    foo3 { id }
  }
}

AFAIK the old executor won't call resolver for foo3 if foo2 fails. However, the new executor will. As it will call resolvers for the whole foo3 sub-graph. This could be possibly eliminated by dependency tracking between execution strands. However, it unnecessarily slows down average-case execution, i.e. when no resolver fails.

For coverage, it should be possible to run PHPUnit with env variable EXECUTOR=couroutine and w/o and then merge generated coverage reports - https://stackoverflow.com/a/30898560/149230

We've been using without any issue the new executor in production at Scuk.cz since the end of July when I committed the implementation that passed all tests.

Although I've made one addition to the code we run in production - result caching. It can bypass calling resolvers for whole sub-graph and read result from the cache instead. Now that this PR is merged, if you'd be interested, I might send the PR with this change. However, it would be only for the new executor, I don't see how it would be possible to port this to the old executor.

@simPod
Copy link
Collaborator

simPod commented Nov 21, 2018

It will however ~double the CI test runtime, not a big deal I guess. But merging clovers cannot be avoided.

We can eliminate the double runtime by specifying eg. @group executor for tests that cover executor and running only those for new executor.

@vladar
Copy link
Member

vladar commented Nov 22, 2018

Anyone willing to send a PR for coverage fix in Travis config?

@simPod I think I can live with doubling of the tests runtime (sorry Travis).

@vladar
Copy link
Member

vladar commented Nov 22, 2018

Although I've made one addition to the code we run in production - result caching. It can bypass calling resolvers for whole sub-graph and read result from the cache instead.

Very interesting. How do you invalidate it (for the whole subgraph)?

Now that this PR is merged, if you'd be interested, I might send the PR with this change. However, it would be only for the new executor, I don't see how it would be possible to port this to the old executor.

I think we can add it in the next version. For now, we should just collect some feedback on the new executor. But it sounds pretty exciting.

One thing I am interested at the moment is benchmarks for old/new executor (I am still curious to test my hypothesis about effects of traversal path on performance). If you did something already - would be very interested to see under benchmarks (should be easy now when they both located alongside).

@yaquawa
Copy link

yaquawa commented Dec 21, 2018

@jakubkulhan

Wow! Very interesting!!
We got the same issue with the poor performance when executing against large schema.. and it's really a big pain for us.

We successfully refactored GraphQL schema so that types & fields are created lazily. Also we started caching parsed AST. This got us nice speed improvements.

compiled instructions could be cached instead of an AST

Would you give me more details for how you optimized the execution performance ?
Perhaps, create a new topic of this on the docs?

@simPod simPod mentioned this pull request Dec 30, 2018
@adri
Copy link
Contributor

adri commented Mar 8, 2019

@jakubkulhan

Also we started caching parsed AST.

Sounds interesting. Do you maybe have an example how you did that?

@jakubkulhan
Copy link
Contributor Author

@adri GraphQL::executeQuery() accepts both string and instance of DocumentNode (see https://github.com/webonyx/graphql-php/blob/master/src/GraphQL.php#L128-L132). We use two-level caching (in-process using APCu and out-of-process using Redis) for expression Parser::parse($query, ["noLocation" => true]) (noLocation is needed, otherwise parsed AST is really big) that gets passed instead of raw query string.

@adri
Copy link
Contributor

adri commented Mar 11, 2019

@jakubkulhan thanks! Do you store the DocumentNode serialized? If yes, did you have issues with the library updating and changing properties that make unserialize fail?

@jakubkulhan
Copy link
Contributor Author

@adri Yes, it's stored serialized by PHP's native serialize(). To prevent issues with version upgrades, cache key contains graphql-php's package version obtained using ocramius/package-versions.

@TerraSkye
Copy link

@jakubkulhan u should use ig_binary for serilazation it has performance increase over the default :

Native PHP :
PHP serialized in 2.91 seconds
PHP unserialized in 6.43 seconds
serialized “String” size : 20769

Igbinary :
WIN igbinary serialized in 1.60 seconds
WIN igbinrary unserialized in 4.77 seconds
WIN serialized “String” Size : 4467

@wutsch0
Copy link

wutsch0 commented Sep 19, 2019

@jakubkulhan

Although I've made one addition to the code we run in production - result caching. It can bypass calling resolvers for whole sub-graph and read result from the cache instead. Now that this PR is merged, if you'd be interested, I might send the PR with this change. However, it would be only for the new executor, I don't see how it would be possible to port this to the old executor.

Have you send a PR about that or can you tell me how you implemented it? It would be very nice to have the ability of sub-graph result caching in the lib

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants