Update authorization and pagination docs (#1814)

mandiwise · benjie · web-flow · commit 4065ffca5268 · 2024-12-12T14:05:49.000Z
Co-authored-by: Benjie &lt;benjie@jemjie.com&gt;
diff --git a/src/pages/learn/authorization.mdx b/src/pages/learn/authorization.mdx
@@ -1,52 +1,92 @@
 # Authorization
 
-> Delegate authorization logic to the business logic layer
+<p className="learn-subtitle">Delegate authorization logic to the business logic layer</p>
+
+Most APIs will need to secure access to certain types of data depending on who requested it, and GraphQL is no different. GraphQL execution should begin after [authentication](/graphql-js/authentication-and-express-middleware/) middleware confirms the user's identity and passes that information to the GraphQL layer. But after that, you still need to determine if the authenticated user is allowed to view the data provided by the specific fields that were included in the request. On this page, we'll explore how a GraphQL schema can support authorization.
+
+## Type and field authorization
 
 Authorization is a type of business logic that describes whether a given user/session/context has permission to perform an action or see a piece of data. For example:
 
 _"Only authors can see their drafts"_
 
-Enforcing this kind of behavior should happen in the [business logic layer](/learn/thinking-in-graphs/#business-logic-layer). It is tempting to place authorization logic in the GraphQL layer like so:
+Enforcing this behavior should happen in the [business logic layer](/learn/thinking-in-graphs/#business-logic-layer). Let's consider the following `Post` type defined in a schema:
+ 
+```graphql
+type Post {
+  authorId: ID!
+  body: String
+}
+```
+
+In this example, we can imagine that when a request initially reaches the server, authentication middleware will first check the user's credentials and add information about their identity to the `context` object of the GraphQL request so that this data is available in every field resolver for the duration of its execution.
+
+If a post's body should only be visible to the user who authored it, then we will need to check that the authenticated user's ID matches the post's `authorId` value. It may be tempting to place authorization logic in the resolver for the post's `body` field like so:
 
 ```js
-const postType = new GraphQLObjectType({
-  name: 'Post',
-  fields: {
-    body: {
-      type: GraphQLString,
-      resolve(post, args, context, { rootValue }) {
-        // return the post body only if the user is the post's author
-        if (context.user && (context.user.id === post.authorId)) {
-          return post.body
-        }
-        return null
-      }
-    }
+function Post_body(obj, args, context, info) {
+  // return the post body only if the user is the post's author
+  if (context.user && (context.user.id === obj.authorId)) {
+    return obj.body
   }
-})
+  return null
+}
 ```
 
-Notice that we define "author owns a post" by checking whether the post's `authorId` field equals the current user’s `id`. Can you spot the problem? We would need to duplicate this code for each entry point into the service. Then if the authorization logic is not kept perfectly in sync, users could see different data depending on which API they use. Yikes! We can avoid that by having a [single source of truth](/learn/thinking-in-graphs/#business-logic-layer) for authorization.
+Notice that we define "author owns a post" by checking whether the post's `authorId` field equals the current user’s `id`. Can you spot the problem? We would need to duplicate this code for each entry point into the service. Then if the authorization logic is not kept perfectly in sync, users could see different data depending on which API they use. Yikes! We can avoid that by having a [single source of truth](/learn/thinking-in-graphs/#business-logic-layer) for authorization, instead of putting it the GraphQL layer.
 
-Defining authorization logic inside the resolver is fine when learning GraphQL or prototyping. However, for a production codebase, delegate authorization logic to the business logic layer. Here’s an example:
+Defining authorization logic inside the resolver is fine when learning GraphQL or prototyping. However, for a production codebase, delegate authorization logic to the business logic layer. Here’s an example of how authorization of the `Post` type's fields could be implemented separately:
 
 ```js
-// Authorization logic lives inside postRepository
-const postRepository = require('postRepository');
-
-const postType = new GraphQLObjectType({
-  name: 'Post',
-  fields: {
-    body: {
-      type: GraphQLString,
-      resolve(post, args, context, { rootValue }) {
-        return postRepository.getBody(context.user, post)
-      }
+// authorization logic lives inside `postRepository`
+export const postRepository = {
+  getBody({ user, post }) {
+    if (user?.id && (user.id === post.authorId)) {
+      return post.body
     }
+    return null
   }
-})
+}
+```
+
+The resolver function for the post's `body` field would then call a `postRepository` method instead of implementing the authorization logic directly:
+
+```js
+import { postRepository } from 'postRepository'
+
+function Post_body(obj, args, context, info) {
+  // return the post body only if the user is the post's author
+  return postRepository.getBody({ user: context.user, post: obj })
+}
 ```
 
-In the example above, we see that the business logic layer requires the caller to provide a user object. If you are using GraphQL.js, the User object should be populated on the `context` argument or `rootValue` in the fourth argument of the resolver.
+In the example above, we see that the business logic layer requires the caller to provide a user object, which is available in the `context` object for the GraphQL request. We recommend passing a fully-hydrated user object instead of an opaque token or API key to your business logic layer. This way, we can handle the distinct concerns of [authentication](/graphql-js/authentication-and-express-middleware/) and authorization in different stages of the request processing pipeline.
+
+## Using type system directives
+
+In the example above, we saw how authorization logic can be delegated to the business logic layer through a function that is called in a field resolver. In general, it is recommended to perform all authorization logic in that layer, but if you decide to implement authorization in the GraphQL layer instead then one approach is to use [type system directives](/learn/schema/#directives).
+
+For example, a directive such as `@auth` could be defined in the schema with arguments that indicate what roles or permissions a user must have to access the data provided by the types and fields where the directive is applied:
+
+```graphql
+directive @auth(rule: Rule) on FIELD_DEFINITION
+
+enum Rule {
+  IS_AUTHOR
+}
+
+type Post {
+  authorId: ID!
+  body: String @auth(rule: IS_AUTHOR)
+}
+```
+
+It would be up to the GraphQL implementation to determine how an `@auth` directive affects execution when a client makes a request that includes the `body` field for `Post` type. However, the authorization logic should remain delegated to the business logic layer.
+
+## Recap
+
+To recap these recommendations for authorization in GraphQL:
 
-We recommend passing a fully-hydrated User object instead of an opaque token or API key to your business logic layer. This way, we can handle the distinct concerns of [authentication](/graphql-js/authentication-and-express-middleware/) and authorization in different stages of the request processing pipeline.
+- Authorization logic should be delegated to the business logic layer, not the GraphQL layer
+- After execution begins, a GraphQL server should make decisions about whether the client that made the request is authorized to access data for the included fields
+- Type system directives may be defined and added to the types and fields in a schema to apply generalized authorization rules
diff --git a/src/pages/learn/pagination.mdx b/src/pages/learn/pagination.mdx
@@ -1,16 +1,16 @@
 # Pagination
 
-> Different pagination models enable different client capabilities
+<p className="learn-subtitle">Traverse lists of objects with a consistent field pagination model</p>
 
-A common use case in GraphQL is traversing the relationship between sets of objects. There are a number of different ways that these relationships can be exposed in GraphQL, giving a varying set of capabilities to the client developer.
+A common use case in GraphQL is traversing the relationship between sets of objects. There are different ways that these relationships can be exposed in GraphQL, giving a varying set of capabilities to the client developer. On this page, we'll explore how fields may be paginated using a cursor-based connection model.
 
 ## Plurals
 
-The simplest way to expose a connection between objects is with a field that returns a plural type. For example, if we wanted to get a list of R2-D2's friends, we could just ask for all of them:
+The simplest way to expose a connection between objects is with a field that returns a plural [List type](/learn/schema/#list). For example, if we wanted to get a list of R2-D2's friends, we could just ask for all of them:
 
 ```graphql
 # { "graphiql": true }
-{
+query {
   hero {
     name
     friends {
@@ -22,10 +22,10 @@ The simplest way to expose a connection between objects is with a field that ret
 
 ## Slicing
 
-Quickly, though, we realize that there are additional behaviors a client might want. A client might want to be able to specify how many friends they want to fetch; maybe they only want the first two. So we'd want to expose something like:
+Quickly, though, we realize that there are additional behaviors a client might want. A client might want to be able to specify how many friends they want to fetch—maybe they only want the first two. So we'd want to expose something like this:
 
 ```graphql
-{
+query {
   hero {
     name
     friends(first: 2) {
@@ -37,20 +37,22 @@ Quickly, though, we realize that there are additional behaviors a client might w
 
 But if we just fetched the first two, we might want to paginate through the list as well; once the client fetches the first two friends, they might want to send a second request to ask for the next two friends. How can we enable that behavior?
 
-## Pagination and Edges
+## Pagination and edges
 
-There are a number of ways we could do pagination:
+There are several ways we could do pagination:
 
 - We could do something like `friends(first:2 offset:2)` to ask for the next two in the list.
 - We could do something like `friends(first:2 after:$friendId)`, to ask for the next two after the last friend we fetched.
 - We could do something like `friends(first:2 after:$friendCursor)`, where we get a cursor from the last item and use that to paginate.
 
-In general, we've found that **cursor-based pagination** is the most powerful of those designed. Especially if the cursors are opaque, either offset or ID-based pagination can be implemented using cursor-based pagination (by making the cursor the offset or the ID), and using cursors gives additional flexibility if the pagination model changes in the future. As a reminder that the cursors are opaque and that their format should not be relied upon, we suggest base64 encoding them.
+The approach described in the first bullet is classic _offset-based pagination_. However, this style of pagination can have performance and security downsides, especially for larger data sets. Additionally, if new records are added to the database after the user has made a request for a page of results, then offset calculations for subsequent pages may become ambiguous.
 
-That leads us to a problem; though; how do we get the cursor from the object? We wouldn't want cursor to live on the `User` type; it's a property of the connection, not of the object. So we might want to introduce a new layer of indirection; our `friends` field should give us a list of edges, and an edge has both a cursor and the underlying node:
+In general, we've found that _cursor-based pagination_ is the most powerful of those designed. Especially if the cursors are opaque, either offset or ID-based pagination can be implemented using cursor-based pagination (by making the cursor the offset or the ID), and using cursors gives additional flexibility if the pagination model changes in the future. As a reminder that the cursors are opaque and their format should not be relied upon, we suggest base64 encoding them.
+
+But that leads us to a problem—how do we get the cursor from the object? We wouldn't want the cursor to live on the `User` type; it's a property of the connection, not of the object. So we might want to introduce a new layer of indirection; our `friends` field should give us a list of edges, and an edge has both a cursor and the underlying node:
 
 ```graphql
-{
+query {
   hero {
     name
     friends(first: 2) {
@@ -67,14 +69,14 @@ That leads us to a problem; though; how do we get the cursor from the object? We
 
 The concept of an edge also proves useful if there is information that is specific to the edge, rather than to one of the objects. For example, if we wanted to expose "friendship time" in the API, having it live on the edge is a natural place to put it.
 
-## End-of-list, counts, and Connections
+## End-of-list, counts, and connections
 
-Now we have the ability to paginate through the connection using cursors, but how do we know when we reach the end of the connection? We have to keep querying until we get an empty list back, but we'd really like for the connection to tell us when we've reached the end so we don't need that additional request. Similarly, what if we want to know additional information about the connection itself; for example, how many total friends does R2-D2 have?
+Now we can paginate through the connection using cursors, but how do we know when we reach the end of the connection? We have to keep querying until we get an empty list back, but we'd like for the connection to tell us when we've reached the end so we don't need that additional request. Similarly, what if we want additional information about the connection itself, for example, how many friends does R2-D2 have in total?
 
-To solve both of these problems, our `friends` field can return a connection object. The connection object will then have a field for the edges, as well as other information (like total count and information about whether a next page exists). So our final query might look more like:
+To solve both of these problems, our `friends` field can return a connection object. The connection object will be an Object type that has a field for the edges, as well as other information (like total count and information about whether a next page exists). So our final query might look more like this:
 
 ```graphql
-{
+query {
   hero {
     name
     friends(first: 2) {
@@ -96,20 +98,50 @@ To solve both of these problems, our `friends` field can return a connection obj
 
 Note that we also might include `endCursor` and `startCursor` in this `PageInfo` object. This way, if we don't need any of the additional information that the edge contains, we don't need to query for the edges at all, since we got the cursors needed for pagination from `pageInfo`. This leads to a potential usability improvement for connections; instead of just exposing the `edges` list, we could also expose a dedicated list of just the nodes, to avoid a layer of indirection.
 
-## Complete Connection Model
+## Complete connection model
 
-Clearly, this is more complex than our original design of just having a plural! But by adopting this design, we've unlocked a number of capabilities for the client:
+Clearly, this is more complex than our original design of just having a plural! But by adopting this design, we've unlocked several capabilities for the client:
 
 - The ability to paginate through the list.
 - The ability to ask for information about the connection itself, like `totalCount` or `pageInfo`.
 - The ability to ask for information about the edge itself, like `cursor` or `friendshipTime`.
 - The ability to change how our backend does pagination, since the user just uses opaque cursors.
 
-To see this in action, there's an additional field in the example schema, called `friendsConnection`, that exposes all of these concepts. You can check it out in the example query. Try removing the `after` parameter to `friendsConnection` to see how the pagination will be affected. Also, try replacing the `edges` field with the helper `friends` field on the connection, which lets you get directly to the list of friends without the additional edge layer of indirection, when that's appropriate for clients.
+To see this in action, there's an additional field in the example schema, called `friendsConnection`, that exposes all of these concepts:
+
+```graphql
+interface Character {
+  id: ID!
+  name: String!
+  friends: [Character]
+  friendsConnection(first: Int, after: ID): FriendsConnection!
+  appearsIn: [Episode]!
+}
+
+type FriendsConnection {
+  totalCount: Int
+  edges: [FriendsEdge]
+  friends: [Character]
+  pageInfo: PageInfo!
+}
+
+type FriendsEdge {
+  cursor: ID!
+  node: Character
+}
+
+type PageInfo {
+  startCursor: ID
+  endCursor: ID
+  hasNextPage: Boolean!
+}
+```
+
+You can try it out in the example query. Try removing the `after` argument for the `friendsConnection` field to see how the pagination will be affected. Also, try replacing the `edges` field with the helper `friends` field on the connection, which lets you get directly to the list of friends without the additional edge layer of indirection, when appropriate for clients:
 
 ```graphql
 # { "graphiql": true }
-{
+query {
   hero {
     name
     friendsConnection(first: 2, after: "Y3Vyc29yMQ==") {
@@ -129,6 +161,14 @@ To see this in action, there's an additional field in the example schema, called
 }
 ```
 
-## Connection Specification
+## Connection specification
+
+To ensure a consistent implementation of this pattern, the Relay project has a formal [specification](https://relay.dev/graphql/connections.htm) you can follow for building GraphQL APIs that use a cursor-based connection pattern - whether or not use you Relay.
+
+## Recap
+
+To recap these recommendations for paginating fields in a GraphQL schema:
 
-To ensure a consistent implementation of this pattern, the Relay project has a formal [specification](https://facebook.github.io/relay/graphql/connections.htm) you can follow for building GraphQL APIs which use a cursor based connection pattern.
+- List fields that may return a lot of data should be paginated
+- Cursor-based pagination provides a stable pagination model for fields in a GraphQL schema
+- The cursor connection specification from the Relay project provides a consistent pattern for paginating the fields in a GraphQL schema