Subscribe to PHP Freaks RSS

Notes on GraphQL

syndicated from planet-php.net on July 18, 2018

The last week has been my first foray into GraphQL, using the GitHub GraphQL API endpoints. I now have OpinionsTM.

The promise is fantastic: query for everything you need, but nothing more. Get it all in one go.

But the reality is somewhat... different.

What I found was that you end up with a lot of garbage data structures that you then, on the client side, need to decipher and massage, unpacking edges, nodes, and whatnot. I ended up having to do almost a dozen array_column, array_map, and array_reduce operations on the returned data to get a structure I can actually use.

The final data I needed looked like this:

[
  {
    "name": "zendframework/zend-expressive",
    "tags": [
      {
        "name": "3.0.2",
        "date": "2018-04-10"
      }
    ]
  }
]

To fetch it, I needed a query like the following:

query showOrganizationInfo(
  $organization:String!
  $cursor:String!
) {
  organization(login:$organization) {
    repositories(first: 100, after: $cursor) {
      pageInfo {
        startCursor
        hasNextPage
        endCursor
      }
      nodes {
        nameWithOwner
        tags:refs(refPrefix: "refs/tags/", first: 100, orderBy:{field:TAG_COMMIT_DATE, direction:DESC}) {
          edges {
            tag: node {
              name
              target {
                ... on Commit {
                  pushedDate
                }
                ... on Tag {
                  tagger {
                    date
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

Which gave me data like the following:

{
  "data": {
    "organization": {
      "repositories: {
        "pageInfo": {
          "startCursor": "...",
          "hasNextPage": true,
          "endCursor": "..."
        },
        "nodes": [
          {
            "nameWithOwner": "zendframework/zend-expressive",
            "tags": {
              "edges": [
                "tag": {
                  "name": "3.0.2",
                  "target": {
                    "tagger": {
                      "date": "2018-04-10"
                    }
                  }
                }
              ]
            }
          }
        ]
      }
    }
  }
}

How did I discover how to create the query? I'd like to say it was by reading the docs. I really would. But these gave me almost zero useful examples, particularly when it came to pagination, ordering results sets, or what those various "nodes" and "edges" bits were, or why they were necessary. (I eventually found the information, but it's still rather opaque as an end-user.)

Additionally, see that pageInfo bit? This brings me to my next point: pagination sucks, particularly if it's not at the top-level. You can only fetch 100 items at a time from any given node in the GitHub GraphQL API, which means pagination. And I have yet to find a client that will detect pagination data in results and auto-follow them. Additionally, the "after" property had to be something valid... but there were no examples of what a valid value would be. I had to resort to StackOverflow to find an example, and I still don't understand why it works.

I get why clients cannot unfurl pagination, as pagination data could appear anywhere in the query. However, it hit me hard, as I thought I had a complete set of data, only to discover around half of it was missing once I finally got the processing correct.

If any items further down the tree also require pagination, you're in for some real headaches, as you then have to fetch paginated sets depth-first.

So, while GraphQL promises fewer round trips and exactly the data you need, my experience so far is:

  • I end up having to be very careful about structuring my queries, paying huge attention to pagination potential, and often sending multiple queries ANYWAYS. A well-documented REST API is often far easier to understand and work with immediately.

  • I end up doing MORE work client-side to make the data I receive back USEFUL. This is because the payload structure is based on the query structure and the various permutations you need in order to get at the data you need. Again, a REST API usually has a single, well-documented payload, making consumption far easier.

I'm sure I'm

Truncated by Planet PHP, read more at the original (another 1995 bytes)