GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. GraphQL provides a complete and understandable description of the data in your API, gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools. To understand GraphQl you need to understand the following terms.
Schema
A schema defines a GraphQL API's type system. It describes the complete set of possible data (objects, fields, relationships, everything) that a client can access. Calls from the client are validated and executed against the schema.
Connection
Connections let you query related objects as part of the same call. With connections, you can use a single GraphQL call where you would have to use multiple calls to a REST API.
Node
A Node is a generic term for an object, you can look up a node directly, or you can access related nodes via a connection.
Edge
Edges represent connections between nodes. When you query a connection, you traverse its edges to get to its nodes. Every edges field has a node field and ultimately the data you are searching for.
It is useful to picture the above as a graph: dots connected by lines. The dots are nodes, the lines are edges. A connection defines a relationship between nodes.
Check out the GraphQL website for more info and a more detailed description - Docs
Github GraphQL API
GitHub chose GraphQL for there v4 API because it offers significantly more flexibility for integrators. The ability to define precisely the data you want—and only the data you want—is a powerful advantage over the REST API endpoints. GraphQL lets you replace multiple REST requests with a single call to fetch the data you specify.
Github provide an online tool for testing and exploring the API - GitHub Explorer, simply login with your github creds and start querying.
This is old news
Although none of this is new, so you may be wondering why im writing about it now, GraphQL was officially released in 2015 and Github announced a new GraphQL API in Sept 2016. Well, the answer is I struggled to find many examples other than the most basic hello-world
. So after a couple of hours of fiddling around I had created some pretty neat queries exposing some stats around a large github organisation, so I thought I'd share one of them with you.
A Usecase
I wanted to know the number of repositories in our GitHub repo and for each of those repositories the number of branches, this would be fed into some other data we had collected and be used to generate some reports for the team. You can imagine in a normal REST world, you could make the following individual calls:
- GET /orgs/:org/teams
then for each team
- GET /teams/:team_id/repos
then for each repo
- GET /repos/:owner/:repo/branches
Count the elements in the returned array
Ok, the above is not rocket science, but does require some effort. How would the above work against the graphQL API.
GraphQL Query
The Query will search the github organization
and count the number of repositories
, then for each repository
return the name
, url
, description
and branch
information including total number of branches and the name
of each.
{
organization(login: "github") {
name
url
repositories(first: 100) {
numberOfRepositories: totalCount
details: nodes {
name
url
description
branches: refs(refPrefix: "refs/heads/", first: 100) {
numberOfBranches: totalCount
details: edges {
branch: node {
name
}
}
}
}
}
}
}
Copy the above into the Github API explorer and see what you get. This query will return a JSON document that we can use to create our report.
The response (well a snippet of it) is shown bellow,
{
"data": {
"organization": {
"name": "GitHub",
"url": "https://github.com/github",
"repositories": {
"numberOfRepositories": 296,
"details": [
...
{
"name": "learn.github.com",
"url": "https://github.com/github/learn.github.com",
"description": "The discovery gate for all things educational for Git and GitHub.",
"branches": {
"numberOfBranches": 5,
"details": [
{
"branch": {
"name": "bootstrap-and-readme"
}
},
{
"branch": {
"name": "gh-pages"
}
},
{
"branch": {
"name": "gitcasts-videos"
}
},
{
"branch": {
"name": "proper-rendering-with-codeblocks"
}
},
{
"branch": {
"name": "remove-generation-rake-task"
}
}
]
}
},
...
]
}
}
}
}
We are only returning the data we want and are able to manipulate it into an easy to consume format, the Query is simply an HTTP post, so once the query is set you can perform the post and process the dataset however you wish.
How does it work
Inspecting the query above we use the organization
node to retrieve the name
and URL
organization(login: "github") {
name
url
we can then follow the edge
to the repositories
node, using the connection between the nodes.
+-----+ +------+
| ORG +------+ REPO |
+-----+ +------+
This gives us the data on the repositories node, the 'numberOfRepositories' and the list of each individual repository
, using this same pattern we can follow another edge
and get branch information.
+-----+ +------+ +------+
| ORG +------+ REPO +-----+BRANCH|
+-----+ +------+ +------+
Extras
GraphQL offers some really powerful manipulation tools too, for example the totalCount in the branch section of the output is not very descriptive, we can simply ask for the result to be given an alias
numberOfBranches:totalCount
You will see I have used several alias' in the example query to make reading and processing the result data more intuitive.
Conclusion
This is just the start of your GraphQL adventure, check out Mutations
, Unions
and Input Objects
on the github documentation site and read more about Graph QL and about GitHubs implementation of it here
I'd love to hear some of the use cases for using the Github GraphQL API, comment below and we can chat...
until next time
-Alex.