What is GraphQL?

December 19, 2019

GraphQL logo

What is GraphQL?

From graphql.org:

GraphQL is a query language for APIs and a runtime for fulfilling those queries with your existing data. [GraphQL also] gives clients the power to ask for exactly what they need and nothing more, makes it easier to evolve APIs over time, and enables powerful developer tools.

So, what does this mean in practice? When am I actually supposed to use GraphQL, and why?

To solidify the above concepts and understand why GraphQL can be useful, let’s compare a REST implementation vs. a GraphQL implementation of an API that supports a seasonally appropriate(?) use case.

It’s holiday time, so a lot of us have travel coming up. And unfortunately for us, a bunch of this travel will be on planes. Cramped and pricey? Not a good look.

So let’s say that based on my history of being burned by bad flight prices, I now want to make an app that will chart out the trends in flight ticket prices over time. I’ve got my hands on the raw data, but now what? How do we go about makin’ the API that’ll power this fancy app?

REST Implementation

Suppose I want to fly from Boston to Los Angeles on February 1st, and I want to know how much the prices have changed over the last six weeks to see if now might be a good time to buy. To support this use case, we might end up with a resulting REST endpoint that looks something like this:

# app/api/flights/api.rb
# This example is using Ruby Grape: https://github.com/ruby-grape/grape
# but you could implement it any way you like.

resource :flights do
  params do
    requires :departure_airport, type: String
    requires :arrival_airport, type: String
    requires :departs_at, type: DateTime
    requires :start_time, type: DateTime
    requires :end_time, type: DateTime
  end

  get do
    Flights.where(
      departure_airport: params[:departure_airport],
      arrival_airport: params[:arrival_airport],
      recorded_at: params[:start_time].beginning_of_day..params[:end_time].end_of_day,
      departs_at: params[:departs_at]
    )
  end
end

After we’ve put the endpoint together, our first big decision point comes up: what properties about Flights should we send back to the client? Suppose we have access to a database with records of flight prices on certain airlines at certain times. The base record might look something like this:

{
   "id": "1298",
   "airline": "Graphical Airlines",
   "flight_number": "334",
   "recorded_at": "2019-11-24T10:38:21.008Z",
   "departs_at": "2019-02-01T11:30:00.000Z",
   "arrives_at": "2019-02-01T18:25:00.000Z",
   "is_nonstop": true,
   "airport_departure": "BOS",
   "airport_arrival": "LAX",
   "distance": 2601,
   "distance_unit": "miles",
   "duration": 415,
   "total_price": 292.60,
   "baggage_policy": [
     "Each passenger is limited to one carry-on bag and a personal item.",
     "The fee for the first checked bag is $50, each additional bag costs $60.",
     "The US federal government restricts hazardous materials in carry-on and checked baggage."
   ],
   "refundable": false,
   "base_fare": 245.58,
   "fees": 47.02,
   "flight_class": "saver",
   "ticket_changes_allowed": false,
   "can_select_seat": false,
   "boarding": "standard",
   "first_bag_fee": 50,
   "additional_fee_per_bag": 60,
   "stops": 0,
   "layover_duration": 0,
   "total_travel_duration": 415,
}

Yikes! That’s a whole lot of info about a single flight, especially when all we care about here is the ticket price. Remember, our initial use case was just to show a price history over the last six weeks. If we snapshotted this flight’s status once every 3 hours over six weeks, we’d end up with:

24 * 7 * 6 = 1008 hours / 3 = 336 snapshots! Sure, it’s not the end of the world, but it’s still way more data than we need. And just imagine how that number explodes if we wanted to extend the time range or query for multiple flights at once!

So let’s start small. Let’s only serialize the basics—ID, the departure time, the airports, and total price of the ticket.

class FlightSerializer < ActiveModel::Serializer
  attributes :id, :departs_at, :price,
             :airport_departure, :airport_arrival
end

Now our returned output will look like this instead:

{
   "id": "1298",
   "departs_at": "2019-02-01T11:30:00.000Z",
   "airport_departure": "BOS",
   "airport_arrival": "LAX",
   "total_price": 292.60
}

Phew. Much more manageable.

But wait, a new use case appears! Let’s say our app is really ✈️ taking off ✈️. People love querying for price history, go figure. But now, people want to query not only for price history, but also the airline and total travel time. Well… we can make a PR to update the serializer, and now our payload will look like this.

{
  "id": "1298",
  "departs_at": "2019-02-01T11:30:00.000Z",
  "airline": "Graphical Airlines",
  "airport_departure": "BOS",
  "airport_arrival": "LAX",
  "total_price": 292.60,
  "total_travel_time": 415
}

Well, making that one-off PR wasn’t too costly I guess, and the payload is still smaller than the original one. But now when we’re querying for only price, we’re still getting back travel time and airline anyway, even though we’re not going to use it. Though… the doc is still so small that maybe we’ll be fine…

…until we get requests to query for flight number. And then the number of additional stops. And then the arrival time. And then layover duration, total taxes, whether it’s refundable, price of first checked bag, baggage policies—OK, what? We’re back to just sending back the whole document with every query possible! For really large queries, this is will be far too much data for the client to handle—and not to mention, if we only wanted to query one property, why do we need to deal with all this other crap??

Should we bite the bullet? Should we just make new endpoints for each possible combination of properties that we’d want to query, each with their own serializers? Should we add some complicated logic that conditionally sends over information somehow? Should we… use GraphQL?

GraphQL Implementation

So, reminder on our initial use case: we want to query for the prices of a flight ticket over the last six weeks.

A key difference between GraphQL API implementations and REST implementations is that in GraphQL, the server operates on a single endpoint, and all requests should be made to this endpoint.

So for example, let’s say we had to query for Flights, Aircrafts, and Pilots. In a REST API world, we’d probably have endpoints like:

  • /api/flights,
  • /api/aircrafts, and
  • /api/pilots.

In the GraphQL world, all requests are made to /graphql. Then the server is responsible for cracking open the query, finding the data, and only returning what we asked for. But hOw dOeS iT kNow??

There are two key elements to a GraphQL server implementation: a schema and a root. The schema defines the object types and what we can query on those objects, and the root is a set of resolver functions that define how to find those objects.

Here’s an example of one of the simplest schemas you could build (using the GraphQL query language).

type Query {
  pilotOfTheDay: String
}

Apparently the only thing we can query for in this schema is the quote of the day. And this is how we’d query for it:

{
  pilotOfTheDay
}

Shocking! That said, the definition of how the resolve quoteOfTheDay would live in the root. An example implementation:

my_fake_db = {
  pilot_of_the_day: "Certainly not pat"
}

def pilot_of_the_day
  my_fake_db[:pilot_of_the_day]
end

We could also add more interesting types into the mix or add parameters to our queries.

If you want to learn more about these concepts, the official getting GraphQL started guide is great. The examples in it use JavaScript, though the rest of this doc will be using Ruby.

So how can we take schemas and resolvers and apply them to our Rails apps? Lucky for us we’ve got a gem called graphql-ruby on our side. Let’s dive into an example implementation.

Adding the gem to our app and running the set-up rake task configures the /graphql and /graphiql endpoint for us (the latter can be used as a sandbox for testing!), so we just need to think about our types and queries. With the set-up complete, here is what our Flight type could look like:

module Types
  class FlightType < Types::BaseObject
    # required fields
    field :id, ID, null: false
    field :airline, String, null: false
    field :flight_number, String, null: false
    field :recorded_at, DateTime, null: false
    field :departs_at, DateTime, null: false
    field :arrives_at, DateTime, null: false
    field :total_price, Float, null: false

    # if a basic type like String or Bool won't cut it,
    # we can make our own supporting types.
    field :airport_departure, Types::Enums::AirportCodeType, null: false
    field :airport_arrival, Types::Enums::AirportCodeType, null: false

    # optional fields
    field :flight_class, Types::Enums::FlightClassType, null: true
    field :is_nonstop, Bool, null: true
    field :distance, Int, null: true
    field :duration, String, null: true
    field :baggage_policy, [String!], null: true
    field :refundable, Bool, null: true
    field :base_fare, Float, null: true
    field :fees, Float, null: true
    field :ticket_changes_allowed, Bool, null: true
    field :can_select_seat, Bool, null: true
    field :boarding, String, null: true
    field :first_bag_fee, Float, null: true
    field :additional_fee_per_bag, Float, null: true
    field :stops, Int, null: true
    field :layover_duration, Int, null: true
    field :total_travel_duration, Int, null: true
  end
end

This might look like a lot, but all we’re doing is defining what fields are queryable on a Flight object and the type of each of those fields. By default these fields will resolve by either calling an underlying method by the same name (flight.total_price), or if the object is a hash, it’ll do a lookup (flight[:total_price]).

Now let’s define how we can query for these flights by introducing the query root of this schema.

module Types
  class QueryType < GraphQL::Schema::Object
    field :flight, Types::FlightType, null: true do
      argument :id, ID, required: true
    end

    def flight(id:)
      # Our resolver function for finding info about a
      # single flight.
      #
      # This is an ActiveRecord example, but we could
      # replace this logic with anything we want.
      # We could use Elasticsearch, make call to a
      # different API, whatever!
      Flight.find(id: id)
    end

    field :flights, Types::FlightType, null: true do
      argument :airport_departure, Types::Enums::AirportCodeType, required: true
      argument :airport_arrival, Types::Enums::AirportCodeType, required: true
      argument :departs_at, DateTime, required: true
      argument :start_date, DateTime, required: false
      argument :end_date, DateTime, required: false
    end

    def flights(**query_options)
      # More complex queries could be defined in
      # a separate class.
      FlightQueryBuilder.build(**query_options)
    end
  end
end

The above class now defines two new queryable fields—one field for single flights (flight), and one field for multiple flights (flights). After each field, we write the resolver fuction that defines how to find the data for those fields.

After this set-up, we can now query for the price of flights from the last n weeks! Here’s one possible query:

{
  flights(
    airport_departure: BOS,
    airport_arrival: LAX,
    departs_at: "2019-02-01T11:30:00.000Z"
  ) {
    id
    price
  }
}

And the client can get exactly what it expects—no more, and no less.

If we wanted to query for more fields in the future, we could just update the above query with more properties, or create a brand new query for just that use case. No API-side changes necessary! And if we wanted to start querying for Aircrafts or Pilots, we could simply implement additional Types and their respective resolvers, then be set to query away—and all the client needs to know about is our /graphql endpoint. Sweet!

That about sums up the crash course re: the server-side of things. If you’d like more info, here are some good resources:

Thanks for reading, and happy graphing!


This post is also cross-posted on the Salsify Engineering Blog.


Pattra is an artist, writer, & engineer based out of the Boston area.