Guides, Web

A Complete Guide to Hacking GraphQL

Introduction

I decided to make this guide due to the lack of material on this topic and my own struggles with GraphQL. Its purpose is to provide pentesters with the necessary tools to perform tests against GraphQL implementations. I encourage you to do further research and practice on your own with the references provided at the end.

What is GraphQL?

GraphQL is a language used for data query and manipulation, primarily used with APIs. It is used to handle data from a server to a client through the use of various types of operations.

It is currently open-source and there are many frameworks that use it or integrate with it. It is becoming more and more common in web applications nowadays, especially ones that use APIs.

Although it is technically an API, all queries through GraphQL are performed against a single endpoint, for example /graphql. Think of it as middleware between a web/mobile application and an API or other solution.

In terms of functionality, it is quite similar to an SQL database, as it allows to perform operations that
can retrieve or alter data.

GraphQL allows for One-to-one, One-to-many, Many-to-one, and Many-to-many relationships,
as well as combining multiple relationships, and relational mutations:

GraphQL has the following benefits:

  • Fast – Thanks to GraphQL’s structed data approach, it can retrieve only the data required for a certain task, and all within one operation, avoiding over-fetching and improving performance.
  • Easy to develop – Having only one endpoint makes developing with GraphQL much easier, as all developers have to do is write the required operations that can then be used throughout the application.
  • Scalable – GraphQL’s hierarchical relationship and single endpoint design makes it a very scalable solution. This also makes it easier for developers to create documentation for it.

GraphQL Basic Components

  • Schema – Used in GraphQL to define the shape of the data and available queries/mutations. Think of it as tables and fields within a database and the various relationships that allow them to talk to each other.
  • Queries – Used to fetch data, created in advance with specific structure and designed to return a specific number of fields. Think of them as a SELECT statement in SQL databases.
  • Mutations – As opposed to queries, they are used to modify data, and can also fetch results afterwards. Think of mutations as UPDATE statements in SQL databases.

Schema

GraphQL uses a human-readable schema definition language (or SDL) that defines the schema and stores it as a string. The GraphQL schema is a description of the data that can be requested from a
GraphQL endpoint.

It defines available queries, mutations, fragments, fields, and supported types:

Queries

Queries are used to fetch specific data from a GraphLQ instance. They are interactive, meaning they can be changed to shape the field objects they return upon execution.

The fields that are allowed to be retrieved when using a query can be limited so that unauthorised users cannot access sensitive information:

Mutations

Mutations are used to modify data within a GraphQL instance. Just like in queries, mutations can return the value of the newly mutated fields. Mutations can also contain multiple fields.

While query fields are executed in parallel, mutation fields run in series, meaning if we send two AddMoney mutations in one request, the first will finish before the second begins, preventing race conditions.

Operation Structure Explained

The structure used by GraphQL to compose operations can be observed below:

Variables

Although operations in GraphQL can be executed by inserting arguments inside the query string, this is not feasible when they need to be dynamic. GraphQL has a way to factor dynamic values out of the query and pass them as a separate dictionary through variables.

This is achieved by replacing static values in the query with $variable, declaring it as a variable accepted by the query, and passing it in the variables dictionary:

Fragments

Fragments are reusable units that let you construct sets of fields and include them in queries where needed. This avoids writing very repetitive queries. The concept of fragments is often used to
split complicated application datasets.

It is possible for fragments to access variables declared in the query or mutation:

Directives

Directives are attached to fields to affect queries. They are useful where you otherwise would need to
do string manipulation to change a query. The core GraphQL specification includes two directives:

  • @include (if: Boolean) Include this field if true.
  • @skip (if: Boolean) Skip this field if true.

A common use of directives is to implement permissions:

Authentication in GraphQL

There are many ways to perform authentication in GraphQL, however the most common one is
JSON Web Tokens.

JWT is a standard in which information can be securely transmitted between two entities through a compact JSON object. It is generally managed by an authorisation server; companies often use third-party services such as Auth0 to handle JWT tokens.

JSON Web Tokens

JWT is a standard used to safely transmit information (often user identity) between parties. It is a small and simple token that is used by protocols such as OpenID and OAuth 2.0 to represent identity to an application or access token for API authorization.

It is a format that can be signed and/or encrypted. When signed it uses JSON Web Signature (JWS), when
encrypted it uses JSON Web Encryption (JWE). When encrypted, the body cannot be viewed without the encryption key.

JSON web tokens are made of three base64-encoded, dot-separated components: the header, the payload and the signature.

  • The header defines the type of the token (JWT) and the encryption algorithm used.
  • The payload contains a set of fields such as iss (issuer), exp (expiration) and sub (subject) as well as extra fields that can identify the user like user ID, role or company.
  • The signature is used to sign the token and prevent its tampering. It is normally either a private key or a secret.

Enumerating GraphQL

The single endpoint approach reduces the effort required to enumerate existing operations. It contains default functionality that is designed to help developers and it is often not disabled in production.

Additionally, it is meant to be extremely easy to work with, therefore it will advise when running an invalid operation and it will help you build the right query structure.

Identifying a GraphQL Endpoint

Look in your Burp Suite HTTP history for any of the GraphQL keywords such as query, mutation etc. Perform a directory bruteforce attack against the web application, some common GraphQL endpoints are /graphql, /graphiql, /gql.

The Nmap GraphQL Introspection NSE script can also be used for this task, and it contains a comprehensive list of potential GraphQL endpoints.

Fingerprinting GraphQL

graphw00f sends a mix of benign and malformed queries to determine the GraphQL engine in use. It provides insights into the security defences each technology uses and whether are on by default, to have an idea of how the instance can be attacked.

This is possible due to how GraphQL responds to specially crafted requests:

Introspection

Introspection allows to query a GraphQL server for information about the schema in use. It can enumerate the available types, fields, queries, mutations, fragments etc. It can often be used unauthenticated.

It is normally enabled by default, although new frameworks such as Apollo are disabling it in production.

What if Introspection is disabled?

If introspection is disabled, there may be another way to enumerate the schema in use.

Error-Based Discovery

Thanks to GraphQL’s validation capability as well as verbose errors by default, information about the
schema can be easily enumerated by providing incorrect values.

When providing an invalid operation, variable, or value, GraphQL will suggest operations that match a certain portion of the one provided. This helps construct valid operations to further enumerate GraphQL.

Tools such as Clairvoyance or ShapeShifter can be used to automate this type of attack.

Enumerate Schema through UI

Use the application through its UI and observe the operations being made within Burp’s HTTP history to get an idea of the underlying schema. This can be very tedious, though there may be no other options.

If introspection and verbose errors are disabled, it will only be possible to enumerate operations that are accessible through the UI. Luckily, thanks to how GraphQL works, filtering by the endpoint in Burp Suite narrows down to all GraphQL operations.

By now you hopefully have a lot of information about the target schema. These can be your next steps:

  • Go through it to get an idea of how the various operations and sets of data are used.
  • Test all of the operations and endpoints in a “normal” scenario to understand how they work.
  • Think of ways these may have been poorly implemented and could therefore be abused.

Attacking GraphQL

Pretty much all of the REST API vulnerabilities found in the OWASP Security Top 10 are also applicable to GraphQL, in particular:

  • Broken object-level authorization (IDOR)
  • Broken authentication
  • Excessive data exposure
  • Lack of resources and rate limiting
  • Broken function-level authorization
  • Mass assignment
  • Injection Attacks

Information Disclosure

Information disclosure is one of the most common vulnerabilities in GraphQL. It can arise from:

  • Improper or missing access controls.
  • Observable response discrepancy.
  • Use of hard-coded credentials or tokens.
  • Unnecessary exposure of sensitive data.
  • Verbose errors or stack trace.

Insecure Direct Object Reference

If access controls are poorly implemented, it could allow unauthorised data access.

  • Look for operations that use identifiers as one of the variables and try to change them.
  • If incremental numbers have been used as identifiers this makes exploitation much easier.
  • Automated GraphQL tools or Burp intruder can be used to fuzz these values

Injection Attacks

As a middleware application, GraphQL could be used to ingest malicious data. This can introduce injection attacks into the applications sitting on the other end.

This type of attack could lead to XSS, SQL Injection, or command injection:

Denial of Service

Although GraphQL is very well optimized and designed to return only specific sets of data, it won’t necessarily stop attackers from abusing a badly configured implementation.

For example, attacks leveraging nested queries to loop through the same data over and over again could cause GraphQL to hang, or potentially run out of resources and eventually crash.

DOS in GraphQL can manifest itself in several forms:

  • Batch Query Attack – when batched requests are processed one after the other.
  • Deep Recursion Query Attack – when types reference each other, potentially resulting in an infinite recursive query.
  • Resource Intensive Query Attack – when computationally expensive queries can be replayed multiple times in a row.
  • Field Duplication Attack – when the same field can be requested in the same query, increasing the load on the server.
  • Aliases based Attack – when building a query with multiple aliases that call the same query or mutation.

Batch Query Attack Example:

Authentication/Authorisation Issues

Improper or missing authorization checks could allow users to perform unauthorised actions. While testing, ensure tokens/keys/cookies are being validated on each request and that permissions are implemented appropriately.

If OAuth is being used, there could be issues with the way the authorization server performs authentication.

JWT Vulnerabilities

  • If a weak key/passphrase has been used for the JWT, it can potentially be cracked, allowing to modification of its contents.
  • They are sometimes susceptible to attacks that could allow tampering of their contents by modifying the encryption algorithm used or providing a blank password/key.
  • Developers often forget to add expiry validation to ensure tokens are being invalidated.

JWT Tool can be used to automate JWT testing.

Mass Assignment

Mass assignment is a vulnerability where requests are abused to access or modify data that the user should not have access to. This can be abused by adding extra, unintended fields to the operation.

Knowledge of the underlying schema is often required for this attack to work.

Cross-Site Request Forgery

GraphQL is generally safe from CSRF as long as it uses proper JWT or other token/key-based authentication methods. There are still circumstances in which CSRF could be exploited:

  • Cookie-based authentication is in use.
  • No built-in CSRF protection.
  • No CSRF tokens or double submit cookies in use.
  • No origin verification is in place (CORS).
  • No user interaction-based protection (Re-Authentication, CAPTCHA, OTP etc.).
  • GET requests or queries used for state-changing operations.

Server-Side Request Forgery

The implementation may have functionality for fetching or pushing data to an external or internal service by passing its URL within a parameter. If appropriate controls (i.e. whitelisting/sanitisation) have not been implemented, an attacker may be able to temper with the URL.

This may allow to interact with services that are not directly exposed on the internet. Additionally, attacks performed this way will originate from the vulnerable GraphQL application.

Miscellaneous Issues

Other issues affecting GraphQL can be:

  • Lack of rate-limiting.
  • Lack of account lockout and weak password complexity requirements
  • Arbitrary file write/deletion.
  • File path traversal.
  • Information disclosure through stack trace errors.

Useful GraphQL Tools

  • Playground is one of the most common GraphQL IDEs. It includes context-aware autocompletion, error highlighting, and the ability to configure multiple projects and endpoints.
  • Altair is a feature-rich GraphQL IDE that doesn’t require running a web server. It supports:
    • Advanced schema docs search and management
    • Autocompletion and autofill of queries
    • Prerequest scripts
    • Importing and exporting schemas
    • Queries and a plugin system.
  • InQL can run an Introspection query to obtain the endpoint’s schema. It can also inspect the introspection results and generate documentation in different formats and templates for all known basic data types. It can be used as a stand-alone script or as a Burp Suite extension.
  • GraphQL Raider is a Burp Suite Extension for testing endpoints implementing GraphQL. Operations and variables are extracted from the unreadable JSON body and can then be used by Burp’s active scanner to insert payloads into and detect vulnerabilities.
  • GraphQL Voyager allows to access a GraphQL endpoint and visually explore the available types, queries, mutation etc. through an interactive graph by simply pasting the introspection results.
  • GraphQL Path-Enum is a tool that lists the different ways of reaching a given type. It takes the introspection results as an input. As most schemas have loops and have an infinite number of paths, it doesn’t list them all, but it does an exhaustive listing.
  • GraphQLMap is a scripting engine to interact with GraphQL. It has the following features:
    • Dumping a GraphQL schema.
    • Executing GraphQL operations.
    • Autocomplete operations.
    • GraphQL field fuzzing.
    • SQL/NoSQL injection inside a field.
  • Clairvoyance allows to obtain a GraphQL endpoint’s full schema even when introspection is disabled. It does so by exploiting verbose errors and fuzzing values through a wordlist or valid values from HTTP traffic. It produces a JSON schema, suitable for import into other tools like GraphQL Voyager.

Conclusion

GraphQL Recommendations

  • Ensure authorization tokens/keys are generated and handled securely.
  • Disable introspection and verbose errors in prod.
  • Require MFA/CAPTCHA for critical operations.
  • Review access controls to ensure users have appropriate access.
  • Sanitise all input before it is sent by GraphQL.
  • Implement DOS protections such as query ratelimiting, size limit, disabling batched requests, max_depth, query cost analysis, caching, field deduplication, query middleware.

GraphQL CTF Challenges

Real-life Example GraphLQ Vulnerabilities:

References