Lessons learned from handling JWT on mobile

Overview

Modern mobile apps are more complicated than they used to be back in the early days and developers have to face a variety of interesting problems. While we’ve put in our two cents on some of them in previous articles, this one is about authorization and what we have learned by handling JWT on mobile at Just Eat.
When it comes to authorization, it’s standard practice to rely on OAuth 2.0 and the companion JWT (JSON Web Token). We found this important topic was rarely discussed online while much attention was given to new proposed implementations of network stacks, maybe using recent language features or frameworks such as Combine.
We’ll illustrate the problems we faced at Just Eat for JWT parsing, usage, and (most importantly) refreshing. You should be able to learn a few things on how to make your app more stable by reducing the chance of unauthorized requests allowing your users to virtually always stay logged in.

What is JWT

JWT stands for JSON Web Token and is an open industry standard used to represent claims transferred between two parties. A signed JWT is known as a JWS (JSON Web Signature), in fact, a JWT has either to be JWS or JWE (JSON Web Encryption). RFC 7515, RFC 7516, and RFC 7519 describe the various fields and claims in detail. What is relevant for mobile developers is the following:

  • JWT is composed of 3 parts dot-separated: Header, Payload, Signature.
  • The Payload is the only relevant part. The Header identifies which algorithm is used to generate the signature. There are reasons for not verifying the signature client-side making the Signature part irrelevant too.
  • JWT has an expiration date. Expired tokens should be renewed/refreshed.
  • JWT can contain any number of extra information specific to your service.
  • It’s common practice to store JWTs in the app keychain.

Here is a valid and very short token example, courtesy of jwt.io which we recommend using to easily decode tokens for debugging purposes. It shows 3 fragments (base64 encoded) concatenated with a dot.

eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyLCJleHAiOjE1Nzc3NTA0MDB9.7hgBhNK_ZpiteB3GtLh07KJ486Vfe3WAdS-XoDksJCQ

The only field relevant to this document is exp (Expiration Time), part of Payload (the second fragment). This claim identifies the time after which the JWT must not be accepted. In order to accept a JWT, it’s required that the current date/time must be before the expiration time listed in the `exp` claim. It’s accepted practice for implementers to consider for some small leeway, usually no more than a few minutes, to account for clock skew.
N.B. Some API calls might demand the user to be logged in (user-authenticated calls), and others don’t (non-user-authenticated calls). JWT can be used in both cases, marking a distinction between Client JWT and User JWT we will refer to later on.

The token refresh problem

By far the most significant problem we had in the past was the renewal of the token. This seems to be something taken for granted by the mobile community, but in reality, we found it to be quite a fragile part of the authentication flow. If not done right, it can easily cause your customers to end up being logged out, with the consequent frustration we all have experienced as app users.
The Just Eat app makes multiple API calls at startup: it fetches the order history to check for in-flight orders, fetches the most up-to-date consumer details, etc. If the token is expired when the user runs the app, a nasty race condition could cause the same refresh token to be used twice, causing the server to respond with a 401 and subsequently logging the user out on the app. This can also happen during normal execution when multiple API calls are performed very close to each other and the token expires prior to those.
It gets trickier if the client and the server clocks are sensibly off sync: while the client might believe to be in possession of a valid token, it has already expired.
The following diagram should clarify the scenario.


concurrent_-requests_sequence_diagram

Common misbehavior

I couldn’t find a company (regardless of size) or indie developer who had implemented a reasonable token refresh mechanism. The common approach seems to be: to refresh the token whenever an API call fails with 401 Unauthorized. This is not only causing an extra call that could be avoided by locally checking if the token has expired, but it also opens the door for the race condition illustrated above.

Avoid race conditions when refreshing the token 🚦

We’ll explain the solution with some technical details and code snippets but what’s more important is that the reader understands the root problem we are solving and why it should be given the proper attention.
The more we thought about it, we more we convinced ourselves that the best way to shield ourselves from race conditions was by using threading primitives when scheduling async requests to fetch a valid token. This means that all the calls would be regulated via a filter that would hold off subsequent calls to fire until a valid token is retrieved, either from local storage or, if a refresh is needed, from the remote OAuth server.
We’ll show examples for iOS, so we’ve chosen dispatch queues and semaphores (using GCD); fancier and more abstract ways of implementing the solution might exist – in particular by leveraging modern FRP techniques – but ultimately the same primitives are used.
For simplicity, let’s assume that only user-authenticated API requests need to provide a JWT, commonly put in the Authorization header:

Authorization: Bearer <jwt-token>

The code below implements the “Get valid JWT” box from the following flowchart. The logic within this section is the one that must be implemented in mutual exclusion, in our solution, by using the combination of a serial queue and a semaphore.


perform_request_flow

Here is just the minimum amount of code (Swift) needed to explain the solution.

typealias Token = String
typealias AuthorizationValue = String
struct UserAuthenticationInfo {
    let bearerToken: Token // the JWT
    let refreshToken: Token
    let expiryDate: Date // computed on creation from 'exp' claim
    var isValid: Bool {
        return expiryDate.compare(Date()) == .orderedDescending
    }
 }
protocol TokenRefreshing {
    func refreshAccessToken(_ refreshToken: Token, completion: @escaping (Result<UserAuthenticationInfo, Error>) -> Void)
}
protocol AuthenticationInfoStorage {
    var userAuthenticationInfo: UserAuthenticationInfo?
    func persistUserAuthenticationInfo(_ authenticationInfo: UserAuthenticationInfo?)
    func wipeUserAuthenticationInfo()
}
class AuthorizationValueProvider {
    private let authenticationInfoStore: AuthenticationInfoStorage
    private let tokenRefreshAPI: TokenRefreshing
   
    private let queue = DispatchQueue(label: <#label#>, qos: .userInteractive)
    private let semaphore = DispatchSemaphore(value: 1)
    init(tokenRefreshAPI: TokenRefreshing, authenticationInfoStore: AuthenticationInfoStorage) {
        self.tokenRefreshAPI = tokenRefreshAPI
        self.authenticationInfoStore = authenticationInfoStore
    }
    func getValidUserAuthorization(completion: @escaping (Result<AuthorizationValue, Error>) -> Void) {
        queue.async {
            self.getValidUserAuthorizationInMutualExclusion(completion: completion)
        }
    }
}

Before performing any user-authenticated request, the network client asks an AuthorizationValueProvider instance to provide a valid user Authorization value (the JWT). It does so via the async method getValidUserAuthorization which uses a serial queue to handle the requests. The chunky part is the getValidUserAuthorizationInMutualExclusion.

private func getValidUserAuthorizationInMutualExclusion(completion: @escaping (Result<AuthorizationValue, Error>) -> Void) {
    semaphore.wait()
    guard let authenticationInfo = authenticationInfoStore.userAuthenticationInfo else {
        semaphore.signal()
        let error = // forge an error for 'missing authorization'
        completion(.failure(error))
        return
    }
    if authenticationInfo.isValid {
        semaphore.signal()
        completion(.success(authenticationInfo.bearerToken))
        return
    }
    tokenRefreshAPI.refreshAccessToken(authenticationInfo.refreshToken) { result in
        switch result {
        case .success(let authenticationInfo):
            self.authenticationInfoStore.persistUserAuthenticationInfo(authenticationInfo)
            self.semaphore.signal()
            completion(.success(authenticationInfo.bearerToken))
        case .failure(let error) where error.isClientError:
            self.authenticationInfoStore.wipeUserAuthenticationInfo()
            self.semaphore.signal()
            completion(.failure(error))
        case .failure(let error):
            self.semaphore.signal()
            completion(.failure(error))
        }
    }
 }

The method could fire off an async call to refresh the token, and this makes the usage of the semaphore crucial. Without it, the next request to AuthorizationValueProvider would be popped from the queue and executed before the remote refresh completes.
The semaphore is initialised with a value of 1, meaning that only one thread can access the critical section at a given time. We make sure to call wait at the beginning of the execution and to call signal only when we have a result and therefore ready to leave the critical section.
If the token found in the local store is still valid, we simply return it, otherwise, it’s time to request a new one. In the latter case, if all goes well, we persist the token locally and allow the next request to access the method, in the case of an error, we should be careful and wipe the token only if the error is a legit client error (2xx range). This includes also the usage of a refresh token that is not valid anymore, which could happen, for instance, if the user resets the password on another platform/device.
It’s critical to not delete the token from the local store in the case of any other error, such as 5xx or the common Foundation’s NSURLErrorNotConnectedToInternet (-1009), or else the user would unexpectedly be logged out.
It’s also important to note that the same AuthorizationValueProvider instance must be used by all the calls: using different ones would mean using different queues making the entire solution ineffective.
It seemed clear that the network client we developed in-house had to embrace JWT refresh logic at its core so that all the API calls, even new ones that will be added in the future would make use of the same authentication flow.

General recommendations

Here are a couple more (minor) suggestions we thought are worth sharing since they might save you implementation time or influence the design of your solution.

Correctly parse the Payload

Another problem – even though quite trivial and that doesn’t seem to be discussed much – is the parsing of the JWT, that can fail in some cases. In our case, this was related to the base64 encoding function and “adjusting” the base64 payload to be parsed correctly. In some implementations of base64, the padding character is not needed for decoding, since the number of missing bytes can be calculated but in Foundation’s implementation it is mandatory. This caused us some head-scratching and this StackOverflow answer helped us.
The solution is – more officially – stated in RFC 7515 – Appendix C and here is the corresponding Swift code:

func base64String(_ input: String) -> String {
    var base64 = input
        .replacingOccurrences(of: "-", with: "+")
        .replacingOccurrences(of: "_", with: "/")
    switch base64.count % 4 {
    case 2:
         base64 = base64.appending("==")
   
    case 3:
         base64 = base64.appending("=")
    default:
         break
    } 
     return base64
 }

The majority of the developers rely on external libraries to ease the parsing of the token, but as we often do, we have implemented our solution from scratch, without relying on a third-party library. Nonetheless, we feel JSONWebToken by Kyle Fuller is a very good one and it seems to implement JWT faithfully to the RFC, clearly including the necessary base64 decode function.

Handle multiple JWT for multiple app states

As previously stated, when using JWT as an authentication method for non-user- authenticated calls, we need to cater for at least 3 states, shown in the following enum:

enum AuthenticationStatus {
     case notAuthenticated
     case clientAuthenticated
     case userAuthenticated
}

On a fresh install, we can expect to be in the .notAuthenticated state, but as soon as the first API call is ready to be performed, a valid Client JWT has to be fetched and stored locally (at this stage, other authentication mechanisms are used, most likely Basic Auth), moving to the .clientAuthenticated state. Once the user completes the login or signup procedure, a User JWT is retrieved and stored locally (but separately to the Client JWT), entering the .userAuthenticated, so that in the case of a logout we are left with a (hopefully still valid) Client JWT.
In this scenario, almost all transitions are possible:


authorization_states


A couple of recommendations here:

  • if the user is logged in is important to use the User JWT also for the non-user-authenticated calls as the server may personalise the response (e.g. the list of restaurants in the Just Eat app)
  • store both Client and User JWT, so that if the user logs out, the app is left with the Client JWT ready to be used to perform non-user-authenticated requests, saving an unnecessary call to fetch a new token

Conclusion

In this article, we’ve shared some learnings from handling JWT on mobile that are not commonly discussed within the community.
As a good practice, it’s always best to hide complexity and implementation details. Baking the refresh logic described above within your API client is a great way to avoid developers having to deal with complex logic to provide authorization, and enables all the API calls to undergo the same authentication mechanism. Consumers of an API client, should not have the ability to gather the JWT as it’s not their concern to use it or to fiddle with it.
We hope this article helps to raise awareness on how to better handle the usage of JWT on mobile applications, in particular making sure we always do our best to avoid accidental logouts to provide a better user experience.

  1. Hi, nice article. As you mentioned in the beginning of this post, this topic is barely discussed online. To be honest, this is the most complete post I found so far on this topic. I was working on the exact same problem last week for two projects. One for a brand new app for my work and a personal project for which I also develop the API. For the work project the backend engineers used access/refresh tokens (non JWT) and for my personal project I used access/refresh JWT tokens. I ended up with a similar solution to yours using semaphores and seems to work fine so far. I tested it extensively with 10 concurrent requests. I only performed this as a test. The apps will never perform so many requests at once. Reading your post, I have a question about your solution. How do you handle revokes of the access token. From what I understood reading your post, you seem to rely entirely on the expiration date of the JWT token. You are correct to assume that a JWT token will always be invalid after the expiration date but the auth server could also revoke a JWT token before it expires. This could happen for a lot of reasons. In that scenario, if you’ll perform 2 or more requests at once, all of them will get a 401. Also, for that case, there is no way to retry the requests that received 401. Am I missing something to your solution? I would really enjoy discussing this topic further with you.

    Best regards,
    Spyros

    1. Hi Spyros, thanks for your comment.
      There are more details to our solution and we haven’t exposed them all here.

      In the case of receiving a 401, which include the server having revoked the auth token (for whatever reason), we do the following:

      1. if it’s a client-authenticated call, we request a new client JWT, then (if successful) retry the original call
      2. if it’s a user-authenticated call, we force a token refresh (using the refresh token we have) as a last resort, then (if successful) retry the original call

      In the second case, if the token refresh fails the user has to log back in again by design (see the diagram).
      You are right there, we have no critical section for the “force token refresh”, meaning that multiple calls would re-use the same refresh token eventually logging the user out.
      This is a problem that I believe would mainly occur when the server revokes the token, and for user-authenticated calls the user would have to log back in anyway, so we think it’s acceptable.

      Also, our retry logic goes as follows:
      – once if the error is `unauthorized`
      – up to 3 times (or remotely overridden using JustTweak) if it’s a server error (5xx)
      – 0 times otherwise

      1. Hi Alberto, thank you very much for your reply and for answering my questions. Yes, the scenario I described will rarely happen in your case, if ever. I decided to go with a different logic in my case. I send requests to the APIClient and I keep them in a private queue. I decided to not rely on the expiration date of the JWT. If one request receives a 401, then I lock the critical section with a semaphore like you did. Inside that critical section I do the following things.
        1) I change the internal status of the APIClient to indicate that it’s refreshing the token. It’s just a simple enum with 2 cases.
        2) I cancel every request that already started running that requires authentication and I add it in a special list.
        3) I send the request to refresh the token.

        Every new request that tries to run during the time that the APIClient refreshes the token and requires authentication is added straight to the special list I mentioned above. The status of the APIClient is checked using the same semaphore I mentioned above so there is no problem with the critical sections. When the refresh request returns, I signal the semaphore using a defer statement. If the refresh was successful, I toggle the status of the APIClient and restart every pending request in that special list. If the refresh failed, I stop everything and l log out the user.

        I didn’t mention every last detail of my code but I hope you get the idea.

        Thank you for providing your retry logic. I wasn’t sure how many times to retry a request if it’s a server error.

  2. Thanks for sharing your process! Just something that I bumped to alot that you could improve is using defer { sempahore.signal() } instead of signaling in every exit point of a scope. Other than that, great write-up! Thank you πŸ™‚

Comments are closed.