I recently investigated and fixed a performance issue with user authentication in our Spring Boot based API service.

Our users started complaining about random 504s from our service. Looking at our thread pool config, worker threads, database connection pool size etc., everything looked normal.

Looking at NewRelic for slow transactions, it gave a clue:

NewRelic transaction breakdown table for a slow transaction

It showed that the request has spent 20 seconds at org.springframework.security.web.authentication.www.BasicAuthenticationFilter.doFilter

This is unusual and doesn’t point to the relevant application class responsible for this. To make NewRelic collect in-depth traces, we followed this guide and added @Trace to any suspicious methods in our application. Such as the ones used in our authentication filters and providers. Once this done, we redeployed the app.

Once the application was redeployed, we got additional tracing info from NewRelic pinpointing where the application was actually stuck. It was in one of our provider class that implements org.springframework.security.authentication.AuthenticationProvider.

This class does nothing but uses bcrypt to compare the supplied API token against the one in our database. We load all the user’s tokens from the DB and compare each one against the given token and keep going until we find a match.

bcrypt returns a different hash each time because it incorporates a different random value into the hash. This is known as a “salt”. It prevents people from attacking your hashed passwords with a “rainbow table”, a pre-generated table mapping password hashes back to their passwords. 

https://stackoverflow.com/a/52121472/184184

Since bcrypt returns a different value each time we generate the hash, we cannot really do the match in the database directly. It has to be done in the application.

On Linux /dev/random is a stream of random bytes. As you read from the stream you deplete the available entropy. When it reaches a certain point reads from /dev/random block.

https://stackoverflow.com/a/36686640/184184

If we are comparing 1000s of tokens using bcrypt per request, this will eventually slow down and block all requests.

The solution was to switch to a different hashing (that’ll return the same hash every time) instead of bcrypt so we can compare the hash in the database directly. Of course this is less secure when compared to Bcrypt but API tokens are not same as account passwords. Passwords are compared only one per login, eg. when submitting a form. But API tokens need to be hashed for every request.

Leave a Reply

Your email address will not be published. Required fields are marked *

Designing REST APIs

November 20, 2022