backend
The Case for Standardized Error Handling in Your Web Application’s APIs
Because bad error messages — or worse, errors invisible to users do no one any favors.
Introduction
Bad error messages are a sin we’re all guilty of as programmers. On the server side, on the client side, they are rampant basically everywhere in web development.
I write them. You write them. My whole team writes them. We all agree they’re bad: the error messages don’t tell us or our users what the real problem is and only through digging deep into the codebase and tracing through application logs can we locate the actual culprit — most of the time.
This is not a new or unusual phenomenon. It’s standard, and it makes everyone’s lives infinitely more painful than they need to be. But every time we talk about how to make the error messaging better — more verbose, more pinpointed, less nebulous about what actual error occurred, these efforts to improve error messaging get deprioritized by whoever has the power to direct the dev team and we move on to more straightforward, end-user-benefitting features.
And the vicious cycle continues. A user encounters an error on client side of the application, the development team is contacted after the error can’t be resolved by turning the system off and on again (see meme below), the devs look through tons of logs and try to reproduce error locally (which is sometimes an impossible feat), the team either reproduces the error and figures out what caused it and how to fix it, or they can’t reproduce the error and close the bug with no fix, until it rears its ugly head again in the future.
Does this sound like an optimum system to you? Me neither. Which is why I’m here, writing to you today.
I’m making the case for standardizing error handling from APIs so we can all stop wasting so much time tracking down bugs, and get back to the coding we all want to do — building cool things that make people’s lives better.
Standardized errors: the need & the ground rules
In case you didn’t read the error message above, the part that matters says: Failed to execute ImportKey action in set service. Deleting created cart item. : 500 Internal Server Error
.
Yes, this is an actual error message from one of my team’s actual backend web applications. No, it doesn’t make any sort of sense; I totally agree with you.
In web development, we place a lot of emphasis on the “happy path”. The path that we want and expect our users to take when they’re using our tool or application — it’s the path we code for, the path we optimize for, the path we place most of our focus on.
The “unhappy path”, is the path when a user doesn’t do the right thing, click the right button, or generally figures out how to use the system in some way other than its intended purpose. This is the path where things go wrong, and with bad error messaging (like the above), we, as both users and developers, have a really hard time figuring out what exactly went wrong. And in general, the unhappy paths (and the errors they produce) are given a lot less thought about how they should be handled in an effective manner.
That needs to change. Here’s what I propose in the name of better error handling, all of which is based on the OData v4 JSON specification.
What is OData?
Before I give my recommendations, let me give a little background on OData.
OData stands for Open Data Protocol, and it is an open protocol which allows the creation and consumption of queryable and interoperable RESTful APIs in a simple and standard way. Microsoft initiated OData in 2007. — Wikipedia, OData
In essence, OData defines a set of best practices for building and consuming RESTful APIs (application program interface). These practices help developers focus on business logic while building RESTful APIs without having to worry about the various approaches to define things like request and response headers, status codes, HTTP methods, URL conventions, media types, payload formats, query options and more.
It proposes standards to make web development a little less haphazard and a little more predictable, regardless of programming language or development approach and team.
Now, we know about OData and its legitimacy and reason for being in the world, let’s get down to the rules of handling errors.
8 error rules to follow
Below are the standards my team and wider company are working to enact with our applications — they’re generic enough recommendations to apply broadly to APIs in general, but specific enough to remove the questions around how to go about implementing the recommendations.
Rule 1: Code one, reusable error handler
- For nonsuccess conditions, developers should be able to write one piece of code that handles errors consistently across different REST API methods.
- This allows for the building of simple and reliable infrastructure to handle exceptions as a separate flow from successful responses.
- Keep in mind, this error handler is very generic and does not require specific OData constructs. APIs should use this format even if they are not using other OData constructs.
Rule 2: The JSON error handler must have an error
object
- The error response must be a single JSON object. This object must have a name/value pair named
error
. The value must be a JSON object.
Rule 3: error
must contain code
and message
, and may contain other properties For more information
- This object must contain name/value pairs with the names
code
andmessage
and it may contain name/value pairs with the namestarget
,details
, andinnererror
. - The value for the
code
name/value pair is a language-independent string. Its value is a service-defined error code that should be human-readable. - This code serves as a more specific indicator of the error than the HTTP error code specified in the response.
- API Methods should have a relatively small number (about 20) of possible values for
code
and all clients must be capable of handling all of them. - Most services will require a much larger number of more specific error codes, which are not interesting to all clients. These error codes should be exposed in the
innererror
name/value pair as described below. - Introducing a new value for
code
that is visible to existing clients is a breaking change and requires a version increase. API Methods can avoid breaking changes by adding new error codes toinnererror
instead.
Rule 4: The message
should help humans debug the error
- The value for the
message
name/value pair must be a human-readable representation of the error. It is intended as an aid to developers and is not suitable for exposure to end users. - API methods wanting to expose a suitable message for end users must do so through an annotation or custom property.
- API methods should not localize
message
for the end user, because doing so may make the value unreadable to the app developer who may be logging the value, as well as make the value less searchable on the Internet.
Rule 5: target
is the name of the property in error
- The value for the
target
name/value pair is the target of the particular error (e.g., the name of the property in error).
Rule 6: details
is an array of objects with code
and message
- The value for the
details
name/value pair must be an array of JSON objects that must contain name/value pairs forcode
andmessage
, and may contain a name/value pair fortarget
as described above. The objects in thedetails
array usually represents distinct, related errors that occurred during the request. See example below.
Example of the details
object:
{
"error": {
"code": "BadArgument",
"message": "Multiple errors in ContactInfo data",
"target": "ContactInfo",
"details": [
{
"code": "NullValue",
"target": "PhoneNumber",
"message": "Phone number must not be null"
},
{
"code": "NullValue",
"target": "LastName",
"message": "Last name must not be null"
},
{
"code": "MalformedValue",
"target": "Address",
"message": "Address is not valid"
}
]
}
}
In this example there were multiple problems with the request, with each of the individual errors listed in details
.
Rule 7: innererror
is an object with service-defined contents
- The value for the
innererror
name/value pair must be an object. - The contents of this object are service-defined. API methods wanting to return more specific errors than the root-level code must do so by including a name/value pair for
code
and a nestedinnererror
. - Each nested
innererror
object represents a higher level of detail than its parent. - When evaluating errors, clients must traverse through all of the nested
innererrors
and choose the deepest one that they understand. This scheme allows services to introduce new error codes anywhere in the hierarchy without breaking backwards compatibility, so long as old error codes still appear. - The service may return different levels of depth and detail to different callers. For example, in development environments, the deepest
innererror
may contain internal information that can help debug the service. - To guard against potential security concerns around information disclosure, services should take care not to expose too much detail unintentionally.
- Error objects may also include custom server-defined name/value pairs that may be specific to the code. Error types with custom server-defined properties should be declared in the service’s metadata document. See example below.
- Error responses may contain annotations in any of their JSON objects.
Example of innererror
object:
{
"error": {
"code": "BadArgument",
"message": "Previous passwords may not be reused",
"target": "password",
"innererror": {
"code": "PasswordError",
"innererror": {
"code": "PasswordDoesNotMeetPolicy",
"minLength": "6",
"maxLength": "64",
"characterTypes": ["lowerCase","upperCase","number","symbol"],
"minDistinctCharacterTypes": "2",
"innererror": {
"code": "PasswordReuseNotAllowed"
}
}
}
}
}
In the example above, the most basic error code is BadArgument
, but for clients that are interested, there are more specific error codes in innererror
.
The PasswordReuseNotAllowed
code may have been added by the service at a later date, having previously only returned PasswordDoesNotMeetPolicy
.
Existing clients do not break when the new error code is added, but new clients may take advantage of it. The PasswordDoesNotMeetPolicy
error also includes additional name/value pairs that allow the client to determine the server’s configuration, validate the user’s input programmatically, or present the server’s constraints to the user within the client’s own localized messaging.
Rule 8: Code for failure with retries
- It is recommended that for any transient errors that may be retried, services should include a
Retry-After
HTTP header indicating the minimum number of seconds that clients should wait before attempting the operation again.
And this concludes the rules to a standardized API error handling strategy. It’s not overly complicated, but it recommends good guidelines to try to adhere to when encountering the unhappy path.
Conclusion
Error messaging isn’t the most fun thing to implement, but it’s even less fun to try and debug a painfully vague (and oftentimes misleading) server message. Standardized error handling is a must — especially if you’re part of a larger team of developers, or providing an API service to external clients.
Organizations like OData have helped establish best practices around writing and using APIs, which helps developers focus on the business logic of their applications instead. By following these best practices, we’ve established a set of easy to understand and repeatable rules for error handling to make the whole process of dealing with the inevitable unhappy path simpler.
Check back in a few weeks, I’ll be writing about React or something else related to web development, so please follow me so you don’t miss out.
Thanks for reading, I hope this helps you more effectively handle errors in a standardized format so you can better catch and fix bugs.
References & Further Resources
- OData V4 JSON Specifications
- Open Data Protocol, Wikipedia
- OData docs
Want to be notified first when I publish new content? Subscribe to my newsletter.