Let’s get diversified! Our web API is currently returning a JSON
document labeled with the media type for HTML
. We need to fix this ASAP.
How do we do that?
Well, we have two things to do - fix our application and use a couple of headers to start doing some content negotiation.
First, let’s define application/json
as the default media type. To do this, we need to add a before
block in our API. As the name implies, anything inside this block will be run before the code in our routes. To set a default content type, we use the content_type
method and pass it the media type we wish to use as an argument.
# webapi.rb
require 'sinatra'
require 'json'
users = {
'thibault': { first_name: 'Thibault', last_name: 'Denizet', age: 25 },
'simon': { first_name: 'Simon', last_name: 'Random', age: 26 },
'john': { first_name: 'John', last_name: 'Smith', age: 28 }
}
# WE ADDED THIS!
before do
content_type 'application/json'
end
get '/' do
'Master Ruby Web APIs - Chapter 2'
end
get '/users' do
users.map { |name, data| data.merge(id: name) }.to_json
end
What we just changed will ensure that any response sent back by our application will have the Content-Type
header set to application/json
. While it solves our current problem, we still have another one. More on that soon. For now, let’s run some tests with curl
.
Let’s restart our application with CTRL-C
and restart the app.
ruby webapi.rb
Then we can send a request:
curl -i http://localhost:4567/users
The request now returns a response with the correct Content-Type
.
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 203
[HIDDEN HEADERS]
[HIDDEN JSON DOCUMENT]
Back to our issue: what if the client wants a result in XML?
While JSON
is usually the preferred format for web APIs, we want to make our API easy to use for any developer. If a developer decides to use XML
, we should let him do that. While we will keep JSON
as the default media type when no specific content type is requested from the client, we also want to allow a client to retrieve XML
.
JSON
and XML
To do this, we could follow the Rails way and append the format we want at the end of the resource. We would end up with /users.json
and /users.xml
. Let’s give it a try. We will soon encounter another problem with this, but for now it seems to be the simplest solution.
We need a library to serialize the list of users to XML
, and we are going to use gyoku for that purpose.
Run the following command in your terminal to install the latest version of gyoku.
gem install gyoku --no-ri --no-rdoc
Then, we need to make a few changes to our application source code. We still want /users
to return a JSON
document by default, but we also want to allow users to use /users.json
and /users.xml
if they want to.
Here is the list of changes we will make:
gyoku
gem
/users
route to also work with /users.json
/users.xml
that uses the Gyoku gem to serialize our list of users. We also need to specify that the Content-Type
sent back should be application/xml
in this route, overriding the default application/json
set before the route was called.
It all comes together as seen below:
# webapi.rb
require 'sinatra'
require 'json'
# We require gyoku here in order to use it in the
# XML route.
require 'gyoku'
users = {
thibault: { first_name: 'Thibault', last_name: 'Denizet', age: 25 },
simon: { first_name: 'Simon', last_name: 'Random', age: 26 },
john: { first_name: 'John', last_name: 'Smith', age: 28 }
}
before do
content_type 'application/json'
end
get '/' do
'Master Ruby Web APIs - Chapter 2'
end
# Looping through the list of routes for which
# we want to return a JSON representation.
['/users', '/users.json'].each do |path|
get path do
users.map { |name, data| data.merge(id: name) }.to_json
end
end
# Defining /users.xml with the application/xml media type
# and simply calling Gyoku on our users hash with a root
# element 'users'.
get '/users.xml' do
content_type 'application/xml'
Gyoku.xml(users: users)
end
If we run two curl
requests (one for JSON
and one for XML
), we should clearly get two different results. You can also just paste the URLs in your browsers to see the result there. That’s the advantage of using this approach with the media type appended to the resource name.
Request the list of users as a JSON
document.
curl -i http://localhost:4567/users.json
And getting everything back correctly.
HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 203
X-Content-Type-Options: nosniff
Connection: keep-alive
Server: thin
[
{"first_name":"Thibault", "last_name":"Denizet", "age":25, "id":"thibault"},
{"first_name":"Simon", "last_name":"Random", "age":26, "id":"simon"},
{"first_name":"John", "last_name":"Smith", "age":28, "id":"john"}
]
Let’s try to get some XML now.
curl -i http://localhost:4567/users.xml
Once again, everything looks fine.
HTTP/1.1 200 OK
Content-Type: application/xml;charset=utf-8
Content-Length: 270
X-Content-Type-Options: nosniff
Connection: keep-alive
Server: thin
<users>
<thibault>
<firstName>Thibault</firstName>
<lastName>Denizet</lastName>
<age>25</age>
</thibault>
<simon>
<firstName>Simon</firstName>
<lastName>Random</lastName>
<age>26</age>
</simon>
<john>
<firstName>John</firstName>
<lastName>Smith</lastName>
<age>28</age>
</john>
</users
Well, this should be good to go, right? Unfortunately, no. Let’s find out why.
While this approach works, it does not follow the principles of the HTTP and URI RFCs. It’s not bad in itself, but it breaks some of the rules that have made the web so successful. Indeed, we went from having one resource, available at the URI /users
, to three different ones: /users
, /users.json
and /users.xml
.
We end with three different resources representing the same concept, each one having one representation.
That’s going against HTTP fundamentals and the Uniform Interface
constraint from REST. A better way of presenting the users
concept would be to have only one URI, /users
, that could respond to different representations using HTTP header fields.
The HTTP header we need for this to work is called Accept
. This header field allows the client to specify which media types it would like to get. It is defined as a string listing all media types that the client can understand, with an optional priority parameter named q
that can be added to each media type in the list.
It’s then up to the server to honor the request or simply return whatever it wants. That’s why designing web APIs that come with advanced HTTP features is a must. We could also set our server to respond to a client with a 415 Unsupported Media Type
status code if the server cannot offer what the client is asking for.
Let’s change our API from using three different resources to using only one.
The correct way for a client to ask the server for a specific media type is with the Accept
header field. However, HTTP requests that include a body with data, such as POST
ones, can contain the Content-Type
header to tell the server in what format the data is sent. This Content-Type
header should not be used by the server to decide how to format the response. For this purpose, only Accept
should be used.
We still want JSON
to be the default for our API. Sinatra comes with a way to give you the list of accepted media types, sorted by priority.
The request.accept
array contains a list of AcceptEntry
objects that we can use to see what the client would like to receive.
Before going further with our implementation, we need to understand more about media types and how the prioritization system in the Accept
header works.
What are media types? What’s the difference from a content type? What about MIME types?
Alright, alright. Let’s go through all of that.
A media type is a two-part identifier for file formats and format contents transmitted over the Internet.
—Wikipedia
Simply put, a media type defines how some data is formatted and how it can be read by a machine. It allows a computer to differentiate between a JSON
document and an XML
document.
Sadly or fortunately, machines are not as good as we are at recognizing things. It’s quite easy for us humans to recognize that a document is a JSON
document or an HTML
one, but for a machine it’s all just random text without the key to decrypt it.
Let’s take human languages as an example. If someone were to give you a piece of paper with writing on it that you cannot understand, there wouldn’t be much you could do. But if you are also given the name of the language, you can just grab a dictionary and start translating. Media types do the same thing for machines.
There are as many media types as there are data formats; here are some examples:
application/json
application/xml
multipart/form-data
text/html
As you can see, a media type is composed of two parts separated by a slash. The first part is the type and the second part is the subtype. There can also be some optional parameters like charset=UTF-8
that specify which charset
is used.
The first part contains one of the top-level registered type
names which are listed below.
application
audio
example
image
message
model
multipart
text
video
We will mostly be using application
throughout this book.
Media types should be registered with the IANA (Internet Assigned Numbers Authority). There are different “trees” of registration for different uses. All the media types above are registered in the standards tree and don’t have a prefix before their subtype
.
However, you will come across more custom media types, like the ones belonging to the vendor tree (vnd
), the personal types (prs
) and the unregistered (x
).
For example, the media type for the JSON API specification is application/vnd.api+json
, while the media types for HAL are application/hal+json
and application/hal+xml
.
GitHub’s media type is application/vnd.github.v3+json
.
Anyone can register a media type to allow other developers to reuse it in their applications.
Media types were originally named MIME types, which stands for Multipurpose Internet Mail Extensions. They were renamed later on as media types, but are basically the same thing. You will see both in different web technologies and you should know they refer to the same thing.
Content-Type
is the name of the header field defined in the HTTP RFC that contains the media type describing the data sent in the request/response.
Accept
HeaderAs we said earlier, the Accept
header is used by a client to tell the server what media type it wants and can use. This header is not limited to only one value and multiple media types can be chained, separated by commas.
For example, an Accept
header could look like this:
Accept: application/xml;q=0.5, text/html;q=0.4, application/json; text/plain;q=0.1
The q
parameter defines something called the quality factor. Using this, a client can indicate which media types they would prefer in a prioritized order. The quality factor value can range from 0 to 1 and the default value (when q
is not present) is 1. 1 is the highest ‘quality’ and 0 the lowest.
In the example above, the priorities should be interpreted as application/json
first (default q=1
), then application/xml
(q=0.5
) followed by text/html
(q=0.4
) and text/plain
(q=0.1
).
For now, you know enough and we can go back to our application source code to offer two representations of the same resource.
We need to find a way to return the format that the client wants the most (with the highest quality as defined in the Accept
header). To do this, we are going to create a method named accepted_media_type
which will return to us either JSON
or XML
.
Our API only supports those two formats, so let’s not bother with anything else. The code is not very extendable but we don’t want to do any early-optimization on code that will do what we need.
Below you can see the code for the method we just talked about. Let’s go through it.
def accepted_media_type
return 'json' unless request.accept.any?
request.accept.each do |mt|
return 'json' if %w(application/json application/* */*).include?(mt.to_s)
return 'xml' if mt.to_s == 'application/xml'
end
halt 406, 'Not Acceptable'
end
The first thing we do here is check if the client has specified anything in the Accept
header. If it didn’t, we should suppose that the client can understand anything and return the default format, JSON
.
return 'json' unless request.accept.any?
I love Guard Clauses. You will probably see a lot in this book, just like the ones we just made to check if request.accept
contained any element. A Guard Clause is a small piece of code at the top of a method or block that ‘guards’ the following code. It usually checks for some important condition which, if not met, should stop the execution of the current method. For example, it can be used to check some of the parameters that were passed to a method and return if they don’t match the expectations.
It saves from having huge if...else
like this:
if request.accept.any?
# Process and return some value
else
'json'
end
Instead we can do the following, which, in my humble opinion, is much more elegant.
return 'json' unless request.accept.any?
# Process and return some value
The second thing we added is a loop. We go through the list of media types received by the server and stored in request.accept
by Sinatra. The first thing we check in each of these media types (which are given sorted by quality by Sinatra) is if the media type is either application/json
, application/*
or */*
. Since the *
represents a wildcard, anything can fit in and, thus, we decide that our web API will default to JSON
in such cases. If we get a match here, we stop the execution of the loop and return from the method accepted_media_type
with the value json
. This value will be used in our endpoint. Otherwise, we can continue and check if the media type is equal to application/xml
. If that’s the case, we return from the method with the value xml
.
request.accept.each do |mt|
return 'json' if %w(application/json application/* */*).include?(mt.to_s)
return 'xml' if mt.to_s == 'application/xml'
end
If the media type doesn’t match any of the ones we defined, we go to the next iteration of the loop and check again.
By the end of the loop, if we haven’t returned anything, it means there is no media type present in the Accept
header that we support. In that case, we will return to the client the HTTP status 406 Not Acceptable
to let him know that, unfortunately, an exchange of data is impossible between us if the client cannot compromise.
halt 406, 'Not Acceptable'
It all comes together in the webapi.rb
file, as presented below. Note that I extracted the media types check for JSON
and XML
into their own methods, respectively json_or_default?
and xml?
, to clean up the accepted_media_type
method. Also, take a close look at the /users
route. Thanks to our helper methods, we know which media type we have that suits the client the most. We can simply send back the right representation, JSON
or XML
, and the associated Content-Type
header.
# webapi.rb
require 'sinatra'
require 'json'
require 'gyoku'
users = {
thibault: { first_name: 'Thibault', last_name: 'Denizet', age: 25 },
simon: { first_name: 'Simon', last_name: 'Random', age: 26 },
john: { first_name: 'John', last_name: 'Smith', age: 28 }
}
helpers do
def json_or_default?(type)
['application/json', 'application/*', '*/*'].include?(type.to_s)
end
def xml?(type)
type.to_s == 'application/xml'
end
def accepted_media_type
return 'json' unless request.accept.any?
request.accept.each do |mt|
return 'json' if json_or_default?(mt)
return 'xml' if xml?(mt)
end
halt 406, 'Not Acceptable'
end
end
get '/' do
'Master Ruby Web APIs - Chapter 2'
end
get '/users' do
type = accepted_media_type
if type == 'json'
content_type 'application/json'
users.map { |name, data| data.merge(id: name) }.to_json
elsif type == 'xml'
content_type 'application/xml'
Gyoku.xml(users: users)
end
end
Now it’s time to run a few tests.
Since we didn’t follow the first rule of programming and write some automated tests, we have to test manually. Urgh. I can’t wait for the automated tests coming in the next module! Anyway, this will allow me to show you more about the q
(quality
) option for media types in the Accept
header.
Before running the curl
requests below, be sure to have the latest version of the code and restart your server to get the changes live with CTRL-C
and ruby webapi.rb
.
First test! We set the q=0.5
for XML. We didn’t set anything which means it will use the default value q=1
. We should receive back JSON
in this case. Let’s try.
curl -i http://localhost:4567/users \
-H "Accept: application/xml;q=0.5, application/json"
Output
HTTP/1.1 200 OK
Content-Type: application/json
... Hidden Headers ...
[Hidden JSON Data]
Yes! It’s working fine. What if we don’t set any quality (q=1
by default), with application/xml
first?
curl -i http://localhost:4567/users \
-H "Accept: application/xml, application/json"
Output
HTTP/1.1 200 OK
Content-Type: application/xml;charset=utf-8
... Hidden Headers ...
<Hidden XML Data>
XML
received, awesome! Since the client doesn’t specify which media type it wants the most, the server can pick any of them. In this case, however, Sinatra kept the same order and picked the first one, application/xml
.
I’m not going to paste more HTTP responses here, but feel free to run the following curl
requests. Try to guess what you should receive!
curl -i http://localhost:4567/users \
-H "Accept: application/xml;q=0.9, application/json;q=0.8"
curl -i http://localhost:4567/users -H "Accept: application/*"
curl -i http://localhost:4567/users -H "Accept: */*"
curl -i http://localhost:4567/users
Did you guess right? Did you cheat by running the queries? Whatever, here are the results anyway!
XML
! With a q
set to 0.9, XML
is the winner here.
JSON
! The client asked for any media type that belongs to the application
type. The server sent the default, JSON
, which belongs to this type (application/json
). Note that another server could decide to send XML instead!
JSON
! Since our application defaults to JSON
and the client said anything was fine with the double wildcard */*
, the server sent back JSON
.
JSON
! No media type was specified so the server sent back the default, JSON
.
Our web API can now send two representations of the same resource, one in the JSON
format and another one in the XML format. What we just learned, known as content negotiation, is a very important part of how web APIs are supposed to work.
Unfortunately, many people are doing it wrong. I’ve done it wrong because nobody showed me all these powerful techniques and that’s exactly what I’m hoping to fix with this book.
In this chapter, we learned everything about media types and we had our first experience with content negotiation.
We are not done with our growing Sinatra API yet. We still have only two routes defined and a lot more to learn about the HTTP protocol.