Chapter 3

Understanding HTTP Requests with curl

Now that we have a functional minimalistic API, it’s time to start using it. To do so, we will use curl with which data can be transferred through various protocols including, but not limited to, HTTP.

This tool is great to have in your toolbox as a software engineer, especially for one building web APIs. A web browser can be used to test some parts of a web API, but is way too limited to be actually useful. Indeed, we can only run GET requests using a browser, and only if there is no HTTP Header to change.

curl lets us define requests the way we want them, and is thus a great tool to include in web APIs documentation. Since curl requests can easily be generated and copied around, they are a great way to allow developers to test your API and see if everything is working before they start doing anything.

Google Chrome comes with a “copy as curl” feature integrated in the developer tools.

Box 3.1. What is curl?

Here is a description of the tool coming from the curl manual.

curl is a tool to transfer data from or to a server, using one of the supported protocols (DICT, FILE, FTP, FTPS, GOPHER, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, POP3, POP3S, RTMP, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET and TFTP). The command is designed to work without user interaction.

curl offers a busload of useful tricks like proxy support, user authentication, FTP upload, HTTP post, SSL connec- tions, cookies, file transfer resume, Metalink, and more. As you will see below, the number of features will make your head spin!

3.1. Installing curl

Some Linux distributions come with curl pre-installed. To know if you already have it, simply open a terminal and run curl. If it says that the program curl was not found, you need to install it. If you already have it, you can skip this section.

This book assumes that you are either running a Linux/Unix distribution (with VirtualBox and Ubuntu if you are on Windows) or Mac OS X.

I will use Ubuntu as an example for Linux, if you are using any other distribution, I’m confident you know what package manager is installed on your machine and can adapt the following commands.

3.1.1. Installing on Mac OS X

Installing curl on Mac OS X can be made very easy if you have Homebrew installed. If you don’t have it, run the following command (taken from the official website):

/usr/bin/ruby -e \
"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"

Then install curl using the brew command:

brew install curl

If everything went well, you should now have curl installed. Give it a try and enter curl in your terminal.

$ curl
curl: try 'curl --help' or 'curl --manual' for more information

3.1.2. Installing on Linux

You should already have a packet manager, so simply use it to install the curl package.

Ubuntu/Debian:

sudo apt-get install curl

Now that we have curl, let’s learn more about how it works. We will be using it throughout this book to simulate clients.

3.2. curl Crash Course

A curl request is composed of the curl word, the URL you want to hit, and a set of options that allow you to modify anything you’d like in the request that will be sent.

Here are a few options we need to know to write our first requests:

  • -H: Shorthand for Header, this option lets us add or replace HTTP Header Fields. Example: -H "Content-Type: application/json"
  • -d: Shorthand for data, this is the option we will use when we need to send data to the server. Example with a JSON payload: -d '{"name":"John Smith"}'
  • -i, –include: When using this option, curl will not only display the body of the response sent back, but also the headers.
  • -I, –head: This option tells curl to make a HEAD request which will only get the header of a document and not its body.
  • -X, –request: This option specifies what kind of HTTP method we want to use in our request. The default is GET but we can use this option to send POST, PUT, PATCH or DELETE requests, for example.

Let’s use our new knowledge to make our first curl request on our little Sinatra API.

3.3. Our First curl Request

First, stop and restart your Sinatra application if it’s still running. Sinatra does not auto-reload when modifications are made, so to ensure we have the latest version of our code running, we need to restart it. Just use CTRL-C to stop the current process:

...
^CStopping ...
Stopping ...
== Sinatra has ended his set (crowd applauds)

Then rerun the starting command we used earlier.

ruby webapi.rb

Now we are ready to make a GET request to the route we created: /users. This should give us a list of users. Run the following curl command from a terminal:

curl -i http://localhost:4567/users

And here is the result:

HTTP/1.1 200 OK
Content-Type: text/html;charset=utf-8
Content-Length: 203
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
Connection: keep-alive
Server: thin

[
  {"first_name":"Thibault", "last_name":"Denizet", "age":25, "id":"thibault"},
  {"first_name":"Simon", "last_name":"Random", "age":26, "id":"simon"},
  {"first_name":"John", "last_name":"Smith", "age":28, "id":"john"}
]

I’ve manually “prettified” the JSON in order to fit it in one page. Just as you can see in your terminal, it actually came without any spaces.

Great, we’ve got something back. Let’s dive into it and understand how HTTP responses are made. There are some incorrect values in the headers but we will come back to that later.

3.4. HTTP Response

What curl displayed for us is an HTTP response. This response can be divided into four areas:

  • The Start-Line (mandatory)
  • A list of Header Fields (can be 0 or more)
  • An Empty Line (mandatory)
  • A Message-Body (optional)

Although every HTTP response will be different, they will all follow the same pattern and have at least the Start-Line and the Empty Line.

3.4.1. The Start-Line

The Start-Line, as defined in RFC 2616, contains two parts: the Request-Line and the Status-Line.

HTTP/1.1 200 OK

In the example above, the Request-Line is HTTP/1.1 and begins with the version of HTTP being used, while the Status-Line (200 OK) indicates how the request went. We will learn more about the Status-Line when we talk about HTTP status codes.

3.4.2. The Header Fields

The header fields represent the metadata of the HTTP requests and responses. They contain information about how the transfer of data should be handled.

In this response, we’ve got 8 header fields:

  • Content-Type
  • Content-Length
  • X-XSS-Protection
  • X-Content-Type-Options
  • X-Frame-Options
  • Connection
  • Server.

Content-Type is a very important header since it defines how the representation is serialized. It contains the media type of the representation. In our case, you can see how incorrect it is. It defines the media type as being text/html when we actually received a JSON document (media type application/json). There is obviously a bug in our server and we will need to fix it.

Content-Type: text/html;charset=utf-8

The Content-Length header indicates the size of the body sent in the response, in octets.

Content-Length: 203

Non-official headers start, by convention, with an X. They are used to add metadata to the HTTP requests/responses to deal with specific problems. Anyone can add new HTTP headers to their requests very easily since there is no limitation for this in the HTTP implementation.

However, it’s not a recommended practice and should generally be avoided. You shouldn’t have to add anything. HTTP provides a big number of headers that can be used for various situations and you can usually find what you need.

If one day you really need to add a new header, do not use the X prefix. It is now deprecated, so you should just use MY-SUPER-HEADER instead of X-MY-SUPER-HEADER.

All the headers starting with X- in the response we got deal with security and web browser behavior. They are outside of the scope of this book and you won’t need them to build web APIs. They are usually added by web servers automatically. They play a role in protecting from clickjacking, preventing XSS attacks and preventing browsers from MIME-sniffing a response away from the declared content-type.

X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN

We don’t really have to worry about the Connection header. It contains control options for the current connection and in what way it should be handled.

Connection: keep-alive

The Server response-header contains information about the software that handled the request. In our case, that’s the Ruby web server thin.

Server: thin

There are almost 50 header fields defined in the HTTP RFC document and we will learn about many others in the rest of this book.

3.4.3. The Empty Line

The empty line is here just to define the end of the list of headers and the beginning of the body (if there is one).

*Empty, nothing to see.*

3.4.4. The Message-Body

The body, or Message-Body as defined in the RFC, contains the data the server is sending back based on the request we made.

In this case, it’s a JSON string that we generated in the Sinatra API.

[
  {"first_name":"Thibault", "last_name":"Denizet", "age":25, "id":"thibault"},
  {"first_name":"Simon", "last_name":"Random", "age":26, "id":"simon"},
  {"first_name":"John", "last_name":"Smith", "age":28, "id":"john"}
]

We just reviewed how an HTTP response is structured. It wasn’t the most enjoyable thing to do, but we are done with it. Just kidding - now we are going to see how HTTP requests are formatted! Don’t worry, it’s pretty similar and shouldn’t take long.

3.5. HTTP Request

We just reviewed what the server sent back to us.

But what did we send?

Let’s use the option -v to make curl more verbose.

curl -v -i http://localhost:4567/users

Here’s the output with response hidden:

*   Trying ::1...
* connect to ::1 port 4567 failed: Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 4567 (#0)
> GET /users HTTP/1.1
> Host: localhost:4567
> User-Agent: curl/7.43.0
> Accept: */*
>
... Response ...

Here we can see what’s happening. Do you notice something familiar in there? Yes, there is an HTTP request that looks very similar to the HTTP response we dissected earlier.

GET /users HTTP/1.1
Host: localhost:4567
User-Agent: curl/7.43.0
Accept: */*

HTTP requests are pretty close to HTTP responses. We can see a Start-Line that looks similar to the one in our response, a list of Header Fields and an Empty Line. Since we made a GET request, there is no body for us to see under that empty line.

The main difference here is how the Start-Line looks like. Since it’s supposed to tell a server the fundamental information about this request, it contains more data.

3.5.1. The Start-Line

The Start-Line for a request contains the HTTP method to use (GET), the identifier of the resource (/users) and the protocol version in use (HTTP/1.1).

3.5.2. The Header Fields

The list of headers is more minimalist and contains only two header fields: Host and User-Agent. Host is a mandatory header in HTTP requests since it contains the address (IP address or domain name) where the specified resource can be found.

I’m sure you already know about the User-Agent; it’s a pretty well known header that contains information about the emitter of the request. In our case, it was made from curl so it includes the name of the software and its version. In the end, it looks like curl/7.43.0.

To give you another example, here is the user agent for requests sent by Google Chrome (version 41.0.2228.0):

Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko)
  Chrome/41.0.2228.0 Safari/537.36

3.5.3. The Empty Line

It’s the same goal here as for HTTP responses - to separate headers from body.

*Empty, nothing to see.*

3.5.4. The Message-Body

Once we start working with other HTTP methods, like POST, we will see how a body is integrated inside an HTTP request.

3.6. Wrap Up

In this chapter, we learned how HTTP requests and responses are formatted thanks to curl. It allowed us to perform actual requests to our Sinatra API. Even if curl is the main tool we will use in this book, it’s not the only one that exists. HTTPie is another CLI tool that can be used to send HTTP requests, for example.

Next up, we are diving into media types!