topics: http/1.1 (post)

Talking HTTP/1.1

I was recently reminded of insights I gained in my younger years into how the web works by manually talking to a web server. At the time (early 2000s), HTTP/1.1 was the de facto standard being embraced as an improvement over HTTP/1.0. HTTP versions < 2 are plain text, making them quite simple to use in interactions with web servers.

Making a Request

With a tool like Telnet or Netcat, we can open a connection to a web server and start trying to talk the HTTP/1.1 protocol.

I will use Netcat (nc).

$ nc google.com 80 

With a connection established, we can start by making a GET request for the root url using HTTP/1.1

GET / HTTP/1.1

Next we need to specify the Host that the web request is for. A single web server may handle requests for multiple hosts; using the Host header we specify which host we are making a request to.

Host: google.com

Finally, the default behaviour in HTTP/1.1 is to keep the connection open after the request is finished. To instruct the web server to close the connection we can use the Connection header. To signal that we have no further headers to add, we add an empty line.

Connection: close
    

After having added a blank line to the request, we expect the web server to reply.

$ nc google.com 80
GET / HTTP/1.1
Host: google.com
Connection: close

HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Connection: close

<HTML><HEAD><meta http-equiv= .....

We can see the response has a 301 status code. We can see there is a Location header with a value of http://www.google.com/ The Content-Type header describes how the client should interpret the response body. We can see a Content Length header which describes how long the response body is.

If we go head and take the suggestion to make a request to www.google.com/ instead of google.com:

$ nc www.google.com 80
GET / HTTP/1.1
Host: www.google.com
Connection: close

HTTP/1.1 200 OK
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
X-XSS-Protection: 0
X-Frame-Options: SAMEORIGIN
Set-Cookie: NID=215=UePS...
Accept-Ranges: none
Vary: Accept-Encoding
Connection: close
Transfer-Encoding: chunked

5206
<!doctype html><html itemscope="" itemtype="http://schema.org/WebPage" ....

Here we can see an ensemble of new headers. There is a multitude of documentation available on the web to cross-reference what each header is for and what it does. MDN Web Docs is one such resource.

Transfer-Encoding: chunked indicates that ‘chunks’ or batches of response data will be sent. The length of each chunk is specified prior to the response body, 5206 in this case. This mechanism is useful for large responses or for cases where the length of the response may not be known up front.

Set-Cookie: NID=.... is potentially the most interesting header in terms of web application development. When a web server response includes a set cookie header, typically a browser will include the cookie value data in consequent requests to the same domain. This is how being ‘logged in’ typically works. The client submits a username and password in a POST request. The web server responds with a cookie. When the client makes requests with the cookie the web server can cross-reference the cookie value against its list of sessions and treat the request as from a logged in user. This usage is typically referred to as a session cookie. There are lots of other uses of cookies from tracking to recording usage preferences.

And then

Check out HTTP/1.1 rfc 2616 and experiment!