This is the first part in a two part series on HTTP security and HTTP basics. In this first part we bring you overview of the HTTP protocol.
HTTP is a ubiquitous protocol and is one of the cornerstones of the web. If you are a newcomer to web application security, a sound knowledge of the HTTP protocol will make your life easier when interpreting findings by automated security tools, and it’s a necessity if you want to take such findings further with manual testing. What follows is a security-focused introduction to the HTTP protocol to help you get started.
HTTP is a message-based (request, response), stateless protocol comprised of headers (key-value pairs) and an optional body. Three versions of HTTP have been released so far — HTTP/1.0 (released in 1996, rare usage), HTTP 1.1 (released in 1997, wide usage) and HTTP/2 (released in 2015, increasing usage).
The HTTP protocol works over the Transmission Control Protocol (TCP). TCP is one of the core protocols within the Internet protocol suite and it provides a reliable, ordered, and error-checked delivery of a stream of data, making it ideal for HTTP. The default port for HTTP is 80, or 443 if you’re using HTTPS (an extension of the HTTP over TLS).
HTTP is a line-based protocol, meaning that each header is represented on its own line, with each line ending in a Carriage Return Line Feed (CRLF) with a blank line separating the head from the optional body of the request or response.
Up to HTTP/1.1, HTTP was a text-based protocol, however, with HTTP/2 this has changed — HTTP/2, unlike its predecessors is a binary protocol with most implementations requiring TLS encryption. It’s worth noting that for the vast majority of cases (and certainly, for this article) interacting with the HTTP/2 protocol won’t be any different. It’s also worth mentioning that HTTP/1.1 isn’t going away anytime soon, and it’s still early days for HTTP/2 (as such, HTTP/1.1 will be referenced throughout this article).
In order to initiate an HTTP request, a client first establishes a TCP connection to a specified server on a specified port (80 or 443 by default).
The request would start with an initial line known as a request line which contains a method (GET in the following example, more on this later), a URL (/, indicating the “root” of the host in the below example) and the HTTP version (HTTP/1.1 in the below example). We must also include a Host header in order to tell the HTTP client where to send this request.
GET / HTTP/1.1 Host: www.example.com
The above is exactly what a browser does when you type in http://www.example.com into its URL bar. If we wanted to get the contents of http://www.example.com/about.html, we would send the following request instead.
GET /about.html HTTP/1.1 Host: www.example.com
HTTP Request Methods
The HTTP protocol defines a number of HTTP request methods (sometimes also referred to as verbs), which are used within HTTP requests to indicate to the server a desired action for a particular resource.
|GET||The GET method is used to retrieve a resource from a server.|
|POST||The POST method is used to submit data to a resource.|
|TRACE||The TRACE method is used to echo back anything sent by the client. This HTTP method is typically abused for reflected Cross-site Scripting (XSS).|
|PATCH||The PATCH method is used to apply partial updates to a resource.|
|PUT||The PUT method is used to replace a resource.|
|HEAD||The HEAD method is used to retrieve a resource identical to that of a GET request, but without the response body.|
|DELETE||The DELETE method is used to delete the specified resource.|
|OPTIONS||The OPTIONS method is used to describe the supported HTTP methods for a resource.|
|CONNECT||The CONNECT method is used to establish a tunnel to the server specified by the target resource (used by HTTP proxies and HTTPS).|
On the server side, an HTTP server listening on port 80, sends back an HTTP response to the client for what it has requested.
The HTTP response will contain a status line as the first line in a response, followed by response. The status line indicates the version of the protocol, the status code (200 in the below example), and, usually, a description of that status code.
Additionally, the server’s HTTP response will typically also include response headers (Content-Type in the below example) as well as an optional body (with a blank line at the end of the head of the request).
HTTP/1.1 200 OK Content-Type: text/html <html> ... </html>
Response status codes
HTTP response status codes are issued by the server within an HTTP response to let the client know what the status of the request is. Status codes are organized in the following categories.
|Status code group||Description|
Some of the most relevant HTTP status codes for web application security testing are the following, however, a full list of status codes and their description may be found here.
|Status code group||Description|
|200 OK||Indicates that the request has succeeded.|
|301 Moved Permanently||Indicates that the resource requested has been permanently moved to the URL within the Location response header.|
|302 Found (Temporary Redirect)||Indicates that the resource requested has been permanently moved to the URL within the Location response header.|
|400 Bad Request||Indicates that the server could not understand the request by the client, usually due to invalid syntax|
|401 Unauthorized||Indicates the request could not be served due to insufficient authentication.|
|403 Forbidden||Indicates that the server understood the request but refuses to authorize it.|
|404 Not Found||Indicates that the server can not find the requested resource.|
|405 Method Not Allowed||Indicates that the request method is known by the server, but it is not allowed to be used with this resource.|
|500 Internal Server Error||Indicates that the server encountered an unexpected condition that prevented it from fulfilling the request.|