Authentication and Authorization
Learning Objectives
- You know the terms authentication and authorization.
- You know of the three main ways of tracking users: cookies, sessions, and tokens.
- You know of regulations regarding tracking and privacy.
Authentication
Authentication refers to the process of identifying a user. Users can be identified through multiple means, including asking about something they know, asking for something they have, or checking for something they are.
Passwords are an instance of something the users know. Asking for something they have could involve a user’s mobile phone being sent a message or a code, checking that the mobile phone is at the user’s disposal. Similarly, checking for something they are could involve biometric authentication, such as using face, iris, or fingerprint recognition, or, for example, checking the identity of the user through keystroke dynamics.
The term 2-factor authentication refers to using at least two means to identify the user. This could involve, e.g., checking for a password (something they know) and verifying possession of the user’s phone (something they have).
In this course, we’ll work with just password-based authentication.
Authorization
Authorization refers to the process of determining what a user is allowed to do. For example, a user might be allowed to view a page, but not to edit it, or they might be allowed to view and edit a page, but not to delete it. Similarly, a user might be allowed to view a page, but only if they are authenticated (i.e., logged in to the system).
It could also be that the user’s access is restricted to a specific subset of data in the application, such as only their own data or the data that is specific to, for example, a specific group of users. As an example, in a bank application, a user should not be allowed to view (or modify) the accounts of all users, but only their own account. On the other hand, a user could be given access to a group of accounts, such as the accounts of all users in a group (e.g., their family).
Authorization is implemented on the server-side, where the server has logic that checks whether the user is allowed to perform the actions they are trying to perform. This can involve checking the routes that the user is trying to access, the methods that they are trying to use, and the data that they are trying to work with.
Authorization involves authentication, as knowledge of the user is required when trying to determine what they are allowed to do.
Keeping track of the user
For authentication and authorization to work, there is a need to keep track of the user.
HTTP is a stateless protocol and every request that is made to the server is independent from other requests. Thus, they are not linked to any previous or future requests. When we make two GET requests to a server using the HTTP protocol, there is nothing in the requests that would allow the server to determine whether the requests were made by the same client.
GET / HTTP/1.1
Host: myserver.net
GET / HTTP/1.1
Host: myserver.net
As there is a need to keep track of the user, there are also solutions to the problem. Currently, there are three main ways of keeping track of the user: using cookies, using sessions, and using token-based authentication.
Cookies
Cookies, initially proposed in the mid-1990s, is a mechanism for storing small amounts of data on the client, which the client then sends to the server on every request. This way, on every request, the server can study whether a cookie exists in the request and, if a cookie exists, the cookie’s value.
Cookies are implemented using HTTP protocol headers. When a client makes a request to the server, the server adds a Set-Cookie header to the response. A Set-Cookie
header could, for example, look like the following one — in the following example, we create a cookie with the name visits
that has the value 0
.
Set-Cookie: visits=0
Now, when a client receives a response that contains the cookie, the client — browser — automatically stores the cookie. Then, on every request to the same application, the browser adds the cookie to the request in a header called Cookie
.
Cookie: visits=0
The basic flow of using cookies is shown in Figure 1 below.
Additional information can be set to the cookies. This information includes maximum age of the cookie (Max-Age
), a path or a part of a path where the cookie is valid (Path
), the domain or part of a domain where the cookie is valid (Domain
), a version of the cookie (Version
), information on whether the cookie should be sent only over a secure connection, i.e. only over HTTPS, (Secure
), and whether the cookies should not be accessible using JavaScript in the browser (HttpOnly
).
This information is added to the Set-Cookie
header after the key-value -pair — additional information is separated with semicolons.
For example, the below Set-Cookie
header would ask the client to store a cookie with the name name
and value anonymous
. The cookie is valid for 3600 seconds, and is used for the domain aalto.fi
.
Set-Cookie: name=anonymous; Max-Age=3600; Domain=".aalto.fi"
Cookies are stored within a register in the browser, from where they are retrieved whenever a request is made. The register is stored in a file, which means that the cookies persist also if the browser or the computer is restarted.
Cookies are sent as plain text and can be read and modified by anyone who has access to the client. This means that values from cookies must not be trusted. For example, a cookie should not include a user identifier, as a plain-text cookie could be modified to impersonate another user.
Sessions
Sessions are a mechanism that build on cookies. When using sessions, the value of the cookie is created as a long random string that is passed back and forth between the server and the client. On the server, the cookie is resolved to an object stored on the server, which contains data related to the particular cookie and client.
Using a long random string as the value of a cookie makes it difficult to guess the value of a cookie to impersonate another user.
In essence, sessions are a mechanism that allows storing information about the client on the server. The client is still identified with a cookie, but the cookie contains a hard-to-guess reference to the data stored on the server. Such a cookie can be used to track the user across requests, while keeping sensitive data stored on the server.
Sessions are a way to store data on the server, while using cookies to identify the client.
The flow of using sessions is shown in Figure 2 below.
One of the downsides of sessions is that the server needs to store the session data. If the application has multiple servers, the user needs to be always directed to the same server, as the session data is stored on the server. Alternatively, the session data needs to be shared between the servers by e.g. using a database.
Token-based authentication
Users can also be identified using a token. Tokens are created by the server and sent to the client, which then sends the token back to the server on every request. The server can then use the token to identify the user. In principle, cookies and sessions are a form of token-based authentication. The term token-based authentication is typically, however, used to refer to a specific type of token, such as JSON Web Tokens (JWTs).
JWTs consist of three parts: a header, a payload, and a signature. The header and payload are JSON objects that are base64 encoded. The payload contains the claims, which are statements about the entity (typically, the user) and possibly additional data. The signature is created by hashing the header and payload with a secret key, which is only known to the server. When a token is sent to the server, the server can again hash the header and payload, and compare the result with the signature to verify that the token is valid.
JWTs can be stored e.g. in cookies, or in the localstorage of the browser. When using cookies, the browser automatically adds the cookie to the request. On the other hand, if JWTs are stored in localstorage, they need to be explicitly added to the requests as a header.
The flow of using JWTs is shown in Figure 3 below.
When using JWTs, the server does not explicitly need to store data about the user (like e.g. in the case of sessions). Instead, the server can verify the token by hashing the header and payload. If the token is valid, the server can directly use the information from the payload to identify the user, without needing to look up the user in a database.
On the other hand, managing tokens adds complexity, including handling token expiration, token revocation, and token renewal. These are problems that are present with sessions as well, but with JWTs, the server does not store the tokens, which means that the server needs to have a way to manage the tokens without storing them.
It is also possible to store the JWT token in a cookie.
Summary
To summarize, the three main ways of keeping track of the user are cookies, sessions, and token-based authentication. The table below provides a comparison of the three methods.
Mechanism | Use Case | Advantages | Limitations |
---|---|---|---|
Cookies | Basic user tracking | Simple, widely supported | Susceptible to tampering |
Sessions | Server-stored user data | Secure data handling on the server | Requires server storage |
Tokens (JWTs) | Stateless authentication | Lightweight, no server storage required | Token management complexity |
Tracking and privacy regulations
When discussing tracking users, it is also important to mention privacy regulations.
The complete guide to General Data Protection Regulation (GDPR) compliance provides materials for individuals and businesses on the legislation regarding data and privacy. Regarding tracking, the ePrivacy directive notes that consent must be asked for and received before using any cookies or similar tracking mechanisms, except for strictly necessary ones. In addition, users should be given information on what the tracking is used for, and that, even if users would not give consent to using particular uses of tracking, access to the service should still be allowed. Users should also be provided the means to withdraw consent.
The regulations differ between countries. For countries in the EU, tracking is regulated by EU, while in the US, there are no country-wide regulations. Instead, in the US, the use of tracking mechanisms such as cookies is regulated by the states; as an example, California has its own data protection regulation, defined in the California Consumer Privacy Act (CCPA). The CCPA notes that users have the right to know what information is collected, the right to delete (most of) the information, the right to opt-out from sharing their information, the right to correct information collected about them, and the right to limit the use of sensitive data collected about them.
In practice, websites that use tracking mechanisms inform the users of the collected data, and the purpose for collecting data. Often, this information is displayed as a pop-up when when visiting a site for the first time.