SaINT Project

Project Overview
Project Description
IoTBench
Links
Groups

Project Overview

Broadly defined as the Internet of Things (IoT), the growth of commodity devices that integrate physical processes with digital connectivity has had profound effects on society—smart homes, personal monitoring devices, enhanced manufacturing and other IoT applications have changed the way we live, play and work.

Yet extant IoT platforms provide few means of evaluating the use (and potential misuse) of sensitive information. Thus, consumers have little information to assess the security and privacy risks these devices present.

In this project, we present SaINT, a tool for analyzing sensitive data leakage in IoT implementations. SaINT operates in three phases; (a) translation of platform-specific source code into an IR modeling sensor-computation-actuator structures, (b) identifying sensitive sources and sinks, and (c) performing static analysis to identify sensitive data leakage. We also introduce IoTBench, an IoT-specific test suite and open repository for evaluating information leakage in IoT apps.

Project Description

IoT Application Structure

From our studying of the three IoT platforms, we found that their apps share a common structure and common types of taint sources and sinks.

Taint Sources

We classify taint sources into five groups based on information types:

Device States: are the attributes of a device. An IoT app can acquire a variety of privacy-sensitive information through device state interfaces.
Device Information: IoT apps grants access to IoT devices at install time. Our investigations reveal the platforms often define interfaces to access device information such as its manufacturer name, id, and model. This allows a developer to write device-specific apps.
Location: In the IoT domain, location information refers to a user’s geolocation or geographical location.
User Inputs: IoT apps often require user inputs either to manage app logic or to control devices.
State Variables: IoT apps do not store data about their previous executions. To retrieve data across executions, platforms allows apps to persist data to some propriety external storage and retrieve this data in later executions.

Taint Propagation

An IoT app invokes actions to control its devices when a particular event occurs. Actions are invoked in event handlers and may change the state of the devices. Event handlers are not limited to implement only device actions. Apps often call other functions for implementing the app logic, sending messages, and logging device events to an external database. During the execution of event handlers, it is necessary to track how sensitive information propagates in an app’s logic.

Taint Sinks

Our initial analysis also uses two taint sinks:

Internet: IoT apps may send sensitive data to external services or may act as web services through which external entities acquire sensitive information.
Messaging Services: IoT apps use messaging APIs to deliver push notification to mobile-app users and to send SMS messages to designated recipients when specific events occur.

SaINT’s sources and sinks categorization in IoT apps.

SaINT Architecture

We implement the SaINT analyzer that extracts an intermediate representation (IR) from the source code of an IoT app. The IR is used to construct an app’s entry points, event handlers, and call graphs. Using these, SaINT models the lifecycle of an app and performs static taint analysis. Finally, based on static taint analysis, it reports sensitive data flows from sources to sinks; for each data flow, the type of the sensitive information, as well as information about sinks, are reported.

Components of the Intermediate Representation (IR)

From the inter-procedural control flow graph (ICFG) of an app, SaINT’s backward taint tracking consists of two steps: (1) it first performs taint tracking backward from taint sinks to construct possible data-leak paths from sources to sinks; (2) using path- and context-sensitivity, it then prunes infeasible paths to construct a set of feasible paths, which are the output of SaINT’s static taint tracking.

In the first step, SaINT starts at the sinks of the ICFG and propagates taint backward. The reason that SaINT uses the backward approach is to reduce the processing overhead by starting from a few sinks instead of from a huge number of sensitive sources. This is confirmed by checking the ratio of sinks over sources in analyzed IoT apps. Then, with the dependence relation computed and information about taint sources, SaINT can easily construct a set of possible data-leak paths from sources to sinks. In the next step, SaINT prunes infeasible data-leak paths using path- and context-sensitivity. For a path, it collects the evaluation results of the predicates at conditional branches and checks whether the conjunction of those predicates (i.e., the path condition) is always false; if so, the path is infeasible and discarded.

SaINT’s architecture

Modes of Operations of SaINT

SaINT has two modes of operations: online and offline. Online mode allows users and developers to use SaINT remotely through a web application developed as part of this project. On the other hand, the offline mode allows users and IoT application developers to utilize SaINT locally in host machines. In this demo, we are showing the online mode of SaINT.

Output of SaINT

The following figure presents the screenshot of SaINT analysis (online mode) result on a sample app. A warning report by SaINT contains the following information: (1) full data flow paths between taint sources and sinks, and (2) the taint labels of sensitive data, and (3) taint sink information, including the hostname or URL, and contact information. You can find SaINT web application in the following link: https://iot-saint.appspot.com/ (hosting SaINT v1.0, the web app is currently being updated to version 2.0).

SaINT web application

IoTBench

We introduce an IoT-specific test suite, an open repository for evaluating information leakage in IoT apps. We designed our test suite similar to those designed for mobile systems and the smart grid; they have been widely adopted by the security community. IoTBench currently includes 19 hand-crafted malicious SmartThings apps that contain data leaks. Sixteen apps have a single data leak, and three have multiple data leaks; a total of 27 data leaks via either Internet and messaging service sinks. We carefully crafted the IoTBench apps based on official and third-party apps. They include data leaks whose accurate identification through program analysis would require solving problems including multiple entry points, state variables, call by reflection, and field sensitivity. Each app in IoTBench also comes with ground truth of what data leaks are in the app; this is provided as comment blocks in the app’s source code. IoTBench can be used to evaluate both static and dynamic taint analysis tools designed for SmartThings apps; It enables assessing a tool’s accuracy and effectiveness through the ground truths included in the suite. IoTBench apps can be accessed from our GitHub repository.

Groups

This work has been possible thanks to the research collaboration between:

Systems and Internet Infrastructure Security Lab (SIIS), Pennsylvania State University.
Cyber-Physical Systems Security lab (CSL), Florida International University.