Broadly defined as the Internet of Things (IoT), the growth of commodity devices that integrate physical processes with digital connectivity has had profound effects on society—smart homes, personal monitoring devices, enhanced manufacturing and other IoT applications have changed the way we live, play and work.
Yet extant IoT platforms provide few means of evaluating the use (and potential misuse) of sensitive information. Thus, consumers have little information to assess the security and privacy risks these devices present.
In this project, we present SaINT, a tool for analyzing sensitive data leakage in IoT implementations. SaINT operates in three phases; (a) translation of platform-specific source code into an IR modeling sensor-computation-actuator structures, (b) identifying sensitive sources and sinks, and (c) performing static analysis to identify sensitive data leakage. We also introduce IoTBench, an IoT-specific test suite and open repository for evaluating information leakage in IoT apps.
IoT Application Structure
From our studying of the three IoT platforms, we found that their apps share a common structure and common types of taint sources and sinks.
We classify taint sources into five groups based on information types:
- Device States: are the attributes of a device. An IoT app can acquire a variety of privacy-sensitive information through device state interfaces.
- Device Information: IoT apps grants access to IoT devices at install time. Our investigations reveal the platforms often define interfaces to access device information such as its manufacturer name, id, and model. This allows a developer to write device-specific apps.
- Location: In the IoT domain, location information refers to a user’s geolocation or geographical location.
- User Inputs: IoT apps often require user inputs either to manage app logic or to control devices.
- State Variables: IoT apps do not store data about their previous executions. To retrieve data across executions, platforms allows apps to persist data to some propriety external storage and retrieve this data in later executions.
An IoT app invokes actions to control its devices when a particular event occurs. Actions are invoked in event handlers and may change the state of the devices. Event handlers are not limited to implement only device actions. Apps often call other functions for implementing the app logic, sending messages, and logging device events to an external database. During the execution of event handlers, it is necessary to track how sensitive information propagates in an app’s logic.
Our initial analysis also uses two taint sinks:
- Internet: IoT apps may send sensitive data to external services or may act as web services through which external entities acquire sensitive information.
- Messaging Services: IoT apps use messaging APIs to deliver push notification to mobile-app users and to send SMS messages to designated recipients when specific events occur.
SaINT’s sources and sinks categorization in IoT apps.
We implement the SaINT analyzer that extracts an intermediate representation (IR) from the source code of an IoT app. The IR is used to construct an app’s entry points, event handlers, and call graphs. Using these, SaINT models the lifecycle of an app and performs static taint analysis. Finally, based on static taint analysis, it reports sensitive data flows from sources to sinks; for each data flow, the type of the sensitive information, as well as information about sinks, are reported.
Components of the Intermediate Representation (IR)
From the inter-procedural control flow graph (ICFG) of an app, SaINT’s backward taint tracking consists of two steps: (1) it first performs taint tracking backward from taint sinks to construct possible data-leak paths from sources to sinks; (2) using path- and context-sensitivity, it then prunes infeasible paths to construct a set of feasible paths, which are the output of SaINT’s static taint tracking.
In the first step, SaINT starts at the sinks of the ICFG and propagates taint backward. The reason that SaINT uses the backward approach is to reduce the processing overhead by starting from a few sinks instead of from a huge number of sensitive sources. This is confirmed by checking the ratio of sinks over sources in analyzed IoT apps. Then, with the dependence relation computed and information about taint sources, SaINT can easily construct a set of possible data-leak paths from sources to sinks. In the next step, SaINT prunes infeasible data-leak paths using path- and context-sensitivity. For a path, it collects the evaluation results of the predicates at conditional branches and checks whether the conjunction of those predicates (i.e., the path condition) is always false; if so, the path is infeasible and discarded.
Modes of Operations of SaINT
SaINT has two modes of operations: online and offline. Online mode allows users and developers to use SaINT remotely through a web application developed as part of this project. On the other hand, the offline mode allows users and IoT application developers to utilize SaINT locally in host machines. In this demo, we are showing the online mode of SaINT.
Output of SaINT
The following figure presents the screenshot of SaINT analysis (online mode) result on a sample app. A warning report by SaINT contains the following information: (1) full data flow paths between taint sources and sinks, and (2) the taint labels of sensitive data, and (3) taint sink information, including the hostname or URL, and contact information. You can find SaINT web application in the following link: http://saint-project.appspot.com/.
SaINT web application
Project Sponsor: National Science Foundation
Project Duration: 09/01/13-08/31/17