IEEE Symposium on Security and Privacy[2023]
Scaphy: Detecting Modern ICS Attacks by Correlating Behaviors in SCADA and PHYsical.
Authors
- Moses Ike, Georgia Institute of Technology; Sandia National Laboratories
- Kandy Phan, Sandia National Laboratories
- Keaton Sadoski, Sandia National Laboratories
- Romuald Valme, Sandia National Laboratories
- Wenke Lee, Georgia Institute of Technology
Abstract
Modern Industrial Control Systems (ICS) attacks evade existing tools by using knowledge of ICS processes to blend their activities with benign Supervisory Control and Data Acquisition (SCADA) operation, causing physical world damages. We present Scaphy to detect ICS attacks in SCADA by leveraging the unique execution phases of SCADA to identify the limited set of legitimate behaviors to control the physical world in different phases, which differentiates from attacker’s activities. For example, it is typical for SCADA to setup ICS device objects during initialization, but anomalous during process-control. To extract unique behaviors of SCADA execution phases, Scaphy first leverages open ICS conventions to generate a novel physical process dependency and impact graph (PDIG) to identify disruptive physical states. Scaphy then uses PDIG to inform a physical process-aware dynamic analysis, whereby code paths of SCADA process-control execution is induced to reveal API call behaviors unique to legitimate process-control phases. Using this established behavior, Scaphy selectively monitors attacker’s physical world-targeted activities that violates legitimate process-control behaviors. We evaluated Scaphy at a U.S. national lab ICS testbed environment. Using diverse ICS deployment scenarios and attacks across 4 ICS industries, Scaphy achieved 95% accuracy & 3.5% false positives (FP), compared to 47.5% accuracy and 25% FP of existing work. We analyze Scaphy’s resilience to futuristic attacks where attacker knows our approach.
Links
Shedding Light on Inconsistencies in Grid Cybersecurity: Disconnects and Recommendations.
Authors
- Brian Singer, Carnegie Mellon University
- Amritanshu Pandey, Carnegie Mellon University
- Shimiao Li, Carnegie Mellon University
- Lujo Bauer, Carnegie Mellon University
- Craig Miller, Carnegie Mellon University
- Lawrence Pileggi, Carnegie Mellon University
- Vyas Sekar, Carnegie Mellon University
Abstract
The operational, academic, and policy communities disagree on which threats against the power grid are likely and what damage would ensue. For instance, the feasibility and impact of MadIoT-style attacks is being actively debated. By surveying grid experts (N=18) we find that disagreements are not unique to MadIoT attacks but occur across multiple well-studied grid threats. Based on prior work and our survey, we hypothesize that the disagreements stem from inconsistencies in how grid threats are modeled. We identify five likely causes of modeling inconsistencies: 1) using unrealistic grid topologies, 2) assuming unrealistic capabilities for attackers, 3) exploring too few grid scenarios, 4) using incomplete simulators that omit relevant grid processes, and 5) using simulators that incorrectly model key grid processes. To check these hypotheses, we create a modeling framework and examine how these factors change our understanding of the feasibility and impact of grid threats. We use four diverse grid threats as case studies: MadIoT, False Data Injection Attacks, Substation Circuit Breaker Takeover, and Power Plant Takeover. We find that each of our hypothe-sized causes of modeling inconsistencies has a significant effect on modeling the outcomes of attacks. For example, we find that MadIoT attacks are much less feasible and require significantly more high-wattage IoT devices on realistic topologies than on topologies previously used to model them. In contrast, we find that Substation Circuit Breaker Takeover attacks are much more feasible in emergency scenarios and may require significantly fewer substations for failure than previous modeling suggested. We conclude with actionable recommendations for accurately assessing the impact of threats against the grid.
Links
Red Team vs. Blue Team: A Real-World Hardware Trojan Detection Case Study Across Four Modern CMOS Technology Generations.
Authors
- Endres Puschner, Max Planck Institute for Security and Privacy, Germany
- Thorben Moos, Université Catholique de Louvain, Belgium
- Steffen Becker, Max Planck Institute for Security and Privacy, Germany; Ruhr University Bochum, Germany
- Christian Kison, Bundeskriminalamt, Germany
- Amir Moradi, Université Catholique de Louvain, Belgium
- Christof Paar, Max Planck Institute for Security and Privacy, Germany
Abstract
Verifying the absence of maliciously inserted Trojans in Integrated Circuits (ICs) is a crucial task – especially for security-enabled products. Depending on the concrete threat model, different techniques can be applied for this purpose. Assuming that the original IC layout is benign and free of backdoors, the primary security threats are usually identified as the outsourced manufacturing and transportation. To ensure the absence of Trojans in commissioned chips, one straightforward solution is to compare the received semiconductor devices to the design files that were initially submitted to the foundry. Clearly, conducting such a comparison requires advanced laboratory equipment and qualified experts. Nevertheless, the fundamental techniques to detect Trojans which require evident changes to the silicon layout are nowadays well-understood. Despite this, there is a glaring lack of public case studies describing the process in its entirety while making the underlying datasets publicly available. In this work, we aim to improve upon this state of the art by presenting a public and open hardware Trojan detection case study based on four different digital ICs using a Red Team vs. Blue Team approach. Hereby, the Red Team creates small changes acting as surrogates for inserted Trojans in the layouts of 90 nm, 65 nm, 40 nm, and 28 nm ICs. The quest of the Blue Team is to detect all differences between digital layout and manufactured device by means of a GDSII–vs–SEM-image comparison. Can the Blue Team perform this task efficiently? Our results spark optimism for the Trojan seekers and answer common questions about the efficiency of such techniques for relevant IC sizes. Further, they allow to draw conclusions about the impact of technology scaling on the detection performance.
Links
SoK: Distributed Randomness Beacons.
Authors
- Kevin Choi, New York University
- Aathira Manoj, New York University
- Joseph Bonneau, New York University; a16z Crypto Research
Abstract
Motivated and inspired by the emergence of blockchains, many new protocols have recently been proposed for generating publicly verifiable randomness in a distributed yet secure fashion. These protocols work under different setups and assumptions, use various cryptographic tools, and entail unique trade-offs and characteristics. In this paper, we systematize the design of distributed randomness beacons (DRBs) as well as the cryptographic building blocks they rely on. We evaluate protocols on two key security properties, unbiasability and unpredictability, and discuss common attack vectors for predicting or biasing the beacon output and the countermeasures employed by protocols. We also compare protocols by communication and computational efficiency. Finally, we provide insights on the applicability of different protocols in various deployment scenarios and highlight possible directions for further research.
Links
WeRLman: To Tackle Whale (Transactions), Go Deep (RL).
Authors
- Roi Bar-Zur, Technion, IC3
- Ameer Abu-Hanna, Technion
- Ittay Eyal, Technion, IC3
- Aviv Tamar, Technion
Abstract
The security of proof-of-work blockchain protocols critically relies on incentives. Their operators, called miners, receive rewards for creating blocks containing user-generated transactions. Each block rewards its creator with newly minted tokens and with transaction fees paid by the users. The protocol stability is violated if any of the miners surpasses a threshold ratio of the computational power; she is then motivated to deviate with selfish mining and increase her rewards.Previous analyses of selfish mining strategies assumed constant rewards. But with statistics from operational systems, we show that there are occasional whales – blocks with exceptional rewards. Modeling this behavior implies a state-space that grows exponentially with the parameters, becoming prohibitively large for existing analysis tools.We present the WeRLman 1 framework to analyze such models. WeRLman uses deep Reinforcement Learning (RL), inspired by the state-of-the-art AlphaGo Zero algorithm. Directly extending AlphaGo Zero to a stochastic model leads to high sampling noise, which is detrimental to the learning process. Therefore, WeRLman employs novel variance reduction techniques by exploiting the recurrent nature of the system and prior knowledge of transition probabilities. Evaluating WeRLman against models we can accurately solve demonstrates it achieves unprecedented accuracy in deep RL for blockchain.We use WeRLman to analyze the incentives of a rational miner in various settings and upper-bound the security threshold of Bitcoin-like blockchains. We show, for the first time, a negative relationship between fee variability and the security threshold. The previously known bound, with constant rewards, stands at 0.25 [2]. We show that considering whale transactions reduces this threshold considerably. In particular, with Bitcoin historical fees and its future minting policy, its threshold for deviation will drop to 0.2 in 10 years, 0.17 in 20 years, and to 0.12 in 30 years. With recent fees from the Ethereum smart-contract platform, the threshold drops to 0.17. These are below the common sizes of large miners [3].
Links
Three Birds with One Stone: Efficient Partitioning Attacks on Interdependent Cryptocurrency Networks.
Authors
- Muhammad Saad, PayPal
- David Mohaisen, PayPal
Abstract
The biased distribution of cryptocurrency nodes across Autonomous Systems (ASes) increases the risk of spatial partitioning attacks, allowing an adversary to isolate nodes by hijacking AS prefixes. Prior works on spatial partitioning attacks have mainly focused on the Bitcoin network, showing that the prominent cryptocurrency network can be paralyzed by disrupting the physical topology through BGP hijacks.Despite the persisting threat of BGP hijacks, Bitcoin and other cryptocurrencies have not been frequently targeted, likely due to their shielded overlay topology, which limits the exposure of physical network anomalies. In this paper, we present a new perspective by examining the security of cryptocurrency networks, considering shared network resources (network interdependence). We conduct measurements extending beyond the Bitcoin network and analyze commonalities in Bitcoin, Ethereum, and Ripple node hosting patterns. We observe that all three networks are highly centralized, predominantly sharing the common ASes. We also note that among the three cryptocurrencies, Ripple does not shield its overlay topology, which can be exploited to learn about the physical network anomalies. The observed network anomalies present practical attack strategies that can be launched to target all three cryptocurrencies simultaneously. 1 We supplement our analysis by surveying recent BGP attacks on high-profile ASes and recognizing a need for application-level countermeasures. We propose attack countermeasures that reduce the risk of spatial partitioning, notwithstanding the increasing centralization of nodes and network interdependence.
Links
Bitcoin-Enhanced Proof-of-Stake Security: Possibilities and Impossibilities.
Authors
- Ertem Nusret Tas, Stanford University
- David Tse, Stanford University
- Fangyu Gai, BabylonChain
- Sreeram Kannan, University of Washington, Seattle
- Mohammad Ali Maddah-Ali, University of Minnesota
- Fisher Yu, BabylonChain
Abstract
Bitcoin is the most secure blockchain in the world, supported by the immense hash power of its Proof-of-Work miners. Proof-of-Stake chains are energy-efficient, have fast finality but face several security issues: susceptibility to non-slashable long-range safety attacks, low liveness resilience and difficulty to bootstrap from low token valuation. We show that these security issues are inherent in any PoS chain without an external trusted source, and propose a new protocol, Babylon, where an off-the-shelf PoS protocol checkpoints onto Bitcoin to resolve these issues. An impossibility result justifies the optimality of Babylon. A use case of Babylon is to reduce the stake withdrawal delay: our experimental results show that this delay can be reduced from weeks in existing PoS chains to less than 5 hours using Babylon, at a transaction cost of less than 10K USD per annum for posting the checkpoints onto Bitcoin.
Links
MEGA: Malleable Encryption Goes Awry.
Authors
- Matilda Backendal, ETH Zurich, Zurich, Switzerland
- Miro Haller, ETH Zurich, Zurich, Switzerland
- Kenneth G. Paterson, ETH Zurich, Zurich, Switzerland
Abstract
MEGA is a leading cloud storage platform with more than 250 million users and 1000 Petabytes of stored data. MEGA claims to offer user-controlled, end-to-end security. This is achieved by having all data encryption and decryption operations done on MEGA clients, under the control of keys that are only available to those clients. This is intended to protect MEGA users from attacks by MEGA itself, or by adversaries who have taken control of MEGA’s infrastructure.We provide a detailed analysis of MEGA’s use of cryptography in such a malicious server setting. We present five distinct attacks against MEGA, which together allow for a full compromise of the confidentiality of user files. Additionally, the integrity of user data is damaged to the extent that an attacker can insert malicious files of their choice which pass all authenticity checks of the client. We built proof-of-concept versions of all the attacks. Four of the five attacks are eminently practical. They have all been responsibly disclosed to MEGA and remediation is underway.Taken together, our attacks highlight significant shortcomings in MEGA’s cryptographic architecture. We present immediately deployable countermeasures, as well as longer-term recommendations. We also provide a broader discussion of the challenges of cryptographic deployment at massive scale under strong threat models.
Links
DBREACH: Stealing from Databases Using Compression Side Channels.
Authors
- Mathew Hogan, Stanford University
- Yan Michalevsky, Anjuna Security, Inc. and Cryptosat, Inc.
- Saba Eskandarian, UNC Chapel Hill
Abstract
We introduce new compression side-channel attacks against database storage engines that simultaneously support compression of database pages and encryption at rest. Given only limited, indirect access to an encrypted and compressed database table, our attacks extract arbitrary plaintext with high accuracy. We demonstrate accurate and performant attacks on the InnoDB storage engine variants found in MariaDB and MySQL as well as the WiredTiger storage engine for MongoDB.Our attacks overcome obstacles unique to the database setting that render previous techniques developed to attack TLS ineffective. Unlike the web setting, where the exact length of a compressed and encrypted message can be observed, we make use of only approximate ciphertext size information gleaned from file sizes on disk. We amplify this noisy signal and combine it with new attack heuristics tailored to the database setting to extract secret plaintext. Our attacks can detect whether a random string appears in a table with > 90% accuracy and extract 10-character random strings from encrypted tables with > 95% success.
Links
Weak Fiat-Shamir Attacks on Modern Proof Systems.
Authors
- Quang Dao, Carnegie Mellon University
- Jim Miller, Trail of Bits
- Opal Wright, Trail of Bits
- Paul Grubbs, University of Michigan
Abstract
A flurry of excitement amongst researchers and practitioners has produced modern proof systems built using novel technical ideas and seeing rapid deployment, especially in cryptocurrencies. Most of these modern proof systems use the Fiat-Shamir (F-S) transformation, a seminal method of removing interaction from a protocol with a public-coin verifier. Some prior work has shown that incorrectly applying F-S (i.e., using the so-called "weak" F-S transformation) can lead to breaks of classic protocols like Schnorr’s discrete log proof; however, little is known about the risks of applying F-S incorrectly for modern proof systems seeing deployment today.In this paper, we fill this knowledge gap via a broad theoretical and practical study of F-S in implementations of modern proof systems. We perform a survey of open-source implementations and find 30 weak F-S implementations affecting 12 different proof systems. For four of these—Bulletproofs, Plonk, Spartan, and Wesolowski’s VDF—we develop novel knowledge soundness attacks accompanied by rigorous proofs of their efficacy. We perform case studies of applications that use vulnerable implementations, and demonstrate that a weak F-S vulnerability could have led to the creation of unlimited currency in a private smart contract platform. Finally, we discuss possible mitigations and takeaways for academics and practitioners.
Links
Attitudes towards Client-Side Scanning for CSAM, Terrorism, Drug Trafficking, Drug Use and Tax Evasion in Germany.
Authors
- Lisa Geierhaas, University of Bonn
- Fabian Otto, OmniQuest
- Maximilian Häring, University of Bonn
- Matthew Smith, Fraunhofer FKIE, University of Bonn
Abstract
In recent years, there have been a rising number of legislative efforts and proposed technical measures to weaken privacy-preserving technology, with the stated goal of countering serious crimes like child abuse. One of these proposed measures is Client-Side Scanning (CSS). CSS has been hotly debated both in the context of Apple stating their intention to deploy it in 2021 as well as EU legislation being proposed in 2022. Both sides of the argument state that they are working in the best interests of the people. To shed some light on this, we conducted a survey with a representative sample of German citizens. We investigated the general acceptance of CSS vs cloud-based scanning for different types of crimes and analyzed how trust in the German government and companies such as Google and Apple influenced our participants’ views. We found that, by and large, the majority of participants were willing to accept CSS measures to combat serious crimes such as child abuse or terrorism, but support dropped significantly for other illegal activities. However, the majority of participants who supported CSS were also worried about potential abuse, with only 20% stating that they were not concerned. These results suggest that many of our participants would be willing to have their devices scanned and accept some risks in the hope of aiding law enforcement. In our analysis, we argue that there are good reasons to not see this as a carte blanche for the introduction of CSS but as a call to action for the S&P community. More research is needed into how a population’s desire to prevent serious crime online can be achieved while mitigating the risks to privacy and society.
Links
Deep perceptual hashing algorithms with hidden dual purpose: when client-side scanning does facial recognition.
Authors
- Shubham Jain, Imperial College London
- Ana-Maria Creţu, Imperial College London
- Antoine Cully, Imperial College London
- Yves-Alexandre de Montjoye, Imperial College London
Abstract
End-to-end encryption (E2EE) provides strong technical protections to individuals from interferences. Governments and law enforcement agencies around the world have however raised concerns that E2EE also allows illegal content to be shared undetected. Client-side scanning (CSS), using perceptual hashing (PH) to detect known illegal content before it is shared, is seen as a promising solution to prevent the diffusion of illegal content while preserving encryption. While these proposals raise strong privacy concerns, proponents of the solutions have argued that the risk is limited as the technology has a limited scope: detecting known illegal content. In this paper, we show that modern perceptual hashing algorithms are actually fairly flexible pieces of technology and that this flexibility could be used by an adversary to add a secondary hidden feature to a client-side scanning system. More specifically, we show that an adversary providing the PH algorithm can "hide" a secondary purpose of face recognition of a target individual alongside its primary purpose of image copy detection. We first propose a procedure to train a dual-purpose deep perceptual hashing model by jointly optimizing for both the image copy detection and the targeted facial recognition task. Second, we extensively evaluate our dual-purpose model and show it to be able to reliably identify a target individual 67% of the time while not impacting its performance at detecting illegal content. We also show that our model is neither a general face detection nor a facial recognition model, allowing its secondary purpose to be hidden. Finally, we show that the secondary purpose can be enabled by adding a single illegal looking image to the database. Taken together, our results raise concerns that a deep perceptual hashing-based CSS system could turn billions of user devices into tools to locate targeted individuals.
Links
Public Verification for Private Hash Matching.
Authors
- Sarah Scheffler, Princeton University
- Anunay Kulshrestha, Princeton University
- Jonathan Mayer, Princeton University
Abstract
End-to-end encryption (E2EE) prevents online services from accessing user content. This important security property is also an obstacle for content moderation methods that involve content analysis. The tension between E2EE and efforts to combat child sexual abuse material (CSAM) has become a global flashpoint in encryption policy, because the predominant method of detecting harmful content—server-side perceptual hash matching on plaintext images—is unavailable.Recent applied cryptography advances enable private hash matching (PHM), where a service can match user content against a set of known CSAM images without revealing the hash set to users or nonmatching content to the service. These designs, especially a 2021 proposal for identifying CSAM in Apple’s iCloud Photos service, have attracted widespread criticism for creating risks to security, privacy, and free expression.In this work, we aim to advance scholarship and dialogue about PHM by contributing new cryptographic methods for system verification by the general public. We begin with motivation, describing the rationale for PHM to detect CSAM and the serious societal and technical issues with its deployment. Verification could partially address shortcomings of PHM, and we systematize critiques into two areas for auditing: trust in the hash set and trust in the implementation. We explain how, while these two issues cannot be fully resolved by technology alone, there are possible cryptographic trust improvements.The central contributions of this paper are novel cryptographic protocols that enable three types of public verification for PHM systems: (1) certification that external groups approve the hash set, (2) proof that particular lawful content is not in the hash set, and (3) eventual notification to users of false positive matches. The protocols that we describe are practical, efficient, and compatible with existing PHM constructions.
Links
Is Cryptographic Deniability Sufficientƒ Non-Expert Perceptions of Deniability in Secure Messaging.
Authors
- Nathan Reitinger, University of Maryland
- Nathan Malkin, University of Maryland
- Omer Akgul, University of Maryland
- Michelle L. Mazurek, University of Maryland
- Ian Miers, University of Maryland
Abstract
Cryptographers have long been concerned with secure messaging protocols threatening deniability. Many messaging protocols—including, surprisingly, modern email— contain digital signatures which definitively tie the author to their message. If stolen or leaked, these signatures make it impossible to deny authorship. As illustrated by events surrounding leaks from Hilary Clinton’s 2016 U.S. presidential campaign, this concern has proven well founded. Deniable protocols are meant to avoid this very outcome, letting politicians and dissidents alike safely disavow authorship. Despite being deployed on billions of devices in Signal and WhatsApp, the effectiveness of such protocols in convincing people remains unstudied. While the absence of cryptographic evidence is clearly necessary for an effective denial, is it sufficientƒWe conduct a survey study (n = 1, 200) to understand how people perceive evidence of deniability related to encrypted messaging protocols. Surprisingly, in a world of "fake news" and Photoshop, we find that simple denials of message authorship, when presented in a courtroom setting without supporting evidence, are not effective. In contrast, participants who were given access to a screenshot forgery tool or even told one exists were much more likely to believe a denial. Similarly, but to a lesser degree, we find an expert cryptographer’s assertion that there is no evidence is also effective.
Links
On the Evolution of (Hateful) Memes by Means of Multimodal Contrastive Learning.
Authors
- Yiting Qu, CISPA Helmholtz Center for Information Security
- Xinlei He, CISPA Helmholtz Center for Information Security
- Shannon Pierson, London School of Economics and Political Science
- Michael Backes, CISPA Helmholtz Center for Information Security
- Yang Zhang, CISPA Helmholtz Center for Information Security
- Savvas Zannettou, Delft University of Technology
Abstract
The dissemination of hateful memes online has adverse effects on social media platforms and the real world. Detecting hateful memes is challenging, one of the reasons being the evolutionary nature of memes; new hateful memes can emerge by fusing hateful connotations with other cultural ideas or symbols. In this paper, we propose a framework that leverages multimodal contrastive learning models, in particular OpenAI’s CLIP, to identify targets of hateful content and systematically investigate the evolution of hateful memes. We find that semantic regularities exist in CLIP-generated embeddings that describe semantic relationships within the same modality (images) or across modalities (images and text). Leveraging this property, we study how hateful memes are created by combining visual elements from multiple images or fusing textual information with a hateful image. We demonstrate the capabilities of our framework for analyzing the evolution of hateful memes by focusing on antisemitic memes, particularly the Happy Merchant meme. Using our framework on a dataset extracted from 4chan, we find 3.3K variants of the Happy Merchant meme, with some linked to specific countries, persons, or organizations. We envision that our framework can be used to aid human moderators by flagging new variants of hateful memes so that moderators can manually verify them and mitigate the problem of hateful content online. 1
Links
Lambretta: Learning to Rank for Twitter Soft Moderation.
Authors
- Pujan Paudel, Boston University
- Jeremy Blackburn, Binghamton University
- Emiliano De Cristofaro, University College London
- Savvas Zannettou, Delft University of Technology
- Gianluca Stringhini, Boston University
Abstract
To curb the problem of false information, social media platforms like Twitter started adding warning labels to content discussing debunked narratives, with the goal of providing more context to their audiences. Unfortunately, these labels are not applied uniformly and leave large amounts of false content unmoderated. This paper presents LAMBRETTA, a system that automatically identifies tweets that are candidates for soft moderation using Learning To Rank (LTR). We run Lambretta on Twitter data to moderate false claims related to the 2020 US Election and find that it flags over 20 times more tweets than Twitter, with only 3.93% false positives and 18.81% false negatives, outperforming alternative state-of-the-art methods based on keyword extraction and semantic search. Overall, LAMBRETTA assists human moderators in identifying and flagging false information on social media.
Links
SoK: Let the Privacy Games Begin! A Unified Treatment of Data Inference Privacy in Machine Learning.
Authors
- Ahmed Salem, Microsoft
- Giovanni Cherubin, Microsoft
- David Evans, University of Virginia
- Boris Köpf, Microsoft
- Andrew Paverd, Microsoft
- Anshuman Suri, University of Virginia
- Shruti Tople, Microsoft
- Santiago Zanella-Béguelin, Microsoft
Abstract
Deploying machine learning models in production may allow adversaries to infer sensitive information about training data. There is a vast literature analyzing different types of inference risks, ranging from membership inference to reconstruction attacks. Inspired by the success of games (i.e. probabilistic experiments) to study security properties in cryptography, some authors describe privacy inference risks in machine learning using a similar game-based style. However, adversary capabilities and goals are often stated in subtly different ways from one presentation to the other, which makes it hard to relate and compose results. In this paper, we present a game-based framework to systematize the body of knowledge on privacy inference risks in machine learning. We use this framework to (1) provide a unifying structure for definitions of inference risks, (2) formally establish known relations among definitions, and (3) to uncover hitherto unknown relations that would have been difficult to spot otherwise.
Links
Analyzing Leakage of Personally Identifiable Information in Language Models.
Authors
- Nils Lukas, University of Waterloo
- Ahmed Salem, Microsoft Research
- Robert Sim, Microsoft Research
- Shruti Tople, Microsoft Research
- Lukas Wutschitz, Microsoft Research
- Santiago Zanella-Béguelin, Microsoft Research
Abstract
Language Models (LMs) have been shown to leak information about training data through sentence-level membership inference and reconstruction attacks. Understanding the risk of LMs leaking Personally Identifiable Information (PII) has received less attention, which can be attributed to the false assumption that dataset curation techniques such as scrubbing are sufficient to prevent PII leakage. Scrubbing techniques reduce but do not prevent the risk of PII leakage: in practice scrubbing is imperfect and must balance the trade-off between minimizing disclosure and preserving the utility of the dataset. On the other hand, it is unclear to which extent algorithmic defenses such as differential privacy, designed to guarantee sentence-or user-level privacy, prevent PII disclosure. In this work, we introduce rigorous game-based definitions for three types of PII leakage via black-box extraction, inference, and reconstruction attacks with only API access to an LM. We empirically evaluate the attacks against GPT-2 models fine-tuned with and without defenses in three domains: case law, health care, and e-mails. Our main contributions are (i) novel attacks that can extract up to 10× more PII sequences than existing attacks, (ii) showing that sentence-level differential privacy reduces the risk of PII disclosure but still leaks about 3% of PII sequences, and (iii) a subtle connection between record-level membership inference and PII reconstruction. Code to reproduce all experiments in the paper is available at https://github.com/microsoft/analysing_pii_leakage.
Links
Accuracy-Privacy Trade-off in Deep Ensemble: A Membership Inference Perspective.
Authors
- Shahbaz Rezaei, University of California, Davis, CA, USA
- Zubair Shafiq, University of California, Davis, CA, USA
- Xin Liu, University of California, Davis, CA, USA
Abstract
Deep ensemble learning has been shown to improve accuracy by training multiple neural networks and averaging their outputs. Ensemble learning has also been suggested to defend against membership inference attacks that undermine privacy. In this paper, we empirically demonstrate a trade-off between these two goals, namely accuracy and privacy (in terms of membership inference attacks), in deep ensembles. Using a wide range of datasets and model architectures, we show that the effectiveness of membership inference attacks increases when ensembling improves accuracy. We analyze the impact of various factors in deep ensembles and demonstrate the root cause of the trade-off. Then, we evaluate common defenses against membership inference attacks based on regularization and differential privacy. We show that while these defenses can mitigate the effectiveness of membership inference attacks, they simultaneously degrade ensemble accuracy. We illustrate similar trade-off in more advanced and state-of-the-art ensembling techniques, such as snapshot ensembles and diversified ensemble networks. Finally, we propose a simple yet effective defense for deep ensembles to break the trade-off and, consequently, improve the accuracy and privacy, simultaneously.
Links
D-DAE: Defense-Penetrating Model Extraction Attacks.
Authors
- Yanjiao Chen, School of Computer Science, Wuhan University, China
- Rui Guan, School of Mathematics and Statistics, Wuhan University, China
- Xueluan Gong, School of Computer Science, Wuhan University, China
- Jianshuo Dong, School of Cyber Science and Engineering, Wuhan University, China
- Meng Xue, School of Computer Science, Wuhan University, China
Abstract
Recent studies show that machine learning models are vulnerable to model extraction attacks, where the adversary builds a substitute model that achieves almost the same performance of a black-box victim model simply via querying the victim model. To defend against such attacks, a series of methods have been proposed to disrupt the query results before returning them to potential attackers, greatly degrading the performance of existing model extraction attacks.In this paper, we make the first attempt to develop a defense-penetrating model extraction attack framework, named D-DAE, which aims to break disruption-based defenses. The linchpins of D-DAE are the design of two modules, i.e., disruption detection and disruption recovery, which can be integrated with generic model extraction attacks. More specifically, after obtaining query results from the victim model, the disruption detection module infers the defense mechanism adopted by the defender. We design a meta-learning-based disruption detection algorithm for learning the fundamental differences between the distributions of disrupted and undisrupted query results. The algorithm features a good generalization property even if we have no access to the original training dataset of the victim model. Given the detected defense mechanism, the disruption recovery module tries to restore a clean query result from the disrupted query result with well-designed generative models. Our extensive evaluations on MNIST, FashionMNIST, CIFAR-10, GTSRB, and ImageNette datasets demonstrate that D-DAE can enhance the substitute model accuracy of the existing model extraction attacks by as much as 82.24% in the face of 4 state-of-the-art defenses and combinations of multiple defenses. We also verify the effectiveness of D-DAE in penetrating unknown defenses in real-world APIs hosted by Microsoft Azure and Face++.
Links
SNAP: Efficient Extraction of Private Properties with Poisoning.
Authors
- Harsh Chaudhari, Northeastern University
- John Abascal, Northeastern University
- Alina Oprea, Northeastern University
- Matthew Jagielski, Google Research
- Florian Tramèr, ETH Zurich
- Jonathan Ullman, Northeastern University
Abstract
Property inference attacks allow an adversary to extract global properties of the training dataset from a machine learning model. Such attacks have privacy implications for data owners sharing their datasets to train machine learning models. Several existing approaches for property inference attacks against deep neural networks have been proposed [1] –[3], but they all rely on the attacker training a large number of shadow models, which induces a large computational overhead.In this paper, we consider the setting of property inference attacks in which the attacker can poison a subset of the training dataset and query the trained target model. Motivated by our theoretical analysis of model confidences under poisoning, we design an efficient property inference attack, SNAP, which obtains higher attack success and requires lower amounts of poisoning than the state-of-the-art poisoning-based property inference attack by Mahloujifar et al. [3]. For example, on the Census dataset, SNAP achieves 34% higher success rate than [3] while being 56.5× faster. We also extend our attack to infer whether a certain property was present at all during training and estimate the exact proportion of a property of interest efficiently. We evaluate our attack on several properties of varying proportions from four datasets and demonstrate SNAP’s generality and effectiveness.
Links
On the (In)security of Peer-to-Peer Decentralized Machine Learning.
Authors
- Dario Pasquini, SPRING Lab, EPFL, Switzerland
- Mathilde Raynal, SPRING Lab, EPFL, Switzerland
- Carmela Troncoso, SPRING Lab, EPFL, Switzerland
Abstract
In this work, we carry out the first, in-depth, privacy analysis of Decentralized Learning—a collaborative machine learning framework aimed at addressing the main limitations of federated learning. We introduce a suite of novel attacks for both passive and active decentralized adversaries. We demonstrate that, contrary to what is claimed by decentralized learning proposers, decentralized learning does not offer any security advantage over federated learning. Rather, it increases the attack surface enabling any user in the system to perform privacy attacks such as gradient inversion, and even gain full control over honest users’ local model. We also show that, given the state of the art in protections, privacy-preserving configurations of decentralized learning require fully connected networks, losing any practical advantage over the federated setup and therefore completely defeating the objective of the decentralized approach.
Links
Vectorized Batch Private Information Retrieval.
Authors
- Muhammad Haris Mughees, University of Illinois at Urbana-Champaign
- Ling Ren, University of Illinois at Urbana-Champaign
Abstract
This paper studies Batch Private Information Retrieval (BatchPIR), a variant of private information retrieval (PIR) where the client wants to retrieve multiple entries from the server in one batch. BatchPIR matches the use case of many practical applications and holds the potential for substantial efficiency improvements over PIR in terms of amortized cost per query. Existing BatchPIR schemes have achieved decent computation efficiency but have not been able to improve communication efficiency at all. Using vectorized homomorphic encryption, we present the first BatchPIR protocol that is efficient in both computation and communication for a variety of database configurations. Specifically, to retrieve a batch of 256 entries from a database with one million entries of 256 bytes each, the communication cost of our scheme is 7.5x to 98.5x better than state-of-the-art solutions.
Links
RoFL: Robustness of Secure Federated Learning.
Authors
- Hidde Lycklama, ETH Zurich
- Lukas Burkhalter, ETH Zurich
- Alexander Viand, ETH Zurich
- Nicolas Küchler, ETH Zurich
- Anwar Hithnawi, ETH Zurich
Abstract
Even though recent years have seen many attacks exposing severe vulnerabilities in Federated Learning (FL), a holistic understanding of what enables these attacks and how they can be mitigated effectively is still lacking. In this work, we demystify the inner workings of existing (targeted) attacks. We provide new insights into why these attacks are possible and why a definitive solution to FL robustness is challenging. We show that the need for ML algorithms to memorize tail data has significant implications for FL integrity. This phenomenon has largely been studied in the context of privacy; our analysis sheds light on its implications for ML integrity. We show that certain classes of severe attacks can be mitigated effectively by enforcing constraints such as norm bounds on clients’ updates. We investigate how to efficiently incorporate these constraints into secure FL protocols in the single-server setting. Based on this, we propose RoFL, a new secure FL system that extends secure aggregation with privacy-preserving input validation. Specifically, RoFL can enforce constraints such as L
2 and L∞ bounds on high-dimensional encrypted model updates.
Links
Flamingo: Multi-Round Single-Server Secure Aggregation with Applications to Private Federated Learning.
Authors
- Yiping Ma, University of Pennsylvania
- Jess Woods, University of Pennsylvania
- Sebastian Angel, University of Pennsylvania; Microsoft Research
- Antigoni Polychroniadou, J.P. Morgan AI Research & AlgoCRYPT CoE
- Tal Rabin, University of Pennsylvania
Abstract
This paper introduces Flamingo, a system for secure aggregation of data across a large set of clients. In secure aggregation, a server sums up the private inputs of clients and obtains the result without learning anything about the individual inputs beyond what is implied by the final sum. Flamingo focuses on the multi-round setting found in federated learning in which many consecutive summations (averages) of model weights are performed to derive a good model. Previous protocols, such as Bell et al. (CCS ’20), have been designed for a single round and are adapted to the federated learning setting by repeating the protocol multiple times. Flamingo eliminates the need for the per-round setup of previous protocols, and has a new lightweight dropout resilience protocol to ensure that if clients leave in the middle of a sum the server can still obtain a meaningful result. Furthermore, Flamingo introduces a new way to locally choose the so-called client neighborhood introduced by Bell et al. These techniques help Flamingo reduce the number of interactions between clients and the server, resulting in a significant reduction in the end-to-end runtime for a full training session over prior work.We implement and evaluate Flamingo and show that it can securely train a neural network on the (Extended) MNIST and CIFAR-100 datasets, and the model converges without a loss in accuracy, compared to a non-private federated learning system.
Links
SoK: Cryptographic Neural-Network Computation.
Authors
- Lucien K. L. Ng, Georgia Institute of Technology
- Sherman S. M. Chow, The Chinese University of Hong Kong
Abstract
We studied 53 privacy-preserving neural-network papers in 2016-2022 based on cryptography (without trusted processors or differential privacy), 16 of which only use homomorphic encryption, 19 use secure computation for inference, and 18 use non-colluding servers (among which 12 support training), solving a wide variety of research problems. We dissect their cryptographic techniques and "love-hate relationships" with machine learning alongside a genealogy highlighting noteworthy developments. We also re-evaluate the state of the art under WAN. We hope this can serve as a go-to guide connecting different experts in related fields.
Links
FLUTE: Fast and Secure Lookup Table Evaluations.
Authors
- Andreas Brüggemann, Technical University of Darmstadt, Germany
- Robin Hundt, Technical University of Darmstadt, Germany
- Thomas Schneider, Technical University of Darmstadt, Germany
- Ajith Suresh, Technical University of Darmstadt, Germany
- Hossein Yalame, Technical University of Darmstadt, Germany
Abstract
The concept of using Lookup Tables (LUTs) instead of Boolean circuits is well-known and been widely applied in a variety of applications, including FPGAs, image processing, and database management systems. In cryptography, using such LUTs instead of conventional gates like AND and XOR results in more compact circuits and has been shown to substantially improve online performance when evaluated with secure multi-party computation. Several recent works on secure floating-point computations and privacy-preserving machine learning inference rely heavily on existing LUT techniques. However, they suffer from either large overhead in the setup phase or subpar online performance.We propose FLUTE, a novel protocol for secure LUT evaluation with good setup and online performance. In a two-party setting, we show that FLUTE matches or even outperforms the online performance of all prior approaches, while being competitive in terms of overall performance with the best prior LUT protocols. In addition, we provide an open-source implementation of FLUTE written in the Rust programming language, and implementations of the Boolean secure two-party computation protocols of ABY2.0 and silent OT. We find that FLUTE outperforms the state of the art by two orders of magnitude in the online phase while retaining similar overall communication.
Links
Bicoptor: Two-round Secure Three-party Non-linear Computation without Preprocessing for Privacy-preserving Machine Learning.
Authors
- Lijing Zhou, Huawei Technology, Shanghai, China
- Ziyu Wang, Huawei Technology, Shanghai, China
- Hongrui Cui, Shanghai Jiao Tong University, Shanghai, China
- Qingrui Song, Huawei Technology, Shanghai, China
- Yu Yu, Shanghai Jiao Tong University, Shanghai, China
Abstract
The overhead of non-linear functions dominates the performance of the secure multiparty computation (MPC) based privacy-preserving machine learning (PPML). This work introduces a family of novel secure three-party computation (3PC) protocols, Bicoptor, which improve the efficiency of evaluating non-linear functions. The basis of Bicoptor is a new sign determination protocol, which relies on a clever use of the truncation protocol proposed in SecureML (S&P 2017). Our 3PC sign determination protocol only requires two communication rounds, and does not involve any preprocessing. Such sign determination protocol is well-suited for computing non-linear functions in PPML, e.g. the activation function ReLU, Maxpool, and their variants. We develop suitable protocols for these non-linear functions, which form a family of GPU-friendly protocols, Bicoptor. All Bicoptor protocols only require two communication rounds without preprocessing. We evaluate Bicoptor under a 3-party LAN network over a public cloud, and achieve more than 370,000 DReLU/ReLU or 41,000 Maxpool (find the maximum value of nine inputs) operations per second. Under the same settings and environment, our ReLU protocol has a one or even two orders of magnitude improvement to the state-of-the-art works, Falcon (PETS 2021) or Edabits (CRYPTO 2020), respectively without batch processing.
Links
Investigating the Password Policy Practices of Website Administrators.
Authors
- Sena Sahin, Georgia Institute of Technology
- Suood Al Roomi, Georgia Institute of Technology; Kuwait University
- Tara Poteat, Georgia Institute of Technology
- Frank Li, Georgia Institute of Technology
Abstract
Passwords are the de facto standard for online authentication today, and will likely remain so for the foreseeable future. As a consequence, the security community has extensively explored how users behave with passwords, producing recommendations for password policies that promote password security and usability for users. However, it is the website administrators who must adopt such recommendations to enact improvements to online authentication in practice. To date, there has been limited investigation of how web administrators manage password policies for their sites. To improve online authentication at scale, we must understand the factors behind this specific population’s behaviors and decisions, and how to help administrators deploy more secure password policies.In this paper, we explore how web administrators determine the password policies that they employ, what considerations impact a policy’s evolution, and what challenges administrators encounter when managing a site’s policy. To do so, we conduct an online survey and in-depth semi-structured interviews with 11 US-based web administrators with direct experience managing website password policies. Through our qualitative study, we identify a small set of key factors driving the majority of password policy decisions, and barriers that inhibit administrators from enacting policies that are more aligned with modern guidelines. Moving forward, we propose directions for future research and community action that may help administrators manage password policies more effectively.
Links
"In Eighty Percent of the Cases, I Select the Password for Them": Security and Privacy Challenges, Advice, and Opportunities at Cybercafes in Kenya.
Authors
- Collins W. Munyendo, The George Washington University
- Yasemin Acar, The George Washington University; Paderborn University
- Adam J. Aviv, The George Washington University
Abstract
Cybercafes remain a popular way to access the Internet in the developing world as many users still lack access to personal computers. Coupled with the recent digitization of government services, e.g. in Kenya, many users have turned to cybercafes to access essential services. Many of these users may have never used a computer, and face significant security and privacy issues at cybercafes. Yet, these challenges as well as the advice offered remain largely unexplored. We investigate these challenges along with the security advice and support provided by the operators at cybercafes in Kenya through n = 36 semi-structured interviews (n = 14 with cybercafe managers and n = 22 with customers). We find that cybercafes serve a crucial role in Kenya by enabling access to printing and government services. However, most customers face challenges with computer usage as well as security and usability challenges with account creation and password management. As a workaround, customers often rely on the support and advice of cybercafe managers who mostly direct them to use passwords that are memorable, e.g. simply using their national ID numbers or names. Some managers directly manage passwords for their customers, with one even using the same password for all their customers. These results suggest the need for more awareness about phone-based password managers, as well as a need for computer training and security awareness among these users. There is also a need to explore security and privacy advice beyond Western peripheries to support broader populations.
Links
Towards a Rigorous Statistical Analysis of Empirical Password Datasets.
Authors
- Jeremiah Blocki, Purdue University
- Peiyuan Liu, Purdue University
Abstract
A central challenge in password security is to characterize the attacker's guessing curve i.e., what is the probability that the attacker will crack a random user's password within the first G guesses. A key challenge is that the guessing curve depends on the attacker's guessing strategy and the distribution of user passwords both of which are unknown to us. In this work we aim to follow Kerckhoffs's principal and analyze the performance of an optimal attacker who knows the password distribution. Let λ
G denote the probability that such an attacker can crack a random user's password within G guesses. We develop several statistically rigorous techniques to upper and lower bound λG given N independent samples from the unknown password distribution ${\mathcal{P}}$. We show that our upper/lower bounds on λG hold with high confidence and we apply our techniques to analyze eight large password datasets. Our empirical analysis shows that even state-of-the-art password cracking models are often significantly less guess efficient than an attacker who can optimize its attack based on its (partial) knowledge of the password distribution. We also apply our statistical tools to re-examine different models of the password distribution i.e., the empirical password distribution and Zipf's Law. We find that the empirical distribution closely matches our upper/lower bounds on λG when the guessing number G is not too large i.e., G ≪ N. However, for larger values of G our empirical analysis rigorously demonstrates that the empirical distribution (resp. Zipf's Law) overestimates the attacker's success rate. We apply our statistical techniques to upper/lower bound the effectiveness of password throttling mechanisms (key-stretching) which are used to reduce the number of attacker guesses G. Finally, if we are willing to make an additional assumption about the way users respond to password restrictions, we can use our statistical techniques to evaluate the effectiveness of various password composition policies which restrict the passwords that users may select.
Links
Confident Monte Carlo: Rigorous Analysis of Guessing Curves for Probabilistic Password Models.
Authors
- Peiyuan Liu, Computer Science Department, Purdue University, West Lafayette, IN, USA
- Jeremiah Blocki, Computer Science Department, Purdue University, West Lafayette, IN, USA
- Wenjie Bai, Computer Science Department, Purdue University, West Lafayette, IN, USA
Abstract
In password security a defender would like to identify and warn users with weak passwords. Similarly, the defender may also want to predict what fraction of passwords would be cracked within B guesses as the attacker’s guessing budget B varies from small (online attacker) to large (offline attacker). Towards each of these goals the defender would like to quickly estimate the guessing number for each user password pwd assuming that the attacker uses a password cracking model M i.e., how many password guesses will the attacker check before s/he cracks each user password pwd. Since naïve brute-force enumeration can be prohibitively expensive when the guessing number is very large, Dell’Amico and Filippone [1] developed an efficient Monte Carlo algorithm to estimate the guessing number of a given password pwd. While Dell’Amico and Filippone proved that their estimator is unbiased there is no guarantee that the Monte Carlo estimates are accurate nor does the method provide confidence ranges on the estimated guessing number or even indicate if/when there is a higher degree of uncertainty.Our contributions are as follows: First, we identify theoretical examples where, with high probability, Monte Carlo Strength estimation produces highly inaccurate estimates of individual guessing numbers as well as the entire guessing curve. Second, we introduce Confident Monte Carlo Strength Estimation as an extension of Dell’Amico and Filippone [1]. Given a password our estimator generates an upper and lower bound with the guarantee that, except with probability δ, the true guessing number lies within the given confidence range. Our techniques can also be used to characterize the attacker’s guessing curve. In particular, given a probabilistic password cracking model M we can generate high confidence upper and lower bounds on the fraction of passwords that the attacker will crack as the guessing budget B varies.
Links
Not Yet Another Digital ID: Privacy-Preserving Humanitarian Aid Distribution.
Authors
- Boya Wang, SPRING Lab, EPFL, Lausanne, Switzerland
- Wouter Lueks, CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
- Justinas Sukaitis, International Committee of the Red Cross, Geneva, Switzerland
- Vincent Graf Narbel, International Committee of the Red Cross, Geneva, Switzerland
- Carmela Troncoso, SPRING Lab, EPFL, Lausanne, Switzerland
Abstract
Humanitarian aid-distribution programs help bring physical goods to people in need. Traditional paper-based solutions to support aid distribution do not scale to large populations and are hard to secure. Existing digital solutions solve these issues, at the cost of collecting large amount of personal information. This lack of privacy can endanger recipients’ safety and harm their dignity. In collaboration with the International Committee of the Red Cross, we build a safe digital aid-distribution system. We first systematize the requirements such a system should satisfy. We then propose a decentralized solution based on the use of tokens that fulfills the needs of humanitarian organizations. It provides scalability and strong accountability, and, by design, guarantees the recipients’ privacy. We provide two instantiations of our design, on a smart card and on a smartphone. We formally prove the security and privacy properties of these solutions, and empirically show that they can operate at scale.
Links
Disguising Attacks with Explanation-Aware Backdoors.
Authors
- Maximilian Noppel, KASTEL Security Research Labs, Karlsruhe Institute of Technology, Germany
- Lukas Peter, KASTEL Security Research Labs, Karlsruhe Institute of Technology, Germany
- Christian Wressnegger, KASTEL Security Research Labs, Karlsruhe Institute of Technology, Germany
Abstract
Explainable machine learning holds great potential for analyzing and understanding learning-based systems. These methods can, however, be manipulated to present unfaithful explanations, giving rise to powerful and stealthy adversaries. In this paper, we demonstrate how to fully disguise the adversarial operation of a machine learning model. Similar to neural backdoors, we change the model’s prediction upon trigger presence but simultaneously fool an explanation method that is applied post-hoc for analysis. This enables an adversary to hide the presence of the trigger or point the explanation to entirely different portions of the input, throwing a red herring. We analyze different manifestations of these explanation-aware backdoors for gradient- and propagation-based explanation methods in the image domain, before we resume to conduct a red-herring attack against malware classification.
Links
AI-Guardian: Defeating Adversarial Attacks using Backdoors.
Authors
- Hong Zhu, SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences, China; School of Cyber Security, University of Chinese Academy of Sciences, China
- Shengzhi Zhang, Metropolitan College, Boston University, USA
- Kai Chen, SKLOIS, Institute of Information Engineering, Chinese Academy of Sciences, China; School of Cyber Security, University of Chinese Academy of Sciences, China
Abstract
Deep neural networks (DNNs) have been widely used in many fields due to their increasingly high accuracy. However, they are also vulnerable to adversarial attacks, posing a serious threat to security-critical applications such as autonomous driving, remote diagnosis, etc. Existing solutions are limited in detecting/preventing such attacks, and also impacting the performance on the original tasks. In this paper, we present AI-Guardian, a novel approach to defeating adversarial attacks that leverages intentionally embedded backdoors to fail the adversarial perturbations and maintain the performance of the original main task. We extensively evaluate AI-Guardian using five popular adversarial example generation approaches, and experimental results demonstrate its efficacy in defeating adversarial attacks. Specifically, AI-Guardian reduces the attack success rate from 97.3% to 3.2%, which outperforms the state-of-the-art works by 30.9%, with only a 0.9% decline on the clean data accuracy. Furthermore, AI-Guardian introduces only 0.36% overhead to the model prediction time, almost negligible in most cases.
Links
Jigsaw Puzzle: Selective Backdoor Attack to Subvert Malware Classifiers.
Authors
- Limin Yang, University of Illinois at Urbana-Champaign
- Zhi Chen, University of Illinois at Urbana-Champaign
- Jacopo Cortellazzi, King’s College London; University College London
- Feargus Pendlebury, University College London
- Kevin Tu, University of Illinois at Urbana-Champaign
- Fabio Pierazzi, King’s College London
- Lorenzo Cavallaro, University College London
- Gang Wang, University of Illinois at Urbana-Champaign
Abstract
Malware classifiers are subject to training-time exploitation due to the need to regularly retrain using samples collected from the wild. Recent work has demonstrated the feasibility of backdoor attacks against malware classifiers, and yet the stealthiness of such attacks is not well understood. In this paper, we focus on Android malware classifiers and investigate backdoor attacks under the clean-label setting (i.e., attackers do not have complete control over the training process or the labeling of poisoned data). Empirically, we show that existing backdoor attacks against malware classifiers are still detectable by recent defenses such as MNTD. To improve stealthiness, we propose a new attack, Jigsaw Puzzle (JP), based on the key observation that malware authors have little to no incentive to protect any other authors’ malware but their own. As such, Jigsaw Puzzle learns a trigger to complement the latent patterns of the malware author’s samples, and activates the backdoor only when the trigger and the latent pattern are pieced together in a sample. We further focus on realizable triggers in the problem space (e.g., software code) using bytecode gadgets broadly harvested from benign software. Our evaluation confirms that Jigsaw Puzzle is effective as a backdoor, remains stealthy against state-of-the-art defenses, and is a threat in realistic settings that depart from reasoning about feature-space-only attacks. We conclude by exploring promising approaches to improve backdoor defenses.
Links
BayBFed: Bayesian Backdoor Defense for Federated Learning.
Authors
- Kavita Kumari, Technical University of Darmstadt; The University of Texas at San Antonio
- Phillip Rieger, Technical University of Darmstadt
- Hossein Fereidooni, Technical University of Darmstadt
- Murtuza Jadliwala, The University of Texas at San Antonio
- Ahmad-Reza Sadeghi, Technical University of Darmstadt
Abstract
Federated learning (FL) is an emerging technology that allows participants to jointly train a machine learning model without sharing their private data with others. However, FL is vulnerable to poisoning attacks such as backdoor attacks. Consequently, a variety of defenses have recently been proposed, which have primarily utilized intermediary states of the global model (i.e., logits) or distance of the local models (i.e., L
2 −norm) with respect to the global model to detect malicious backdoors in FL. However, as these approaches directly operate on client updates (or weights), their effectiveness depends on factors such as clients’ data distribution or the adversary’s attack strategies. In this paper, we introduce a novel and more generic backdoor defense framework, called BayBFed, which proposes to utilize probability distributions over client updates to detect malicious updates in FL: BayBFed computes a probabilistic measure over the clients’ updates to keep track of any adjustments made in the updates, and uses a novel detection algorithm that can leverage this probabilistic measure to efficiently detect and filter out malicious updates. Thus, it overcomes the shortcomings of previous approaches that arise due to the direct usage of client updates; nevertheless, our probabilistic measure will include all aspects of the local client training strategies. BayBFed utilizes two Bayesian NonParametric (BNP) extensions: (i) a Hierarchical Beta-Bernoulli process to draw a probabilistic measure given the clients’ updates, and (ii) an adaptation of the Chinese Restaurant Process (CRP), referred by us as CRP-Jensen, which leverages this probabilistic measure to detect and filter out malicious updates. We extensively evaluate our defense approach on five benchmark datasets: CIFAR10, Reddit, IoT intrusion detection, MNIST, and FMNIST, and show that it can effectively detect and eliminate malicious updates in FL without deteriorating the benign performance of the global model.
Links
Redeem Myself: Purifying Backdoors in Deep Learning Models using Self Attention Distillation.
Authors
- Xueluan Gong, School of Computer Science, Wuhan University, China
- Yanjiao Chen, College of Electrical Engineering, Zhejiang University, China
- Wang Yang, School of Cyber Science and Engineering, Wuhan University, China
- Qian Wang, School of Cyber Science and Engineering, Wuhan University, China
- Yuzhe Gu, School of Cyber Science and Engineering, Wuhan University, China
- Huayang Huang, School of Cyber Science and Engineering, Wuhan University, China
- Chao Shen, School of Cyber Science and Engineering, Xi’an Jiaotong University, China
Abstract
Recent works have revealed the vulnerability of deep neural networks to backdoor attacks, where a backdoored model orchestrates targeted or untargeted misclassification when activated by a trigger. A line of purification methods (e.g., fine-pruning, neural attention transfer, MCR [69]) have been proposed to remove the backdoor in a model. However, they either fail to reduce the attack success rate of more advanced backdoor attacks or largely degrade the prediction capacity of the model for clean samples. In this paper, we put forward a new purification defense framework, dubbed SAGE, which utilizes self-attention distillation to purge models of backdoors. Unlike traditional attention transfer mechanisms that require a teacher model to supervise the distillation process, SAGE can realize self-purification with a small number of clean samples. To enhance the defense performance, we further propose a dynamic learning rate adjustment strategy that carefully tracks the prediction accuracy of clean samples to guide the learning rate adjustment. We compare the defense performance of SAGE with 6 state-of-the-art defense approaches against 8 backdoor attacks on 4 datasets. It is shown that SAGE can reduce the attack success rate by as much as 90% with less than 3% decrease in prediction accuracy for clean samples. We will open-source our codes upon publication.
Links
Threshold BBS+ Signatures for Distributed Anonymous Credential Issuance.
Authors
- Jack Doerner, Technion
- Yashvanth Kondi, Aarhus University
- Eysa Lee, Northeastern University
- Abhi Shelat, Northeastern University
- LaKyah Tyner, Northeastern University
Abstract
We propose a secure multiparty signing protocol for the BBS+ signature scheme; in other words, an anonymous credential scheme with threshold issuance. We prove that due to the structure of the BBS+ signature, simply verifying the signature produced by an otherwise semi-honest protocol is sufficient to achieve composable security against a malicious adversary. Consequently, our protocol is extremely simple and efficient: it involves a single request from the client (who requires a signature) to the signing parties, two exchanges of messages among the signing parties, and finally a response to the client; in some deployment scenarios the concrete cost bottleneck may be the client’s local verification of the signature that it receives. Furthermore, our protocol can be extended to support the strongest form of blind signing and to serve as a distributed evaluation protocol for the Dodis-Yampolskiy Oblivious VRF. We validate our efficiency claims by implementing and benchmarking our protocol.
Links
zk-creds: Flexible Anonymous Credentials from zkSNARKs and Existing Identity Infrastructure.
Authors
- Michael Rosenberg, University of Maryland
- Jacob White, Purdue University
- Christina Garman, Purdue University
- Ian Miers, University of Maryland
Abstract
Frequently, users on the web need to show that they are, for example, not a robot, old enough to access an age restricted video, or eligible to download an ebook from their local public library without being tracked. Anonymous credentials were developed to address these concerns. However, existing schemes do not handle the realities of deployment or the complexities of real-world identity. Instead, they implicitly make assumptions such as there being an issuing authority for anonymous credentials that, for real applications, requires the local department of motor vehicles to issue sophisticated cryptographic tokens to show users are over 18. In reality, there are multiple trust sources for a given identity attribute, their credentials have distinctively different formats, and many, if not all, issuers are unwilling to adopt new protocols.We present and build zk-creds, a protocol that uses general-purpose zero-knowledge proofs to 1) remove the need for credential issuers to hold signing keys: credentials can be issued to a bulletin board instantiated as a transparency log, Byzantine system, or even a blockchain; 2) convert existing identity documents into anonymous credentials without modifying documents or coordinating with their issuing authority; 3) allow for flexible, composable, and complex identity statements over multiple credentials. Concretely, identity assertions using zk-creds take less than 150ms in a real-world scenario of using a passport to anonymously access age-restricted videos.
Links
Private Access Control for Function Secret Sharing.
Authors
- Sacha Servan-Schreiber, MIT CSAIL
- Simon Beyzerov, MIT PRIMES
- Eli Yablon, MIT PRIMES
- Hyojae Park, MIT PRIMES
Abstract
Function Secret Sharing (FSS; Eurocrypt 2015) allows a dealer to share a function f with two or more evaluators. Given secret shares of a function f, the evaluators can locally compute secret shares of f (x) for any input x, without learning information about f in the process.In this paper, we initiate the study of access control for FSS. Given the shares of f, the evaluators can ensure that the dealer is authorized to share the provided function. For a function family $\mathcal{F}$ and an access control list defined over the family, the evaluators receiving the shares of $f \in \mathcal{F}$ can efficiently check that the dealer knows the access key for f.This model enables new applications of FSS, such as: (1) anonymous authentication in a multi-party setting, (2) access control in private databases, and (3) authentication and spam prevention in anonymous communication systems.Our definitions and constructions abstract and improve the concrete efficiency of several recent systems that implement ad-hoc mechanisms for access control over FSS. The main building block behind our efficiency improvement is a discrete-logarithm zero-knowledge proof-of-knowledge over secret-shared elements, which may be of independent interest.We evaluate our constructions and show a 50–70× reduction in computational overhead compared to existing access control techniques used in anonymous communication. In other applications, such as private databases, the processing cost of introducing access control is only 1.5–3×, when amortized over databases with 500,000 or more items.
Links
MPCAuth: Multi-factor Authentication for Distributed-trust Systems.
Authors
- Sijun Tan, University of California, Berkeley
- Weikeng Chen, University of California, Berkeley
- Ryan Deng, University of California, Berkeley
- Raluca Ada Popa, University of California, Berkeley
Abstract
Systems with distributed trust have attracted growing research attention and seen increasing industry adoptions. In these systems, critical secrets are distributed across N servers, and computations are performed privately using secure multi-party computation (SMPC). Authentication for these distributed-trust systems faces two challenges. The first challenge is ease-of-use. Namely, how can an authentication protocol maintain its user experience without sacrificing security? To avoid a central point of attack, a client needs to authenticate to each server separately. However, this would require the client to authenticate N times for each authentication factor, which greatly hampers usability. The second challenge is privacy, as the client’s sensitive profiles are now exposed to all N servers under different trust domains, which creates N times the attack surface for the profile data.We present MPCAuth, a multi-factor authentication system for distributed-trust applications that address both challenges. Our system enables a client to authenticate to N servers independently with the work of only one authentication. In addition, our system is profile hiding, meaning that the client’s authentication profiles such as her email username, phone number, passwords, and biometric features are not revealed unless all servers are compromised. We propose secure and practical protocols for an array of widely adopted authentication factors, including email passcodes, SMS messages, U2F, security questions/passwords, and biometrics. Our system finds practical applications in the space of cryptocurrency custody and collaborative machine learning, and benefits future adoptions of distributed-trust applications.
Links
Silph: A Framework for Scalable and Accurate Generation of Hybrid MPC Protocols.
Authors
- Edward Chen, Carnegie Mellon University
- Jinhao Zhu, Carnegie Mellon University
- Alex Ozdemir, Stanford University
- Riad S. Wahby, Carnegie Mellon University
- Fraser Brown, Carnegie Mellon University
- Wenting Zheng, Carnegie Mellon University
Abstract
Many applications in finance and healthcare need access to data from multiple organizations. While these organizations can benefit from computing on their joint datasets, they often cannot share data with each other due to regulatory constraints and business competition. One way mutually distrusting parties can collaborate without sharing their data in the clear is to use secure multiparty computation (MPC). However, MPC’s performance presents a serious obstacle for adoption as it is difficult for users who lack expertise in advanced cryptography to optimize. In this paper, we present Silph, a framework that can automatically compile a program written in a high-level language to an optimized, hybrid MPC protocol that mixes multiple MPC primitives securely and efficiently. Compared to prior works, our compilation speed is improved by up to 30000×. On various database analytics and machine learning workloads, the MPC protocols generated by Silph match or outperform prior work by up to 3.6×.
Links
SoK: Anti-Facial Recognition Technology.
Authors
- Emily Wenger, University of Chicago
- Shawn Shan, University of Chicago
- Haitao Zheng, University of Chicago
- Ben Y. Zhao, University of Chicago
Abstract
The rapid adoption of facial recognition (FR) technology by both government and commercial entities in recent years has raised concerns about civil liberties and privacy. In response, a broad suite of so-called "anti-facial recognition" (AFR) tools has been developed to help users avoid unwanted facial recognition. The set of AFR tools proposed in the last few years is wide-ranging and rapidly evolving, necessitating a step back to consider the broader design space of AFR systems and long-term challenges. This paper aims to fill that gap and provides the first comprehensive analysis of the AFR research landscape. Using the operational stages of FR systems as a starting point, we create a systematic framework for analyzing the benefits and tradeoffs of different AFR approaches. We then consider both technical and social challenges facing AFR tools and propose directions for future research in this field.
Links
Spoofing Real-world Face Authentication Systems through Optical Synthesis.
Authors
- Yueli Yan, School of Information Science and Technology, ShanghaiTech University, China
- Zhice Yang, School of Information Science and Technology, ShanghaiTech University, China
Abstract
Facial information has been used for authentication purposes. Recent face authentication systems leverage multi-modal cameras to defeat spoofing attacks. Multimodal cameras are able to simultaneously observe the targeting people from multiple physical aspects, such as the visible, infrared, and depth domains. Known spoofing attacks are not effective in evading the detection since they cannot simulate multiple modalities at the same time.This paper presents a new class of spoofing attacks on multimodal face authentication systems. Its main idea is to forge each and every modality and then combine them together to present to the camera. The attack is realized with a special display device called Hua-pi display. It costs less than $500 and incorporates dedicated scene generators to optically reproduce multimodal scenes of an authorized user, and then synthesizes the scenes together at the camera’s view point through optical combiners to fool face authentication systems. We evaluate the risks of this attack by systematically testing it against the latest commercial face authentication products from major vendors in the field. The results not only demonstrate a successful bypass rate of 80% but also characterize the impacting factors and their feasible regions, revealing a new and realistic threat in the field.
Links
ImU: Physical Impersonating Attack for Face Recognition System with Natural Style Changes.
Authors
- Shengwei An, Purdue University
- Yuan Yao, Nanjing University
- Qiuling Xu, Purdue University
- Shiqing Ma, Rutgers University
- Guanhong Tao, Purdue University
- Siyuan Cheng, Purdue University
- Kaiyuan Zhang, Purdue University
- Yingqi Liu, Purdue University
- Guangyu Shen, Purdue University
- Ian Kelk, Clarifai Inc
- Xiangyu Zhang, Purdue University
Abstract
This paper presents a novel physical impersonating attack against face recognition systems. It aims at generating consistent style changes across multiple pictures of the attacker under different conditions and poses. Additionally, the style changes are required to be physically realizable by make-up and can induce the intended misclassification. To achieve the goal, we develop novel techniques to embed multiple pictures of the same physical person to vectors in the StyleGAN’s latent space, such that the embedded latent vectors have some implicit correlations to make the search for consistent style changes feasible. Our digital and physical evaluation results show our approach can allow an outsider attacker to successfully impersonate the insiders with consistent and natural changes.
Links
DepthFake: Spoofing 3D Face Authentication with a 2D Photo.
Authors
- Zhihao Wu, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Yushi Cheng, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University
- Jiahui Yang, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Xiaoyu Ji, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Wenyuan Xu, Ubiquitous System Security Lab (USSLAB), Zhejiang University
Abstract
Face authentication has been widely used in access control, and the latest 3D face authentication systems employ 3D liveness detection techniques to cope with the photo replay attacks, whereby an attacker uses a 2D photo to bypass the authentication. In this paper, we analyze the security of 3D liveness detection systems that utilize structured light depth cameras and discover a new attack surface against 3D face authentication systems. We propose DepthFake attacks that can spoof a 3D face authentication using only one single 2D photo. To achieve this goal, DepthFake first estimates the 3D depth information of a target victim’s face from his 2D photo. Then, DepthFake projects the carefully-crafted scatter patterns embedded with the face depth information, in order to empower the 2D photo with 3D authentication properties. We overcome a collection of practical challenges, e.g., depth estimation errors from 2D photos, depth images forgery based on structured light, the alignment of the RGB image and depth images for a face, and implemented DepthFake in laboratory setups. We validated DepthFake on 3 commercial face authentication systems (i.e., Tencent Cloud, Baidu Cloud, and 3DiVi) and one commercial access control device. The results over 50 users demonstrate that DepthFake achieves an overall Depth attack success rate of 79.4% and RGB-D attack success rate of 59.4% in the real world.
Links
Understanding the (In)Security of Cross-side Face Verification Systems in Mobile Apps: A System Perspective.
Authors
- Xiaohan Zhang, Fudan University, Shanghai, China
- Haoqi Ye, Fudan University, Shanghai, China
- Ziqi Huang, Fudan University, Shanghai, China
- Xiao Ye, Fudan University, Shanghai, China
- Yinzhi Cao, Johns Hopkins University, Baltimore, USA
- Yuan Zhang, Fudan University, Shanghai, China
- Min Yang, Fudan University, Shanghai, China
Abstract
Face Verification Systems (FVSes) are more and more deployed by real-world mobile applications (apps) to verify a human’s claimed identity. One popular type of FVSes is called cross-side FVS (XFVS), which splits the FVS functionality into two sides: one at a mobile phone to take pictures or videos and the other at a trusted server for verification. Prior works have studied the security of XFVSes from the machine learning perspective, i.e., whether the learning models used by XFVSes are robust to adversarial attacks. However, the security of other parts of XFVSes, especially the design and implementation of the verification procedure used by XFVSes, is not well understood.In this paper, we conduct the first measurement study on the security of real-world XFVSes used by popular mobile apps from a system perspective. More specifically, we design and implement a semi-automated system, called XFVSChecker, to detect XFVSes in mobile apps and then inspect their compliance with four security properties. Our evaluation reveals that most of existing XFVS apps, including those with billions of downloads, are vulnerable to at least one of four types of attacks. These attacks require only easily available attack prerequisites, such as one photo of the victim, to pose significant security risks, including complete account takeover, identity fraud and financial loss. Our findings result in 14 Chinese National Vulnerability Database (CNVD) IDs and one of them, particularly CNVD-2021-86899, is awarded the most valuable vulnerability in 2021 among all the reported vulnerabilities to CNVD.
Links
Breaking Security-Critical Voice Authentication.
Authors
- Andre Kassis, Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada
- Urs Hengartner, Cheriton School of Computer Science, University of Waterloo, Waterloo, Canada
Abstract
Voice authentication (VA) has recently become an integral part in numerous security-critical operations, such as bank transactions and call center conversations. The vulnerability of automatic speaker verification systems (ASVs) to spoofing attacks instigated the development of countermeasures (CMs), whose task is to differentiate between bonafide and spoofed speech. Together, ASVs and CMs form today’s VA systems and are being advertised as an impregnable access control mechanism. We develop the first practical attack on spoofing countermeasures, and demonstrate how a malicious actor may efficiently craft audio samples against these defenses. Previous adversarial attacks against VA have been mainly designed for the whitebox scenario, which assumes knowledge of the system’s internals, or requires large query and time budgets to launch target-specific attacks. When attacking a security-critical system, these assumptions do not hold. Our attack, on the other hand, targets common points of failure that all spoofing countermeasures share, making it real-time, model-agnostic, and completely blackbox without the need to interact with the target to craft the attack samples. The key message from our work is that CMs mistakenly learn to distinguish between spoofed and bonafide audio based on cues that are easily identifiable and forgeable. The effects of our attack are subtle enough to guarantee that these adversarial samples can still bypass the ASV as well and preserve their original textual contents. These properties combined make for a powerful attack that can bypass security-critical VA in its strictest form, yielding success rates of up to 99% with only 6 attempts. Finally, we perform the first targeted, over-telephony-network attack on CMs, bypassing several known challenges and enabling a variety of potential threats, given the increased use of voice biometrics in call centers. Our results call into question the security of modern VA systems and urge users to rethink their trust in them, in light of the real threat of attackers bypassing these measures to gain access to their most valuable resources.
Links
SoK: A Critical Evaluation of Efficient Website Fingerprinting Defenses.
Authors
- Nate Mathews, Rochester Institute of Technology
- James K Holland, University of Minnesota
- Se Eun Oh, Ewha Womans University
- Mohammad Saidur Rahman, Rochester Institute of Technology
- Nicholas Hopper, University of Minnesota
- Matthew Wright, Rochester Institute of Technology
Abstract
Recent website fingerprinting attacks have been shown to achieve very high performance against traffic through Tor. These attacks allow an adversary to deduce the website a Tor user has visited by simply eavesdropping on the encrypted communication. This has consequently motivated the development of many defense strategies that obfuscate traffic through the addition of dummy packets and/or delays. The efficacy and practicality of many of these recent proposals have yet to be scrutinized in detail. In this study, we re-evaluate nine recent defense proposals that claim to provide adequate security with low-overheads using the latest Deep Learning-based attacks. Furthermore, we assess the feasibility of implementing these defenses within the current confines of Tor. To this end, we additionally provide the first on-network implementation of the DynaFlow defense to better assess its real-world utility.
Links
Fashion Faux Pas: Implicit Stylistic Fingerprints for Bypassing Browsers' Anti-Fingerprinting Defenses.
Authors
- Xu Lin, University of Illinois, Chicago
- Frederico Araujo, IBM Research
- Teryl Taylor, IBM Research
- Jiyong Jang, IBM Research
- Jason Polakis, University of Illinois, Chicago
Abstract
Browser fingerprinting remains a topic of particular interest for both the research community and the browser ecosystem, and various anti-fingerprinting countermeasures have been proposed by prior work or deployed by browsers. While preventing fingerprinting presents a challenging task, modern fingerprinting techniques heavily rely on JavaScript APIs, which creates a choke point that can be targeted by countermeasures. In this paper, we explore how browser fingerprints can be generated without using any JavaScript APIs. To that end we develop StylisticFP, a novel fingerprinting system that relies exclusively on CSS features and implicitly infers system characteristics, including advanced fingerprinting attributes like the list of supported fonts, through carefully constructed and arranged HTML elements. We empirically demonstrate our system's effectiveness against privacy-focused browsers (e.g., Safari, Firefox, Brave, Tor) and popular privacy-preserving extensions. We also conduct a pilot study in a research organization and find that our system is comparable to a state-of-the-art JavaScript-based fingerprinting library at distinguishing devices, while outperforming it against browsers with anti-fingerprinting defenses. Our work highlights an additional dimension of the significant challenge posed by browser fingerprinting, and reaffirms the need for more robust detection systems and countermeasures.
Links
Robust Multi-tab Website Fingerprinting Attacks in the Wild.
Authors
- Xinhao Deng, Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China
- Qilei Yin, Zhongguancun Laboratory, Beijing, China
- Zhuotao Liu, Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China; Zhongguancun Laboratory, Beijing, China
- Xiyuan Zhao, Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China
- Qi Li, Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China; Zhongguancun Laboratory, Beijing, China
- Mingwei Xu, Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China; Zhongguancun Laboratory, Beijing, China
- Ke Xu, Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China; Zhongguancun Laboratory, Beijing, China
- Jianping Wu, Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China; Zhongguancun Laboratory, Beijing, China
Abstract
Website fingerprinting enables an eavesdropper to determine which websites a user is visiting over an encrypted connection. State-of-the-art website fingerprinting (WF) attacks have demonstrated effectiveness even against Tor-protected network traffic. However, existing WF attacks have critical limitations on accurately identifying websites in multi-tab browsing sessions, where the holistic pattern of individual websites is no longer preserved, and the number of tabs opened by a client is unknown a priori. In this paper, we propose ARES, a novel WF framework natively designed for multi-tab WF attacks. ARES formulates the multi-tab attack as a multi-label classification problem and solves it using a multi-classifier framework. Each classifier, designed based on a novel transformer model, identifies a specific website using its local patterns extracted from multiple traffic segments. We implement a prototype of ARES and extensively evaluate its effectiveness using our large-scale dataset collected over multiple months (by far the largest multi-tab WF dataset studied in academic papers.) The experimental results illustrate that ARES effectively achieves the multi-tab WF attack with the best F1-score of 0.907. Further, ARES remains robust even against various WF defenses.
Links
Only Pay for What You Leak: Leveraging Sandboxes for a Minimally Invasive Browser Fingerprinting Defense.
Authors
- Ryan Torok, Department of Computer Science, Princeton University, Princeton, New Jersey, USA
- Amit Levy, Department of Computer Science, Princeton University, Princeton, New Jersey, USA
Abstract
We present Sandcastle, an entropy-based browser fingerprinting defense that aims to minimize its interference with legitimate web applications. Sandcastle allows developers to partition code that operates on identifiable information into sandboxes to prove to the browser the information cannot be sent in any network request. Meanwhile, sandboxes may make full use of identifiable information on the client side, including writing to dedicated regions of the Document Object Model. For applications where this policy is too strict, Sandcastle provides an expressive cashier that allows precise control over the granularity of data that is leaked to the network. These features allow Sandcastle to eliminate most or all of the noise added to the outputs of identifiable APIs by Chrome’s Privacy Budget framework, the current state of the art in entropy-based fingerprinting defenses. Enabling unlimited client-side use of identifiable information allows for a much more comprehensive set of web applications to run under a fingerprinting defense, such as 3D games and video streaming, and provides a mechanism to expand the space of APIs that can be introduced to the web ecosystem without sacrificing privacy.
Links
It's (DOM) Clobbering Time: Attack Techniques, Prevalence, and Defenses.
Authors
- Soheil Khodayari, CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
- Giancarlo Pellegrino, CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Abstract
DOM Clobbering is a type of code-less injection attack where attackers insert a piece of non-script, seemingly benign HTML markup into a webpage and transform it to executable code by exploiting the unforeseen interactions between JavaScript code and the runtime environment. The attack techniques, browser behaviours, and vulnerable code patterns that enable DOM Clobbering has not been studied yet, and in this paper, we undertake one of the first evaluations of the state of DOM Clobbering on the Web platform. Starting with a comprehensive survey of existing literature and dynamic analysis of 19 different mobile and desktop browsers, we systematize DOM Clobbering attacks, uncovering 31.4K distinct markups that use five different techniques to unexpectedly overwrite JavaScript variables in at least one browser. Then, we use our systematization to identify and characterize program instructions that can be overwritten by DOM Clobbering, and use it to present TheThing, an automated system that detects clobberable data flows to security-sensitive instructions. We instantiate TheThing on the top of the Tranco top 5K sites, quantifying the prevalence and impact of DOM Clobbering in the wild. Our evaluation uncovers that DOM Clobbering vulnerabilities are ubiquitous, with a total of 9,467 vulnerable data flows across 491 affected sites, making it possible to mount arbitrary code execution, open redirections, or client-side request forgery attacks also against popular websites such as Fandom, Trello, Vimeo, TripAdvisor, WikiBooks and GitHub, that were not exploitable through the traditional attack vectors. Finally, in this paper, we also evaluate the robustness of the existing countermeasures, such as HTML sanitizers and Content Security Policy, against DOM Clobbering.
Links
Scaling JavaScript Abstract Interpretation to Detect and Exploit Node.js Taint-style Vulnerability.
Authors
- Mingqing Kang, Johns Hopkins University
- Yichao Xu, Johns Hopkins University
- Song Li, Zhejiang University
- Rigel Gjomemo, University of Illinois Chicago
- Jianwei Hou, Remain University of China
- V. N. Venkatakrishnan, University of Illinois Chicago
- Yinzhi Cao, Johns Hopkins University
Abstract
Taint-style vulnerabilities, such as OS command injection and path traversal, are common and severe software weaknesses. There exists an inherent trade-off between analysis scalability and accuracy in detecting such vulnerabilities. On one hand, existing syntax-directed approaches often make compromises in the analysis accuracy on dynamic features like bracket syntax. On the other hand, existing abstract interpretation often faces the issue of state explosion in the abstract domain, thus leading to a scalability problem.In this paper, we present a novel approach, called FAST, to scale the vulnerability discovery of JavaScript packages via a novel abstract interpretation approach that relies on two new techniques, called bottom-up and top-down abstract interpretation. The former abstractly interprets functions based on scopes instead of call sequences to construct dynamic call edges. Then, the latter follows specific control-flow paths and prunes the program to skip statements unrelated to the sink. If an end-to-end data-flow path is found, FAST queries the satisfiability of constraints along the path and verifies the exploitability to reduce human efforts.We implement a prototype of FAST and evaluate it against real-world Node.js packages. We show that FAST is able to find 242 zero-day vulnerabilities in NPM with 21 CVE identifiers being assigned. Our evaluation also shows that FAST can scale to real-world applications such as NodeBB and popular frameworks such as total.js and strapi in finding legacy vulnerabilities that no prior works can.
Links
Sound Verification of Security Protocols: From Design to Interoperable Implementations.
Authors
- Linard Arquint, Department of Computer Science, ETH, Zurich, Switzerland
- Felix A. Wolf, Department of Computer Science, ETH, Zurich, Switzerland
- Joseph Lallemand, Univ Rennes, CNRS, IRISA, France
- Ralf Sasse, Department of Computer Science, ETH, Zurich, Switzerland
- Christoph Sprenger, Department of Computer Science, ETH, Zurich, Switzerland
- Sven N. Wiesner, Department of Computer Science, ETH, Zurich, Switzerland
- David Basin, Department of Computer Science, ETH, Zurich, Switzerland
- Peter Müller, Department of Computer Science, ETH, Zurich, Switzerland
Abstract
We provide a framework consisting of tools and metatheorems for the end-to-end verification of security protocols, which bridges the gap between automated protocol verification and code-level proofs. We automatically translate a Tamarin protocol model into a set of I/O specifications expressed in separation logic. Each such specification describes a protocol role’s intended I/O behavior against which the role’s implementation is then verified. Our soundness result guarantees that the verified implementation inherits all security (trace) properties proved for the Tamarin model. Our framework thus enables us to leverage the substantial body of prior verification work in Tamarin to verify new and existing implementations. The possibility to use any separation logic code verifier provides flexibility regarding the target language. To validate our approach and show that it scales to real-world protocols, we verify a substantial part of the official Go implementation of the WireGuard VPN key exchange protocol.
Links
Typing High-Speed Cryptography against Spectre v1.
Authors
- Basavesh Ammanaghatta Shivakumar, MPI-SP, Bochum, Germany
- Gilles Barthe, MPI-SP, Bochum, Germany; IMDEA Software Institute, Madrid, Spain
- Benjamin Gregoire, Inria, Sophia Antipolis, France
- Vincent Laporte, CNRS, Inria, LORIA, Université de Lorraine, Nancy, France
- Tiago Oliveira, MPI-SP, Bochum, Germany
- Swarn Priya, Inria, Sophia Antipolis, France
- Peter Schwabe, MPI-SP, Bochum, Germany; Radboud University, Nijmegen, The Netherlands
- Lucas Tabary-Maujean, ENS Paris-Saclay, Gif-sur-Yvette, France
Abstract
The current gold standard of cryptographic software is to write efficient libraries with systematic protections against timing attacks. In order to meet this goal, cryptographic engineers increasingly use high-assurance cryptography tools. These tools guide programmers and provide rigorous guarantees that can be verified independently by library users. However, high-assurance tools reason about overly simple execution models that elide transient execution leakage. Thus, implementations validated by high-assurance cryptography tools remain potentially vulnerable to transient execution attacks such as Spectre or Meltdown. Moreover, proposed countermeasures are not used in practice due to performance overhead.We propose, analyze, implement and evaluate an approach for writing efficient cryptographic implementations that are protected against Spectre v1 attacks. Our approach ensures speculative constant-time, an information flow property which guarantees that programs are protected against Spectre v1. Speculative constant-time is enforced by means of a (value-dependent) information flow type system. The type system tracks security levels depending on whether execution is misspeculating. We implement our approach in the Jasmin framework for high-assurance cryptography, and use it for protecting all implementations of an experimental cryptographic library that includes highly optimized implementations of symmetric primitives, of elliptic-curve cryptography, and of Kyber, a lattice-based KEM recently selected by NIST for standardization. The performance impact of our protections is very low; for example, less than 1% for Kyber and essentially zero for X25519.
Links
Less is more: refinement proofs for probabilistic proofs.
Authors
- Kunming Jiang, NYU Department of Computer Science, Courant Institute
- Devora Chait-Roth, NYU Department of Computer Science, Courant Institute
- Zachary DeStefano, NYU Department of Computer Science, Courant Institute
- Michael Walfish, NYU Department of Computer Science, Courant Institute
- Thomas Wies, NYU Department of Computer Science, Courant Institute
Abstract
There has been intense interest over the last decade in implementations of probabilistic proofs (IPs, SNARKs, PCPs, and so on): protocols in which an untrusted party proves to a verifier that a given computation was executed properly, possibly in zero knowledge. Nevertheless, implementations still do not scale beyond small computations. A central source of overhead is the front-end: translating from the abstract computation to a set of equivalent arithmetic constraints. This paper introduces a general-purpose framework, called Distiller, in which a user translates to constraints not the original computation but an abstracted specification of it. Distiller is the first in this area to perform such transformations in a way that is provably safe. Furthermore, by taking the idea of "encode a check in the constraints" to its literal logical extreme, Distiller exposes many new opportunities for constraint reduction, resulting in cost reductions for benchmark computations of 1.3–50×, and in some cases, better asymptotics.
Links
Owl: Compositional Verification of Security Protocols via an Information-Flow Type System.
Authors
- Joshua Gancher, Carnegie Mellon University
- Sydney Gibson, Carnegie Mellon University
- Pratap Singh, Carnegie Mellon University
- Samvid Dharanikota, Carnegie Mellon University
- Bryan Parno, Carnegie Mellon University
Abstract
Computationally sound protocol verification tools promise to deliver full-strength cryptographic proofs for security protocols. Unfortunately, current tools lack either modularity or automation. We propose a new approach based on a novel use of information flow and refinement types for sound cryptographic proofs. Our framework, Owl, allows type-based modular descriptions of security protocols, wherein disjoint subprotocols can be programmed and automatically proved secure separately.We give a formal security proof for Owl via a core language which supports symmetric and asymmetric primitives, Diffie-Hellman operations, and hashing via random oracles. We also implement a type checker for Owl and a prototype extraction mechanism to Rust, and evaluate both on 14 case studies, including (simplified forms of) SSH key exchange and Kerberos.
Links
AUC: Accountable Universal Composability.
Authors
- Mike Graf, University of Stuttgart, Stuttgart, Germany
- Ralf Küsters, University of Stuttgart, Stuttgart, Germany
- Daniel Rausch, University of Stuttgart, Stuttgart, Germany
Abstract
Accountability is a well-established and widely used security concept that allows for obtaining undeniable cryptographic proof of misbehavior, thereby incentivizing honest behavior. There already exist several general purpose account-ability frameworks for formal game-based security analyses. Unfortunately, such game-based frameworks do not support modular security analyses, which is an important tool to handle the complexity of modern protocols.Universal composability (UC) models provide native support for modular analyses, including re-use and composition of security results. So far, accountability has mainly been modeled and analyzed in UC models for the special case of MPC protocols, with a general purpose accountability framework for UC still missing. That is, a framework that among others supports arbitrary protocols, a wide range of accountability properties, handling and mixing of accountable and non-accountable security properties, and modular analysis of accountable protocols.To close this gap, we propose AUC, the first general purpose accountability framework for UC models, which supports all of the above, based on several new concepts. We exemplify AUC in three case studies not covered by existing works. In particular, AUC unifies existing UC accountability approaches within a single framework.
Links
High-Order Masking of Lattice Signatures in Quasilinear Time.
Authors
- Rafaël del Pino, PQShield SAS, France
- Thomas Prest, PQShield SAS, France
- Mélissa Rossi, ANSSI, France
- Markku-Juhani O. Saarinen, PQShield LTD, UK
Abstract
In recent years, lattice-based signature schemes have emerged as the most prominent post-quantum solutions, as illustrated by NIST’s selection of Falcon and Dilithium for standardization. Both schemes enjoy good performance characteristics. However, their efficiency dwindles in the presence of side-channel protections, particularly masking – perhaps the strongest generic side-channel countermeasure. Masking at order d-1 requires randomizing all sensitive intermediate variables into d shares. With existing schemes, signature generation complexity grows quadratically with the number of shares, making high-order masking prohibitively slow.In this paper, we turn the problem upside-down: We design a lattice-based signature scheme specifically for side-channel resistance and optimize the masked efficiency as a function of the number of shares. Our design avoids costly operations such as conversions between arithmetic and boolean encodings (A2B/B2A), masked rejection sampling, and does not require a masked SHAKE implementation or other symmetric primitives. The resulting scheme is called Raccoon and belongs to the family of Fiat-Shamir with aborts lattice-based signatures. Raccoon is the first lattice-based signature whose key generation and signing running time has only an O(d log(d)) overhead, with d being the number of shares.Our Reference C implementation confirms that Raccoon’s performance is comparable to other state-of-the-art signature schemes, except that increasing the number of shares has a near-linear effect on its latency. We also present an FPGA implementation and perform a physical leakage assessment to verify its basic security properties.
Links
Practical Timing Side-Channel Attacks on Memory Compression.
Authors
- Martin Schwarzl, Graz University of Technology
- Pietro Borrello, Sapienza University of Rome
- Gururaj Saileshwar, NVIDIA Research
- Hanna Müller, Graz University of Technology
- Michael Schwarz, CISPA Helmholtz Center for Information Security
- Daniel Gruss, Graz University of Technology
Abstract
Compression algorithms have side channels due to their data-dependent operations. So far, only the compression-ratio side channel was exploited, e.g., the compressed data size.In this paper, we present Decomp+Time, the first memory-compression attack exploiting a timing side channel in compression algorithms. While Decomp+Time affects a much broader set of applications than prior work. A key challenge is precisely crafting attacker-controlled compression payloads to enable the attack with sufficient resolution. Our evolutionary fuzzer, Comprezzor, finds effective Decomp+Time payloads that optimize latency differences such that decompression timing can even be exploited in remote attacks. Decomp+Time has a capacity of 9.73 kB/s locally, and 10.72 bit/min across the internet (14 hops). Using Comprezzor, we develop attacks that leak data bytewise in four different case studies: First, we leak 1.50 bit/min from Memcached on a remote PHP script. Second, we leak database records with 2.69 bit/min, from PostgreSQL in a Python-Flask application, over the internet. Third, we leak secrets with 49.14 bit/min locally from ZRAM-compressed pages on Linux. Fourth, we leak internal heap pointers from the V8 engine within the Google Chrome browser on a system using ZRAM. Thus, it is important to re-evaluate the use of compression on sensitive data even if the application is only reachable via a remote interface.
Links
TEEzz: Fuzzing Trusted Applications on COTS Android Devices.
Authors
- Marcel Busch, EPFL
- Aravind Machiry, Purdue University
- Chad Spensky, Allthenticate
- Giovanni Vigna, UC Santa Barbara
- Christopher Kruegel, UC Santa Barbara
- Mathias Payer, EPFL
Abstract
Security and privacy-sensitive smartphone applications use trusted execution environments (TEEs) to protect sensitive operations from malicious code. By design, TEEs have privileged access to the entire system but expose little to no insight into their inner workings. Moreover, real-world TEEs enforce strict format and protocol interactions when communicating with trusted applications (TAs), which prohibits effective automated testing.TEEzz is the first TEE-aware fuzzing framework capable of effectively fuzzing TAs in situ on production smartphones, i.e., the TA runs in the encrypted and protected TEE and the fuzzer may only observe interactions with the TA but has no control over the TA’s code or data. Unlike traditional fuzzing techniques, which monitor the execution of a program being fuzzed and view its memory after a crash, TEEzz only requires a limited view of the target. TEEzz overcomes key limitations of TEE fuzzing (e.g., lack of visibility into the executed TAs, proprietary exchange formats, and value dependencies of interactions) by automatically attempting to infer the field types and message dependencies of the TA API through its interactions, designing state- and type-aware fuzzing mutators, and creating an in situ, on-device fuzzer.Due to the limited availability of systematic fuzzing research for TAs on commercial-off-the-shelf (COTS) Android devices, we extensively examine existing solutions, explore their limitations, and demonstrate how TEEzz improves the state-of-the-art. First, we show that general-purpose kernel driver fuzzers are ineffective for fuzzing TAs. Then, we establish a baseline for fuzzing TAs using a ground-truth experiment. We show that TEEzz outperforms other blackbox fuzzers, can improve greybox approaches (if TAs source code is available), and even outperforms greybox approaches for stateful targets. We found 13 previously unknown bugs in the latest versions of OPTEE TAs in total, out of which TEEzz is the only fuzzer to trigger three. We also ran TEEzz on popular phones and found 40 unique bugs for which one CVE was assigned so far.
Links
Half&Half: Demystifying Intel's Directional Branch Predictors for Fast, Secure Partitioned Execution.
Authors
- Hosein Yavarzadeh, University of California San Diego
- Mohammadkazem Taram, Purdue University
- Shravan Narayan, University of California San Diego; University of Texas at Austin
- Deian Stefan, University of California San Diego
- Dean Tullsen, University of California San Diego
Abstract
This paper presents Half&Half, a novel software defense against branch-based side-channel attacks. Half&Half isolates the effects of different protection domains on the conditional branch predictors (CBPs) in modern Intel processors. This work presents the first exhaustive analysis of modern conditional branch prediction structures, and reveals for the first time an unknown opportunity to physically partition all CBP structures and completely prevent leakage between two domains using the shared predictor. Half&Half is a software-only solution to branch predictor isolation that requires no changes to the hardware or ISA, and only requires minor modifications to be supported in existing compilers. We implement Half&Half in the LLVM and WebAssembly compilers and show that it incurs an order of magnitude lower overhead compared to the current state-of-the-art branch-based side-channel defenses.
Links
Half&Half: Demystifying Intel's Directional Branch Predictors for Fast, Secure Partitioned Execution.
Authors
- Hosein Yavarzadeh, University of California San Diego
- Mohammadkazem Taram, Purdue University
- Shravan Narayan, University of California San Diego; University of Texas at Austin
- Deian Stefan, University of California San Diego
- Dean Tullsen, University of California San Diego
Abstract
This paper presents Half&Half, a novel software defense against branch-based side-channel attacks. Half&Half isolates the effects of different protection domains on the conditional branch predictors (CBPs) in modern Intel processors. This work presents the first exhaustive analysis of modern conditional branch prediction structures, and reveals for the first time an unknown opportunity to physically partition all CBP structures and completely prevent leakage between two domains using the shared predictor. Half&Half is a software-only solution to branch predictor isolation that requires no changes to the hardware or ISA, and only requires minor modifications to be supported in existing compilers. We implement Half&Half in the LLVM and WebAssembly compilers and show that it incurs an order of magnitude lower overhead compared to the current state-of-the-art branch-based side-channel defenses.
Links
Improving Developers' Understanding of Regex Denial of Service Tools through Anti-Patterns and Fix Strategies.
Authors
- Sk Adnan Hassan, Virginia Tech, Blacksburg, VA, USA
- Zainab Aamir, Stony Brook University, Stony Brook, NY, USA
- Dongyoon Lee, Stony Brook University, Stony Brook, NY, USA
- James C. Davis, Purdue University, West Lafayette, IN, USA
- Francisco Servant, University of Málaga, Málaga, Spain
Abstract
Regular expressions are used for diverse purposes, including input validation and firewalls. Unfortunately, they can also lead to a security vulnerability called ReDoS (Regular Expression Denial of Service), caused by a super-linear worst-case execution time during regex matching. Due to the severity and prevalence of ReDoS, past work proposed automatic tools to detect and fix regexes. Although these tools were evaluated in automatic experiments, their usability has not yet been studied; usability has not been a focus of prior work. Our insight is that the usability of existing tools to detect and fix regexes will improve if we complement them with anti-patterns and fix strategies of vulnerable regexes.We developed novel anti-patterns for vulnerable regexes, and a collection of fix strategies to fix them. We derived our anti-patterns and fix strategies from a novel theory of regex infinite ambiguity — a necessary condition for regexes vulnerable to ReDoS. We proved the soundness and completeness of our theory. We evaluated the effectiveness of our anti-patterns, both in an automatic experiment and when applied manually. Then, we evaluated how much our anti-patterns and fix strategies improve developers’ understanding of the outcome of detection and fixing tools. Our evaluation found that our anti-patterns were effective over a large dataset of regexes (N=209,188): 100% precision and 99% recall, improving the state of the art 50% precision and 87% recall. Our anti-patterns were also more effective than the state of the art when applied manually (N=20): 100% developers applied them effectively vs. 50% for the state of the art. Finally, our anti-patterns and fix strategies increased developers’ understanding using automatic tools (N=9): from median "Very weakly" to median "Strongly" when detecting vulnerabilities, and from median "Very weakly" to median "Very strongly" when fixing them.
Links
Practical Program Modularization with Type-Based Dependence Analysis.
Authors
- Kangjie Lu, University of Minnesota
Abstract
Today's software programs are bloating and have become extremely complex. As there is typically no internal isolation among modules in a program, a vulnerability can be exploited to corrupt the memory and take control of the whole program. Program modularization is thus a promising security mechanism that splits a complex program into smaller modules, so that memory-access instructions can be constrained from corrupting irrelevant modules. A general approach to realizing program modularization is dependence analysis which determines if an instruction is independent of specific code or data; and if so, it can be modularized. Unfortunately, dependence analysis in complex programs is generally considered infeasible, due to problems in data-flow analysis, such as unknown indirect-call targets, pointer aliasing, and path explosion. As a result, we have not seen practical automated program modularization built on dependence analysis.This paper presents a breakthrough—Type-based dependence analysis for Program Modularization (TyPM). Its goal is to determine which modules in a program can never pass a type of object (including references) to a memory-access instruction; therefore, objects of this type that are created by these modules can never be valid targets of the instruction. The idea is to employ a type-based analysis to first determine which types of data flows can take place between two modules, and then transitively resolve all dependent modules of a memory-access instruction, with respect to the specific type. Such an approach avoids the data-flow analysis and can be practical. We develop two important security applications based on TyPM: refining indirect-call targets and protecting critical data structures. We extensively evaluate TyPM with various system software, including an OS kernel, a hypervisor, UEFI firmware, and a browser. Results show that on average TyPM additionally refines indirect-call targets produced by the state of the art by 31%-91%. TyPM can also remove 99.9% of modules for memory-write instructions to prevent them from corrupting critical data structures in the Linux kernel.
Links
WarpAttack: Bypassing CFI through Compiler-Introduced Double-Fetches.
Authors
- Jianhao Xu, State Key Laboratory for Novel Software Technology, Nanjing University; EPFL
- Luca Di Bartolomeo, EPFL
- Flavio Toffalini, EPFL
- Bing Mao, State Key Laboratory for Novel Software Technology, Nanjing University
- Mathias Payer, EPFL
Abstract
Code-reuse attacks are dangerous threats that attracted the attention of the security community for years. These attacks aim at corrupting important control-flow transfers for taking control of a process without injecting code. Nowadays, the combinations of multiple mitigations (e.g., ASLR, DEP, and CFI) drastically reduced this attack surface, making running code-reuse exploits more challenging.Unfortunately, security mitigations are combined with compiler optimizations, that do not distinguish between security-related and application code. Blindly deploying code optimizations over code-reuse mitigations may undermine their security guarantees. For instance, compilers may introduce double-fetch vulnerabilities that lead to concurrency issues such as Time-Of-Check to Time-Of-Use (TOCTTOU) attacks.In this work, we propose a new attack vector, called WarpAttack, that exploits compiler-introduced double-fetch optimizations to mount TOCTTOU attacks and bypass code-reuse mitigations. We study the mechanism underlying this attack and present a practical proof-of-concept exploit against the last version of Firefox. Additionally, we propose a lightweight analysis to locate vulnerable double-fetch code (with 3% false positives) and conduct research over six popular applications, five operating systems, and four architectures (32 and 64 bits) to study the diffusion of this threat. Moreover, we study the implication of our attack against six CFI implementations. Finally, we investigate possible research lines for addressing this threat and propose practical solutions to be deployed in existing projects.
Links
SoK: Certified Robustness for Deep Neural Networks.
Authors
- Linyi Li, University of Illinois Urbana-Champaign
- Tao Xie, Key Laboratory of High Confidence Software Technologies, MoE (Peking University)
- Bo Li, University of Illinois Urbana-Champaign
Abstract
Great advances in deep neural networks (DNNs) have led to state-of-the-art performance on a wide range of tasks. However, recent studies have shown that DNNs are vulnerable to adversarial attacks, which have brought great concerns when deploying these models to safety-critical applications such as autonomous driving. Different defense approaches have been proposed against adversarial attacks, including: a) empirical defenses, which can usually be adaptively attacked again without providing robustness certification; and b) certifiably robust approaches, which consist of robustness verification providing the lower bound of robust accuracy against any attacks under certain conditions and corresponding robust training approaches. In this paper, we systematize certifiably robust approaches and related practical and theoretical implications and findings. We also provide the first comprehensive benchmark on existing robustness verification and training approaches on different datasets. In particular, we 1) provide a taxonomy for the robustness verification and training approaches, as well as summarize the methodologies for representative algorithms, 2) reveal the characteristics, strengths, limitations, and fundamental connections among these approaches, 3) discuss current research progresses, theoretical barriers, main challenges, and future directions for certifiably robust approaches for DNNs, and 4) provide an open-sourced unified platform to evaluate 20+ representative certifiably robust approaches.
Links
RAB: Provable Robustness Against Backdoor Attacks.
Authors
- Maurice Weber, ETH Zurich, Switzerland
- Xiaojun Xu, University of Illinois at Urbana-Champaign, USA
- Bojan Karlaš, ETH Zurich, Switzerland
- Ce Zhang, ETH Zurich, Switzerland
- Bo Li, University of Illinois at Urbana-Champaign, USA
Abstract
Recent studies have shown that deep neural net-works (DNNs) are vulnerable to adversarial attacks, including evasion and backdoor (poisoning) attacks. On the defense side, there have been intensive efforts on improving both empirical and provable robustness against evasion attacks; however, the provable robustness against backdoor attacks still remains largely unexplored. In this paper, we focus on certifying the machine learning model robustness against general threat models, especially backdoor attacks. We first provide a unified framework via randomized smoothing techniques and show how it can be instantiated to certify the robustness against both evasion and backdoor attacks. We then propose the first robust training process, RAB, to smooth the trained model and certify its robustness against backdoor attacks. We theoretically prove the robustness bound for machine learning models trained with RAB and prove that our robustness bound is tight. In addition, we theoretically show that it is possible to train the robust smoothed models efficiently for simple models such as K-nearest neighbor classifiers, and we propose an exact smooth-training algorithm that eliminates the need to sample from a noise distribution for such models. Empirically, we conduct comprehensive experiments for different machine learning (ML) models such as DNNs, support vector machines, and K-NN models on MNIST, CIFAR-10, and ImageNette datasets and provide the first benchmark for certified robustness against backdoor attacks. In addition, we evaluate K-NN models on a spambase tabular dataset to demonstrate the advantages of the proposed exact algorithm. Both the theoretic analysis and the comprehensive evaluation on diverse ML models and datasets shed light on further robust learning strategies against general training time attacks.
Links
ObjectSeeker: Certifiably Robust Object Detection against Patch Hiding Attacks via Patch-agnostic Masking.
Authors
- Chong Xiang, Princeton University
- Alexander Valtchanov, Princeton University
- Saeed Mahloujifar, Princeton University
- Prateek Mittal, Princeton University
Abstract
Object detectors, which are widely deployed in security-critical systems such as autonomous vehicles, have been found vulnerable to patch hiding attacks. An attacker can use a single physically-realizable adversarial patch to make the object detector miss the detection of victim objects and undermine the functionality of object detection applications. In this paper, we propose ObjectSeeker for certifiably robust object detection against patch hiding attacks. The key insight in ObjectSeeker is patch-agnostic masking: we aim to mask out the entire adversarial patch without knowing the shape, size, and location of the patch. This masking operation neutralizes the adversarial effect and allows any vanilla object detector to safely detect objects on the masked images. Remarkably, we can evaluate ObjectSeeker’s robustness in a certifiable manner: we develop a certification procedure to formally determine if ObjectSeeker can detect certain objects against any white-box adaptive attack within the threat model, achieving certifiable robustness. Our experiments demonstrate a significant (~10%-40% absolute and ~2-6× relative) improvement in certifiable robustness over the prior work, as well as high clean performance (∼1% drop compared with undefended models). 1
Links
PublicCheck: Public Integrity Verification for Services of Run-time Deep Models.
Authors
- Shuo Wang, CSIRO’s Data61, Australia; Cybersecurity CRC, Australia
- Sharif Abuadbba, CSIRO’s Data61, Australia; Cybersecurity CRC, Australia
- Sidharth Agarwal, Indian Institute of Technology, Delhi, India
- Kristen Moore, CSIRO’s Data61, Australia; Cybersecurity CRC, Australia
- Ruoxi Sun, CSIRO’s Data61, Australia
- Minhui Xue, CSIRO’s Data61, Australia; Cybersecurity CRC, Australia
- Surya Nepal, CSIRO’s Data61, Australia; Cybersecurity CRC, Australia
- Seyit Camtepe, CSIRO’s Data61, Australia; Cybersecurity CRC, Australia
- Salil Kanhere, University of New South Wales, Australia
Abstract
Existing integrity verification approaches for deep models are designed for private verification (i.e., assuming the service provider is honest, with white-box access to model parameters). However, private verification approaches do not allow model users to verify the model at run-time. Instead, they must trust the service provider, who may tamper with the verification results. In contrast, a public verification approach that considers the possibility of dishonest service providers can benefit a wider range of users. In this paper, we propose PublicCheck, a practical public integrity verification solution for services of run-time deep models. PublicCheck considers dishonest service providers, and overcomes public verification challenges of being lightweight, providing anti-counterfeiting protection, and having fingerprinting samples that appear smooth. To capture and fingerprint the inherent prediction behaviors of a run-time model, PublicCheck generates smoothly transformed and augmented encysted samples that are enclosed around the model's decision boundary while ensuring that the verification queries are indistinguishable from normal queries. PublicCheck is also applicable when knowledge of the target model is limited (e.g., with no knowledge of gradients or model parameters). A thorough evaluation of PublicCheck demonstrates the strong capability for model integrity breach detection (100% detection accuracy with less than 10 black-box API queries) against various model integrity attacks and model compression attacks. PublicCheck also demonstrates the smooth appearance, feasibility, and efficiency of generating a plethora of encysted samples for fingerprinting.
Links
FedRecover: Recovering from Poisoning Attacks in Federated Learning using Historical Information.
Authors
- Xiaoyu Cao, Duke University
- Jinyuan Jia, Duke University
- Zaixi Zhang, University of Science and Technology of China
- Neil Zhenqiang Gong, Duke University
Abstract
Federated learning is vulnerable to poisoning attacks in which malicious clients poison the global model via sending malicious model updates to the server. Existing defenses focus on preventing a small number of malicious clients from poisoning the global model via robust federated learning methods and detecting malicious clients when there are a large number of them. However, it is still an open challenge how to recover the global model from poisoning attacks after the malicious clients are detected. A naive solution is to remove the detected malicious clients and train a new global model from scratch using the remaining clients. However, such train-from-scratch recovery method incurs a large computation and communication cost, which may be intolerable for resource-constrained clients such as smartphones and IoT devices.In this work, we propose FedRecover, a method that can recover an accurate global model from poisoning attacks with a small computation and communication cost for the clients. Our key idea is that the server estimates the clients’ model updates instead of asking the clients to compute and communicate them during the recovery process. In particular, the server stores the historical information, including the global models and clients’ model updates in each round, when training the poisoned global model before the malicious clients are detected. During the recovery process, the server estimates a client’s model update in each round using its stored historical information. Moreover, we further optimize FedRecover to recover a more accurate global model using warm-up, periodic correction, abnormality fixing, and final tuning strategies, in which the server asks the clients to compute and communicate their exact model updates. Theoretically, we show that the global model recovered by FedRecover is close to or the same as that recovered by train-from-scratch under some assumptions. Empirically, our evaluation on four datasets, three federated learning methods, as well as untargeted and targeted poisoning attacks (e.g., backdoor attacks) shows that FedRecover is both accurate and efficient.
Links
On The Empirical Effectiveness of Unrealistic Adversarial Hardening Against Realistic Adversarial Attacks.
Authors
- Salijona Dyrmishi, University of Luxembourg
- Salah Ghamizi, University of Luxembourg
- Thibault Simonetto, University of Luxembourg
- Yves Le Traon, University of Luxembourg
- Maxime Cordy, University of Luxembourg
Abstract
While the literature on security attacks and defenses of Machine Learning (ML) systems mostly focuses on unrealistic adversarial examples, recent research has raised concern about the under-explored field of realistic adversarial attacks and their implications on the robustness of real-world systems. Our paper paves the way for a better understanding of adversarial robustness against realistic attacks and makes two major contributions. First, we conduct a study on three real-world use cases (text classification, botnet detection, malware detection) and seven datasets in order to evaluate whether unrealistic adversarial examples can be used to protect models against realistic examples. Our results reveal discrepancies across the use cases, where unrealistic examples can either be as effective as the realistic ones or may offer only limited improvement. Second, to explain these results, we analyze the latent representation of the adversarial examples generated with realistic and unrealistic attacks. We shed light on the patterns that discriminate which unrealistic examples can be used for effective hardening. We release our code, datasets and models to support future research in exploring how to reduce the gap between unrealistic and realistic adversarial attacks.
Links
Rethinking Searchable Symmetric Encryption.
Authors
- Zichen Gui, ETH Zürich
- Kenneth G. Paterson, ETH Zürich
- Sikhar Patranabis, IBM Research India
Abstract
Symmetric Searchable Encryption (SSE) schemes enable keyword searches over encrypted documents. To obtain efficiency, SSE schemes incur a certain amount of leakage. The vast majority of the literature on SSE considers only leakage from one component of the overall SSE system, the encrypted search index. This component is used to identify which documents to return in response to a keyword query. The actual fetching of the documents is left to another component, usually left unspecified in the literature, but generally envisioned as a simple storage system matching document identifiers to encrypted documents.This raises the question: do SSE schemes actually protect the security of data and queries when considered from a system-wide viewpoint? We answer this question in the negative. We do this by introducing a new inference attack that achieves practically efficient, highly scalable, accurate query reconstruction against end-to-end SSE systems. In particular, our attack works even when the SSE schemes are built in the natural way using the state-of-the-art techniques (namely, volume-hiding encrypted multi-maps) designed to suppress leakage and protect against previous generations of attack.A second question is whether the state-of-the-art leakage suppression techniques can instead be applied on a system-wide basis, to protect both the encrypted search index and the encrypted document store, to produce efficient SSE systems. We also answer this question in the negative. To do so, we implement SSE systems using those state-of-the-art leakage suppression methods, and evaluate their performance. We show that storage overheads range from 100× to 800× while bandwidth overheads range from 20× to100×, as compared to a naïve baseline system.Our results motivate the design of new SSE systems that are designed with system-wide security in mind from the outset. In this regard, we show that one such SSE system due to Chen et al. (IEEE INFOCOM 2018), with provable security guarantees based on differential privacy, is also vulnerable to our new attack.In totality, our results force a re-evaluation of how to build end-to-end SSE systems that offer both security and efficiency.
Links
Private Collaborative Data Cleaning via Non-Equi PSI.
Authors
- Erik-Oliver Blass, Airbus Munich, Germany
- Florian Kerschbaum, University of Waterloo Waterloo, Canada
Abstract
We introduce and investigate the privacy-preserving version of collaborative data cleaning. With collaborative data cleaning, two parties want to reconcile their data sets to filter out badly classified, misclassified data items. In the privacy-preserving (private) version of data cleaning, the additional security goal is that parties should only learn their misclassified data items, but nothing else about the other party’s data set. The problem of private data cleaning is essentially a variation of private set intersection (PSI), and one could employ recent circuit-PSI techniques to compute misclassifications with privacy. However, we design, analyze, and implement three new protocols tailored to the specifics of private data cleaning that outperform a circuit-PSI-based approach. With the first protocol, we exploit the idea that a small additional leakage (the differentially private size of the intersection of data items) allows for a reduction in complexity over circuit-PSI. The other two protocols convert the problem of finding a mismatch in data classifications into finding a match, and then follow the standard technique of using oblivious pseudorandom functions (OPRF) for computing PSI. Depending on the number of data classes, this leads to a concrete runtime improvement over circuit-PSI.
Links
Private Collaborative Data Cleaning via Non-Equi PSI.
Authors
- Erik-Oliver Blass, Airbus Munich, Germany
- Florian Kerschbaum, University of Waterloo Waterloo, Canada
Abstract
We introduce and investigate the privacy-preserving version of collaborative data cleaning. With collaborative data cleaning, two parties want to reconcile their data sets to filter out badly classified, misclassified data items. In the privacy-preserving (private) version of data cleaning, the additional security goal is that parties should only learn their misclassified data items, but nothing else about the other party’s data set. The problem of private data cleaning is essentially a variation of private set intersection (PSI), and one could employ recent circuit-PSI techniques to compute misclassifications with privacy. However, we design, analyze, and implement three new protocols tailored to the specifics of private data cleaning that outperform a circuit-PSI-based approach. With the first protocol, we exploit the idea that a small additional leakage (the differentially private size of the intersection of data items) allows for a reduction in complexity over circuit-PSI. The other two protocols convert the problem of finding a mismatch in data classifications into finding a match, and then follow the standard technique of using oblivious pseudorandom functions (OPRF) for computing PSI. Depending on the number of data classes, this leads to a concrete runtime improvement over circuit-PSI.
Links
SPHINCS+C: Compressing SPHINCS+ With (Almost) No Cost.
Authors
- Andreas Hülsing, TU Eindhoven
- Mikhail Kudinov, TU Eindhoven
- Eyal Ronen, Tel Aviv University
- Eylon Yogev, Bar-Ilan University
Abstract
SPHINCS+ [CCS ’19] is one of the selected post-quantum digital signature schemes of NIST’s post-quantum standardization process. The scheme is a hash-based signature and is considered one of the most secure and robust proposals. The proposal includes a fast (but larger) variant and a small (but slower) variant for each security level. The main problem that might hinder its adoption is its large signature size. Although SPHINCS+ supports a trade-off between signature size and the computational cost of signing, further reducing the signature size (below the small variants) results in a prohibitively high computational cost for the signer.This paper presents several novel methods for further compressing the signature size while requiring negligible added computational costs for the signer and further reducing verification time. Moreover, our approach enables a much more efficient trade-off curve between signature size and the computational costs of the signer. In many parameter settings, we achieve small signatures and faster running times simultaneously. For example, for 128-bit (classical) security, the small signature variant of SPHINCS+ is 7856 bytes long, while our variant is only 6304 bytes long: a compression of approximately 20% while still reducing the signer’s running time. However, other trade-offs that focus, e.g., on verification speed, are possible.The main insight behind our scheme is that there are predefined specific subsets of messages for which the WOTS+ and FORS signatures (that SPHINCS+ uses) can be compressed, and generation can be made faster while maintaining the same security guarantees. Although most messages will not come from these subsets, we can search for suitable hashed values to sign. We sign a hash of the message concatenated with a counter that was chosen such that the hashed value is in the subset. The resulting signature is both smaller and faster to sign and verify.Our schemes are simple to describe and implement. We provide an implementation, a theoretical analysis of speed and security, as well as benchmark results.
Links
Threshold Signatures in the Multiverse.
Authors
- Leemon Baird, Swirlds Labs
- Sanjam Garg, UC Berkeley; NTT Research
- Abhishek Jain, Johns Hopkins University
- Pratyay Mukherjee, Supra Oracles
- Rohit Sinha, Meta
- Mingyuan Wang, UC Berkeley
- Yinuo Zhang, UC Berkeley
Abstract
We introduce a new notion of multiverse threshold signatures (MTS). In an MTS scheme, multiple universes – each defined by a set of (possibly overlapping) signers, their weights, and a specific security threshold – can co-exist. A universe can be (adaptively) created via a non-interactive asynchronous setup. Crucially, each party in the multiverse holds constant-sized keys and releases compact signatures with size and computation time both independent of the number of universes. Given sufficient partial signatures over a message from the members of a specific universe, an aggregator can produce a short aggregate signature relative to that universe.We construct an MTS scheme building on BLS signatures. Our scheme is practical, and can be used to reduce bandwidth complexity and computational costs in decentralized oracle networks. As an example data point, consider a multiverse containing 2000 nodes and 100 universes (parameters inspired by Chainlink’s use in the wild), each of which contains arbitrarily large subsets of nodes and arbitrary thresholds. Each node computes and outputs 1 group element as its partial signature; the aggregator performs under 0.7 seconds of work for each aggregate signature, and the final signature of size 192 bytes takes 6.4 ms (or 198K EVM gas units) to verify. For this setting, prior approaches, when used to construct MTS, yield schemes that have one of the following drawbacks: (i) partial signatures that are 48× larger, (ii) have aggregation times 311× worse, or (iii) have signature size 39× and verification gas costs 3.38× larger. We also provide an open-source implementation and a detailed evaluation.
Links
FIDO2, CTAP 2.1, and WebAuthn 2: Provable Security and Post-Quantum Instantiation.
Authors
- Nina Bindel, SandboxAQ, Palo Alto, USA
- Cas Cremers, CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
- Mang Zhao, CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
Abstract
The FIDO2 protocol is a globally used standard for passwordless authentication, building on an alliance between major players in the online authentication space. While already widely deployed, the standard is still under active development. Since version 2.1 of its CTAP sub-protocol, FIDO2 can potentially be instantiated with post-quantum secure primitives.We provide the first formal security analysis of FIDO2 with the CTAP 2.1 and WebAuthn 2 sub-protocols. Our security models build on work by Barbosa et al. for their analysis of FIDO2 with CTAP 2.0 and WebAuthn 1, which we extend in several ways. First, we provide a more fine-grained security model that allows us to prove more relevant protocol properties, such as guarantees about token binding agreement, the None attestation mode, and user verification. Second, we can prove post-quantum security for FIDO2 under certain conditions and minor protocol extensions. Finally, we show that for some threat models, the downgrade resilience of FIDO2 can be improved, and show how to achieve this with a simple modification.
Links
Token meets Wallet: Formalizing Privacy and Revocation for FIDO2.
Authors
- Lucjan Hanzlik, CISPA Helmholtz Center for Information Security
- Julian Loss, CISPA Helmholtz Center for Information Security
- Benedikt Wagner, CISPA Helmholtz Center for Information Security, Saarland University
Abstract
The FIDO2 standard is a widely-used class of challenge-response type protocols that allows to authenticate to an online service using a hardware token. Barbosa et al. (CRYPTO ‘21) provided the first formal security model and analysis for the FIDO2 standard. However, their model has two shortcomings: (1) It does not include privacy, one of the key features claimed by FIDO2. (2) It only covers tokens that store all secret keys locally. In contrast, due to limited memory, most existing FIDO2 tokens either derive all secret keys from a common seed or store keys on the server (the latter approach is also known as key wrapping).In this paper, we revisit the security of the WebAuthn component of FIDO2 as implemented in practice. Our contributions are as follows. (1) We adapt the model of Barbosa et al. so as to capture authentication tokens using key derivation or key wrapping. (2) We provide the first formal definition of privacy for the WebAuthn component of FIDO2. We then prove the privacy of this component in common FIDO2 token implementations if the underlying building blocks are chosen appropriately. (3) We address the unsolved problem of global key revocation in FIDO2. To this end, we introduce and analyze a simple revocation procedure that builds on the popular BIP32 standard used in cryptocurrency wallets and can efficiently be implemented with existing FIDO2 servers.
Links
SoK: Taxonomy of Attacks on Open-Source Software Supply Chains.
Authors
- Piergiorgio Ladisa, SAP Security Research; Université de Rennes 1, Inria, IRISA
- Henrik Plate, SAP Security Research
- Matias Martinez, Université Polytechnique Hauts-de-France
- Olivier Barais, Université de Rennes 1, Inria, IRISA
Abstract
The widespread dependency on open-source software makes it a fruitful target for malicious actors, as demonstrated by recurring attacks. The complexity of today’s open-source supply chains results in a significant attack surface, giving attackers numerous opportunities to reach the goal of injecting malicious code into open-source artifacts that is then downloaded and executed by victims.This work proposes a general taxonomy for attacks on open-source supply chains, independent of specific programming languages or ecosystems, and covering all supply chain stages from code contributions to package distribution. Taking the form of an attack tree, it covers 107 unique vectors, linked to 94 real-world incidents, and mapped to 33 mitigating safeguards.User surveys conducted with 17 domain experts and 134 software developers positively validated the correctness, comprehensiveness and comprehensibility of the taxonomy, as well as its suitability for various use-cases. Survey participants also assessed the utility and costs of the identified safeguards, and whether they are used.
Links
It's like flossing your teeth: On the Importance and Challenges of Reproducible Builds for Software Supply Chain Security.
Authors
- Marcel Fourné, Max Planck Institute for Security and Privacy, Bochum, Germany
- Dominik Wermke, CISPA Helmholtz Center for Information Security, Germany
- William Enck, North Carolina State University, Raleigh, North Carolina, USA
- Sascha Fahl, CISPA Helmholtz Center for Information Security, Germany
- Yasemin Acar, Paderborn University, Germany; George Washington University, USA
Abstract
The 2020 Solarwinds attack was a tipping point that caused a heightened awareness about the security of the software supply chain and in particular the large amount of trust placed in build systems. Reproducible Builds (R-Bs) provide a strong foundation to build defenses for arbitrary attacks against build systems by ensuring that given the same source code, build environment, and build instructions, bitwise-identical artifacts are created. Unfortunately, much of the software industry believes R-Bs are too far out of reach for most projects. The goal of this paper is to help identify a path for R-Bs to become a commonplace property.To this end, we conducted a series of 24 semi-structured expert interviews with participants from the Reproducible-Builds.org project, finding that self-effective work by highly motivated developers and collaborative communication with upstream projects are key contributors to R-Bs. We identified a range of motivations that can encourage open source developers to strive for R-Bs, including indicators of quality, security benefits, and more efficient caching of artifacts. We also identify experiences that help and hinder adoption, which often revolves around communication with upstream projects. We conclude with recommendations on how to better integrate R-Bs with the efforts of the open source and free software community.
Links
"Always Contribute Back": A Qualitative Study on Security Challenges of the Open Source Supply Chain.
Authors
- Dominik Wermke, CISPA Helmholtz Center for Information Security, Germany
- Jan H. Klemmer, Leibniz University Hannover, Germany
- Noah Wöhler, CISPA Helmholtz Center for Information Security, Germany
- Juliane Schmüser, CISPA Helmholtz Center for Information Security, Germany
- Harshini Sri Ramulu, Paderborn University, Germany
- Yasemin Acar, Paderborn University, Germany; George Washington University, United States
- Sascha Fahl, CISPA Helmholtz Center for Information Security, Germany; Leibniz University Hannover, Germany
Abstract
Open source components are ubiquitous in companies’ setups, processes, and software. Utilizing these external components as building blocks enables companies to leverage the benefits of open source software, allowing them to focus their efforts on features and faster delivery instead of writing their own components. But by introducing these components into their software stack, companies inherit unique security challenges and attack surfaces: including code from potentially unvetted contributors and obligations to assess and mitigate the impact of vulnerabilities in external components.In 25 in-depth, semi-structured interviews with software developers, architects, and engineers from industry projects, we investigate their projects’ processes, decisions, and considerations in the context of external open source code. We find that open source components play an important role in many of our participants’ projects, that most projects have some form of company policy or at least best practice for including external code, and that many developers wish for more developer-hours, dedicated teams, or tools to better audit included components. Based on our findings, we discuss implications for company stakeholders and the open source software ecosystem. Overall, we appeal to companies to not treat the open source ecosystem as a free (software) supply chain and instead to contribute towards the health and security of the overall software ecosystem they benefit from and are part of.
Links
Continuous Intrusion: Characterizing the Security of Continuous Integration Services.
Authors
- Yacong Gu, QI-ANXIN Technology Research Institute
- Lingyun Ying, QI-ANXIN Technology Research Institute
- Huajun Chai, QI-ANXIN Technology Research Institute
- Chu Qiao, University of Delaware
- Haixin Duan, Tsinghua University; Tsinghua University-QI-ANXIN Group JCNS
- Xing Gao, University of Delaware
Abstract
Continuous Integration (CI) is a widely-adopted software development practice for automated code integration. A typical CI workflow involves multiple independent stakeholders, including code hosting platforms (CHPs), CI platforms (CPs), and third party services. While CI can significantly improve development efficiency, unfortunately, it also exposes new attack surfaces. As the code executed by a CI task may come from a less-trusted user, improperly configured CI with weak isolation mechanisms might enable attackers to inject malicious code into victim software by triggering a CI task. Also, one insecure stakeholder can potentially affect the whole process. In this paper, we systematically study potential security threats in CI workflows with multiple stakeholders and major CP components considered. We design and develop an analysis tool, CInspector, to investigate potential vulnerabilities in seven popular CPs, when integrated with three mainstream CHPs. We find that all CPs have the risk of token leakage caused by improper resource sharing and isolation, and many of them utilize over-privileged tokens with improper validity periods. We further reveal four novel attack vectors that allow attackers to escalate their privileges and stealthy inject malicious code by executing a piece of code in a CI task. To understand the potential impact, we conduct a large-scale measurement on the three mainstream CHPs, scrutinizing over 1.69 million repositories. Our quantitative analysis demonstrates that some very popular repositories and large organizations are affected by these attacks. We have duly reported the identified vulnerabilities to CPs and received positive responses.
Links
Investigating Package Related Security Threats in Software Registries.
Authors
- Yacong Gu, QI-ANXIN Technology Research Institute
- Lingyun Ying, QI-ANXIN Technology Research Institute
- Yingyuan Pu, QI-ANXIN Technology Research Institute; Ocean University of China
- Xiao Hu, QI-ANXIN Technology Research Institute
- Huajun Chai, QI-ANXIN Technology Research Institute
- Ruimin Wang, QI-ANXIN Technology Research Institute; Southeast University
- Xing Gao, University of Delaware
- Haixin Duan, Tsinghua University; Tsinghua University-QI-ANXIN Group JCNS
Abstract
Package registries host reusable code assets, allowing developers to share and reuse packages easily, thus accelerating the software development process. Current software registry ecosystems involve multiple independent stakeholders for package management. Unfortunately, abnormal behavior and information inconsistency inevitably exist, enabling adversaries to conduct malicious activities with minimal effort covertly. In this paper, we investigate potential security vulnerabilities in six popular software registry ecosystems. Through a systematic analysis of the official registries, corresponding registry mirrors and registry clients, we identify twelve potential attack vectors, with six of them disclosed for the first time, that can be exploited to distribute malicious code stealthily. Based on these security issues, we build an analysis framework, RScouter, to continuously monitor and uncover vulnerabilities in registry ecosystems. We then utilize RScouter to conduct a measurement study spanning one year over six registries and seventeen popular mirrors, scrutinizing over 4 million packages across 53 million package versions. Our quantitative analysis demonstrates that multiple threats exist in every ecosystem, and some have been exploited by attackers. We have duly reported the identified vulnerabilities to related stakeholders and received positive responses.
Links
ShadowNet: A Secure and Efficient On-device Model Inference System for Convolutional Neural Networks.
Authors
- Zhichuang Sun, Google
- Ruimin Sun, Florida International University
- Changming Liu, Northeastern University
- Amrita Roy Chowdhury, University of California, San Diego
- Long Lu, Northeastern University
- Somesh Jha, University of Wisconsin-Madison
Abstract
With the increased usage of AI accelerators on mobile and edge devices, on-device machine learning (ML) is gaining popularity. Thousands of proprietary ML models are being deployed today on billions of untrusted devices. This raises serious security concerns about model privacy. However, protecting model privacy without losing access to the untrusted AI accelerators is a challenging problem. In this paper, we present a novel on-device model inference system, ShadowNet. ShadowNet protects the model privacy with Trusted Execution Environment (TEE) while securely outsourcing the heavy linear layers of the model to the untrusted hardware accelerators. ShadowNet achieves this by transforming the weights of the linear layers before outsourcing them and restoring the results inside the TEE. The non-linear layers are also kept secure inside the TEE. ShadowNet’s design ensures efficient transformation of the weights and the subsequent restoration of the results. We build a ShadowNet prototype based on TensorFlow Lite and evaluate it on five popular CNNs, namely, MobileNet, ResNet-44, MiniVGG, ResNet-404, and YOLOv4-tiny. Our evaluation shows that ShadowNet achieves strong security guarantees with reasonable performance, offering a practical solution for secure on-device model inference.
Links
Deepfake Text Detection: Limitations and Opportunities.
Authors
- Jiameng Pu, Virginia Tech
- Zain Sarwar, University of Chicago
- Sifat Muhammad Abdullah, Virginia Tech
- Abdullah Rehman, Virginia Tech
- Yoonjin Kim, Virginia Tech
- Parantapa Bhattacharya, University of Virginia
- Mobin Javed, LUMS Pakistan
- Bimal Viswanath, Virginia Tech
Abstract
Recent advances in generative models for language have enabled the creation of convincing synthetic text or deepfake text. Prior work has demonstrated the potential for misuse of deepfake text to mislead content consumers. Therefore, deepfake text detection, the task of discriminating between human and machine-generated text, is becoming increasingly critical. Several defenses have been proposed for deepfake text detection. However, we lack a thorough understanding of their real-world applicability. In this paper, we collect deepfake text from 4 online services powered by Transformer-based tools to evaluate the generalization ability of the defenses on content in the wild. We develop several low-cost adversarial attacks, and investigate the robustness of existing defenses against an adaptive attacker. We find that many defenses show significant degradation in performance under our evaluation scenarios compared to their original claimed performance. Our evaluation shows that tapping into the semantic information in the text content is a promising approach for improving the robustness and generalization performance of deepfake text detection schemes.
Links
StyleFool: Fooling Video Classification Systems via Style Transfer.
Authors
- Yuxin Cao, Shenzhen International Graduate School, Tsinghua University, China
- Xi Xiao, Shenzhen International Graduate School, Tsinghua University, China
- Ruoxi Sun, CSIRO’s Data61, Australia
- Derui Wang, CSIRO’s Data61, Australia
- Minhui Xue, CSIRO’s Data61, Australia
- Sheng Wen, Swinburne University of Technology, Australia
Abstract
Video classification systems are vulnerable to adversarial attacks, which can create severe security problems in video verification. Current black-box attacks need a large number of queries to succeed, resulting in high computational overhead in the process of attack. On the other hand, attacks with restricted perturbations are ineffective against defenses such as denoising or adversarial training. In this paper, we focus on unrestricted perturbations and propose StyleFool, a black-box video adversarial attack via style transfer to fool the video classification system. StyleFool first utilizes color theme proximity to select the best style image, which helps avoid unnatural details in the stylized videos. Meanwhile, the target class confidence is additionally considered in targeted attacks to influence the output distribution of the classifier by moving the stylized video closer to or even across the decision boundary. A gradient-free method is then employed to further optimize the adversarial perturbations. We carry out extensive experiments to evaluate StyleFool on two standard datasets, UCF-101 and HMDB-51. The experimental results demonstrate that StyleFool outperforms the state-of-the-art adversarial attacks in terms of both the number of queries and the robustness against existing defenses. Moreover, 50% of the stylized videos in untargeted attacks do not need any query since they can already fool the video classification model. Furthermore, we evaluate the indistinguishability through a user study to show that the adversarial samples of StyleFool look imperceptible to human eyes, despite unrestricted perturbations.
Links
GeeSolver: A Generic, Efficient, and Effortless Solver with Self-Supervised Learning for Breaking Text Captchas.
Authors
- Ruijie Zhao, Shanghai Jiao Tong University
- Xianwen Deng, Shanghai Jiao Tong University
- Yanhao Wang, QI-ANXIN
- Zhicong Yan, Shanghai Jiao Tong University
- Zhengguang Han, QI-AN Pangu (Shanghai) InfoTech Co., Ltd.
- Libo Chen, Shanghai Jiao Tong University
- Zhi Xue, Shanghai Jiao Tong University
- Yijun Wang, Shanghai Jiao Tong University
Abstract
Although text-based captcha, which is used to differentiate between human users and bots, has faced many attack methods, it remains a widely used security mechanism and is employed by some websites. Some deep learning-based text captcha solvers have shown excellent results, but the labor-intensive and time-consuming labeling process severely limits their viability. Previous works attempted to create easy-to-use solvers using a limited collection of labeled data. However, they are hampered by inefficient preprocessing procedures and inability to recognize the captchas with complicated security features.In this paper, we propose GeeSolver, a generic, efficient, and effortless solver for breaking text-based captchas based on self-supervised learning. Our insight is that numerous difficult-to-attack captcha schemes that "damage" the standard font of characters are similar to image masks. And we could leverage masked autoencoders (MAE) to improve the captcha solver to learn the latent representation from the "unmasked" part of the captcha images. Specifically, our model consists of a ViT encoder as latent representation extractor and a well-designed decoder for captcha recognition. We apply MAE paradigm to train our encoder, which enables the encoder to extract latent representation from local information (i.e., without masking part) that can infer the corresponding character. Further, we freeze the parameters of the encoder and leverage a few labeled captchas and many unlabeled captchas to train our captcha decoder with semi-supervised learning.Our experiments with real-world captcha schemes demonstrate that GeeSolver outperforms the state-of-the-art methods by a large margin using a few labeled captchas. We also show that GeeSolver is highly efficient as it can solve a captcha within 25 ms using a desktop CPU and 9 ms using a desktop GPU. Besides, thanks to latent representation extraction, we successfully break the hard-to-attack captcha schemes, proving the generality of our solver. We hope that our work will help security experts to revisit the design and availability of text-based captchas. The code is available at https://github.com/NSSL-SJTU/GeeSolver.
Links
TrojanModel: A Practical Trojan Attack against Automatic Speech Recognition Systems.
Authors
- Wei Zong, Institute of Cybersecurity and Cryptology (iC²), University of Wollongong, Australia
- Yang-Wai Chow, Institute of Cybersecurity and Cryptology (iC²), University of Wollongong, Australia
- Willy Susilo, Institute of Cybersecurity and Cryptology (iC²), University of Wollongong, Australia
- Kien Do, Applied Artificial Intelligence Institute (A²I²), Deakin University, Australia
- Svetha Venkatesh, Applied Artificial Intelligence Institute (A²I²), Deakin University, Australia
Abstract
While deep learning techniques have achieved great success in modern digital products, researchers have shown that deep learning models are susceptible to Trojan attacks. In a Trojan attack, an adversary stealthily modifies a deep learning model such that the model will output a predefined label whenever a trigger is present in the input. In this paper, we present TrojanModel, a practical Trojan attack against Automatic Speech Recognition (ASR) systems. ASR systems aim to transcribe voice input into text, which is easier for subsequent downstream applications to process. We consider a practical attack scenario in which an adversary inserts a Trojan into the acoustic model of a target ASR system. Unlike existing work that uses noise-like triggers that will easily arouse user suspicion, the work in this paper focuses on the use of unsuspicious sounds as a trigger, e.g., a piece of music playing in the background. In addition, TrojanModel does not require the retraining of a target model. Experimental results show that TrojanModel can achieve high attack success rates with negligible effect on the target model’s performance. We also demonstrate that the attack is effective in an over-the-air attack scenario, where audio is played over a physical speaker and received by a microphone.
Links
REGA: Scalable Rowhammer Mitigation with Refresh-Generating Activations.
Authors
- Michele Marazzi, Computer Security Group, ETH Zürich
- Flavien Solt, Computer Security Group, ETH Zürich
- Patrick Jattke, Computer Security Group, ETH Zürich
- Kubo Takashi, Zentel Japan
- Kaveh Razavi, Computer Security Group, ETH Zürich
Abstract
Mitigating Rowhammer requires performing additional refresh operations to recharge DRAM rows before bits start to flip. These refreshes are scarce and can only happen periodically, impeding the design of effective mitigations as newer DRAM substrates become more vulnerable to Rowhammer, and more "victim" rows are affected by a single "aggressor" row.We introduce REGA, the first in-DRAM mechanism that can generate extra refresh operations each time a row is activated. Since row activations are the sole cause of Rowhammer, these extra refreshes become available as soon as the DRAM device faces Rowhammer-inducing activations. Refresh operations are traditionally performed using sense amplifiers. Sense amplifiers, however, are also in charge of handling the read and write operations. Consequently, the sense amplifiers cannot be used for refreshing rows during data transfers. To enable refresh operations in parallel to data transfers, REGA uses additional low-overhead buffering sense amplifiers for the sole purpose of data transfers. REGA can then use the original sense amplifiers for parallel refresh operations of other rows during row activations.The refreshes generated by REGA enable the design of simple and scalable in-DRAM mitigations with strong security guarantees. As an example, we build REGA
M , the first deterministic in-DRAM mitigation that scales to small Rowhammer thresholds while remaining agnostic to the number of victims per aggressor. REGAM has a constant 2.1% area overhead, and can protect DDR5 devices with Rowhammer thresholds as small as 261, 517, and 1029 with 23.9%, 11.5%, and 4.7% more power, and 3.7%, 0.8% and 0% performance overhead.
Links
CSI:Rowhammer - Cryptographic Security and Integrity against Rowhammer.
Authors
- Jonas Juffinger, Lamarr Security Research; Graz University of Technology
- Lukas Lamster, Graz University of Technology
- Andreas Kogler, Graz University of Technology
- Maria Eichlseder, Graz University of Technology
- Moritz Lipp, Amazon Web Services
- Daniel Gruss, Lamarr Security Research; Graz University of Technology
Abstract
In this paper, we present CSI:Rowhammer, a principled hardware-software co-design Rowhammer mitigation with cryptographic security and integrity guarantees, that does not focus on any specific properties of Rowhammer. We design a new memory error detection mechanism based on a low-latency cryptographic MAC and an exception mechanism initiating a software-level correction routine. The exception handler uses a novel instruction-set extension for the error correction and resumes execution afterward. In contrast to regular ECC-DRAM that remains exploitable if more than 2 bits are flipped, CSI:Rowhammer maintains the security level of the cryptographic MAC. We evaluate CSI:Rowhammer in a gem5 proof-of-concept implementation. Under normal conditions, we see latency overheads below 0.75% and no memory overhead compared to off-the-shelf ECC-DRAM. While the average latency to correct a single bitflip is below 20 ns (compared to a range from a few nanoseconds to several milliseconds for state-of-the-art ECC memory), CSI:Rowhammer can detect any number of bitflips with overwhelming probability and correct at least 8 bitflips in practical time constraints.
Links
Jolt: Recovering TLS Signing Keys via Rowhammer Faults.
Authors
- Koksal Mus, Worcester Polytechnic Institute, Worcester, MA, USA
- Yarkın Doröz, Worcester Polytechnic Institute, Worcester, MA, USA
- M. Caner Tol, Worcester Polytechnic Institute, Worcester, MA, USA
- Kristi Rahman, Worcester Polytechnic Institute, Worcester, MA, USA
- Berk Sunar, Worcester Polytechnic Institute, Worcester, MA, USA
Abstract
Digital Signature Schemes such as DSA, ECDSA, and RSA are widely deployed to protect the integrity of security protocols such as TLS, SSH, and IPSec. In TLS, for instance, RSA and (EC)DSA are used to sign the state of the agreed upon protocol parameters during the handshake phase. Naturally, RSA and (EC)DSA implementations have become the target of numerous attacks, including powerful side-channel attacks. Hence, cryptographic libraries were patched repeatedly over the years.Here we introduce Jolt, a novel attack targeting signature scheme implementations. Our attack exploits faulty signatures gained by injecting faults during signature generation. By using the signature verification primitive, we correct faulty signatures and, in the process deduce bits of the secret signing key. Compared to recent attacks that exploit single bit biases in the nonce that require 2 45 signatures, our attack requires less than a thousand faulty signatures for a 256-bit (EC)DSA. The performance improvement is due to the fact that our attack targets the secret signing key, which does not change across signing sessions. We show that the proposed attack also works on Schnorr and RSA signatures with minor modifications.We demonstrate the viability of Jolt by running experiments targeting TLS handshakes in common cryptographic libraries such as WolfSSL, OpenSSL, Microsoft SymCrypt, LibreSSL, and Amazon s2n. On our target platform, the online phase takes less than 2 hours to recover 192 bits of a 256-bit ECDSA key, which is sufficient for full key recovery. We note that while RSA signatures are protected in popular cryptographic libraries, OpenSSL remains vulnerable to double fault injection. We have also reviewed their Federal Information Processing Standard (FIPS) hardened versions which are slightly less efficient but still vulnerable to our attack. We found that (EC)DSA signatures remain largely unprotected against software-only faults, posing a threat to real-life deployments such as TLS, and potentially other security protocols such as SSH and IPSec. This highlights the need for a thorough review and implementation of faults checking in security protocol implementations.
Links
Hide and Seek with Spectres: Efficient discovery of speculative information leaks with random testing.
Authors
- Oleksii Oleksenko, Microsoft Research
- Marco Guarnieri, IMDEA Software Institute
- Boris Köpf, Microsoft Research
- Mark Silberstein, Technion
Abstract
Attacks like Spectre abuse speculative execution, one of the key performance optimizations of modern CPUs. Recently, several testing tools have emerged to automatically detect speculative leaks in commercial (black-box) CPUs. However, the testing process is still slow, which has hindered in-depth testing campaigns, and so far prevented the discovery of new classes of leakage.In this paper, we identify the root causes of the performance limitations in existing approaches, and propose techniques to overcome these limitations. With these techniques, we improve the testing speed over the state-of-the-art by up to two orders of magnitude.These improvements enable us to run a testing campaign of unprecedented depth on Intel and AMD CPUs. As a highlight, we discover two types of previously unknown speculative leaks (affecting string comparison and division) that have escaped previous manual and automatic analyses.
Links
Spectre Declassified: Reading from the Right Place at the Wrong Time.
Authors
- Basavesh Ammanaghatta Shivakumar, MPI-SP, Bochum, Germany
- Jack Barnes, The University of Adelaide, Adelaide, Australia
- Gilles Barthe, IMDEA Software Institute, Madrid, Spain; MPI-SP, Bochum, Germany
- Sunjay Cauligi, MPI-SP, Bochum, Germany
- Chitchanok Chuengsatiansup, The University of Adelaide, Adelaide, Australia
- Daniel Genkin, Georgia Institute of Technology, Atlanta, United States
- Sioli O’Connell, The University of Adelaide, Adelaide, Australia
- Peter Schwabe, MPI-SP, Bochum, Germany; Radboud University, Nijmegen, The Netherlands
- Rui Qi Sim, The University of Adelaide, Adelaide, Australia
- Yuval Yarom, The University of Adelaide, Adelaide, Australia
Abstract
Practical information-flow programming languages commonly allow controlled leakage via a declassify construct—programmers can use this construct to declare intentional leakage. For instance, cryptographic signatures and ciphertexts, which are computed from private keys, are viewed as secret by information-flow analyses. Cryptographic libraries can use declassify to make this data public, as it is no longer sensitive.In this paper, we study the interaction between speculative execution and declassification. We show that speculative execution leads to unintended leakage from declassification sites. Concretely, we present a PoC that recovers keys from AES implementations. Our PoC is an instance of a Spectre attack, and remains effective even when programs are compiled with speculative load hardening (SLH), a widespread compiler-based countermeasure against Spectre. We develop formal countermeasures against these attacks, including a significant improvement to SLH we term selective speculative load hardening (selSLH). These countermeasures soundly enforce relative non-interference (RNI): Informally, the speculative leakage of a protected program is limited to the existing sequential leakage of the original program. We implement our simplest countermeasure in the FaCT language and compiler—which is designed specifically for high-assurance cryptography—and we see performance overheads of at most 10%. Finally, although we do not directly implement selSLH, our preliminary evaluation suggests a significant reduction in performance cost for cryptographic functions as compared to traditional SLH.
Links
Volttack: Control IoT Devices by Manipulating Power Supply Voltage.
Authors
- Kai Wang, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Shilin Xiao, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Xiaoyu Ji, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Chen Yan, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Chaohao Li, Hangzhou Hikvision Digital Technology Co., Ltd
- Wenyuan Xu, Ubiquitous System Security Lab (USSLAB), Zhejiang University
Abstract
This paper analyzes the security of Internet of Things (IoT) devices from the perspective of sensing and actuating. Particularly, we discover a vulnerability in power supply modules and propose Volttack attacks. To launch a Volttack attack, attackers may compromise the power source and inject malicious signals through the power supply module, which is indispensable in most devices. Eventually, Volttack attacks may cause the sensor measurement irrelevant to reality or maneuver the actuator in a way disregarding the desired command. To understand Volttack, we systematically analyze the underlying principle of power supply signals affecting the electronic components, which are building blocks to constitute the sensor or actuator modules. Derived from these findings, we implement and validate Volttack on off-the-shelf products: 6 sensors and 3 actuators, which are used in applications ranging from automobile braking systems, industrial process control to robotic arms. The consequences of manipulating the sensor measurement or actuation include doubled car braking distance and a natural gas leak. The root cause of such a vulnerability stems from the common belief that noises from the power line are unintentional, and our work aims to call for attention to enhancing the security of power supply modules and adding countermeasures to mitigate the attacks.
Links
Inducing Wireless Chargers to Voice Out for Inaudible Command Attacks.
Authors
- Donghui Dai, Department of Computing, The Hong Kong Polytechnic University
- Zhenlin An, Department of Computing, The Hong Kong Polytechnic University
- Lei Yang, Department of Computing, The Hong Kong Polytechnic University
Abstract
Recent works demonstrated that speech recognition systems or voice assistants can be manipulated by malicious voice commands, which are injected through various inaudible media, such as ultrasound, laser, and electromagnetic interference (EMI). In this work, we explore a new kind of inaudible voice attack through the magnetic interference induced by a wireless charger. Essentially, we show that the microphone components of smart devices suffer from severe magnetic interference when they are enjoying wireless charging, due to the absence of effective protection against the EMI at low frequencies (100 kHz or below). By taking advantage of this vulnerability, we design two inaudible voice attacks, HeartwormAttack and ParasiteAttack, both of which aim to inject malicious voice commands into smart devices being wirelessly charged. They make use of a compromised wireless charger or accessory equipment (called parasite) to inject the voice, respectively. We conduct extensive experiments with 17 victim devices (iPhone, Huawei, Samsung, etc.) and 6 types of voice assistants (Siri, Google STT, Bixby, etc.). Evaluation results demonstrate the feasibility of two proposed attacks with commercial charging settings.
Links
mmSpoof: Resilient Spoofing of Automotive Millimeter-wave Radars using Reflect Array.
Authors
- Rohith Reddy Vennam, University of California San Diego, La Jolla, CA
- Ish Kumar Jain, University of California San Diego, La Jolla, CA
- Kshitiz Bansal, University of California San Diego, La Jolla, CA
- Joshua Orozco, University of California San Diego, La Jolla, CA
- Puja Shukla, University of California San Diego, La Jolla, CA
- Aanjhan Ranganathan, Northeastern University, Boston, MA
- Dinesh Bharadia, University of California San Diego, La Jolla, CA
Abstract
FMCW radars are integral to automotive driving for robust and weather-resistant sensing of surrounding objects. However, these radars are vulnerable to spoofing attacks that can cause sensor malfunction and potentially lead to accidents. Previous attempts at spoofing FMCW radars using an attacker device have not been very effective due to the need for synchronization between the attacker and the victim. We present a novel spoofing mechanism called mmSpoof that does not require synchronization and is resilient to various security features and countermeasures of the victim radar. Our spoofing mechanism uses a "reflect array" based attacker device that reflects the radar signal with appropriate modulation to spoof the victim’s radar. We provide insights and mechanisms to flexibly spoof any distance and velocity on the victim’s radar using a unique frequency shift at the mmSpoof’s reflect array. We design a novel algorithm to estimate this frequency shift without assuming prior information about the victim’s radar. We show the effectiveness of our spoofing using a compact and mobile setup with commercial-off-the-shelf components in realistic automotive driving scenarios with commercial radars.
Links
PLA-LiDAR: Physical Laser Attacks against LiDAR-based 3D Object Detection in Autonomous Vehicle.
Authors
- Zizhi Jin, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Xiaoyu Ji, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Yushi Cheng, Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University
- Bo Yang, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Chen Yan, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Wenyuan Xu, Ubiquitous System Security Lab (USSLAB), Zhejiang University
Abstract
Autonomous vehicles and robots increasingly exploit LiDAR-based 3D object detection systems to detect obstacles in environment. Correct detection and classification are important to ensure safe driving. Though existing work has demonstrated the feasibility of manipulating point clouds to spoof 3D object detectors, most of the attempts are conducted digitally. In this paper, we investigate the possibility of physically fooling LiDAR-based 3D object detection by injecting adversarial point clouds using lasers. First, we develop a laser transceiver that can inject up to 4200 points, which is 20 times more than prior work, and can measure the scanning cycle of victim LiDARs to schedule the spoofing laser signals. By designing a control signal method that converts the coordinates of point clouds to control signals and an adversarial point cloud optimization method with physical constraints of LiDARs and attack capabilities, we manage to inject spoofing point cloud with desired point cloud shapes into the victim LiDAR physically. We can launch four types of attacks, i.e., naive hiding, record-based creating, optimization-based hiding, and optimization-based creating. Extensive experiments demonstrate the effectiveness of our attacks against two commercial LiDAR and three detectors. We also discuss defense strategies at the sensor and AV system levels.
Links
mmEcho: A mmWave-based Acoustic Eavesdropping Method.
Authors
- Pengfei Hu, School of Computer Science and Technology, Shandong University, Qingdao, China
- Wenhao Li, School of Computer Science and Technology, Shandong University, Qingdao, China
- Riccardo Spolaor, School of Computer Science and Technology, Shandong University, Qingdao, China
- Xiuzhen Cheng, School of Computer Science and Technology, Shandong University, Qingdao, China
Abstract
Acoustic eavesdropping targeting private or confidential spaces is one of the most severe privacy threats. Soundproof rooms may reduce such risks, but they cannot prevent sophisticated eavesdropping, which has been an emerging research trend in recent years. Researchers have investigated such acoustic eavesdropping attacks via sensor-enabled side-channels. However, such attacks either make unrealistic assumptions or have considerable constraints. This paper introduces mmEcho, an acoustic eavesdropping system that uses a millimeter-wave radio signal to accurately measure the micrometer-level vibration of an object induced by sound waves. Compared with previous works, our eavesdropping method is highly accurate and requires no prior knowledge about the victim. We evaluate the performance of mmEcho under extensive real-world settings and scenarios. Our results show that mmEcho can accurately reconstruct audio from moving sources at various distances, orientations, reverberating objects, sound insulators, spoken languages, and sound levels.
Links
Side Eye: Characterizing the Limits of POV Acoustic Eavesdropping from Smartphone Cameras with Rolling Shutters and Movable Lenses.
Authors
- Yan Long, Electrical Engineering and Computer Science, University of Michigan
- Pirouz Naghavi, Computer and Information Science and Engineering, University of Florida
- Blas Kojusner, Computer and Information Science and Engineering, University of Florida
- Kevin Butler, Computer and Information Science and Engineering, University of Florida
- Sara Rampazzi, Computer and Information Science and Engineering, University of Florida
- Kevin Fu, Electrical Engineering and Computer Science, University of Michigan
Abstract
Our research discovers how the rolling shutter and movable lens structures widely found in smartphone cameras modulate structure-borne sounds onto camera images, creating a point-of-view (POV) optical-acoustic side channel for acoustic eavesdropping. The movement of smartphone camera hardware leaks acoustic information because images unwittingly modulate ambient sound as imperceptible distortions. Our experiments find that the side channel is further amplified by intrinsic behaviors of Complementary Metal-oxide–Semiconductor (CMOS) rolling shutters and movable lenses such as in Optical Image Stabilization (OIS) and Auto Focus (AF). Our paper characterizes the limits of acoustic information leakage caused by structure-borne sound that perturbs the POV of smartphone cameras. In contrast with traditional optical-acoustic eavesdropping on vibrating objects, this side channel requires no line of sight and no object within the camera’s field of view (images of a ceiling suffice). Our experiments test the limits of this side channel with a novel signal processing pipeline that extracts and recognizes the leaked acoustic information. Our evaluation with 10 smartphones on a spoken digit dataset reports 80.66%, 91.28%, and 99.67% accuracies on recognizing 10 spoken digits, 20 speakers, and 2 genders respectively. We further systematically discuss the possible defense strategies and implementations. By modeling, measuring, and demonstrating the limits of acoustic eavesdropping from smartphone camera image streams, our contributions explain the physics-based causality and possible ways to reduce the threat on current and future devices.
Links
3DFed: Adaptive and Extensible Framework for Covert Backdoor Attack in Federated Learning.
Authors
- Haoyang Li, The Hong Kong Polytechnic University
- Qingqing Ye, The Hong Kong Polytechnic University
- Haibo Hu, The Hong Kong Polytechnic University
- Jin Li, Guangzhou University
- Leixia Wang, Renmin University of China
- Chengfang Fang, Huawei International, Singapore
- Jie Shi, Huawei International, Singapore
Abstract
Federated Learning (FL), the de-facto distributed machine learning paradigm that locally trains datasets at individual devices, is vulnerable to backdoor model poisoning attacks. By compromising or impersonating those devices, an attacker can upload crafted malicious model updates to manipulate the global model with backdoor behavior upon attacker-specified triggers. However, existing backdoor attacks require more information on the victim FL system beyond a practical black-box setting. Furthermore, they are often specialized to optimize for a single objective, which becomes ineffective as modern FL systems tend to adopt in-depth defense that detects backdoor models from different perspectives. Motivated by these concerns, in this paper, we propose 3DFed, an adaptive, extensible, and multi-layered framework to launch covert FL backdoor attacks in a black-box setting. 3DFed sports three evasion modules that camouflage backdoor models: backdoor training with constrained loss, noise mask, and decoy model. By implanting indicators into a backdoor model, 3DFed can obtain the attack feedback in the previous epoch from the global model and dynamically adjust the hyper-parameters of these backdoor evasion modules. Through extensive experimental results, we show that when all its components work together, 3DFed can evade the detection of all state-of-the-art FL backdoor defenses, including Deepsight, Foolsgold, FLAME, FL-Detector, and RFLBAT. New evasion modules can also be incorporated in 3DFed in the future as it is an extensible framework.
Links
Scalable and Privacy-Preserving Federated Principal Component Analysis.
Authors
- David Froelicher, EPFL; Tune Insight SA
- Hyunghoon Cho, Tune Insight SA
- Manaswitha Edupalli, Tune Insight SA
- Joao Sa Sousa, MIT
- Jean-Philippe Bossuat, Broad Institute of MIT and Harvard
- Apostolos Pyrgelis, MIT
- Juan R. Troncoso-Pastoriza, Broad Institute of MIT and Harvard
- Bonnie Berger, EPFL
- Jean-Pierre Hubaux, MIT; Broad Institute of MIT and Harvard
Abstract
Principal component analysis (PCA) is an essential algorithm for dimensionality reduction in many data science domains. We address the problem of performing a federated PCA on private data distributed among multiple data providers while ensuring data confidentiality. Our solution, SF-PCA, is an end-to-end secure system that preserves the confidentiality of both the original data and all intermediate results in a passive-adversary model with up to all-but-one colluding parties. SF-PCA jointly leverages multiparty homomorphic encryption, interactive protocols, and edge computing to efficiently interleave computations on local cleartext data with operations on collectively encrypted data. SF-PCA obtains results as accurate as non-secure centralized solutions, independently of the data distribution among the parties. It scales linearly or better with the dataset dimensions and with the number of data providers. SF-PCA is more precise than existing approaches that approximate the solution by combining local analysis results, and between 3x and 250x faster than privacy-preserving alternatives based solely on secure multiparty computation or homomorphic encryption. Our work demonstrates the practical applicability of secure and federated PCA on private distributed datasets.
Links
Private, Efficient, and Accurate: Protecting Models Trained by Multi-party Learning with Differential Privacy.
Authors
- Wenqiang Ruan, Laboratory of Data Analytics and Security, Fudan University
- Mingxin Xu, Laboratory of Data Analytics and Security, Fudan University
- Wenjing Fang, Ant Group
- Li Wang, Ant Group
- Lei Wang, Ant Group
- Weili Han, Laboratory of Data Analytics and Security, Fudan University
Abstract
Secure multi-party computation-based machine learning, referred to as multi-party learning (MPL for short), has become an important technology to utilize data from multiple parties with privacy preservation. While MPL provides rigorous security guarantees for the computation process, the models trained by MPL are still vulnerable to attacks that solely depend on access to the models. Differential privacy could help to defend against such attacks. However, the accuracy loss brought by differential privacy and the huge communication overhead of secure multi-party computation protocols make it highly challenging to balance the 3-way trade-off between privacy, efficiency, and accuracy.In this paper, we are motivated to resolve the above issue by proposing a solution, referred to as PEA (Private, Efficient, Accurate), which consists of a secure differentially private stochastic gradient descent (DPSGD for short) protocol and two optimization methods. First, we propose a secure DPSGD protocol to enforce DPSGD, which is a popular differentially private machine learning algorithm, in secret sharing-based MPL frameworks. Second, to reduce the accuracy loss led by differential privacy noise and the huge communication overhead of MPL, we propose two optimization methods for the training process of MPL: (1) the data-independent feature extraction method, which aims to simplify the trained model structure; (2) the local data-based global model initialization method, which aims to speed up the convergence of the model training. We implement PEA in two open-source MPL frameworks: TF-Encrypted and Queqiao. The experimental results on various datasets demonstrate the efficiency and effectiveness of PEA. E.g. when ϵ = 2, we can train a differentially private classification model with an accuracy of 88% for CIFAR-10 within 7 minutes under the LAN setting. This result significantly outperforms the one from CryptGPU, one state-of-the-art MPL framework: it costs more than 16 hours to train a non-private deep neural network model on CIFAR-10 with the same accuracy.
Links
Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering.
Authors
- Ce Feng, Lehigh University
- Nuo Xu, Lehigh University
- Wujie Wen, Lehigh University
- Parv Venkitasubramaniam, Lehigh University
- Caiwen Ding, University of Connecticut
Abstract
Differential privacy is a widely accepted measure of privacy in the context of deep learning algorithms, and achieving it relies on a noisy training approach known as differentially private stochastic gradient descent (DP-SGD). DP-SGD requires direct noise addition to every gradient in a dense neural network, the privacy is achieved at a significant utility cost. In this work, we present Spectral-DP, a new differentially private learning approach which combines gradient perturbation in the spectral domain with spectral filtering to achieve a desired privacy guarantee with a lower noise scale and thus better utility. We develop differentially private deep learning methods based on Spectral-DP for architectures that contain both convolution and fully connected layers. In particular, for fully connected layers, we combine a block-circulant based spatial restructuring with Spectral-DP to achieve better utility. Through comprehensive experiments, we study and provide guidelines to implement Spectral-DP deep learning on benchmark datasets. In comparison with state-of-the-art DP-SGD based approaches, Spectral-DP is shown to have uniformly better utility performance in both training from scratch and transfer learning settings.
Links
ELSA: Secure Aggregation for Federated Learning with Malicious Actors.
Authors
- Mayank Rathee, University of California, Berkeley
- Conghao Shen, University of California, Berkeley; Stanford University
- Sameer Wagh, University of California, Berkeley; Devron Corporation
- Raluca Ada Popa, University of California, Berkeley
Abstract
Federated learning (FL) is an increasingly popular approach for machine learning (ML) in cases where the training dataset is highly distributed. Clients perform local training on their datasets and the updates are then aggregated into the global model. Existing protocols for aggregation are either inefficient, or don’t consider the case of malicious actors in the system. This is a major barrier in making FL an ideal solution for privacy-sensitive ML applications. We present Elsa, a secure aggregation protocol for FL, which breaks this barrier - it is efficient and addresses the existence of malicious actors at the core of its design. Similar to prior work on Prio and Prio+, Elsa provides a novel secure aggregation protocol built out of distributed trust across two servers that keeps individual client updates private as long as one server is honest, defends against malicious clients, and is efficient end-to-end. Compared to prior works, the distinguishing theme in Elsa is that instead of the servers generating cryptographic correlations interactively, the clients act as untrusted dealers of these correlations without compromising the protocol’s security. This leads to a much faster protocol while also achieving stronger security at that efficiency compared to prior work. We introduce new techniques that retain privacy even when a server is malicious at a small added cost of 7-25% in runtime with negligible increase in communication over the case of semi-honest server. Our work improves end-to-end runtime over prior work with similar security guarantees by big margins - single-aggregator RoFL by up to 305x (for the models we consider), and distributed trust Prio by up to 8x.
Links
No One Drinks From the Firehose: How Organizations Filter and Prioritize Vulnerability Information.
Authors
- Stephanie de Smale, National Cyber Security Centre, The Netherlands; Delft University of Technology
- Rik van Dijk, National Cyber Security Centre, The Netherlands
- Xander Bouwman, Delft University of Technology
- Jeroen van der Ham, National Cyber Security Centre, The Netherlands; University of Twente
- Michel van Eeten, Delft University of Technology
Abstract
The number of published software vulnerabilities is increasing every year. How do organizations stay in control of their attack surface despite their limited staff resources? Prior work has analyzed the overall software vulnerability ecosystem as well as patching processes within organizations, but not how these two are connected.We investigate this missing link through semi-structured interviews with 22 organizations in critical infrastructure and government services. We analyze where in these organizations the responsibility is allocated to collect and triage information about software vulnerabilities, and find that none of our respondents is acquiring such information comprehensively, not even in a reduced and aggregated form like the National Vulnerability Database (NVD). This means that information on known vulnerabilities will be missed, even in critical infrastructure organizations. We observe that organizations apply implicit and explicit coping mechanisms to reduce their intake of vulnerability information, and identify three trade-offs in these strategies: independence, pro-activeness and formalization.Although our respondents’ behavior is in conflict with the widely accepted security advice to collect comprehensive vulnerability information about active systems, no respondents recall having experienced a security incident that was associated with missing information on a known software vulnerability. This suggests that, given scarce resources, reducing the intake of vulnerability information by up to 95% can be considered a rational strategy. Our findings raise questions about the allocation of responsibility and accountability for finding vulnerable systems, as well as suggest changing expectations around collecting vulnerability information.
Links
Vulnerability Discovery for All: Experiences of Marginalization in Vulnerability Discovery.
Authors
- Kelsey R. Fulton, University of Maryland
- Samantha Katcher, Tufts University
- Kevin Song, University of Chicago
- Marshini Chetty, University of Chicago
- Michelle L. Mazurek, University of Maryland
- Chloé Messdaghi, Impactive Consulting
- Daniel Votipka, Tufts University
Abstract
Vulnerability discovery is an essential aspect of software security. Currently, the demand for security experts significantly exceeds the available vulnerability discovery workforce. Further, the existing vulnerability discovery workforce is highly homogeneous, dominated by white and Asian men. As such, one promising avenue for increasing the capacity of the vulnerability discovery community is through recruitment and retention from a broader population. Although significant prior research has explored the challenges of equity and inclusion in computing broadly, the competitive and frequently self-taught nature of vulnerability discovery work may create new variations on these challenges. This paper reports on a semi-structured interview study (N = 16) investigating how people from marginalized populations come to participate in vulnerability discovery, whether they feel welcomed by the vulnerability discovery community, and what challenges they face when joining the vulnerability discovery community. We find that members of marginalized populations face some unique challenges, while other challenges common in vulnerability discovery are exacerbated by marginalization.
Links
"We are a startup to the core": A qualitative interview study on the security and privacy development practices in Turkish software startups.
Authors
- Dilara Keküllüoğlu, University of Edinburgh, UK
- Yasemin Acar, The George Washington University, USA; Paderborn University, Germany
Abstract
Security and privacy are often neglected in software development, and rarely a priority for developers. This insight is commonly based on research conducted by researchers and on developer populations living and working in the United States, Europe, and the United Kingdom. However, the production of software is global, and crucial populations in important technology hubs are not adequately studied. The software startup scene in Turkey is impactful, and comprehension, knowledge, and mitigations related to software security and privacy remain understudied. To close this research gap, we conducted a semi-structured interview study with 16 developers working in Turkish software startups. The goal of the interview study was to analyze if and how developers ensure that their software is secure and preserves user privacy. Our main finding is that developers rarely prioritize security and privacy, due to a lack of awareness, skills, and resources. We find that regulations can make a positive impact on security and privacy. Based on the study, we issue recommendations for industry, individual developers, research, educators, and regulators. Our recommendations can inform a more globalized approach to security and privacy in software development.
Links
"How technical do you get? I'm an English teacher": Teaching and Learning Cybersecurity and AI Ethics in High School.
Authors
- Zachary Kilhoffer, University of Illinois at Urbana-Champaign
- Zhixuan Zhou, University of Illinois at Urbana-Champaign
- Firmiana Wang, University of Illinois Laboratory High School
- Fahad Tamton, University of Illinois at Urbana-Champaign
- Yun Huang, University of Illinois at Urbana-Champaign
- Pilyoung Kim, University of Denver
- Tom Yeh, University of Colorado Boulder
- Yang Wang, University of Illinois at Urbana-Champaign
Abstract
Today’s cybersecurity and AI technologies are often fraught with ethical challenges. One promising direction is to teach cybersecurity and AI ethics to today’s youth. However, we know little about how these subjects are taught before college. Drawing from interviews of US high school teachers (n=16) and students (n=11), we find that cybersecurity and AI ethics are often taught in non-technical classes such as social studies and language arts. We also identify relevant topics, of which epistemic norms, privacy, and digital citizenship appeared most often. While teachers leverage traditional and novel teaching strategies including discussions (treating current events as case studies), gamified activities, and content creation, many challenges remain. For example, teachers hesitate to discuss current events out of concern for appearing partisan and angering parents; cyber hygiene instruction appears very ineffective at educating youth and promoting safer online behavior; and generational differences make it difficult for teachers to connect with students. Based on the study results, we offer practical suggestions for educators, school administrators, and cybersecurity practitioners to improve youth education on cybersecurity and AI ethics.
Links
Skilled or Gullibleƒ Gender Stereotypes Related to Computer Security and Privacy.
Authors
- Miranda Wei, Paul G. Allen School of Computer Science & Engineering, University of Washington
- Pardis Emami-Naeini, Department of Computer Science, Duke University
- Franziska Roesner, Paul G. Allen School of Computer Science & Engineering, University of Washington
- Tadayoshi Kohno, Paul G. Allen School of Computer Science & Engineering, University of Washington
Abstract
Gender stereotypes remain common in U.S. society and harm people of all genders. Focusing on binary genders (women and men) as a first investigation, we empirically study gender stereotypes related to computer security and privacy. We used Prolific to conduct two surveys with U.S. participants that aimed to: (1) surface potential gender stereotypes related to security and privacy (N = 202), and (2) assess belief in gender stereotypes about security and privacy engagement, personal characteristics, and behaviors (N = 190). We find that stereotype beliefs are significantly correlated with participants’ gender as well as level of sexism, and we delve into the justifications our participants offered for their beliefs. Beyond scientifically studying the existence and prevalence of such stereotypes, we describe potential implications, including biasing crowdworker-faciliated user research. Further, our work lays a foundation for deeper investigations of the impacts of stereotypes in computer security and privacy, as well as stereotypes across the whole gender and identity spectrum.
Links
Everybody's Got ML, Tell Me What Else You Have: Practitioners' Perception of ML-Based Security Tools and Explanations.
Authors
- Jaron Mink, University of Illinois at Urbana-Champaign
- Hadjer Benkraouda, University of Illinois at Urbana-Champaign
- Limin Yang, University of Illinois at Urbana-Champaign
- Arridhana Ciptadi, Truera
- Ali Ahmadzadeh, Blue Hexagon
- Daniel Votipka, Tufts University
- Gang Wang, University of Illinois at Urbana-Champaign
Abstract
Significant efforts have been investigated to develop machine learning (ML) based tools to support security operations. However, they still face key challenges in practice. A generally perceived weakness of machine learning is the lack of explanation, which motivates researchers to develop machine learning explanation techniques. However, it is not yet well understood how security practitioners perceive the benefits and pain points of machine learning and corresponding explanation methods in the context of security operations. To fill this gap and understand "what is needed", we conducted semi-structured interviews with 18 security practitioners with diverse roles, duties, and expertise. We find practitioners generally believe that ML tools should be used in conjunction with (instead of replacing) traditional rule-based methods. While ML’s output is perceived as difficult to reason, surprisingly, rule-based methods are not strictly easier to interpret. We also find that only few practitioners considered security (robustness to adversarial attacks) as a key factor for the choice of tools. Regarding ML explanations, while recognizing their values in model verification and understanding security events, practitioners also identify gaps between existing explanation methods and the needs of their downstream tasks. We collect and synthesize the suggestions from practitioners regarding explanation scheme designs, and discuss how future work can help to address these needs.
Links
Precise Detection of Kernel Data Races with Probabilistic Lockset Analysis.
Authors
- Gabriel Ryan, Columbia University
- Abhishek Shah, Columbia University
- Dongdong She, Columbia University
- Suman Jana, Columbia University
Abstract
Finding data races is critical for ensuring security in modern kernel development. However, finding data races in the kernel is challenging because it requires jointly searching over possible combinations of system calls and concurrent execution schedules. Kernel race testing systems typically perform this search by executing groups of fuzzer seeds from a corpus and applying a combination of schedule fuzzing and dynamic race prediction on traces. However, predicting which combinations of seeds can expose races in the kernel is difficult as fuzzer seeds will usually follow different execution paths when executed concurrently due to inter-thread communications and synchronization.To address this challenge, we introduce a new analysis for kernel race prediction, Probabilistic Lockset Analysis (PLA) that addresses the challenges posed by race prediction for the kernel. PLA leverages the observation that system calls almost always perform certain memory accesses to shared memory to perform their function. PLA uses randomized concurrent trace sampling to identify memory accesses that are performed consistently and estimates the probability of races between them subject to kernel lock synchronization. By prioritizing high probability races, PLA is able to make accurate predictions.We evaluate PLA against comparable kernel race testing methods and show it finds races at a 3× higher rate over 24 hours. We use PLA to find 183 races in linux kernel v5.18-rc5, including 102 harmful races. PLA is able to find races that have severe security impact in heavily tested core kernel modules, including use-after-free in memory management, OOB write in network cryptography, and leaking kernel heap memory information. Some of these vulnerabilities have been overlooking by existing systems for years: one of the races found by PLA involving an OOB write has been present in the kernel since 2013 (version v3.14-rc1) and has been designated a high severity CVE.
Links
SegFuzz: Segmentizing Thread Interleaving to Discover Kernel Concurrency Bugs through Fuzzing.
Authors
- Dae R. Jeong, School of Computing, KAIST
- Byoungyoung Lee, Department of Electrical and Computer Engineering, Seoul National University
- Insik Shin, School of Computing, KAIST
- Youngjin Kwon, School of Computing, KAIST
Abstract
Discovering kernel concurrency bugs through fuzzing is challenging. Identifying kernel concurrency bugs, as opposed to non-concurrency bugs, necessitates an analysis of possible interleavings between two or more threads. However, because the search space of thread interleaving is vast, it is impractical to investigate all conceivable thread interleavings. To explore the vast search space, most previous approaches perform random or simple heuristic searches without having coverage for thread interleaving or with an insufficient form of coverage. As a result, they either conduct wasteful searches with redundant executions or overlook concurrent bugs that their coverage cannot address.To overcome such limitations, we propose SegFuzz, a fuzzing framework for kernel concurrency bugs. When exploring the search space of thread interleavings, SegFuzz decomposes an entire thread interleaving into a set of segments, each of which represents an interleaving of the small number of instructions, and utilizes individual segments as interleaving coverage, called interleaving segment coverage. When searching for thread interleavings, SegFuzz mutates interleavings in explored interleaving segments to construct new thread interleavings that have not yet been explored. With SegFuzz, we discover new 21 concurrency bugs in Linux kernels, and demonstrate the efficiency of SegFuzz by showing that SegFuzz can identify known bugs on average 4.1 times quickly than the state-of-the-art approaches.
Links
AEM: Facilitating Cross-Version Exploitability Assessment of Linux Kernel Vulnerabilities.
Authors
- Zheyue Jiang, Fudan University
- Yuan Zhang, Fudan University
- Jun Xu, University of Utah
- Xinqian Sun, Fudan University
- Zhuang Liu, Fudan University
- Min Yang, Fudan University
Abstract
This paper studies the problem of cross-version exploitability assessment for Linux kernels. Specifically, given an exploit demonstrating the exploitability of a vulnerability on a specific kernel version, we aim to understand the exploitability of the same vulnerability on other kernel versions. To tackle cross-version exploitability assessment, automated exploit generation (AEG), a recently popular topic, is the only existing, applicable solution. However, AEG is not well-suited due to its template-driven nature and ignorance of the capabilities offered by the available exploit.In this work, we introduce a new method, automated exploit migration (AEM), to facilitate cross-version exploitability assessment for Linux kernels. The key insight of AEM is the observation that the strategy adopted by the exploit is often applicable to other exploitable kernel versions. Technically, we consider the kernel version where the exploit works as a reference and adjust the exploit to force the other kernel versions to align with the reference. This way, we can reproduce the exploiting behaviors on the other versions. To reduce the cost and increase the feasibility, we strategically identify execution points that truly affect the exploitation and only enforce alignment at those points. We have designed and implemented a prototype of AEM. In our evaluation of 67 cases where exploit migration is needed, our prototype successfully migrates the exploit for 56 cases, producing a success rate of 83.5%.
Links
AEM: Facilitating Cross-Version Exploitability Assessment of Linux Kernel Vulnerabilities.
Authors
- Zheyue Jiang, Fudan University
- Yuan Zhang, Fudan University
- Jun Xu, University of Utah
- Xinqian Sun, Fudan University
- Zhuang Liu, Fudan University
- Min Yang, Fudan University
Abstract
This paper studies the problem of cross-version exploitability assessment for Linux kernels. Specifically, given an exploit demonstrating the exploitability of a vulnerability on a specific kernel version, we aim to understand the exploitability of the same vulnerability on other kernel versions. To tackle cross-version exploitability assessment, automated exploit generation (AEG), a recently popular topic, is the only existing, applicable solution. However, AEG is not well-suited due to its template-driven nature and ignorance of the capabilities offered by the available exploit.In this work, we introduce a new method, automated exploit migration (AEM), to facilitate cross-version exploitability assessment for Linux kernels. The key insight of AEM is the observation that the strategy adopted by the exploit is often applicable to other exploitable kernel versions. Technically, we consider the kernel version where the exploit works as a reference and adjust the exploit to force the other kernel versions to align with the reference. This way, we can reproduce the exploiting behaviors on the other versions. To reduce the cost and increase the feasibility, we strategically identify execution points that truly affect the exploitation and only enforce alignment at those points. We have designed and implemented a prototype of AEM. In our evaluation of 67 cases where exploit migration is needed, our prototype successfully migrates the exploit for 56 cases, producing a success rate of 83.5%.
Links
When Top-down Meets Bottom-up: Detecting and Exploiting Use-After-Cleanup Bugs in Linux Kernel.
Authors
- Lin Ma, Zhejiang University
- Duoming Zhou, Zhejiang University
- Hanjie Wu, Carnegie Mellon University
- Yajin Zhou, Zhejiang University
- Rui Chang, Zhejiang University
- Hao Xiong, Zhejiang University
- Lei Wu, Zhejiang University
- Kui Ren, Zhejiang University
Abstract
When a device is detached from the system, Use-After-Cleanup (UAC) bugs can occur because a running kernel thread may be unaware of the device detachment and attempt to use an object that has been released by the cleanup thread. Our investigation suggests that an attacker can exploit the UAC bugs to obtain the capability of arbitrary code execution and privilege escalation, which receives little attention from the community. While existing tools mainly focus on well-known concurrency bugs like data race, few target UAC bugs.In this paper, we propose a tool named UACatcher to systematically detect UAC bugs. UACatcher consists of three main phases. It first scans the entire kernel to find target layers. Next, it adopts the context- and flow-sensitive inter-procedural analysis and the points-to analysis to locate possible free (deallocation) sites in the bottom-up cleanup thread and use (dereference) sites in the top-down kernel thread that can cause UAC bugs. Then, UACatcher uses the routine switch point algorithm which counts on the synchronizations and path constraints to detect UAC bugs among these sites and estimate exploitable ones. For exploitable bugs, we leverage the pseudoterminal-based device emulation technique to develop practical exploits.We have implemented a prototype of UACatcher and evaluated it on 5.11 Linux kernel. As a result, our tool successfully detected 346 UAC bugs, which were reported to the community (277 have been confirmed and fixed and 15 CVEs have been assigned). Additionally, 13 bugs are exploitable, which can be used to develop working exploits that gain the arbitrary code execution primitive in kernel space and achieve the privilege escalation. Finally, we discuss UACatcher’s limitations and propose possible solutions to fix and prevent UAC bugs.
Links
RSFuzzer: Discovering Deep SMI Handler Vulnerabilities in UEFI Firmware with Hybrid Fuzzing.
Authors
- Jiawei Yin, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing; Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology
- Menghao Li, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing; Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology
- Yuekang Li, Nanyang Technological University
- Yong Yu, Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology
- Boru Lin, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing; Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology
- Yanyan Zou, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing; Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology
- Yang Liu, Nanyang Technological University
- Wei Huo, Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, Beijing; Key Laboratory of Network Assessment Technology, Chinese Academy of Sciences; Beijing Key Laboratory of Network Security and Protection Technology
- Jingling Xue, UNSW, Sydney
Abstract
System Management Mode (SMM) is a secure operation mode for x86 processors supported by Unified Extensible Firmware Interface (UEFI) firmware. SMM is designed to provide a secure execution environment to access highly privileged data or control low-level hardware (such as power management). The programs running in SMM are called SMM drivers and System Management Interrupt (SMI) handlers are the most important components of SMM drivers since they are the only components to receive and handle data from outside the SMM execution environment. Although SMM can serve as an extra layer of protection when the operating system is compromised, vulnerabilities in SMM drivers, especially SMI handlers, can invalidate this protection and cause severe damages to the device. Thus, early detection of SMI handler vulnerabilities is important for UEFI firmware security.To this end, researchers have proposed to use hybrid fuzzing techniques for detecting SMI handler vulnerabilities. Particularly, Intel has developed a hybrid fuzzer called Excite and uses it to secure Intel products. Although existing hybrid fuzzing techniques can detect vulnerabilities in SMI handlers, their effectiveness is limited due to two major pitfalls: 1) They can only feed input through the most common input interface to SMI handlers, lacking the ability to utilize other input interfaces. 2) They have no awareness of variables shared by multiple SMI handlers, lacking the ability to explore code segments related to such variables. By addressing the challenges faced by existing works, we propose RSFuzzer, a hybrid greybox fuzzing technique which can learn input interface and format information and detect deeply hidden vulnerabilities which are triggered by invoking multiple SMI handlers. We implemented RSFuzzer and evaluated it on 16 UEFI firmware images provided by six vendors. The experiment results show that RSFuzzer can cover 617% more basic blocks and detect 828% more vulnerabilities on average than the state-of-the-art hybrid fuzzing technique. Moreover, we found and reported 65 0-day vulnerabilities in the evaluated UEFI firmware images and 14 CVE IDs were assigned. Noticeably, 6 of the 0-day vulnerabilities were found in commercial-off-the-shelf (COTS) products from Intel, which might have been tested by Excite before releasing.
Links
A Theory to Instruct Differentially-Private Learning via Clipping Bias Reduction.
Authors
- Hanshen Xiao, MIT
- Zihang Xiang, KAUST
- Di Wang, KAUST
- Srinivas Devadas, MIT
Abstract
We study the bias introduced in Differentially-Private Stochastic Gradient Descent (DP-SGD) with clipped or normalized per-sample gradient. As one of the most popular but artificial operations to ensure bounded sensitivity, gradient clipping enables composite privacy analysis of many iterative optimization methods without additional assumptions on either learning models or input data. Despite its wide applicability, gradient clipping also presents theoretical challenges in systematically instructing improvement of privacy or utility. In general, without an assumption on globally-bounded gradient, classic convergence analyses do not apply to clipped gradient descent. Further, given limited understanding of the utility loss, many existing improvements to DP-SGD are heuristic, especially in the applications of private deep learning.In this paper, we provide meaningful theoretical analysis validated by thorough empirical results of DP-SGD. We point out that the bias caused by gradient clipping is underestimated in previous works. For generic non-convex optimization via DP-SGD, we show one key factor contributing to the bias is the sampling noise of stochastic gradient to be clipped. Accordingly, we use the developed theory to build a series of improvements for sampling noise reduction from various perspectives. From an optimization angle, we study variance reduction techniques and propose inner-outer momentum. At the learning model (neural network) level, we propose several tricks to enhance network internal normalization and BatchClipping to carefully clip the gradient of a batch of samples. For data preprocessing, we provide theoretical justification of recently proposed improvements via data normalization and (self-)augmentation.Putting these systematic improvements together, private deep learning via DP-SGD can be significantly strengthened in many tasks. For example, in computer vision applications, with an (ϵ = 8, δ = 10 −5 ) DP guarantee, we successfully train ResNet20 on CIFAR10 and SVHN with test accuracy 76.0% and 90.1%, respectively; for natural language processing, with (ϵ = 4, δ = 10 −5 ), we successfully train a recurrent neural network on IMDb data with test accuracy 77.5%.
Links
Continual Observation under User-level Differential Privacy.
Authors
- Wei Dong, Hong Kong University of Science and Technology, Hong Kong SAR, China
- Qiyao Luo, Hong Kong University of Science and Technology, Hong Kong SAR, China
- Ke Yi, Hong Kong University of Science and Technology, Hong Kong SAR, China
Abstract
In the foundational work of Dwork et al. [15] on continual observation under differential privacy (DP), two privacy models have been proposed: event-level DP and user-level DP. The latter provides a much stronger notion of privacy, as it allows a user to contribute an arbitrary number of items. Under event-level DP, their mechanisms match the optimal utility bounds in the static setting up to polylogarithmic factors for all union-preserving functions. Unfortunately, in contrast to this strong result for event-level DP, their user-level DP mechanisms have weak utility guarantees and many restrictions on the data. In this paper, we take an instance-specific approach, designing continual observation mechanisms for a number of fundamental functions under user-level DP. Our mechanisms do not need any a priori restrictions on the data, while providing utility guarantees that degrade gracefully as the hardness of the data increases. For the count and sum function, our mechanisms are down-neighborhood optimal, matching the static setting up to polylogarithmic factors. For other functions, they do not match the static case, but we prove that this is inevitable, which is the first separation result for continual observation under differential privacy.
Links
Locally Differentially Private Frequency Estimation Based on Convolution Framework.
Authors
- Huiyu Fang, School of Cyber Science and Engineering, Southeast University, Nanjing, China
- Liquan Chen, School of Cyber Science and Engineering, Southeast University, Nanjing, China
- Yali Liu, School of Computer Science and Technology, Jiangsu Normal University, Xuzhou, China
- Yuan Gao, School of Cyber Science and Engineering, Southeast University, Nanjing, China
Abstract
Local differential privacy (LDP) collects user data while protecting user privacy and eliminating the need for a trusted data collector. Several LDP protocols have been proposed and deployed in real-world applications. Frequency estimation is a fundamental task in the LDP protocols, which enables more advanced tasks in data analytics. However, the existing LDP protocols amplify the added noise in estimating the frequencies and therefore do not achieve optimal performance in accuracy. This paper introduces a convolution framework to analyze and optimize the estimated frequencies of LDP protocols. The convolution framework can equivalently transform the original frequency estimation problem into a deconvolution problem with noise. We thus add the Wiener filter-based deconvolution algorithms to LDP protocols to estimate the frequency while suppressing the added noise. Experimental results on different real-world datasets demonstrate that our proposed algorithms can lead to significantly better accuracy for state-of-the-art LDP protocols by orders of magnitude for the smooth dataset. And these algorithms also work on non-smooth datasets, but only to a limited extent. Our code is available at https://github.com/SEUNICK/LDP.
Links
Telepath: A Minecraft-based Covert Communication System.
Authors
- Zhen Sun, Cornell Tech
- Vitaly Shmatikov, Cornell Tech
Abstract
Covert, censorship-resistant communication in the presence of nation-state adversaries requires unobservable channels whose operation is difficult to detect via network-traffic analysis. Traffic substitution, i.e., replacing data transmitted by a "cover" application with covert content, takes advantage of already-existing encrypted channels to produce traffic that is statistically indistinguishable from the traffic of the cover application and thus difficult to censor.Online games are a promising platform for building circumvention channels due to their popularity in many censored regions. We show, however, that previously proposed traffic substitution methods cannot be directly applied to games. Their traces, even if statistically similar to game traces, may violate game-specific invariants and are thus easy to detect because they could not have been generated by an actual gameplay.We explain how to identify non-disruptive content whose substitution does not result in client-server inconsistencies and use these ideas to design and implement Telepath, a covert communication system that uses Minecraft as the platform. Telepath takes advantage of (1) Minecraft’s encrypted client-server channel, (2) decentralized architecture that enables individual users to run their own servers, and (3) popularity of "mods" that add functionality to Minecraft clients and servers. Telepath runs a Minecraft game but substitutes non-disruptive in-game messages with covert content, without changing the game’s interaction with the network manager.We measure performance of Telepath for Web browsing and audio streaming, and show that network traffic generated by Telepath resists statistical traffic analysis that aims to distinguish it from popular Minecraft bots.
Links
Discop: Provably Secure Steganography in Practice Based on "Distribution Copies".
Authors
- Jinyang Ding, University of Science and Technology of China
- Kejiang Chen, University of Science and Technology of China
- Yaofei Wang, Hefei University of Technology
- Na Zhao, University of Science and Technology of China
- Weiming Zhang, University of Science and Technology of China
- Nenghai Yu, University of Science and Technology of China
Abstract
Steganography is the act of disguising the transmission of secret information as seemingly innocent. Although provably secure steganography has been proposed for decades, it has not been mainstream in this field because its strict requirements (such as a perfect sampler and an explicit data distribution) are challenging to satisfy in traditional data environments. The popularity of deep generative models is gradually increasing and can provide an excellent opportunity to solve this problem. Several methods attempting to achieve provably secure steganography based on deep generative models have been proposed in recent years. However, they cannot achieve the expected security in practice due to unrealistic conditions, such as the balanced grouping of discrete elements and a perfect match between the message and channel distributions. In this paper, we propose a new provably secure steganography method in practice named Discop, which constructs several "distribution copies" during the generation process. At each time step of generation, the message determines from which "distribution copy" to sample. As long as the receiver agrees on some shared information with the sender, he can extract the message without error. To further improve the embedding rate, we recursively construct more "distribution copies" by creating Huffman trees. We prove that Discop can strictly maintain the original distribution so that the adversary cannot perform better than random guessing. Moreover, we conduct experiments on multiple generation tasks for diverse digital media, and the results show that Discop’s security and efficiency outperform those of previous methods.
Links
SQUIP: Exploiting the Scheduler Queue Contention Side Channel.
Authors
- Stefan Gast, Lamarr Security Research; Graz University of Technology
- Jonas Juffinger, Lamarr Security Research; Graz University of Technology
- Martin Schwarzl, Graz University of Technology
- Gururaj Saileshwar, Georgia Institute of Technology
- Andreas Kogler, Graz University of Technology
- Simone Franza, Graz University of Technology
- Markus Köstl, Graz University of Technology
- Daniel Gruss, Graz University of Technology
Abstract
Modern superscalar CPUs have multiple execution units that independently execute operations from the instruction stream. Previous work has shown that numerous side channels exist around these out-of-order execution pipelines, particularly for an attacker running on an SMT core.In this paper, we present the SQUIP attack, the first side-channel attack on scheduler queues, which are critical for deciding the schedule of instructions to be executed in superscalar CPUs. Scheduler queues have not been explored as a side channel so far, as Intel CPUs only have a single scheduler queue, and contention thereof would be virtually the same as contention of the reorder buffer. However, the Apple M1, AMD Zen 2, and Zen 3 microarchitectures have separate scheduler queues per execution unit. We first reverse-engineer the behavior of the scheduler queues on these CPUs and show that they can be primed and probed. The SQUIP attack observes the occupancy level from within the same hardware core and across SMT threads. We evaluate the performance of the SQUIP attack in a covert channel, exfiltrating 0.89 Mbit/s from a co-located virtual machine at an error rate below 0.8 %, and 2.70 Mbit/s from a co-located process at an error rate below 0.8 %. We then demonstrate the side channel on an mbedTLS RSA signature process in a co-located process and in a co-located virtual machine. Our attack recovers full RSA-4096 keys with only 50 500 traces and less than 5 to 18 bit errors on average. Finally, we discuss mitigations necessary, especially for Zen 2 and Zen 3 systems, to prevent our attacks.
Links
Scatter and Split Securely: Defeating Cache Contention and Occupancy Attacks.
Authors
- Lukas Giner, Graz University of Technology
- Stefan Steinegger, Graz University of Technology
- Antoon Purnal, Imec-COSIC, KU Leuven
- Maria Eichlseder, Graz University of Technology
- Thomas Unterluggauer, Intel Corporation
- Stefan Mangard, Graz University of Technology
- Daniel Gruss, Graz University of Technology
Abstract
In this paper, we propose SassCache, a secure skewed associative cache with keyed index mapping. For this purpose, we design a new two-layered, low-latency cryptographic construction with configurable output coverage based on state-of-the-art cryptographic primitives. Based on this construction, SassCache is the first secure randomized cache with secure spacing. Victim cache lines automatically hide in locations the attacker cannot reach after less than 1 access on average. Consequently, attackers cannot evict the cache line, no matter which and how many memory accesses they perform. Our security analysis shows that all existing techniques for eviction set construction fail, and state-of-the-art attacks only apply to 1 in 3 million addresses, where SassCache is still as secure as ScatterCache. Compared to standard caches, Sass Cache has a single-threaded performance penalty of 1.75 % on the last-level cache hit rate in the SPEC2017 benchmark, and an average decrease of 11.7 p.p. in hit rate for MiBench, GAP and Scimark for our high-security settings.
Links
DevIOus: Device-Driven Side-Channel Attacks on the IOMMU.
Authors
- Taehun Kim, Korea University
- Hyeongjin Park, Korea University
- Seokmin Lee, Korea University
- Seunghee Shin, The State University of New York at Binghamton
- Junbeom Hur, Korea University
- Youngjoo Shin, Korea University
Abstract
Modern computer systems take advantage of Input/Output Memory Management Unit (IOMMU) to protect memory from DMA attacks, or to achieve strong isolation in virtualization. Despite its promising benefits, the IOMMU could be a new source of security threats. Like the MMU, the IOMMU also has Translation Lookaside Buffer (TLB) named IOTLB, an address translation cache that keeps the recent translations. Accordingly, the IOTLB can be a target of a timing side-channel attack, revealing victim’s secret. In this paper, we present DevIOus, a novel device-driven side-channel attack exploiting the IOTLB. DevIOus employs DMA-capable PCIe devices, such as GPU and RDMA-enabled NIC (RNIC), to deliver the attack. Thus, our attack has no influence on CPU caches or TLB in a victim’s machine. Implementing DevIOus is not trivial as microarchitectural internals of the IOTLB of Intel processors are hidden. We overcome this by reverse-engineering the IOTLB and disclose its hidden architectural properties. Based on this, we construct two IOTLB-based timing attack primitives using a GPU and an RNIC. Then, we demonstrate practical attacks that target co-located VMs under hardware-assisted isolation, and remote machines connected over the RDMA network. We also discuss possible mitigations against the proposed side-channel attack.
Links
DVFS Frequently Leaks Secrets: Hertzbleed Attacks Beyond SIKE, Cryptography, and CPU-Only Data.
Authors
- Yingchen Wang, University of Texas at Austin
- Riccardo Paccagnella, University of Illinois Urbana-Champaign
- Alan Wandke, University of Illinois Urbana-Champaign
- Zhao Gang, University of Texas at Austin
- Grant Garrett-Grossman, University of Illinois Urbana-Champaign
- Christopher W. Fletcher, University of Illinois Urbana-Champaign
- David Kohlbrenner, University of Washington
- Hovav Shacham, University of Texas at Austin
Abstract
The recent Hertzbleed disclosure demonstrates how remote-timing analysis can reveal secret information previously only accessible to local-power analysis. At worst, this constitutes a fundamental break in the constant-time programming principles and the many deployed programs that rely on them. But all hope is not lost. Hertzbleed relies on a coarse-grained, noisy channel that is difficult to exploit. Indeed, the Hertzbleed paper required a bespoke cryptanalysis to attack a specific cryptosystem (SIKE). Thus, it remains unclear if Hertzbleed represents a threat to the broader security ecosystem.In this paper, we demonstrate that Hertzbleed’s effects are wide ranging, not only affecting cryptosystems beyond SIKE, but also programs beyond cryptography, and even computations occurring outside the CPU cores. First, we demonstrate how latent gadgets in other cryptosystem implementations— specifically "constant-time" ECDSA and Classic McEliece— can be combined with existing cryptanalysis to bootstrap Hertzbleed attacks on those cryptosystems. Second, we demonstrate how power consumption on the integrated GPU influences frequency on the CPU—and how this can be used to perform the first cross-origin pixel stealing attacks leveraging "constant-time" SVG filters on Google Chrome.
Links
A Security RISC: Microarchitectural Attacks on Hardware RISC-V CPUs.
Authors
- Lukas Gerlach, CISPA Helmholtz Center for Information Security
- Daniel Weber, CISPA Helmholtz Center for Information Security
- Ruiyi Zhang, CISPA Helmholtz Center for Information Security
- Michael Schwarz, CISPA Helmholtz Center for Information Security
Abstract
Microarchitectural attacks threaten the security of computer systems even in the absence of software vulnerabilities. Such attacks are well explored on x86 and ARM CPUs, with a wide range of proposed but not-yet deployed hardware countermeasures. With the standardization of the RISC-V instruction set architecture and the announcement of support for the architecture by major processor vendors, RISC-V CPUs are on the verge of becoming ubiquitous. However, the microarchitectural attack surface of the first commercially-available RISC-V hardware CPUs still needs to be explored.This paper analyzes the two commercially-available off-the-shelf 64-bit RISC-V (hardware) CPUs used in most RISC-V systems running a full-fledged commodity Linux system. We evaluate the microarchitectural attack surface and introduce 3 new microarchitectural attack techniques: Cache+Time, a novel cache-line-granular cache attack without shared memory, Flush+Fault exploiting the Harvard cache architecture for Flush+Reload, and CycleDrift exploiting unprivileged access to instruction-retirement information. We also show that many known attacks apply to these RISC-V CPUs, mainly due to non-existing hardware countermeasures and instruction-set subtleties that do not consider the microarchitectural attack surface. We demonstrate our attacks in 6 case studies, including the first RISC-V-specific microarchitectural KASLR break and a CycleDrift-based method for detecting kernel activity. Based on our analysis, we stress the need to consider the microarchitectural attack surface during every step of a CPU design, including custom ISA extensions.
Links
Examining Zero-Shot Vulnerability Repair with Large Language Models.
Authors
- Hammond Pearce, New York University
- Benjamin Tan, University of Calgary
- Baleegh Ahmad, New York University
- Ramesh Karri, New York University
- Brendan Dolan-Gavitt, New York University
Abstract
Human developers can produce code with cybersecurity bugs. Can emerging ‘smart’ code completion tools help repair those bugs? In this work, we examine the use of large language models (LLMs) for code (such as OpenAI’s Codex and AI21’s Jurassic J-1) for zero-shot vulnerability repair. We investigate challenges in the design of prompts that coax LLMs into generating repaired versions of insecure code. This is difficult due to the numerous ways to phrase key information— both semantically and syntactically—with natural languages. We perform a large scale study of five commercially available, black-box, "off-the-shelf" LLMs, as well as an open-source model and our own locally-trained model, on a mix of synthetic, hand-crafted, and real-world security bug scenarios. Our experiments demonstrate that while the approach has promise (the LLMs could collectively repair 100% of our synthetically generated and hand-crafted scenarios), a qualitative evaluation of the model’s performance over a corpus of historical real-world examples highlights challenges in generating functionally correct code.
Links
Examining Zero-Shot Vulnerability Repair with Large Language Models.
Authors
- Hammond Pearce, New York University
- Benjamin Tan, University of Calgary
- Baleegh Ahmad, New York University
- Ramesh Karri, New York University
- Brendan Dolan-Gavitt, New York University
Abstract
Human developers can produce code with cybersecurity bugs. Can emerging ‘smart’ code completion tools help repair those bugs? In this work, we examine the use of large language models (LLMs) for code (such as OpenAI’s Codex and AI21’s Jurassic J-1) for zero-shot vulnerability repair. We investigate challenges in the design of prompts that coax LLMs into generating repaired versions of insecure code. This is difficult due to the numerous ways to phrase key information— both semantically and syntactically—with natural languages. We perform a large scale study of five commercially available, black-box, "off-the-shelf" LLMs, as well as an open-source model and our own locally-trained model, on a mix of synthetic, hand-crafted, and real-world security bug scenarios. Our experiments demonstrate that while the approach has promise (the LLMs could collectively repair 100% of our synthetically generated and hand-crafted scenarios), a qualitative evaluation of the model’s performance over a corpus of historical real-world examples highlights challenges in generating functionally correct code.
Links
Callee: Recovering Call Graphs for Binaries with Transfer and Contrastive Learning.
Authors
- Wenyu Zhu, Tsinghua University, Beijing, China; BNRist
- Zhiyao Feng, Tsinghua University, Beijing, China; BNRist
- Zihan Zhang, Tsinghua University, Beijing, China; BNRist
- Jianjun Chen, Tsinghua University, Beijing, China; Zhongguancun Laboratory
- Zhijian Ou, Tsinghua University, Beijing, China
- Min Yang, Fudan University, Shanghai, China
- Chao Zhang, Tsinghua University, Beijing, China; BNRist; Zhongguancun Laboratory
Abstract
Recovering binary programs’ call graphs is crucial for inter-procedural analysis tasks and applications based on them. One of the core challenges is recognizing targets of indirect calls (i.e., indirect callees). Existing solutions all have high false positives and negatives, making call graphs inaccurate. In this paper, we propose a new solution Callee combining transfer learning and contrastive learning. The key insight is that, deep neural networks (DNNs) can automatically identify patterns concerning indirect calls. Inspired by the advances in question-answering applications, we utilize contrastive learning to answer the callsite-callee question. However, one of the toughest challenges is that DNNs need large datasets to achieve high performance, while collecting large-scale indirect-call ground truths can be computational-expensive. Therefore, we leverage transfer learning to pre-train DNNs with easy-to-collect direct calls and further fine-tune DNNs for indirect-calls. We evaluate Callee on several groups of targets, and results show that our solution could match callsites to callees with an F1-Measure of 94.6%, much better than state-of-the-art solutions. Further, we apply Callee to two applications – binary code similarity detection and hybrid fuzzing, and found it could greatly improve their performance.
Links
XFL: Naming Functions in Binaries with Extreme Multi-label Learning.
Authors
- James Patrick-Evans, Research Institute CODE, Bundeswehr University, Munich, Germany; Information Security Group, Royal Holloway, University of London, United Kingdom
- Moritz Dannehl, Research Institute CODE, Bundeswehr University, Munich, Germany
- Johannes Kinder, Research Institute CODE, Bundeswehr University, Munich, Germany
Abstract
Reverse engineers benefit from the presence of identifiers such as function names in a binary, but usually these are removed for release. Training a machine learning model to predict function names automatically is promising but fundamentally hard: unlike words in natural language, most function names occur only once. In this paper, we address this problem by introducing eXtreme Function Labeling (XFL), an extreme multi-label learning approach to selecting appropriate labels for binary functions. XFL splits function names into tokens, treating each as an informative label akin to the problem of tagging texts in natural language. We relate the semantics of binary code to labels through Dexter, a novel function embedding that combines static analysis-based features with local context from the call graph and global context from the entire binary. We demonstrate that XFL/Dexter outperforms the state of the art in function labeling on a dataset of 10,047 binaries from the Debian project, achieving a precision of 83.5%. We also study combinations of XFL with alternative binary embeddings from the literature and show that Dexter consistently performs best for this task. As a result, we demonstrate that binary function labeling can be effectively phrased in terms of multi-label learning, and that binary function embeddings benefit from including explicit semantic features.
Links
D-ARM: Disassembling ARM Binaries by Lightweight Superset Instruction Interpretation and Graph Modeling.
Authors
- Yapeng Ye, Purdue University
- Zhuo Zhang, Purdue University
- Qingkai Shi, Purdue University
- Yousra Aafer, University of Waterloo
- Xiangyu Zhang, Purdue University
Abstract
ARM binary analysis has a wide range of applications in ARM system security. A fundamental challenge is ARM disassembly. ARM, particularly AArch32, has a number of unique features making disassembly distinct from x86 disassembly, such as the mixing of ARM and Thumb instruction modes, implicit mode switching within an application, and more prevalent use of inlined data. Existing techniques cannot achieve high accuracy when binaries become complex and have undergone obfuscation. We propose a novel ARM binary disassembly technique that is particularly designed to address challenges in legacy code for 32-bit ARM binaries. It features a lightweight superset instruction interpretation method to derive rich semantic information and a graph-theory based method that aggregates such information to produce final results. Our comparative evaluation with a number of state-of-the-art disassemblers, including Ghidra, IDA, P-Disasm, XDA, D-Disasm, and Spedi, on thousands of binaries generated from SPEC2000 and SPEC2006 with various settings, and real-world applications collected online show that our technique D-ARM substantially outperforms the baselines.
Links
GraphSPD: Graph-Based Security Patch Detection with Enriched Code Semantics.
Authors
- Shu Wang, George Mason University
- Xinda Wang, George Mason University
- Kun Sun, George Mason University
- Sushil Jajodia, George Mason University
- Haining Wang, Virginia Tech
- Qi Li, Tsinghua University
Abstract
With the increasing popularity of open-source software, embedded vulnerabilities have been widely propagating to downstream software. Due to different maintenance policies, software vendors may silently release security patches without providing sufficient advisories (e.g., CVE). This leaves users unaware of security patches and provides attackers good chances to exploit unpatched vulnerabilities. Thus, detecting those silent security patches becomes imperative for secure software maintenance. In this paper, we propose a graph neural network based security patch detection system named GraphSPD, which represents patches as graphs with richer semantics and utilizes a patch-tailored graph model for detection. We first develop a novel graph structure called PatchCPG to represent software patches by merging two code property graphs (CPGs) for the pre-patch and post-patch source code as well as retaining the context, deleted, and added components for the patch. By applying a slicing technique, we retain the most relevant context and reduce the size of PatchCPG. Then, we develop the first end-to-end deep learning model called PatchGNN to determine if a patch is security-related directly from its graph-structured PatchCPG. PatchGNN includes a new embedding process to convert PatchCPG into a numeric format and a new multi-attributed graph convolution mechanism to adapt diverse relationships in PatchCPG. The experimental results show GraphSPD can significantly outperform the state-of-the-art approaches on security patch detection.
Links
Effective ReDoS Detection by Principled Vulnerability Modeling and Exploit Generation.
Authors
- Xinyi Wang, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, CAS; School of Cyber Security, University of Chinese Academy of Sciences
- Cen Zhang, Nanyang Technological University
- Yeting Li, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, CAS; School of Cyber Security, University of Chinese Academy of Sciences
- Zhiwu Xu, Shenzhen University
- Shuailin Huang, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, CAS; School of Cyber Security, University of Chinese Academy of Sciences
- Yi Liu, Nanyang Technological University
- Yican Yao, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, CAS; School of Cyber Security, University of Chinese Academy of Sciences
- Yang Xiao, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, CAS; School of Cyber Security, University of Chinese Academy of Sciences
- Yanyan Zou, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, CAS; School of Cyber Security, University of Chinese Academy of Sciences
- Yang Liu, Nanyang Technological University
- Wei Huo, {CAS-KLONAT, BKLONSPT}, Institute of Information Engineering, CAS; School of Cyber Security, University of Chinese Academy of Sciences
Abstract
Regular expression Denial-of-Service (ReDoS) is one kind of algorithmic complexity attack. For a vulnerable regex, attackers can craft certain strings to trigger the super-linear worst-case matching time, which causes denial-of-service to regex engines. Various ReDoS detection approaches have been proposed recently. Among them, hybrid approaches which absorb the advantages of both static and dynamic approaches have shown their performance superiority. However, two key challenges still hinder the effectiveness of the detection: 1) Existing modelings summarize localized vulnerability patterns based on partial features of the vulnerable regex; 2) Existing attack string generation strategies are ineffective since they neglected the fact that non-vulnerable parts of the regex may unexpectedly invalidate the attack string (we name this kind of invalidation as disturbance.)Rengar is our hybrid ReDoS detector with new vulnerability modeling and disturbance free attack string generator. It has the following key features: 1) Benefited by summarizing patterns from full features of the vulnerable regex, its modeling is a more precise interpretation of the root cause of ReDoS vulnerability. The modeling is more descriptive and precise than the union of existing modelings while keeping conciseness; 2) For each vulnerable regex, its generator automatically checks all potential disturbances and composes generation constraints to avoid possible disturbances.Compared with nine state-of-the-art tools, Rengar detects not only all vulnerable regexes they found but also 3 – 197 times more vulnerable regexes. Besides, it saves 57.41% – 99.83% average detection time compared with tools containing a dynamic validation process. Using Rengar, we have identified 69 zero-day vulnerabilities (21 CVEs) affecting popular projects which have more than dozens of millions weekly download count.
Links
SoK: Decentralized Finance (DeFi) Attacks.
Authors
- Liyi Zhou, Imperial College London; Berkeley Center for Responsible, Decentralized Intelligence (RDI)
- Xihan Xiong, Imperial College London
- Jens Ernstberger, Technical University of Munich; Berkeley Center for Responsible, Decentralized Intelligence (RDI)
- Stefanos Chaliasos, Imperial College London
- Zhipeng Wang, Imperial College London
- Ye Wang, University of Macau
- Kaihua Qin, Imperial College London; Berkeley Center for Responsible, Decentralized Intelligence (RDI)
- Roger Wattenhofer, ETH Zurich
- Dawn Song, University of California, Berkeley; Berkeley Center for Responsible, Decentralized Intelligence (RDI)
- Arthur Gervais, University College London; Berkeley Center for Responsible, Decentralized Intelligence (RDI)
Abstract
Within just four years, the blockchain-based Decentralized Finance (DeFi) ecosystem has accumulated a peak total value locked (TVL) of more than 253 billion USD. This surge in DeFi’s popularity has, unfortunately, been accompanied by many impactful incidents. According to our data, users, liquidity providers, speculators, and protocol operators suffered a total loss of at least 3.24 billion USD from Apr 30, 2018 to Apr 30, 2022. Given the blockchain’s transparency and increasing incident frequency, two questions arise: How can we systematically measure, evaluate, and compare DeFi incidents? How can we learn from past attacks to strengthen DeFi security?In this paper, we introduce a common reference frame to systematically evaluate and compare DeFi incidents, including both attacks and accidents. We investigate 77 academic papers, 30 audit reports, and 181 real-world incidents. Our data reveals several gaps between academia and the practitioners’ community. For example, few academic papers address "price oracle attacks" and "permissonless interactions", while our data suggests that they are the two most frequent incident types (15% and 10.5% correspondingly). We also investigate potential defenses, and find that: (i) 103 (56%) of the attacks are not executed atomically, granting a rescue time frame for defenders; (ii) bytecode similarity analysis can at least detect 31 vulnerable/23 adversarial contracts; and (iii) 33 (15.3%) of the adversaries leak potentially identifiable information by interacting with centralized exchanges.
Links
BlindHub: Bitcoin-Compatible Privacy-Preserving Payment Channel Hubs Supporting Variable Amounts.
Authors
- Xianrui Qin, The University of Hong Kong
- Shimin Pan, The University of Hong Kong
- Arash Mirzaei, Monash University
- Zhimei Sui, Monash University
- Oğuzhan Ersoy, Radboud University, Delft University of Technology
- Amin Sakzad, Monash University
- Muhammed F. Esgin, Monash University; CSIRO’s Data61
- Joseph K. Liu, Monash University
- Jiangshan Yu, Monash University
- Tsz Hon Yuen, The University of Hong Kong
Abstract
Payment Channel Hub (PCH) is a promising solution to the scalability issue of first-generation blockchains or cryptocurrencies such as Bitcoin. It supports off-chain payments between a sender and a receiver through an intermediary (called the tumbler). Relationship anonymity and value privacy are desirable features of privacy-preserving PCHs, which prevent the tumbler from identifying the sender and receiver pairs as well as the payment amounts. To our knowledge, all existing Bitcoin-compatible PCH constructions that guarantee relationship anonymity allow only a (predefined) fixed payment amount. Thus, to achieve payments with different amounts, they would require either multiple PCH systems or running one PCH system multiple times. Neither of these solutions would be deemed practical.In this paper, we propose the first Bitcoin-compatible PCH that achieves relationship anonymity and supports variable amounts for payment. To achieve this, we have several layers of technical constructions, each of which could be of independent interest to the community. First, we propose BlindChannel, a novel bi-directional payment channel protocol for privacy-preserving payments, where one of the channel parties is unable to see the channel balances. Then, we further propose BlindHub, a three-party (sender, tumbler, receiver) protocol for private conditional payments, where the tumbler pays to the receiver only if the sender pays to the tumbler. The appealing additional feature of BlindHub is that the tumbler cannot link the sender and the receiver while supporting a variable payment amount. To construct BlindHub, we also introduce two new cryptographic primitives as building blocks, namely Blind Adaptor Signature (BAS), and Flexible Blind Conditional Signature (FBCS). BAS is an adaptor signature protocol built on top of a blind signature scheme. FBCS is a new cryptographic notion enabling us to provide an atomic and privacy-preserving PCH. Lastly, we instantiate both BlindChannel and BlindHub protocols and present implementation results to show their practicality.
Links
Optimistic Fast Confirmation While Tolerating Malicious Majority in Blockchains.
Authors
- Ruomu Hou, National University of Singapore
- Haifeng Yu, National University of Singapore
Abstract
The robustness of a blockchain against the adversary is often characterized by the maximum fraction (f
max ) of adversarial power that it can tolerate. While most existing blockchains can only tolerate ${f_{\max }} < \frac{1}{2}$ or lower, there are some blockchain systems that are able to tolerate a malicious majority, namely ${f_{\max }} \geq \frac{1}{2}$. A key price paid by such blockchains, however, is their large confirmation latency. This work aims to significantly reduce the confirmation latency in such blockchains, under the common case where the actual fraction f of adversarial power is relatively small. To this end, we propose a novel blockchain called Flint. Flint tolerates ${f_{\max }} \geq \frac{1}{2}$ and can give optimistic execution (i.e., fast confirmation) whenever f is relatively small. Our experiments show that the fast confirmation in Flint only takes a few minutes, as compared to several hours of confirmation latency in prior works.
Links
Clockwork Finance: Automated Analysis of Economic Security in Smart Contracts.
Authors
- Kushal Babel, Cornell Tech
- Philip Daian, Cornell Tech
- Mahimna Kelkar, Cornell Tech
- Ari Juels, Cornell Tech
Abstract
We introduce the Clockwork Finance Framework (CFF), a general purpose, formal verification framework for mechanized reasoning about the economic security properties of composed decentralized-finance (DeFi) smart contracts.CFF features three key properties. It is contract complete, meaning that it can model any smart contract platform and all its contracts—Turing complete or otherwise. It does so with asymptotically constant model overhead. It is also attack-exhaustive by construction, meaning that it can automatically and mechanically extract all possible economic attacks on users’ cryptocurrency across modeled contracts.Thanks to these properties, CFF can support multiple goals: economic security analysis of contracts by developers, analysis of DeFi trading risks by users, fees UX, and optimization of arbitrage opportunities by bots or miners. Because CFF offers composability, it can support these goals with reasoning over any desired set of potentially interacting smart contract models.We instantiate CFF as an executable model for Ethereum contracts that incorporates a state-of-the-art deductive verifier. Building on previous work, we introduce extractable value (EV), a new formal notion of economic security in composed DeFi contracts that is both a basis for CFF and of general interest.We construct modular, human-readable, composable CFF models of four popular, deployed DeFi protocols in Ethereum: Uniswap, Uniswap V2, Sushiswap, and MakerDAO, representing a combined 24 billion USD in value as of March 2022. We use these models along with some other common models such as flash loans, airdrops and voting to show experimentally that CFF is practical and can drive useful, data-based EV-based insights from real world transaction activity. Without any explicitly programmed attack strategies, CFF uncovers on average an expected $56 million of EV per month in the recent past.
Links
Tyr: Finding Consensus Failure Bugs in Blockchain System with Behaviour Divergent Model.
Authors
- Yuanliang Chen, School of Software, Tsinghua University, KLISS, BNRist, Beijing, China
- Fuchen Ma, School of Software, Tsinghua University, KLISS, BNRist, Beijing, China
- Yuanhang Zhou, School of Software, Tsinghua University, KLISS, BNRist, Beijing, China
- Yu Jiang, School of Software, Tsinghua University, KLISS, BNRist, Beijing, China
- Ting Chen, University of Electronic Science and Technology of China, Chengdu, China
- Jiaguang Sun, School of Software, Tsinghua University, KLISS, BNRist, Beijing, China
Abstract
Blockchain is a decentralized distributed system on which a large number of financial applications have been deployed. The consensus process in it plays an important role, which guarantees that legal transactions on the chain can be executed and recorded fairly and consistently. However, because of Consensus Failure Bugs (CFBs), many blockchain systems do not provide even this basic guarantee. The validity and consistency of blockchain systems rely on the soundness of complex consensus logic implementation. Any bugs which cause the blockchain consensus failure can be crucial.In this work, we introduce Tyr, an open-source tool for detecting CFBs in blockchain systems with a large number of abnormal divergent consensus behaviors. First, we design four oracle detectors to monitor the behaviors of nodes and analyze the violation of consensus properties. To trigger these oracles effectively, Tyr harnesses a behavior divergent model to constantly generate consensus messages and make nodes behave as differently as possible. We implemented and evaluated Tyr on six widely used commercial blockchain consensus systems, including IBM Fabric, WeBank FISCO-BCOS, ConsenSys Quorum, Facebook Diem, Go-Ethereum, and EOS. Compared with the state-of-the-art tools Peach, Fluffy, and Twins, Tyr covers 27.3%, 228.2%, and 297.1% more branches, respectively. Furthermore, Tyr has detected 20 serious previously unknown vulnerabilities, all of which have been repaired by the corresponding maintainers.
Links
Leaking Arbitrarily Many Secrets: Any-out-of-Many Proofs and Applications to RingCT Protocols.
Authors
- Tianyu Zheng, Department of Computing, The Hong Kong Polytechnic University
- Shang Gao, Department of Computing, The Hong Kong Polytechnic University
- Yubo Song, School of Cyber Science and Engineering, Southeast University
- Bin Xiao, Department of Computing, The Hong Kong Polytechnic University
Abstract
Ring Confidential Transaction (RingCT) protocol is an effective cryptographic component for preserving the privacy of cryptocurrencies. However, existing RingCT protocols are instantiated from one-out-of-many proofs with only one secret, leading to low efficiency and weak anonymity when handling transactions with multiple inputs. Additionally, current partial knowledge proofs with multiple secrets are neither secure nor efficient to be applied in a RingCT protocol.In this paper, we propose a novel any-out-of-many proof, a logarithmic-sized zero-knowledge proof scheme for showing the knowledge of arbitrarily many secrets out of a public list. Unlike other partial knowledge proofs that have to reveal the number of secrets [ACF21], our approach proves the knowledge of multiple secrets without leaking the exact number of them. Furthermore, we improve the efficiency of our method with a generic inner-product transformation to adopt the Bulletproofs compression [BBB+18], which reduces the proof size to 2⌈log
2 (N)⌉+9.Based on our proposed proof scheme, we further construct a compact RingCT protocol for privacy cryptocurrencies, which can provide a logarithmic-sized communication complexity for transactions with multiple inputs. More importantly, as the only known RingCT protocol instantiated from the partial knowledge proofs, our protocol can achieve the highest anonymity level compared with other approaches like Omniring [LRR+19]. For other applications, such as multiple ring signatures, our protocol can also be applied with some modifications. We believe our techniques are also applicable in other privacy-preserving scenarios, such as multiple ring signatures and coin-mixing in the blockchain.
Links
Could you clean up the Internet with a Pit of Tar? Investigating tarpit feasibility on Internet worms.
Authors
- Harm Griffioen, Hasso Plattner Institute, University of Potsdam, Germany
- Christian Doerr, Hasso Plattner Institute, University of Potsdam, Germany
Abstract
Botnets often spread through massive Internet-wide scanning, identifying and infecting vulnerable Internet-facing devices to grow their network. Taking down these networks is often hard for law enforcement, and some people have proposed tarpits as a defensive method because it does not require seizing infrastructure or rely on device owners to make sure their devices are well-configured and protected. These tarpits are network services that aim to keep a malware-infected device busy and slow down or eradicate the malicious behavior.This paper identifies a network-based tarpit vulnerability in stateless-scanning malware and develops a tarpitting exploit. We apply this technique against malware based on the Mirai scanning routine to identify whether tarpitting at scale is effective in containing the spread of self-propagating malware. We demonstrate that we can effectively trap thousands of devices even in a single tarpit and that this significantly slows down botnet spreading across the Internet and provide a framework to simulate malware spreading under various network conditions to apriori evaluate the effect of tarpits on a particular malware. We show that the self-propagating malware could be contained with the help of a few thousand tarpits without any measurable adverse impact on compromised routers or Internet Service Providers, and we release our tarpitting solution as an open platform to the community to realize this.
Links
Beyond Phish: Toward Detecting Fraudulent e-Commerce Websites at Scale.
Authors
- Marzieh Bitaab, Arizona State University
- Haehyun Cho, Soongsil University
- Adam Oest, PayPal, Inc.
- Zhuoer Lyu, Arizona State University
- Wei Wang, Palo Alto Networks
- Jorij Abraham, Scam Adviser
- Ruoyu Wang, Arizona State University
- Tiffany Bao, Arizona State University
- Yan Shoshitaishvili, Arizona State University
- Adam Doupé, Arizona State University
Abstract
Despite recent advancements in malicious website detection and phishing mitigation, the security ecosystem has paid little attention to Fraudulent e-Commerce Websites (FCWs), such as fraudulent shopping websites, fake charities, and cryptocurrency scam websites. Even worse, there are no active large-scale mitigation systems or publicly available datasets for FCWs.In this paper, we first propose an efficient and automated approach to gather FCWs through crowdsourcing. We identify eight different types of non-phishing FCWs and derive key defining characteristics. Then, we find that anti-phishing mitigation systems, such as Google Safe Browsing, have a detection rate of just 0.46% on our dataset. We create a classifier, BEYOND PHISH, to identify FCWs using manually defined features based on our analysis. Validating BEYOND PHISH on never-before-seen (untrained and untested data) through a user study indicates that our system has a high detection rate and a low false positive rate of 98.34% and 1.34%, respectively. Lastly, we collaborated with a major Internet security company, Palo Alto Networks, as well as a major financial services provider, to evaluate our classifier on manually labeled real-world data. The model achieves a false positive rate of 2.46% and a 94.88% detection rate, showing potential for real-world defense against FCWs.
Links
Limits of I/O Based Ransomware Detection: An Imitation Based Attack.
Authors
- Chijin Zhou, BNRist, School of Software, Tsinghua University, Beijing, China
- Lihua Guo, BNRist, School of Software, Tsinghua University, Beijing, China
- Yiwei Hou, BNRist, School of Software, Tsinghua University, Beijing, China
- Zhenya Ma, BNRist, School of Software, Tsinghua University, Beijing, China
- Quan Zhang, BNRist, School of Software, Tsinghua University, Beijing, China
- Mingzhe Wang, BNRist, School of Software, Tsinghua University, Beijing, China
- Zhe Liu, NUAA, Computer Science and Technology, Nanjing, China
- Yu Jiang, BNRist, School of Software, Tsinghua University, Beijing, China
Abstract
By encrypting the data of infected hosts, cryptographic ransomware has caused billions of dollars in financial losses to a wide range of victims. Many detection techniques have been proposed to counter ransomware threats over the past decade. Their common approach is to monitor I/O behaviors from user space and apply custom heuristics to discriminate ransomware. These techniques implicitly assume that ransomware behaves very differently from benign programs in terms of heuristics. However, when we investigated the behavior of benign and ransomware programs, we found that the boundary between their behaviors was blurred. A ransomware program can still achieve its goal even though it follows the behavior patterns of benign programs. In this paper, we aim to explore the limits of ransomware detection techniques that based on I/O behaviors. To this end, we present Animagus, an imitation-based ransomware attack that imitates behaviors of benign programs to disguise its encryption tasks. It first learns behavior patterns from a benign program, and then spawns and orchestrates child processes to perform encryption tasks behaving the same as the benign program. We evaluate its effectiveness against six state-of-the-art detection techniques, and the results show that it can successfully evade these defenses. We investigate in detail why they are ineffective and how Animagus is different from existing ransomware samples. In the end, we discuss potential countermeasures and the benefits that detection tools can gain from our work.
Links
From Grim Reality to Practical Solution: Malware Classification in Real-World Noise.
Authors
- Xian Wu, Northwestern University
- Wenbo Guo, UC Berkeley
- Jia Yan, Penn State
- Baris Coskun, AWS
- Xinyu Xing, Northwestern University
Abstract
Malware datasets inevitably contain incorrect labels due to the shortage of expertise and experience needed for sample labeling. Previous research demonstrated that a training dataset with incorrectly labeled samples would result in inaccurate model learning. To address this problem, researchers have proposed various noise learning methods to offset the impact of incorrectly labeled samples, and in image recognition and text mining applications, these methods demonstrated great success. In this work, we apply both representative and state-of-the-art noise learning methods to real-world malware classification tasks. We surprisingly observe that none of the existing methods could minimize incorrect labels’ impact. Through a carefully designed experiment, we discover that the inefficacy mainly results from extreme data imbalance and the high percentage of incorrectly labeled data samples. As such, we further propose a new noise learning method and name it after MORSE. Unlike existing methods, MORSE customizes and extends a state-of-the-art semi-supervised learning technique. It takes possibly incorrectly labeled data as unlabeled data and thus avoids their potential negative impact on model learning. In MORSE, we also integrate a sample re-weighting method that balances the training data usage in the model learning and thus handles the data imbalance challenge. We evaluate MORSE on both our synthesized and real-world datasets. We show that MORSE could significantly outperform existing noise learning methods and minimize the impact of incorrectly labeled data.
Links
SoK: History is a Vast Early Warning System: Auditing the Provenance of System Intrusions.
Authors
- Muhammad Adil Inam, University of Illinois at Urbana-Champaign
- Yinfang Chen, University of Illinois at Urbana-Champaign
- Akul Goyal, University of Illinois at Urbana-Champaign
- Jason Liu, University of Illinois at Urbana-Champaign
- Jaron Mink, University of Illinois at Urbana-Champaign
- Noor Michael, University of Illinois at Urbana-Champaign
- Sneha Gaur, University of Illinois at Urbana-Champaign
- Adam Bates, University of Illinois at Urbana-Champaign
- Wajih Ul Hassan, University of Virginia
Abstract
Auditing, a central pillar of operating system security, has only recently come into its own as an active area of public research. This resurgent interest is due in large part to the notion of data provenance, a technique that iteratively parses audit log entries into a dependency graph that explains the history of system execution. Provenance facilitates precise threat detection and investigation through causal analysis of sophisticated intrusion behaviors. However, the absence of a foundational audit literature, combined with the rapid publication of recent findings, makes it difficult to gain a holistic picture of advancements and open challenges in the area.In this work, we survey and categorize the provenance-based system auditing literature, distilling contributions into a layered taxonomy based on the audit log capture and analysis pipeline. Recognizing that the Reduction Layer remains a key obstacle to the further proliferation of causal analysis technologies, we delve further on this issue by conducting an ambitious independent evaluation of 8 exemplar reduction techniques against the recently-released DARPA Transparent Computing datasets. Our experiments uncover that past approaches frequently prune an overlapping set of activities from audit logs, reducing the synergistic benefits from applying them in tandem; further, we observe an inverse relation between storage efficiency and anomaly detection performance. However, we also observe that log reduction techniques are able to synergize effectively with data compression, potentially reducing log retention costs by multiple orders of magnitude. We conclude by discussing promising future directions for the field.
Links
Collaborative Ad Transparency: Promises and Limitations.
Authors
- Eleni Gkiouzepi, Technical University of Berlin
- Athanasios Andreou, Algorithmic Transparency Institute
- Oana Goga, CNRS, Inria, Institut Polytechnique de Paris
- Patrick Loiseau, Inria, FairPlay Team
Abstract
Several targeted advertising platforms offer transparency mechanisms, but researchers and civil societies repeatedly showed that those have major limitations. In this paper, we propose a collaborative ad transparency method to infer, without the cooperation of ad platforms, the targeting parameters used by advertisers to target their ads. Our idea is to ask users to donate data about their attributes and the ads they receive and to use this data to infer the targeting attributes of an ad campaign. We propose a Maximum Likelihood Estimator based on a simplified Bernoulli ad delivery model. We first test our inference method through controlled ad experiments on Facebook. Then, to further investigate the potential and limitations of collaborative ad transparency, we propose a simulation framework that allows varying key parameters. We validate that our framework gives accuracies consistent with real-world observations such that the insights from our simulations are transferable to the real world. We then perform an extensive simulation study for ad campaigns that target a combination of two attributes. Our results show that we can obtain good accuracy whenever at least ten monitored users receive an ad. This usually requires a few thousand monitored users, regardless of population size. Our simulation framework is based on a new method to generate a synthetic population with statistical properties resembling the actual population, which may be of independent interest.
Links
Toss a Fault to Your Witcher: Applying Grey-box Coverage-Guided Mutational Fuzzing to Detect SQL and Command Injection Vulnerabilities.
Authors
- Erik Trickel, Arizona State University
- Fabio Pagani, University of California, Santa Barbara
- Chang Zhu, Arizona State University
- Lukas Dresel, University of California, Santa Barbara
- Giovanni Vigna, University of California, Santa Barbara
- Christopher Kruegel, University of California, Santa Barbara
- Ruoyu Wang, Arizona State University
- Tiffany Bao, Arizona State University
- Yan Shoshitaishvili, Arizona State University
- Adam Doupé, Arizona State University
Abstract
Black-box web application vulnerability scanners attempt to automatically identify vulnerabilities in web applications without access to the source code. However, they do so by using a manually curated list of vulnerability-inducing inputs, which significantly reduces the ability of a black-box scanner to explore the web application’s input space and which can cause false negatives. In addition, black-box scanners must attempt to infer that a vulnerability was triggered, which causes false positives.To overcome these limitations, we propose Witcher, a novel web vulnerability discovery framework that is inspired by grey-box coverage-guided fuzzing. Witcher implements the concept of fault escalation to detect both SQL and command injection vulnerabilities. Additionally, Witcher captures coverage information and creates output-derived input guidance to focus the input generation and, therefore, to increase the state-space exploration of the web application. On a dataset of 18 web applications written in PHP, Python, Node.js, Java, Ruby, and C, 13 of which had known vulnerabilities, Witcher was able to find 23 of the 36 known vulnerabilities (64%), and additionally found 67 previously unknown vulnerabilities, 4 of which received CVE numbers. In our experiments, Witcher outperformed state of the art scanners both in terms of number of vulnerabilities found, but also in terms of coverage of web applications.
Links
UTopia: Automatic Generation of Fuzz Driver using Unit Tests.
Authors
- Bokdeuk Jeong, Samsung Research, Republic of Korea
- Joonun Jang, Samsung Research, Republic of Korea
- Hayoon Yi, Samsung Research, Republic of Korea
- Jiin Moon, Samsung Research, Republic of Korea
- Junsik Kim, Samsung Research, Republic of Korea
- Intae Jeon, Samsung Research, Republic of Korea
- Taesoo Kim, Samsung Research, Republic of Korea; Georgia Institute of Technology, USA
- WooChul Shim, Samsung Research, Republic of Korea
- Yong Ho Hwang, Samsung Research, Republic of Korea
Abstract
Fuzzing is arguably the most practical approach for detecting security bugs in software, but a non-trivial extent of efforts is required for its adoption. To be effective, high-quality fuzz drivers should be first formulated with a proper sequence of APIs that can exhaustively explore the program states. To alleviate this burden, existing solutions attempt to generate fuzz drivers either by inferring the valid sequences of APIs from the consumer code (i.e., actual uses of APIs) or by directly extracting them from sample executions. Unfortunately, all existing approaches suffer from a common problem: the observed API sequences, either statically inferred or dynamically monitored, are intermingled with custom application logics. However, we observed that the unit tests are carefully crafted by the actual designer of the APIs to validate their proper usages, and importantly, it is a common practice to write the unit tests during their development (e.g., over 70% of popular GitHub projects).In this paper, we propose, UTopia, an open-source tool and analysis algorithm that can automatically synthesize effective fuzz drivers from existing unit tests with near-zero human involvement. To demonstrate its effectiveness, we applied UTopia to 55 open-source project libraries, including Tizen and Node.js, and automatically generated 5K fuzz drivers from 8K eligible unit tests. In addition, we executed the generated fuzzers for approximately 5 million per-core hours and discovered 123 bugs. More importantly, 2.4K of the generated fuzz drivers were adopted to the continuous integration process of the Tizen project, indicating the quality of the synthesized fuzz driver. The proposed tool and results are publicly available and maintained for a broader adoption among both researchers and practitioners.
Links
SelectFuzz: Efficient Directed Fuzzing with Selective Path Exploration.
Authors
- Changhua Luo, Chinese University of Hong Kong, Hong Kong SAR, China
- Wei Meng, Chinese University of Hong Kong, Hong Kong SAR, China
- Penghui Li, Chinese University of Hong Kong, Hong Kong SAR, China
Abstract
Directed grey-box fuzzers specialize in testing specific target code. They have been applied to many security applications such as reproducing known crashes and detecting vulnerabilities caused by incomplete patches. However, existing directed fuzzers favor the inputs discovering new code regardless whether the newly uncovered code is relevant to the target code or not. As a result, the fuzzers would extensively explore irrelevant code and suffer from low efficiency.In this paper, we distinguish relevant code in the target program from the irrelevant one that does not help trigger the vulnerabilities in target code. We present SelectFuzz, a new directed fuzzer that selectively explores relevant program paths for efficient crash reproduction and vulnerability detection. It identifies two types of relevant code—path-divergent code and data-dependent code, that respectively captures the control-and data- dependency with the target code. It then selectively instruments and explores only the relevant code blocks. We also propose a new distance metric that accurately measures the reaching probability of different program paths and inputs.We evaluated SelectFuzz with real-world vulnerabilities in sets of diverse programs. SelectFuzz significantly outperformed a baseline directed fuzzer by up to 46.31×, and performed the best in the Google Fuzzer Test Suite. Our experiments also demonstrated that SelectFuzz and the existing techniques such as path pruning are complementary. Finally, with SelectFuzz, we detected 14 previously unknown vulnerabilities—including 6 new CVE IDs—in well tested real-world software. Our report has led to the fix of 11 vulnerabilities.
Links
Finding Specification Blind Spots via Fuzz Testing.
Authors
- Ru Ji, University of Waterloo
- Meng Xu, University of Waterloo
Abstract
A formally verified program is only as correct as its specifications (SPEC). But how to assure that the SPEC is complete and free of loopholes? This paper presents Fast, short for Fuzzing-Assisted Specification Testing, as a potential answer. The key insight is to exploit and synergize the "redundancy" and "diversity" in formally verified programs for cross-checking. Specifically, within the same codebase, SPEC, implementation (CODE), and test suites are all derived from the same set of business requirements. Therefore, if some intention is captured in CODE and test case but not in SPEC, this is a strong indication that there is a blind spot in SPEC.Fast examines the SPEC for incompleteness issues in an automated way: it first locates SPEC gaps via mutation testing, i.e., by checking whether a CODE variant conforms to the original SPEC. If so, Fast further leverages the test suites to infer whether the gap is introduced by intention or by mistake. Depending on the codebase size, Fast may choose to generate CODE variants in either an enumerative or evolutionary way. Fast is applied to two open-source codebases that feature formal verification and helps to confirm 13 and 21 blind spots in their SPEC respectively. This highlights the prevalence of SPEC incompleteness in real-world applications.
Links
ODDFuzz: Discovering Java Deserialization Vulnerabilities via Structure-Aware Directed Greybox Fuzzing.
Authors
- Sicong Cao, Yangzhou University
- Biao He, Ant Group
- Xiaobing Sun, Yangzhou University
- Yu Ouyang, Ant Group
- Chao Zhang, Tsinghua University
- Xiaoxue Wu, Yangzhou University
- Ting Su, East China Normal University
- Lili Bo, Yangzhou University
- Bin Li, Yangzhou University
- Chuanlei Ma, Ant Group
- Jiajia Li, Ant Group
- Tao Wei, Ant Group
Abstract
Java deserialization vulnerability is a severe threat in practice. Researchers have proposed static analysis solutions to locate candidate vulnerabilities and fuzzing solutions to generate proof-of-concept (PoC) serialized objects to trigger them. However, existing solutions have limited effectiveness and efficiency.In this paper, we propose a novel hybrid solution ODDFuzz to efficiently discover Java deserialization vulnerabilities. First, ODDFuzz performs lightweight static taint analysis to identify candidate gadget chains that may cause deserialization vulnerabilities. In this step, ODDFuzz tries to locate all candidates and avoid false negatives. Then, ODDFuzz performs directed greybox fuzzing (DGF) to explore those candidates and generate PoC testcases to mitigate false positives. Specifically, ODDFuzz applies a structure-aware seed generation method to guarantee the validity of the testcases, and adopts a novel hybrid feedback and a step-forward strategy to guide the directed fuzzing.We implemented a prototype of ODDFuzz and evaluated it on the popular Java deserialization repository ysoserial. Results show that, ODDFuzz could discover 16 out of 34 known gadget chains, while two state-of-the-art baselines only identify three of them. In addition, we evaluated ODDFuzz on real-world applications including Oracle WebLogic Server, Apache Dubbo, Sonatype Nexus, and protostuff, and found six previously unreported exploitable gadget chains with five CVEs assigned.
Links
The Leaky Web: Automated Discovery of Cross-Site Information Leaks in Browsers and the Web.
Authors
- Jannis Rautenstrauch, CISPA Helmholtz Center for Information Security
- Giancarlo Pellegrino, CISPA Helmholtz Center for Information Security
- Ben Stock, CISPA Helmholtz Center for Information Security
Abstract
When browsing the web, none of us want sites to infer which other sites we may have visited before or are logged in to. However, attacker-controlled sites may infer this state through browser side-channels dubbed Cross-Site Leaks (XS-Leaks). Although these issues have been known since the 2000s, prior reports mostly found individual instances of issues rather than systematically studying the problem space. Further, actual impact in the wild often remained opaque.To address these open problems, we develop the first automated framework to systematically discover observation channels in browsers. In doing so, we detect and characterize 280 observation channels that leak information cross-site in the engines of Chromium, Firefox, and Safari, which include many variations of supposedly fixed leaks. Atop this framework, we create an automatic pipeline to find XS-Leaks in real-world websites. With this pipeline, we conduct the largest to-date study on XS-Leak prevalence in the wild by performing visit inference and a newly proposed variant cookie acceptance inference attack on the Tranco Top10K. In addition, we test 100 websites for the classic XS-Leak attack vector of login detection.Our results show that XS-Leaks pose a significant threat to the web ecosystem as at least 15%, 34%, and 77% of all tested sites are vulnerable to the three attacks. Also, we present substantial implementation differences between the browsers resulting in differing attack surfaces that matter in the wild. To ensure browser vendors and web developers alike can check their applications for XS-Leaks, we open-source our framework and include an extensive discussion on countermeasures to get rid of XS-Leaks in the near future and ensure new features in browsers do not introduce new XS-Leaks.
Links
WebSpec: Towards Machine-Checked Analysis of Browser Security Mechanisms.
Authors
- Lorenzo Veronese, TU Wien
- Benjamin Farinier, Univ Rennes, Inria, CNRS, IRISA
- Pedro Bernardo, TU Wien
- Mauro Tempesta, TU Wien
- Marco Squarcina, TU Wien
- Matteo Maffei, TU Wien
Abstract
The complexity of browsers has steadily increased over the years, driven by the continuous introduction and update of Web platform components, such as novel Web APIs and security mechanisms. Their specifications are manually reviewed by experts to identify potential security issues. However, this process has proved to be error-prone due to the extensiveness of modern browser specifications and the interplay between new and existing Web platform components. To tackle this problem, we developed WebSpec, the first formal security framework for the analysis of browser security mechanisms, which enables both the automatic discovery of logical flaws and the development of machine-checked security proofs. WebSpec, in particular, includes a comprehensive semantic model of the browser in the Coq proof assistant, a formalization in this model of ten Web security invariants, and a toolchain turning the Coq model and the Web invariants into SMT-lib formulas to enable model checking with the Z3 theorem prover. If a violation is found, the toolchain automatically generates executable tests corresponding to the discovered attack trace, which is validated across major browsers.We showcase the effectiveness of WebSpec by discovering two new logical flaws caused by the interaction of different browser mechanisms and by identifying three previously discovered logical flaws in the current Web platform, as well as five in old versions. Finally, we show how WebSpec can aid the verification of our proposed changes to amend the reported inconsistencies affecting the current Web platform.
Links
Detection of Inconsistencies in Privacy Practices of Browser Extensions.
Authors
- Duc Bui, The University of Michigan, Ann Arbor, MI, U.S.A.
- Brian Tang, The University of Michigan, Ann Arbor, MI, U.S.A.
- Kang G. Shin, The University of Michigan, Ann Arbor, MI, U.S.A.
Abstract
All major web browsers support extensions to provide additional functionalities and enhance users’ browsing experience while the extensions can access and collect users’ data during their web browsing. Although the web extensions inform users of their data practices via multiple forms of notices, prior work has overlooked the critical gap between the actual data practices and the published privacy notices of browser extensions. To fill this gap, we propose ExtPrivA that automatically detects the inconsistencies between browser extensions’ data collection and their privacy disclosures. From the privacy policies and Dashboard disclosures, ExtPrivA extracts privacy statements to have a clear interpretation of the privacy practices of an extension. It emulates user interactions to trigger the extension’s functionalities and analyzes the initiators of network requests to accurately extract the users’ data transferred by the extension from the browser to external servers. Our end-to-end evaluation has shown ExtPrivA to detect inconsistencies between the privacy disclosures and data-collection behavior with an 85% precision. In a large-scale study of 47.2k extensions on the Chrome Web Store, we found 820 extensions with 1,290 flows that are inconsistent with their privacy statements. Even worse, we have found 525 pairs of contradictory privacy statements in the Dashboard disclosures and privacy policies of 360 extensions. These discrepancies between the privacy disclosures and the actual data-collection behavior are deemed as serious violations of the Store’s policies. Our findings highlight the critical issues in the privacy disclosures of browser extensions that potentially mislead, and even pose high privacy risks to, end-users.
Links
TeSec: Accurate Server-side Attack Investigation for Web Applications.
Authors
- Ruihua Wang, KLISS, TNList, School of Software, Tsinghua University
- Yihao Peng, KLISS, TNList, School of Software, Tsinghua University
- Yilun Sun, KLISS, TNList, School of Software, Tsinghua University
- Xuancheng Zhang, KLISS, TNList, School of Software, Tsinghua University
- Hai Wan, KLISS, TNList, School of Software, Tsinghua University
- Xibin Zhao, KLISS, TNList, School of Software, Tsinghua University
Abstract
The user interface (UI) of web applications is usually the entry point of web attacks against enterprises and organizations. Finding the UI elements utilized by the intruders is of great importance both for attack interception and web application fixing. Current attack investigation methods targeting web UI either provide rough analysis results or have poor performance in high concurrency scenarios, which leads to heavy manual analysis work. In this paper, we propose TeSec, an accurate attack investigation method for web UI applications. TeSec makes use of two kinds of correlations. The first one, built from annotated audit log partitioned by PID/TID and delimiter-logs, captures the correspondence between audit log entries and web requests. The second one, modeled by an Aho-Corasick automaton built during system testing period, captures the correspondence between requests and the UI elements/events. Leveraging these two correlations, TeSec can accurately and automatically locate the UI elements/events (i.e., the root cause of the alarm) from an alarm, even in high concurrency scenarios. Furthermore, TeSec only needs to be deployed in the server and does not need to collect logs from the client-side browsers. We evaluate TeSec on 12 web applications. The experimental results show that the matching accuracy between UI events/elements and the alarm is above 99.6%. And security analysts only need to check no more than 2 UI elements on average for each individual forensics analysis. The maximum overhead of average response time and audit log space overhead are low (4.3% and 4.6% respectively).
Links
RuleKeeper: GDPR-Aware Personal Data Compliance for Web Frameworks.
Authors
- Mafalda Ferreira, INESC-ID / Instituto Superior Técnico, Universidade de Lisboa
- Tiago Brito, INESC-ID / Instituto Superior Técnico, Universidade de Lisboa
- José Fragoso Santos, INESC-ID / Instituto Superior Técnico, Universidade de Lisboa
- Nuno Santos, INESC-ID / Instituto Superior Técnico, Universidade de Lisboa
Abstract
Pressured by existing regulations such as the EU GDPR, online services must advertise a personal data protection policy declaring the types and purposes of collected personal data, which must then be strictly enforced as per the consent decisions made by the users. However, due to the lack of system-level support, obtaining strong guarantees of policy enforcement is hard, leaving the door open for software bugs and vulnerabilities to cause GDPR-compliance violations.We present RuleKeeper, a GDPR-aware personal data policy compliance system for web development frameworks. Currently ported for the MERN framework, RuleKeeper allows web developers to specify a GDPR manifest from which the data protection policy of the web application is automatically generated and is transparently enforced through static code analysis and runtime access control mechanisms. GDPR compliance is checked in a cross-cutting manner requiring few changes to the application code. We used our prototype implementation to evaluate RuleKeeper with four real-world applications. Our system can model realistic GDPR data protection requirements, adds modest performance overheads to the web application, and can detect GDPR violation bugs.
Links
Characterizing Everyday Misuse of Smart Home Devices.
Authors
- Phoebe Moh, University of Maryland
- Pubali Datta, University of Illinois Urbana-Champaign
- Noel Warford, University of Maryland
- Adam Bates, University of Illinois Urbana-Champaign
- Nathan Malkin, University of Maryland
- Michelle L. Mazurek, University of Maryland
Abstract
Exploration of Internet of Things (IoT) security often focuses on threats posed by external and technically-skilled attackers. While it is important to understand these most extreme cases, it is equally important to understand the most likely risks of harm posed by smart device ownership. In this paper, we explore how smart devices are misused — used without permission in a manner that causes harm — by device owners’ everyday associates such as friends, family, and romantic partners. In a preliminary characterization survey (n = 100), we broadly capture the kinds of unauthorized use and misuse incidents participants have experienced or engaged in. Then, in a prevalence survey (n = 483), we assess the prevalence of these incidents in a demographically-representative population. Our findings show that unauthorized use of smart devices is widespread (experienced by 43% of participants), and that misuse is also common (experienced by at least 19% of participants). However, highly individual factors determine whether these unauthorized use events constitute misuse. Through a focus on everyday abuses, this work sheds light on the most prevalent security and privacy threats faced by smart-home owners today.
Links
"It's up to the Consumer to be Smart": Understanding the Security and Privacy Attitudes of Smart Home Users on Reddit.
Authors
- Jingjie Li, University of Wisconsin–Madison
- Kaiwen Sun, University of Michigan
- Brittany Skye Huff, University of Wisconsin–Madison
- Anna Marie Bierley, University of Wisconsin–Madison
- Younghyun Kim, University of Wisconsin–Madison
- Florian Schaub, University of Michigan
- Kassem Fawaz, University of Wisconsin–Madison
Abstract
Smart home technologies offer many benefits to users. Yet, they also carry complex security and privacy implications that users often struggle to assess and account for during adoption. To better understand users’ considerations and attitudes regarding smart home security and privacy, in particular how users develop them progressively, we conducted a qualitative content analysis of 4,957 Reddit comments in 180 security- and privacy-related discussion threads from /r/homeautomation, a major Reddit smart home forum. Our analysis reveals that users’ security and privacy attitudes, manifested in the levels of concern and degree to which they incorporate protective strategies, are shaped by multi-dimensional considerations. Users’ attitudes evolve according to changing contextual factors, such as adoption phases, and how they become aware of these factors. Further, we describe how online discourse about security and privacy risks and protections contributes to individual and collective attitude development. Based on our findings, we provide recommendations to improve smart home designs, support users’ attitude development, facilitate information exchange, and guide future research regarding smart home security and privacy.
Links
User Perceptions and Experiences with Smart Home Updates.
Authors
- Julie M. Haney, National Institute of Standards and Technology, Gaithersburg, MD, USA
- Susanne M. Furman, National Institute of Standards and Technology, Gaithersburg, MD, USA
Abstract
Updates may be one of the few tools consumers have to mitigate security and privacy vulnerabilities in smart home devices. However, little research has been undertaken to understand users’ perceptions and experiences with smart home updates. To address this gap, we conducted an online survey of a demographically diverse sample of 412 smart home users in the United States. We found that users overwhelmingly view smart home updates as important and urgent. However, relationships between update perceptions and security and privacy perceptions are less clear. We also identify problematic aspects of updates and gaps between current and preferred update modes. We then suggest ways in which update mechanisms and interfaces can be designed to be more usable and understandable to users.
Links
Design and Evaluation of Inclusive Email Security Indicators for People with Visual Impairments.
Authors
- Yaman Yu, University of Illinois at Urbana-Champaign
- Saidivya Ashok, University of Illinois at Urbana-Champaign
- Smirity Kaushik, University of Illinois at Urbana-Champaign
- Yang Wang, University of Illinois at Urbana-Champaign
- Gang Wang, University of Illinois at Urbana-Champaign
Abstract
Due to the challenges to detect and filter phishing emails, it is inevitable that some phishing emails can still reach a user’s inbox. As a result, email providers such as Gmail have implemented phishing warnings to help users to better recognize phishing attempts. Existing research has primarily focused on phishing warnings for sighted users and yet it is not well understood how people with visual impairments interact with phishing emails and warnings. In this paper, we worked with a group of users (N=41) with visual impairments to study the effectiveness of existing warnings and explore more inclusive designs (using Gmail warning designs as a baseline for comparison). We took a multipronged approach including an exploratory study (to understand the challenges faced by users), user-in-the-loop design and prototyping, and the main study (to assess the impact of design choices). Our results show that users with visual impairments often miss existing Gmail warnings because the current design (e.g., warning position, HTML tags used) does not match well with screen reader users’ reading habits. The inconsistencies of the warnings (e.g., across the Standard and HTML view) also create obstacles to users. We show that an inclusive design (combining audio warning, shortcut key, and warning page overlay) can effectively increase the warning noticeability. Based on our results, we make a number of recommendations to email providers.
Links
When and Why Do People Want Ad Targeting Explanations? Evidence from a Four-Week, Mixed-Methods Field Study.
Authors
- Hao-Ping Hank Lee, Carnegie Mellon University
- Jacob Logas, Georgia Institute of Technology
- Stephanie Yang, Georgia Institute of Technology
- Zhouyu Li, North Carolina State University
- Natã Barbosa, University of Illinois Urbana-Champaign
- Yang Wang, University of Illinois Urbana-Champaign
- Sauvik Das, Carnegie Mellon University
Abstract
Many people are concerned about how their personal data is used for online behavioral advertising (OBA). Ad targeting explanations have been proposed as a way to reduce this concern by improving transparency. However, it is unclear when and why people might want ad targeting explanations. Without this insight, we run the risk of designing explanations that do not address real concerns. To bridge this gap, we conducted a four-week, mixed-methods field study with 60 participants to understand when and why people want targeting explanations for the ads they actually encountered while browsing the web. We found that users wanted explanations for around 30% of the 4,251 ads we asked them about during the study, and that subjective perceptions of how their personal data was collected and shared were highly correlated with when users wanted ad explanations. Often, users wanted these explanations to confirm or deny their own preconceptions about how their data was collected or the motives of advertisers. A key upshot of our work is that one-size-fits-all approaches to ad explanations are likely to fail at addressing people’s lived concerns about ad targeting; instead, more personalized explanations are needed.
Links
SecureCells: A Secure Compartmentalized Architecture.
Authors
- Atri Bhattacharyya, EcoCloud, EPFL
- Florian Hofhammer, EcoCloud, EPFL
- Yuanlong Li, EcoCloud, EPFL
- Siddharth Gupta, EcoCloud, EPFL
- Andres Sanchez, EcoCloud, EPFL
- Babak Falsafi, EcoCloud, EPFL
- Mathias Payer, EcoCloud, EPFL
Abstract
Modern programs are monolithic, combining code of varied provenance without isolation, all the while running on network-connected devices. A vulnerability in any component may compromise code and data of all other components. Compartmentalization separates programs into fault domains with limited policy-defined permissions, following the Principle of Least Privilege, preventing arbitrary interactions between components. Unfortunately, existing compartmentalization mechanisms target weak attacker models, incur high overheads, or overfit to specific use cases, precluding their general adoption. The need of the hour is a secure, performant, and flexible mechanism on which developers can reliably implement an arsenal of compartmentalized software.We present SecureCells, a novel architecture for intra-address space compartmentalization. SecureCells enforces per-Virtual Memory Area (VMA) permissions for secure and scalable access control, and introduces new userspace instructions for secure and fast compartment switching with hardware-enforced call gates and zero-copy permission transfers. SecureCells enables novel software mechanisms for call stack maintenance and register context isolation. In microbenchmarks, SecureCells switches compartments in only 8 cycles on a 5-stage in-order processor, reducing cost by an order of magnitude compared to state-of-the-art. Consequently, SecureCells helps secure high-performance software such as an in-memory key-value store with negligible overhead of less than 3%.
Links
WaVe: a verifiably secure WebAssembly sandboxing runtime.
Authors
- Evan Johnson, UC San Diego
- Evan Laufer, Stanford
- Zijie Zhao, UIUC
- Dan Gohman, Fastly Labs
- Shravan Narayan, UC San Diego
- Stefan Savage, UC San Diego
- Deian Stefan, UC San Diego
- Fraser Brown, CMU
Abstract
The promise of software sandboxing is flexible, fast and portable isolation; capturing the benefits of hardwarebased memory protection without requiring operating system involvement. This promise is reified in WebAssembly (Wasm), a popular portable bytecode whose compilers automatically insert runtime checks to ensure that data and control flow are constrained to a single memory segment. Indeed, modern compiled Wasm implementations have advanced to the point where these checks can themselves be verified, removing the compiler from the trusted computing base. However, the resulting integrity properties are only valid for code executing strictly inside the Wasm sandbox. Any interactions with the runtime system, which manages sandboxes and exposes the WebAssembly System Interface (WASI) used to access operating system resources, operate outside this contract. The resulting conundrum is how to maintain Wasm’s strong isolation properties while still allowing such programs to interact with the outside world (i.e., with the file system, the network, etc.). Our paper presents a solution to this problem, via WaVe, a verified secure runtime system that implements WASI. We mechanically verify that interactions with WaVe (including OS side effects) not only maintain Wasm’s memory safety guarantees, but also maintain access isolation for the host OS’s storage and network resources. Finally, in spite of completely removing the runtime from the trusted computing base, we show that WaVe offers performance competitive with existing industrial (yet unsafe) Wasm runtimes.
Links
μSwitch: Fast Kernel Context Isolation with Implicit Context Switches.
Authors
- Dinglan Peng, Purdue University
- Congyu Liu, Purdue University
- Tapti Palit, Purdue University
- Pedro Fonseca, Purdue University
- Anjo Vahldiek-Oberwagner, Intel Labs
- Mona Vij, Intel Labs
Abstract
Isolating application components is crucial to limit the exposure of sensitive data and code to vulnerabilities in the untrusted components. Process-based isolation is the de facto isolation used in practice, e.g., web browsers. However, it incurs significant performance overhead and is typically infeasible when frequent switches between isolation domains are expected. To address this problem, many intra-process memory isolation techniques have been proposed using novel kernel abstractions, recent CPU extensions (e.g., Intel ® MPK), and software-based fault isolation (e.g., WebAssembly). However, these techniques insufficiently isolate kernel resources, such as file descriptors, or do so by incurring high overheads when resources are accessed. Other work virtualizes the kernel context inside a privileged user space domain, but this is ad-hoc, error-prone, and provides only limited kernel functionalities.We propose μSwitch, an efficient kernel context isolation mechanism with memory protection that addresses these limitations. We use a protected structure, shared by the kernel and the user space, for context switching and propose implicit context switching to improve its performance by deferring the kernel resource switch to the next system call. We apply μSWITCH to isolate libraries in the Firefox web browser and an HTTP server, and reduce the overhead of isolation by 32.7% to 98.4% compared with other isolation techniques.
Links
Control Flow and Pointer Integrity Enforcement in a Secure Tagged Architecture.
Authors
- Ravi Theja Gollapudi, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
- Gokturk Yuksek, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
- David Demicco, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
- Matthew Cole, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
- Gaurav Kothari, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
- Rohit Kulkarni, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
- Xin Zhang, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
- Kanad Ghose, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
- Aravind Prakash, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
- Zerksis Umrigar, Department of Computer Science, State University of New York at Binghamton, Binghamton, New York, USA
Abstract
Control flow attacks exploit software vulnerabilities to divert the flow of control into unintended paths to ultimately execute attack code. This paper explores the use of instruction and data tagging as a general means of thwarting such control flow attacks, including attacks that rely on violating pointer integrity. Using specific types of narrow-width data tags along with narrow-width instruction tags embedded within the binary facilitates the security policies required to protect against such attacks, leading to a practically viable solution. Co-locating instruction tags close to their corresponding instructions within cache lines eliminates the need for separate mechanisms for instruction tag accesses. Information gleaned from the analysis phase of a compiler is augmented and used to generate the instruction and data tags. A full-stack implementation that consists of a modified LLVM compiler, modified Linux OS support for tags and a FPGA-implemented CPU hardware prototype for enforcing CFI, data pointer and code pointer integrity is demonstrated. With a modest hardware enhancement, the execution time of benchmark applications on the prototype system is shown to be limited to low, single-digit percentages of a baseline system without tagging.
Links
EC: Embedded Systems Compartmentalization via Intra-Kernel Isolation.
Authors
- Arslan Khan, Purdue University
- Dongyan Xu, Purdue University
- Dave Jing Tian, Purdue University
Abstract
Embedded systems comprise of low-power microcontrollers and constitute computing systems from IoT nodes to supercomputers. Unfortunately, due to the low power constraint, the security of these systems is often overlooked, leaving a huge attack surface. For instance, an attacker compromising a user task can access any kernel data structure. Existing work has applied compartmentalization to reduce the attack surface, but these systems either incur a high runtime overhead or require major modifications to existing firmware. In this paper, we present Embedded Compartmentalizer (EC), a comprehensive and automatic compartmentalization toolchain for Real-Time Operating Systems (RTOSs) and baremetal firmware. EC provides the Embedded Compartmentalizer Compiler (ECC) to automatically partition firmware into different compartments and enforces memory protection among them using the Embedded Compartmentalizer Kernel (ECK), a formally verified microkernel implementing a novel architecture for compartmentalizing firmware using intra-kernel isolation. Our evaluation shows that EC is 1.2x faster than state-of-the-art systems and can achieve up to 96.2% ROP gadget reduction in firmwares. EC provides a low-cost, practical, and effective compartmentalization solution for embedded systems with memory protection and debug hardware extension.
Links
Low-Cost Privilege Separation with Compile Time Compartmentalization for Embedded Systems.
Authors
- Arslan Khan, Purdue University
- Dongyan Xu, Purdue University
- Dave Jing Tian, Purdue University
Abstract
Embedded systems are pervasive and find various applications all around us. These systems run on low-power microcontrollers with real-time constraints. Developers often sacrifice security to meet these constraints by running the entire software stack with the same privilege. Existing work has utilized compartmentalization to mitigate the situation but suffers from a high overhead due to extensive runtime checking to achieve isolation between different compartments in the system, resulting in a rare adoption. In this paper, we present Compartmentalized Real-Time C (CRT-C), a low-cost compile-time compartmentalization mechanism for embedded systems to achieve privilege separation in a linear address space using specialized programming language dialects. Each programming dialect restricts the programming capabilities of a part of a program, formalizing different compartments within the program. CRT-C uses static analysis to identify various compartments in firmware and realizes the least privilege in the system by enforcing compartment-specific policies. We design and implement a new compiler to compile CRT-C to generate compartmentalized firmware that is ready to run on commodity embedded systems. We evaluate CRT-C with two Real-Time Operating Systems (RTOSs): FreeRTOS and Zephyr. Our evaluation shows that CRT-C can provide compartmentalization to embedded systems to thwart various attacks while incurring an average runtime overhead of 2.63% and memory overhead of 1.75%. CRT-C provides a practical solution to both retrofit legacy and secure new applications for embedded systems.
Links
One Key to Rule Them All: Secure Group Pairing for Heterogeneous IoT Devices.
Authors
- Habiba Farrukh, Purdue University
- Muslum Ozgur Ozmen, Purdue University
- Faik Kerem Ors, Purdue University
- Z. Berkay Celik, Purdue University
Abstract
Pairing schemes establish cryptographic keys to secure communication among IoT devices. Existing pairing approaches that rely on trusted central entities, human interaction, or shared homogeneous context are prone to a single point of failure, have limited usability, and require additional sensors. Recent work has explored event timings observed by devices with heterogeneous sensing modalities as proof of co-presence for decentralized pairing. Yet, this approach incurs high pairing time, cannot pair sensors that sense continuous physical quantities and does not support group pairing, making it infeasible for many IoT deployments. In this paper, we design and develop IoTCupid, a secure group pairing system for IoT devices with heterogeneous sensing modalities, without requiring active user involvement. IoTCupid operates in three phases: (a) detecting events sensed by both instant and continuous sensors with a novel window-based derivation technique, (b) grouping the events through a fuzzy clustering algorithm to extract inter-event timings, and (c) establishing group keys among devices with identical inter-event timings through a partitioned group password-authenticated key exchange scheme. We evaluate IoTCupid in smart home and office environments with 11 heterogeneous devices and show that it effectively pairs all devices with only 2 group keys with a minimal pairing overhead.
Links
Optimistic Access Control for the Smart Home.
Authors
- Nathan Malkin, University of Maryland
- Alan F. Luo, University of Maryland
- Julio Poveda, University of Maryland
- Michelle L. Mazurek, University of Maryland
Abstract
One of the biggest privacy concerns of smart home users is enforcing limits on household members’ access to devices and each other’s data. While people commonly express preferences for intricate access control policies, in practice they often settle for less secure defaults. As an alternative, this paper investigates "optimistic access control" policies that allow users to obtain access and data without pre-approval, subject to oversight from other household members. This solution allows users to leverage the interpersonal trust they already rely on in order to establish privacy boundaries commensurate with more complex access control methods, while retaining the convenience of less secure strategies. To evaluate this concept, we conducted a series of surveys with 604 people total, studying the acceptability and perceptions of this approach. We found that a number of respondents preferred optimistic modes to existing access control methods and that interest in optimistic access varied with device type and household characteristics.
Links
Protected or Porous: A Comparative Analysis of Threat Detection Capability of IoT Safeguards.
Authors
- Anna Maria Mandalari, University College London
- Hamed Haddadi, Imperial College London
- Daniel J. Dubois, Northeastern University
- David Choffnes, Northeastern University
Abstract
Consumer Internet of Things (IoT) devices are increasingly common, from smart speakers to security cameras, in homes. Along with their benefits come potential privacy and security threats. To limit these threats a number of commercial services have become available (IoT safeguards). The safeguards claim to provide protection against IoT privacy risks and security threats. However, the effectiveness and the associated privacy risks of these safeguards remains a key open question. In this paper, we investigate the threat detection capabilities of IoT safeguards for the first time. We develop and release an approach for automated safeguards experimentation to reveal their response to common security threats and privacy risks. We perform thousands of automated experiments using popular commercial IoT safeguards when deployed in a large IoT testbed. Our results indicate not only that these devices may be ineffective in preventing risks, but also their cloud interactions and data collection operations may introduce privacy risks for the households that adopt them.
Links
LazyTAP: On-Demand Data Minimization for Trigger-Action Applications.
Authors
- Mohammad M. Ahmadpanah, Chalmers University of Technology
- Daniel Hedin, Chalmers University of Technology; Mälardalen University
- Andrei Sabelfeld, Chalmers University of Technology
Abstract
Trigger-Action Platforms (TAPs) empower applications (apps) for connecting otherwise unconnected devices and services. The current TAPs like IFTTT require trigger services to push excessive amounts of sensitive data to the TAP regardless of whether the data will be used in the app, at odds with the principle of data minimization. Furthermore, the rich features of modern TAPs, including IFTTT queries to support multiple trigger services and nondeterminism of apps, have been out of the reach of previous data minimization approaches like minTAP. This paper proposes LazyTAP, a new paradigm for fine-grained on-demand data minimization. LazyTAP breaks away from the traditional push-all approach of coarse-grained data over-approximation. Instead, LazyTAP pulls input data on-demand, once it is accessed by the app execution. Thanks to the fine granularity, LazyTAP enables tight minimization that naturally generalizes to support multiple trigger services via queries and is robust with respect to nondeterministic behavior of the apps. We achieve seamlessness for third-party app developers by leveraging laziness to defer computation and proxy objects to load necessary remote data behind the scenes as it becomes needed. We formally establish the correctness of LazyTAP and its minimization properties with respect to both IFTTT and minTAP. We implement and evaluate LazyTAP on app benchmarks showing that on average LazyTAP improves minimization by 95% over IFTTT and by 38% over minTAP, while incurring a tolerable performance overhead.
Links
Blue's Clues: Practical Discovery of Non-Discoverable Bluetooth Devices.
Authors
- Tyler Tucker, Florida Institute for Cybersecurity Research (FICS)
- Hunter Searle, Florida Institute for Cybersecurity Research (FICS)
- Kevin Butler, Florida Institute for Cybersecurity Research (FICS)
- Patrick Traynor, Florida Institute for Cybersecurity Research (FICS)
Abstract
Bluetooth is overwhelmingly the protocol of choice for personal area networking, and the Bluetooth Classic standard has been in continuous use for over 20 years. Bluetooth devices make themselves Discoverable to communicate, but best practice to protect privacy is to ensure that devices remain in Non-Discoverable mode. This paper demonstrates the futility of protecting devices by making them Non-Discoverable. We introduce the Blue’s Clues attack, which presents the first direct, non-disruptive approach to fully extracting the permanent, unique Bluetooth MAC identifier from targeted devices in Non-Discoverable mode. We also demonstrate that we can fully characterize device capabilities and retrieve identifiers, some of which we discover often contain identifying information about the device owner. We demonstrate Blue’s Clues using a software-defined radio and mounting the attack over the air against both our own devices and, with institutional approval, throughout a public building. We find that a wide variety of Bluetooth devices can be uniquely identified in less than 10 seconds on average, with affected devices ranging from smartphones and headphones to gas pump skimmers and nanny-cams, spanning all versions of the Bluetooth Classic standard. While we provide potential mitigation against attacks, Blue’s Clues forces a reassessment of over 20 years of best practices for protecting devices against discovery.
Links
DeHiREC: Detecting Hidden Voice Recorders via ADC Electromagnetic Radiation.
Authors
- Ruochen Zhou, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Xiaoyu Ji, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Chen Yan, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Yi-Chao Chen, Shanghai Jiao Tong University; Microsoft Research Asia
- Wenyuan Xu, Ubiquitous System Security Lab (USSLAB), Zhejiang University
- Chaohao Li, Ubiquitous System Security Lab (USSLAB), Zhejiang University
Abstract
Unauthorized covert voice recording brings a remarkable threat to privacy-sensitive scenarios, such as confidential meetings and private conversations. Due to the miniaturization and disguise characteristics, hidden voice recorders are difficult to be noticed in their surroundings. In this paper, we present DeHiREC, the first proof-of-concept system that can detect offline hidden voice recorders from their electromagnetic radiations (EMR). We first characterize the unique patterns of the emanated EMR signals and then locate the EMR source, i.e., the analog-to-digital converter (ADC) module embedded in the mixed signal system-on-chips (MSoCs). Since these unintentional EMR signals can be extremely noisy and weak, accurately detecting them can be challenging. To address this challenge, we first design an EMR Catalyzing method to stimulate the EMR signals actively and then employ an adaptive-folding algorithm to improve the signal-to-noise ratio (SNR) of the sensed EMRs. Once the sensed EMR variation corresponds to our active stimulation, we can determine that there exists a hidden voice recorder. We evaluate the performance of DeHiREC on 13 commercial voice recorders under various impacts, including interference from other devices. Experimental results reveal that DeHiREC is effective in detecting all 13 voice recorders and achieves an overall success rate of 92.17% and a recall rate of 86.14% at a distance of 0.2 m.
Links
IPvSeeYou: Exploiting Leaked Identifiers in IPv6 for Street-Level Geolocation.
Authors
- Erik C. Rye, University of Maryland
- Robert Beverly, CMAND
Abstract
We present IPvSeeYou, a privacy attack that permits a remote and unprivileged adversary to physically geolocate many residential IPv6 hosts and networks with street-level precision. The crux of our method involves: 1) remotely discovering wide area (WAN) hardware MAC addresses from home routers; 2) correlating these MAC addresses with their WiFi BSSID counterparts of known location; and 3) extending coverage by associating devices connected to a common penultimate provider router.We first obtain a large corpus of MACs embedded in IPv6 addresses via high-speed network probing. These MAC addresses are effectively leaked up the protocol stack and largely represent WAN interfaces of residential routers, many of which are all-in-one devices that also provide WiFi. We develop a technique to statistically infer the mapping between a router’s WAN and WiFi MAC addresses across manufacturers and devices, and mount a large-scale data fusion attack that correlates WAN MACs with WiFi BSSIDs available in wardriving (geolocation) databases. Using these correlations, we geolocate the IPv6 prefixes of >12M routers in the wild across 146 countries and territories. Selected validation confirms a median geolocation error of 39 meters. We then exploit technology and deployment constraints to extend the attack to a larger set of IPv6 residential routers by clustering and associating devices with a common penultimate provider router. While we responsibly disclosed our results to several manufacturers and providers, the ossified ecosystem of deployed residential cable and DSL routers suggests that our attack will remain a privacy threat into the foreseeable future.
Links
From 5G Sniffing to Harvesting Leakages of Privacy-Preserving Messengers.
Authors
- Norbert Ludant, Northeastern University, Boston, USA
- Pieter Robyns, Hasselt University - tUL - EDM Belgian Cyber Command, Hasselt, Belgium
- Guevara Noubir, Northeastern University, Boston, USA
Abstract
We present the first open-source tool capable of efficiently sniffing 5G control channels, 5GSniffer and demonstrate its potential to conduct attacks on users privacy. 5GSniffer builds on our analysis of the 5G RAN control channel exposing side-channel leakage. We note that decoding the 5G control channels is significantly more challenging than in LTE, since part of the information necessary for decoding is provided to the UEs over encrypted channels. We devise a set of techniques to achieve real-time control channels sniffing (over three orders of magnitude faster than brute-forcing). This enables, among other things, to retrieve the Radio Network Temporary Identifiers (RNTIs) of all users in a cell, and perform traffic analysis. To illustrate the potential of our sniffer, we analyse two privacy-focused messengers, Signal and Telegram. We identify privacy leaks that can be exploited to generate stealthy traffic to a target user. When combined with 5GSniffer, it enables stealthy exposure of the presence of a target user in a given location (solely based on their phone number), by linking the phone number to the RNTI. It also enables traffic analysis of the target user. We evaluate the attacks and our sniffer, demonstrating nearly 100% accuracy within 30 seconds of attack initiation.
Links
Man-in-the-Middle Attacks without Rogue AP: When WPAs Meet ICMP Redirects.
Authors
- Xuewei Feng, Department of Computer Science and Technology & BNRist, Tsinghua University
- Qi Li, Institute for Network Sciences and Cyberspace & BNRist, Tsinghua University; Zhongguancun Lab
- Kun Sun, Department of Information Sciences and Technology & CSIS, George Mason University
- Yuxiang Yang, Department of Computer Science and Technology & BNRist, Tsinghua University
- Ke Xu, Department of Computer Science and Technology & BNRist, Tsinghua University; Zhongguancun Lab
Abstract
Modern Wi-Fi networks are commonly protected by the security mechanisms, e.g., WPA, WPA2 or WPA3, and thus it is difficult for an attacker (a malicious supplicant) to hijack the traffic of other supplicants as a man-in-the-middle (MITM). In traditional Evil Twins attacks, attackers may deploy a bogus wireless access point (AP) to hijack the victim supplicants’ traffic (e.g., stealing credentials). In this paper, we uncover a new MITM attack that can evade the security mechanisms in Wi-Fi networks by spoofing the legitimate AP to send a forged ICMP redirect message to a victim supplicant and thus allow attackers to stealthily hijack the traffic from the victim supplicant without deploying any bogus AP. The core idea is to misuse the vulnerability of cross-layer interactions between WPAs and ICMP protocols, totally evading the link layer security mechanisms enforced by WPAs. We resolve two requirements to successfully launch our attack. First, when the attacker spoofs the legitimate AP to craft an ICMP redirect message, the legitimate AP cannot recognize and filter out those forged ICMP redirect messages. We uncover a new vulnerability (CVE-2022-25667) of the Network Processing Units (NPUs) in AP routers that restrict the AP routers from blocking fake ICMP error messages passing through the router. We test 55 popular wireless routers from 10 well-known AP vendors, and none of these routers can block the forged ICMP redirect messages due to this vulnerability. Second, we develop a new method to ensure the forged ICMP redirect message can evade the legitimacy check of the victim supplicant and then poison its routing table. We conduct an extensive measurement study on 122 real-world Wi-Fi networks, covering all prevalent Wi-Fi security modes. The experimental results show that 109 out of the 122 (89%) evaluated Wi-Fi networks are vulnerable to our attack. Besides notifying the vulnerability to the NPU manufacturers and the AP vendors, we develop two countermeasures to throttle the identified attack.
Links
Mew: Enabling Large-Scale and Dynamic Link-Flooding Defenses on Programmable Switches.
Authors
- Huancheng Zhou, SUCCESS Lab, Texas A&M University
- Sungmin Hong, SUCCESS Lab, Texas A&M University
- Yangyang Liu, The Hong Kong Polytechnic University
- Xiapu Luo, The Hong Kong Polytechnic University
- Weichao Li, Peng Cheng Laboratory
- Guofei Gu, SUCCESS Lab, Texas A&M University
Abstract
Link-flooding attacks (LFAs) can cut off the Internet connection to selected server targets and are hard to mitigate because adversaries use normal-looking and low-rate flows and can dynamically adjust the attack strategy. Traditional centralized defense systems cannot locally and efficiently suppress malicious traffic. Though emerging programmable switches offer an opportunity to bring defense systems closer to targeted links, their limited resource and lack of support for runtime reconfiguration limit their usage for link-flooding defenses.We present Mew 1 , a resource-efficient and runtime adaptable link-flooding defense system. Mew can counter various LFAs even when a massive number of flows are concentrated on a link, or when the attack strategy changes quickly. We design a distributed storage mechanism and a lossless state migration mechanism to reduce the storage bottleneck of programmable networks. We develop cooperative defense APIs to support multi-grained co-detection and co-mitigation without excessive overhead. Mew's dynamic defense mechanism can constantly analyze network conditions and activate corresponding defenses without rebooting devices or interrupting other running functions. We develop a prototype of Mew by using real-world programmable switches, which are located in five cities. Our experiments show that the real-world prototype can defend against large-scale and dynamic LFAs effectively.
Links
PCSPOOF: Compromising the Safety of Time-Triggered Ethernet.
Authors
- Andrew Loveless, University of Michigan; NASA Johnson Space Center
- Linh Thi Xuan Phan, University of Pennsylvania
- Ronald Dreslinski, University of Michigan
- Baris Kasikci, University of Michigan
Abstract
Designers are increasingly using mixed-criticality networks in embedded systems to reduce size, weight, power, and cost. Perhaps the most successful of these technologies is Time-Triggered Ethernet (TTE), which lets critical time-triggered (TT) traffic and non-critical best-effort (BE) traffic share the same switches and cabling. A key aspect of TTE is that the TT part of the system is isolated from the BE part, and thus BE devices have no way to disrupt the operation of the TTE devices. This isolation allows designers to: (1) use untrusted, but low cost, BE hardware, (2) lower BE security requirements, and (3) ignore BE devices during safety reviews and certification procedures.We present PCSPOOF, the first attack to break TTE’s isolation guarantees. PCSPOOF is based on two key observations. First, it is possible for a BE device to infer private information about the TT part of the network that can be used to craft malicious synchronization messages. Second, by injecting electrical noise into a TTE switch over an Ethernet cable, a BE device can trick the switch into sending these malicious synchronization messages to other TTE devices. Our evaluation shows that successful attacks are possible in seconds, and that each successful attack can cause TTE devices to lose synchronization for up to a second and drop tens of TT messages — both of which can result in the failure of critical systems like aircraft or automobiles. We also show that, in a simulated spaceflight mission, PCSPOOF causes uncontrolled maneuvers that threaten safety and mission success. We disclosed PCSPOOF to aerospace companies using TTE, and several are implementing mitigations from this paper.
Links
BLEDiff: Scalable and Property-Agnostic Noncompliance Checking for BLE Implementations.
Authors
- Imtiaz Karim, Purdue University
- Abdullah Al Ishtiaq, Pennsylvania State University
- Syed Rafiul Hussain, Pennsylvania State University
- Elisa Bertino, Purdue University
Abstract
In this work, we develop an automated, scalable, property-agnostic, and black-box protocol noncompliance checking framework called BLEDiff that can analyze and uncover noncompliant behavior in the Bluetooth Low Energy (BLE) protocol implementations. To overcome the enormous manual effort of extracting BLE protocol reference behavioral abstraction and security properties from a large and complex BLE specification, BLEDiff takes advantage of having access to multiple BLE devices and leverages the concept of differential testing to automatically identify deviant noncompliant behavior. In this regard, BLEDiff first automatically extracts the protocol FSM of a BLE implementation using the active automata learning approach. To improve the scalability of active automata learning for the large and complex BLE protocol, BLEDiff explores the idea of using a divide and conquer approach. BLEDiff essentially divides the BLE protocol into multiple sub-protocols, identifies their dependencies and extracts the FSM of each sub-protocol separately, and finally composes them to create the large protocol FSM. These FSMs are then pair-wise tested to automatically identify diverse deviations. We evaluate BLEDiff with 25 different commercial devices and demonstrate it can uncover 13 different deviant behaviors with 10 exploitable attacks.
Links
ViDeZZo: Dependency-aware Virtual Device Fuzzing.
Authors
- Qiang Liu, Zhejiang University; EPFL
- Flavio Toffalini, EPFL
- Yajin Zhou, Zhejiang University
- Mathias Payer, EPFL
Abstract
A virtual machine interacts with its host environment through virtual devices, driven by virtual device messages, e.g., I/O operations. By issuing crafted messages, an adversary can exploit a vulnerability in a virtual device to escape the virtual machine, gaining host access. Even though hundreds of bugs in virtual devices have been discovered, coverage-based virtual device fuzzers hardly consider intra-message dependencies (a field in a virtual device message may be dependent on another field) and inter-message dependencies (a message may depend on a previously issued message), thus resulting in limited scalability or efficiency.ViDeZZo, our new dependency-aware fuzzing framework for virtual devices, overcomes the limitations of existing virtual device fuzzers by annotating intra-message dependencies with a lightweight grammar, and by self-learning inter-message dependencies with new mutation rules. Specifically, ViDeZZo annotates message dependencies and applies three categories of message mutators. This approach avoids heavy manual effort to analyze specifications and speeds up the slow exploration by satisfying dependencies, resulting in a scalable and efficient fuzzer that boosts bug discovery in virtual devices.In our evaluation, ViDeZZo covers two hypervisors, four architectures, five device categories, and 28 virtual devices, and reaches competitive coverage faster. Moreover, ViDeZZo successfully finds 24 existing and 28 new bugs across diverse bug types. We are actively engaging with the community with 7 of our submitted patches already accepted.
Links
DevFuzz: Automatic Device Model-Guided Device Driver Fuzzing.
Authors
- Yilun Wu, Stony Brook University
- Tong Zhang, Samsung Electronics
- Changhee Jung, Purdue University
- Dongyoon Lee, Stony Brook University
Abstract
The security of device drivers is critical for the entire operating system’s reliability. Yet, it remains very challenging to validate if a device driver can properly handle potentially malicious input from a hardware device. Unfortunately, existing symbolic execution-based solutions often do not scale, while fuzzing solutions require real devices or manual device models, leaving many device drivers under-tested and insecure.This paper presents DevFuzz, a new model-guided device driver fuzzing framework that does not require a physical device. DevFuzz uses symbolic execution to automatically generate the probe model that can guide a fuzzer to properly initialize a device driver under test. DevFuzz also leverages both static and dynamic program analyses to construct MMIO, PIO, and DMA device models to improve the effectiveness of fuzzing further. DevFuzz successfully tested 191 device drivers of various bus types (PCI, USB, RapidIO, I2C) from different operating systems (Linux, FreeBSD, and Windows) and detected 72 bugs, 41 of which have been patched and merged into the mainstream.
Links
SyzDescribe: Principled, Automated, Static Generation of Syscall Descriptions for Kernel Drivers.
Authors
- Yu Hao, University of California, Riverside
- Guoren Li, University of California, Riverside
- Xiaochen Zou, University of California, Riverside
- Weiteng Chen, University of California, Riverside
- Shitong Zhu, University of California, Riverside
- Zhiyun Qian, University of California, Riverside
- Ardalan Amiri Sani, University of California, Irvine
Abstract
Fuzz testing operating system kernels has been effective overall in recent years. For example, syzkaller manages to find thousands of bugs in the Linux kernel since 2017. One necessary component of syzkaller is a collection of syscall descriptions that are often provided by human experts. However, to our knowledge, current syscall descriptions are largely written manually, which is both time-consuming and error-prone. It is especially challenging considering that there are many kernel drivers (for new hardware devices and beyond) that are continuously being developed and evolving over time. In this paper, we present a principled solution for generating syscall descriptions for Linux kernel drivers. At its core, we summarize and model the key invariants or programming conventions, extracted from the "contract" between the core kernel and drivers. This allows us to understand programmatically how a kernel driver is initialized and how its associated interfaces are constructed. With this insight, we have developed a solution in a tool called SyzDescribe that has been tested for over hundreds of kernel drivers. We show that the syscall descriptions produced by SyzDescribe are competitive to manually-curated ones, and much better than prior work (i.e., DIFUZE and KSG). Finally, we analyze the gap between our descriptions and the ground truth and point to future improvement opportunities.
Links
QueryX: Symbolic Query on Decompiled Code for Finding Bugs in COTS Binaries.
Authors
- HyungSeok Han, Theori Inc.; KAIST
- JeongOh Kyea, KAIST
- Yonghwi Jin, KAIST
- Jinoh Kang, KAIST
- Brian Pak, KAIST
- Insu Yun, Theori Inc.
Abstract
Extensible static checking tools, such as Sys and CodeQL, have successfully discovered bugs in source code. These tools allow analysts to write application-specific rules, referred to as queries. These queries can leverage the domain knowledge of analysts, thereby making the analysis more accurate and scalable. However, the majority of these tools are inapplicable to binary-only analysis. One exception, joern, translates a binary code into decompiled code and feeds the decompiled code into an ordinary C code analyzer. However, this approach is not sufficiently precise for symbolic analysis, as it overlooks the unique characteristics of decompiled code. While binary analysis platforms, such as angr, support symbolic analysis, analysts must understand their intermediate representations (IRs) although they are mostly working with decompiled code.In this paper, we propose a precise and scalable symbolic analysis called fearless symbolic analysis that uses intuitive queries for binary code and implement this in QueryX. To make the query intuitive, QueryX enables analysts to write queries on top of decompiled code instead of IRs. In particular, QueryX supports callbacks on decompiled code, using which analysts can control symbolic analysis to discover bugs in the code. For precise analysis, we lift decompiled code into our IR named DNR and perform symbolic analysis on DNR while considering the characteristics of the decompiled code. Notably, DNR is only used internally such that it allows analysts to write queries regardless of using DNR. For scalability, QueryX automatically reduces control-flow graphs using callbacks and ordering dependencies between callbacks that are specified in the queries. We applied QueryX to the Windows kernel, the Windows system service, and an automotive binary. As a result, we found 15 unique bugs including 10 CVEs and earned $180,000 from the Microsoft bug bounty program.
Links
Pyfet: Forensically Equivalent Transformation for Python Binary Decompilation.
Authors
- Ali Ahad, Department of Computer Science, University of Virginia, Charlottesville, VA, USA
- Chijung Jung, Department of Computer Science, University of Virginia, Charlottesville, VA, USA
- Ammar Askar, School of Computer Science, Georgia Institute of Technology, Atlanta, GA, USA
- Doowon Kim, Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN, USA
- Taesoo Kim, School of Computer Science, Georgia Institute of Technology, Atlanta, GA, USA
- Yonghwi Kwon, Department of Computer Science, University of Virginia, Charlottesville, VA, USA
Abstract
Decompilation is a crucial capability in forensic analysis, facilitating analysis of unknown binaries. The recent rise of Python malware has brought attention to Python decompilers that aim to obtain source code representation from a Python binary. However, Python decompilers fail to handle various binaries, limiting their capabilities in forensic analysis.This paper proposes a novel solution that transforms a decompilation error-inducing Python binary into a decompilable binary. Our key intuition is that we can resolve the decompilation errors by transforming error-inducing code blocks in the input binary into another form. The core of our approach is the concept of Forensically Equivalent Transformation (FET) which allows non-semantic preserving transformation in the context of forensic analysis. We carefully define the FETs to minimize their undesirable consequences while fixing various error-inducing instructions that are difficult to solve when preserving the exact semantics. We evaluate the prototype of our approach with 17,117 real-world Python malware samples causing decompilation errors in five popular decompilers. It successfully identifies and fixes 77,022 errors. Our approach also handles anti-analysis techniques, including opcode remapping, and helps migrate Python 3.9 binaries to 3.8 binaries.
Links
Adaptive Risk-Limiting Comparison Audits.
Authors
- Benjamin Fuller, University of Connecticut – Voting Technology Research Center
- Abigail Harrison, University of Connecticut – Voting Technology Research Center
- Alexander Russell, University of Connecticut – Voting Technology Research Center
Abstract
Risk-limiting audits (RLAs) are rigorous statistical procedures meant to detect invalid election results. RLAs examine paper ballots cast during the election to statistically assess the possibility of a disagreement between the winner determined by the ballots and the winner reported by tabulation. The design of an RLA must balance risk against efficiency: "risk" refers to a bound on the chance that the audit fails to detect such a disagreement when one occurs; "efficiency" refers to the total effort to conduct the audit.The most efficient approaches—when measured in terms of the number of ballots that must be inspected—proceed by "ballot comparison." However, ballot comparison requires an (untrusted) declaration of the contents of each cast ballot, rather than a simple tabulation of vote totals. This "cast-vote record table" (CVR) is then spot-checked against ballots for consistency. In many practical settings, the cost of generating a suitable CVR dominates the cost of conducting the audit which has prevented widespread adoption of these sample-efficient techniques.We introduce a new RLA procedure: an "adaptive ballot comparison" audit. In this audit, a global CVR is never produced; instead, a three-stage procedure is iterated: 1) a batch is selected, 2) a CVR is produced for that batch, and 3) a ballot within the batch is sampled, inspected by auditors, and compared with the CVR. We prove that such an audit can achieve risk commensurate with standard comparison audits while generating a fraction of the CVR. We present three main contributions: (1) a formal adversarial model for RLAs; (2) definition and analysis of an adaptive audit procedure with rigorous risk limits and an associated correctness analysis accounting for the incidental errors arising in typical audits; and (3) an analysis of efficiency.
Links
Blue Is the New Black (Market): Privacy Leaks and Re-Victimization from Police-Auctioned Cellphones.
Authors
- Richard Roberts, University of Maryland
- Julio Poveda, University of Maryland
- Raley Roberts, University of Maryland
- Dave Levin, University of Maryland
Abstract
In the United States, items in police possession are often sold at auction if they are not claimed. This includes cellphones that the police obtained through civil asset forfeiture, that were stolen, or that were turned in to lost-and-found. Thousands of US police departments partner with a website, PropertyRoom, to auction their items. Over the course of several months, we purchased 228 cellphones from PropertyRoom to ascertain whether they contained personal information. Our results show that a shocking amount of sensitive, personal information is easily accessible, even to a "low-effort" adversary with no forensics expertise: 21.5% of the phones we purchased were not locked at all, another 4.8% used top-40 most common PINs and patterns, and one phone had a sticky-note from the police with the PIN on it. We analyze the content on the 61 phones we could access, finding sensitive information about not only the phones’ previous owners, but also about their personal contacts, and in some cases, about victims of those persons’ crimes. Additionally, we analyze approximately two years of PropertyRoom cellphone auctions, finding multiple instances of identifying information in photos of the items being auctioned, including sticky-notes with PINs, owners’ names and phone numbers, and evidence stickers that reveal how the phones were obtained and the names of the officers who obtained them. Our work shows that police procedures and phone auctions can be a significant source of personal information leakage and re-victimization. We hope that our work is a call to arms to enforce new policies that either prohibit the selling of computing devices containing user information, or at the very least impose requirements to wipe phones in a manner that the US federal government already employs.
Links
No Privacy in the Electronics Repair Industry.
Authors
- Jason Ceci, University of Guelph
- Jonah Stegman, University of Guelph
- Hassan Khan, University of Guelph
Abstract
Electronics repair and service providers offer a range of services to computing device owners across North America—from software installation to hardware repair. Device owners obtain these services and leave their device along with their access credentials at the mercy of technicians, which leads to privacy concerns for owners’ personal data. We conduct a comprehensive four-part study to measure the state of privacy in the electronics repair industry. First, through a field study with 18 service providers, we uncover that most service providers do not have any privacy policy or controls to safeguard device owners’ personal data from snooping by technicians. Second, we drop rigged devices for repair at 16 service providers and collect data on widespread privacy violations by technicians, including snooping on personal data, copying data off the device, and removing tracks of snooping activities. Third, we conduct an online survey (n=112) to collect data on customers’ experiences when getting devices repaired. Fourth, we invite a subset of survey respondents (n=30) for semi-structured interviews to establish a deeper understanding of their experiences and identify potential solutions to curtail privacy violations by technicians. We apply our findings to discuss possible controls and actions different stakeholders and regulatory agencies should take to improve the state of privacy in the repair industry.
Links
How IoT Re-using Threatens Your Sensitive Data: Exploring the User-Data Disposal in Used IoT Devices.
Authors
- Peiyu Liu, Zhejiang University; NGICS Platform, Zhejiang University
- Shouling Ji, Zhejiang University
- Lirong Fu, Zhejiang University
- Kangjie Lu, University of Minnesota
- Xuhong Zhang, Zhejiang University
- Jingchang Qin, Zhejiang University
- Wenhai Wang, Zhejiang University; NGICS Platform, Zhejiang University
- Wenzhi Chen, Zhejiang University
Abstract
With the rapid technology evolution of the Internet of Things (IoT) and increasing user needs, IoT device re-using becomes more and more common nowadays. For instance, more than 300,000 used IoT devices are selling on Craigslist. During IoT re-using, sensitive data such as credentials and biometrics residing in these devices may face the risk of leakage if a user fails properly dispose of the data. Thus, a critical security concern is raised: do (or can) users properly dispose of the sensitive data in used IoT? To the best of our knowledge, it is still an unexplored problem that desires a systematic study.In this paper, we perform the first in-depth investigation on the user-data disposal of used IoT devices. Our investigation integrates multiple research methods to explore the status quo and the root causes of the user-data leakages with used IoT devices. First, we conduct a user study to investigate the user awareness and understanding of data disposal. Then, we conduct a large-scale analysis on 4,749 IoT firmware images to investigate user-data collection. Finally, we conduct a comprehensive empirical evaluation on 33 IoT devices to investigate the effectiveness of existing data disposal methods.Through the systematical investigation, we discover that IoT devices collect more sensitive data than users expect. Specifically, we detect 121,984 sensitive data collections in the tested firmware. Moreover, users usually do not or even cannot properly dispose of the sensitive data. Worse, due to the inherent characteristics of storage chips, 13.2% of the investigated firmware perform "shallow" deletion, which may allow adversaries to obtain sensitive data after data disposal. Given the large-scale IoT re-using, such leakage would cause a broad impact. We have reported our findings to world-leading companies. We hope our findings raise awareness of the failures of user-data disposal with IoT devices and promote the protection of users’ sensitive data in IoT devices.
Links
Privacy Leakage via Unrestricted Motion-Position Sensors in the Age of Virtual Reality: A Study of Snooping Typed Input on Virtual Keyboards.
Authors
- Yi Wu, University of Tennessee, Knoxville, TN, USA
- Cong Shi, New Jersey Institute of Technology, Newark, NJ, USA
- Tianfang Zhang, Rutgers University, New Brunswick, NJ, USA
- Payton Walker, Texas A&M University, College Station, Texas, USA
- Jian Liu, University of Tennessee, Knoxville, TN, USA
- Nitesh Saxena, Texas A&M University, College Station, Texas, USA
- Yingying Chen, Rutgers University, New Brunswick, NJ, USA
Abstract
Virtual Reality (VR) has gained popularity in numerous fields, including gaming, social interactions, shopping, and education. In this paper, we conduct a comprehensive study to assess the trustworthiness of the embedded sensors on VR, which embed various forms of sensitive data that may put users’ privacy at risk. We find that accessing most on-board sensors (e.g., motion, position, and button sensors) on VR SDKs/APIs, such as OpenVR, Oculus Platform, and WebXR, requires no security permission, exposing a huge attack surface for an adversary to steal the user’s privacy. We validate this vulnerability through developing malware programs and malicious websites and specifically explore to what extent it exposes the user’s information in the context of keystroke snooping. To examine its actual threat in practice, the adversary in the considered attack model doesn’t possess any labeled data from the user nor knowledge about the user’s VR settings. Extensive experiments, involving two mainstream VR systems and four keyboards with different typing mechanisms, demonstrate that our proof-of-concept attack can recognize the user’s virtual typing with over 89.7% accuracy. The attack can recover the user’s passwords with up to 84.9% recognition accuracy if three attempts are allowed and achieve an average of 87.1% word recognition rate for paragraph inference. We hope this study will help the community gain awareness of the vulnerability in the sensor management of current VR systems and provide insights to facilitate the future design of more comprehensive and restricted sensor access control mechanisms.
Links
Uncovering User Interactions on Smartphones via Contactless Wireless Charging Side Channels.
Authors
- Tao Ni, City University of Hong Kong
- Xiaokuan Zhang, George Mason University
- Chaoshun Zuo, The Ohio State University
- Jianfeng Li, The Hong Kong Polytechnic University
- Zhenyu Yan, The Chinese University of Hong Kong
- Wubing Wang, DBAPPSecurity Co., Ltd
- Weitao Xu, City University of Hong Kong
- Xiapu Luo, The Hong Kong Polytechnic University
- Qingchuan Zhao, City University of Hong Kong
Abstract
Today, there is an increasing number of smartphones supporting wireless charging that leverages electromagnetic induction to transmit power from a wireless charger to the charging smartphone. In this paper, we report a new contactless and context-aware wireless-charging side-channel attack, which captures two physical phenomena (i.e., the coil whine and the magnetic field perturbation) generated during this wireless charging process and further infers the user interactions on the charging smartphone. We design and implement a three-stage attack framework, dubbed WISERS, to demonstrate the practicality of this new side channel. WISERS first captures the coil whine and the magnetic field perturbation emitted by the wireless charger, then infers (i) inter-interface switches (e.g., switching from the home screen to an app interface) and (ii) intra-interface activities (e.g., keyboard inputs inside an app) to build user interaction contexts, and further reveals sensitive information. We extensively evaluate the effectiveness of WISERS with popular smartphones and commercial-off-the-shelf (COTS) wireless chargers. Our evaluation results suggest that WISERS can achieve over 90.4% accuracy in inferring sensitive information, such as screen-unlocking passcode and app launch. In addition, our study also shows that WISERS is resilient to a list of impact factors.
Links
MagBackdoor: Beware of Your Loudspeaker as A Backdoor For Magnetic Injection Attacks.
Authors
- Tiantian Liu, Zhejiang University, Hangzhou, China
- Feng Lin, Zhejiang University, Hangzhou, China
- Zhangsen Wang, Zhejiang University, Hangzhou, China
- Chao Wang, Zhejiang University, Hangzhou, China
- Zhongjie Ba, Zhejiang University, Hangzhou, China
- Li Lu, Zhejiang University, Hangzhou, China
- Wenyao Xu, University at Buffalo, Buffalo, New York, USA
- Kui Ren, Zhejiang University, Hangzhou, China
Abstract
An audio system containing loudspeakers and microphones is the fundamental hardware for voice-enabled devices, enabling voice interaction with mobile applications and smart homes. This paper presents MagBackdoor, the first magnetic field attack that injects malicious commands via a loudspeaker-based backdoor of the audio system, compromising the linked voice interaction system. MagBackdoor focuses on the magnetic threat on loudspeakers and manipulates their sound production stealthily. Consequently, the microphone will inevitably pick up malicious sound generated by the attacked speaker, due to the closely packed arrangement of internal audio systems. To prove the feasibility of MagBackdoor, we conduct comprehensive simulations and experiments. This study further models the mechanism by which an external magnetic field excites the sound production of loudspeakers, giving theoretical guidance to MagBackdoor. Aiming at stealthy magnetic attacks in real-world scenarios, we self-design a prototype that can emit magnetic fields modulated by voice commands. We implement MagBackdoor and evaluate it across a wide range of smart devices involving 16 smartphones, four laptops, two tablets, and three smart speakers, achieving an average 95% injection success rate with high-quality injected acoustic signals.
Links
Private Eye: On the Limits of Textual Screen Peeking via Eyeglass Reflections in Video Conferencing.
Authors
- Yan Long, Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, USA
- Chen Yan, College of Electrical Engineering, Zhejiang University, Hangzhou, China
- Shilin Xiao, College of Electrical Engineering, Zhejiang University, Hangzhou, China
- Shivan Prasad, Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, USA
- Wenyuan Xu, College of Electrical Engineering, Zhejiang University, Hangzhou, China
- Kevin Fu, Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, USA
Abstract
Personal video conferencing has become a new norm after COVID-19 caused a seismic shift from in-person meetings and phone calls to video conferencing for daily communications and sensitive business. Video leaks participants’ on-screen information because eyeglasses and other reflective objects unwittingly expose partial screen contents. Using mathematical modeling and human subjects experiments, this research explores the extent to which emerging webcams might leak recognizable textual and graphical information gleaming from eyeglass reflections captured by webcams. The primary goal of our work is to measure, compute, and predict the factors, limits, and thresholds of recognizability as webcam technology evolves in the future. Our work explores and characterizes the viable threat models based on optical attacks using multi-frame super resolution techniques on sequences of video frames. Our models and experimental results in a controlled lab setting show it is possible to reconstruct and recognize with over 75% accuracy on-screen texts that have heights as small as 10 mm with a 720p webcam. We further apply this threat model to web textual contents with varying attacker capabilities to find thresholds at which text becomes recognizable. Our user study with 20 participants suggests present-day 720p webcams are sufficient for adversaries to reconstruct textual content on big-font websites. Our models further show that the evolution towards 4K cameras will tip the threshold of text leakage to reconstruction of most header texts on popular websites. Besides textual targets, a case study on recognizing a closed-world dataset of Alexa top 100 websites with 720p webcams shows a maximum recognition accuracy of 94% with 10 participants even without using machine-learning models. Our research proposes near-term mitigations including a software prototype that users can use to blur the eyeglass areas of their video streams. For possible long-term defenses, we advocate an individual reflection testing procedure to assess threats under various settings, and justify the importance of following the principle of least privilege for privacy-sensitive scenarios.
Links
Low-effort VR Headset User Authentication Using Head-reverberated Sounds with Replay Resistance.
Authors
- Ruxin Wang, School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, LA, USA
- Long Huang, School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, LA, USA
- Chen Wang, School of Electrical Engineering and Computer Science, Louisiana State University, Baton Rouge, LA, USA
Abstract
While Virtual Reality (VR) applications are becoming increasingly common, efficiently verifying a VR device user before granting personal access is still a challenge. Existing VR authentication methods require users to enter PINs or draw graphical passwords using controllers. Though the entry is in the virtual space, it can be observed by others in proximity and is subject to critical security issues. Furthermore, the in-air hand movements or handheld controller-based authentications require active user participation and are not time-efficient. This work proposes a low-effort VR device authentication system based on the unique skull-reverberated sounds, which can be acquired when the user wears the VR device. Specifically, when the user puts on the VR device or is wearing it to log into an online account, the proposed system actively emits an ultrasonic signal to initiate the authentication session. The signal returning to the VR device’s microphone has been reverberated by the user’s head, which is unique in size, skull shape and mass. We thus extract head biometric information from the received signal for unobtrusive VR device authentication.Though active acoustic sensing has been broadly used on mobile devices, no prior work has ever successfully applied such techniques to commodity VR devices. Because VR devices are designed to provide users with virtual reality immersion, the echo sounds used for active sensing are unwanted and severely suppressed. The raw audio before this process is also not accessible without kernel/hardware modifications. Thus, our work further solves the challenge of active acoustic sensing under echo cancellation to enable deploying our system on off-the-shelf VR devices. Additionally, we show that the echo cancellation mechanism is naturally good to prevent acoustic replay attacks. The proposed system is developed based on an autoencoder and a convolutional neural network for biometric data extraction and recognition. Experiments with a standalone and a mobile phone VR headset show that our system efficiently verifies a user and is also replay-resistant.