Shang-wei Lin | Nanyang Technological University (original) (raw)
Papers by Shang-wei Lin
arXiv (Cornell University), Jul 18, 2022
With the rapid increasing number of open source software (OSS), the majority of the software vuln... more With the rapid increasing number of open source software (OSS), the majority of the software vulnerabilities in the open source components are fixed silently, which leads to the deployed software that integrated them being unable to get a timely update. Hence, it is critical to design a security patch identification system to ensure the security of the utilized software. However, most of the existing works for security patch identification just consider the changed code and the commit message of a commit as a flat sequence of tokens with simple neural networks to learn its semantics, while the structure information is ignored. To address these limitations, in this paper, we propose our well-designed approach E-SPI, which extracts the structure information hidden in a commit for effective identification. Specifically, it consists of the code change encoder to extract the syntactic of the changed code with the BiLSTM to learn the code representation and the message encoder to construct the dependency graph for the commit message with the graph neural network (GNN) to learn the message representation. We further enhance the code change encoder by embedding contextual information related to the changed code. To demonstrate the effectiveness of our approach, we conduct the extensive experiments against six state-of-the-art approaches on the existing dataset and from the real deployment environment. The experimental results confirm that our approach can significantly outperform current state-of-the-art baselines.
CRC Press eBooks, Oct 19, 2012
Currently available application frameworks that target the automatic design of real-time embedded... more Currently available application frameworks that target the automatic design of real-time embedded software are poor in integrating functional and non-functional requirements for mobile and ubiquitous systems. In this work, we present the internal architecture and design flow of a newly proposed framework called Verifiable Embedded Real-Time Application Framework (VERTAF), which integrates three techniques namely software component-based reuse, formal synthesis, and formal verification. Component reuse is based on a formal unified modeling language (UML) real-time embedded object model. Formal synthesis employs quasi-static and quasi-dynamic scheduling with multi-layer portable efficient code generation, which can output either real-time operating systems (RTOS)-specific application code or automatically generated real-time executive with application code. Formal verification integrates a model checker kernel from state graph manipulators (SGM), by adapting it for embedded software. The proposed architecture for VERTAF is component-based which allows plug-and-play for the scheduler and the verifier. The architecture is also easily extensible because reusable hardware and software design components can be added. Application examples developed using VERTAF demonstrate significantly reduced relative design effort as compared to design without VERTAF, which also shows how high-level reuse of software components combined with automatic synthesis and verification increases design productivity.
arXiv (Cornell University), Jul 14, 2020
Image denoising can remove natural noise that widely exists in images captured by multimedia devi... more Image denoising can remove natural noise that widely exists in images captured by multimedia devices due to low-quality imaging sensors, unstable image transmission processes, or low light conditions. Recent works also find that image denoising benefits the high-level vision tasks, e.g., image classification. In this work, we try to challenge this common sense and explore a totally new problem, i.e., whether the image denoising can be given the capability of fooling the state-of-theart deep neural networks (DNNs) while enhancing the image quality. To this end, we initiate the very first attempt to study this problem from the perspective of adversarial attack and propose the adversarial denoise attack. More specifically, our main contributions are three-fold: First, we identify a new task that stealthily embeds attacks inside the image denoising module widely deployed in multimedia devices as an image postprocessing operation to simultaneously enhance the visual image quality and fool DNNs. Second, we formulate this new task as a kernel prediction problem for image filtering and propose the adversarial-denoising kernel prediction that can produce adversarial-noiseless kernels for effective denoising and adversarial attacking simultaneously. Third, we implement an adaptive perceptual region localization to identify semantic-related vulnerability regions with which the attack can be more effective while not doing too much harm to the denoising. We name the proposed method as Pasadena (Perceptually Aware and Stealthy Adversarial DENoise Attack) and validate our method on the NeurIPS'17 adversarial competition dataset, CVPR2021-AIC-VI: unrestricted adversarial attacks on Ima-geNet, and Tiny-ImageNet-C dataset. The comprehensive evaluation and analysis demonstrate that our method not only realizes denoising but also achieves a significantly higher success rate and transferability over state-of-the-art attacks.
Lecture Notes in Computer Science, 2004
Currently available application frameworks that target the automatic design of real-time embedded... more Currently available application frameworks that target the automatic design of real-time embedded software are poor in integrating functional and non-functional requirements for mobile and ubiquitous systems. In this work, we present the internal architecture and design flow of a newly proposed framework called Verifiable Embedded Real-Time Application Framework (VERTAF), which integrates three techniques namely software component-based reuse, formal synthesis, and formal verification. Component reuse is based on a formal unified modeling language (UML) real-time embedded object model. Formal synthesis employs quasi-static and quasi-dynamic scheduling with multi-layer portable efficient code generation, which can output either real-time operating systems (RTOS)-specific application code or automatically generated real-time executive with application code. Formal verification integrates a model checker kernel from state graph manipulators (SGM), by adapting it for embedded software. The proposed architecture for VERTAF is component-based which allows plug-and-play for the scheduler and the verifier. The architecture is also easily extensible because reusable hardware and software design components can be added. Application examples developed using VERTAF demonstrate significantly reduced relative design effort as compared to design without VERTAF, which also shows how high-level reuse of software components combined with automatic synthesis and verification increases design productivity.
Lecture notes in computer science, 2024
Boolean satisfiability (SAT) solving is a fundamental problem in computer science. Finding effici... more Boolean satisfiability (SAT) solving is a fundamental problem in computer science. Finding efficient algorithms for SAT solving has broad implications in many areas of computer science and beyond. Quantum SAT solvers have been proposed in the literature based on Grover's algorithm. Although existing quantum SAT solvers can consider all possible inputs at once, they evaluate each clause in the formula one by one sequentially, making the time complexity O(m), linear to the number of clauses m, per Grover iteration. In this work, we develop a parallel quantum SAT solver, which reduces the time complexity in each iteration to constant time O(1) by utilising extra entangled qubits. To further improve the scalability of our solution in case of extremely large problems, we develop a distributed version of the proposed parallel SAT solver based on quantum teleportation such that the total qubits required are shared and distributed among a set of quantum computers (nodes), and the quantum SAT solving is accomplished collaboratively by all the nodes. We prove the correctness of our approaches and evaluate them in simulations and real quantum computers.
arXiv (Cornell University), Aug 6, 2020
A smart contract is a computer program which allows users to automate their actions on the blockc... more A smart contract is a computer program which allows users to automate their actions on the blockchain platform. Given the significance of smart contracts in supporting important activities across industry sectors including supply chain, finance, legal and medical services, there is a strong demand for verification and validation techniques. Yet, the vast majority of smart contracts lack any kind of formal specification, which is essential for establishing their correctness. In this survey, we investigate formal models and specifications of smart contracts presented in the literature and present a systematic overview in order to understand the common trends. We also discuss the current approaches used in verifying such property specifications and identify gaps with the hope to recognize promising directions for future work.
arXiv (Cornell University), Feb 28, 2021
Decentralized finance (DeFi) has become one of the most successful applications of blockchain and... more Decentralized finance (DeFi) has become one of the most successful applications of blockchain and smart contracts. The DeFi ecosystem enables a wide range of crypto-financial activities, while the underlying smart contracts often contain bugs, with many vulnerabilities arising from the unforeseen consequences of composing DeFi protocols together. In this paper, we propose a formal process-algebraic technique that models DeFi protocols in a compositional manner to allow for efficient property verification. We also conduct a case study to demonstrate the proposed approach in analyzing the composition of two interacting DeFi protocols, namely, Curve and Compound. Finally, we discuss how the proposed modeling and verification approach can be used to analyze financial and security properties of interest.
arXiv (Cornell University), Sep 14, 2019
Despite the high stakes involved in smart contracts, they are often developed in an undisciplined... more Despite the high stakes involved in smart contracts, they are often developed in an undisciplined manner, leaving the security and reliability of blockchain transactions at risk. In this paper, we introduce ContraMaster-an oracle-supported dynamic exploit generation framework for smart contracts. Existing approaches mutate only single transactions; ContraMaster exceeds these by mutating the transaction sequences. ContraMaster uses data-flow, control-flow, and the dynamic contract state to guide its mutations. It then monitors the executions of target contract programs, and validates the results against a generalpurpose semantic test oracle to discover vulnerabilities. Being a dynamic technique, it guarantees that each discovered vulnerability is a violation of the test oracle and is able to generate the attack script to exploit this vulnerability. In contrast to rule-based approaches, ContraMaster has not shown any false positives, and it easily generalizes to unknown types of vulnerabilities (e.g., logic errors). We evaluate ContraMaster on 218 vulnerable smart contracts. The experimental results confirm its practical applicability and advantages over the state-of-the-art techniques, and also reveal three new types of attacks.
Rust is an emergent systems programming language highlighting memory safety by its Ownership and ... more Rust is an emergent systems programming language highlighting memory safety by its Ownership and Borrowing System (OBS). The existing formal semantics for Rust only covers limited subsets of the major language features of Rust. Moreover, they formalize OBS as type systems at the language-level, which can only be used to conservatively analyze programs against the OBS invariants at compile-time. That is, they are not executable, and thus cannot be used for automated verification of runtime behavior. In this paper, we propose RustSEM, a new executable operational semantics for Rust. RustSEM covers a much larger subset of the major language features than existing semantics. Moreover, RustSEM provides an operational semantics for OBS at the memory-level, which can be used to verify the runtime behavior of Rust programs against the OBS invariants. We have implemented RustSEM in the executable semantics modeling tool K-Framework. We have evaluated the semantics correctness of RustSEM wrt....
37th IEEE/ACM International Conference on Automated Software Engineering
Programming errors enable security attacks on smart contracts, which are used to manage large sum... more Programming errors enable security attacks on smart contracts, which are used to manage large sums of financial assets. Automated program repair (APR) techniques aim to reduce developers' burden of manually fixing bugs by automatically generating patches for a given issue. Existing APR tools for smart contracts focus on mitigating typical smart contract vulnerabilities rather than violations of functional specification. However, in decentralized financial (DeFi) smart contracts, the inconsistency between intended behavior and implementation translates into the deviation from the underlying financial model, resulting in monetary losses for the application and its users. In this work, we propose DeFinery-a technique for automated repair of a smart contract that does not satisfy a user-defined correctness property. To explore a larger set of diverse patches while providing formal correctness guarantees w.r.t. the intended behavior, we combine search-based patch generation with semantic analysis of an original program for inferring its specification. Our experiments in repairing 9 real-world and benchmark smart contracts prove that DeFinery efficiently generates high-quality patches that cannot be found by other existing tools. CCS CONCEPTS • Software and its engineering → Automatic programming; Software verification and validation.
Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
Most of the existing smart contract symbolic execution tools perform analysis on bytecode, which ... more Most of the existing smart contract symbolic execution tools perform analysis on bytecode, which loses high-level semantic information presented in source code. This makes interactive analysis tasks-such as visualization and debugging-extremely challenging, and significantly limits the tool usability. In this paper, we present SolSEE, a source-level symbolic execution engine for Solidity smart contracts. We describe the design of SolSEE, highlight its key features, and demonstrate its usages through a Web-based user interface. SolSEE demonstrates advantages over other existing source-level analysis tools in the advanced Solidity language features it supports and analysis flexibility. A demonstration video is available at: https://sites.google.com/view/solsee/. CCS CONCEPTS • Software and its engineering → Development frameworks and environments; Software verification and validation.
Frontiers in Artificial Intelligence and Applications
Exploring the underlying structure of a Human-Machine Interface (HMI) product effectively while a... more Exploring the underlying structure of a Human-Machine Interface (HMI) product effectively while adhering to the pre-defined test conditions and methodology is critical for validating the quality of the software. We propose an reinforcement-learning powered Automated Software Structure Exploration Framework for Testing (ASSET), which is capable of interacting with and analyzing the HMI software under testing (SUT). The main challenge is to incorporate the human instructions into the ASSET phase by using the visual feedback such as the downloaded image sequence from the HMI, which could be difficult to analyze. Our framework combines both computer vision and natural language processing techniques to understand the semantic meanings of the visual feedback. Building on the semantic understanding, we develop a rules-guided software exploration algorithm via reinforcement learning and deterministic finite automaton (DFA). We conducted experiments on HMI software in actual production phase...
IEEE Transactions on Dependable and Secure Computing
Computers & Security
Abstract The widespread adoption of smart contracts demands strong security guarantees. Our work ... more Abstract The widespread adoption of smart contracts demands strong security guarantees. Our work is motivated by the problem of statically checking potential information tampering in smart contracts. This paper presents a security type verification framework for smart contracts based on type systems. We introduce a formal calculus for reasoning smart contract operations and interactions and design a lightweight type system for checking secure information flow in Solidity (a popular high-level programming language for writing smart contracts). The soundness of our type system is proved w.r.t. non-interference. In addition, a type verifier based on our type system is proposed to assist users to automatically find an optimal secure type assignment for state variables, which makes contracts well-typed. We also prove that finding the optimal secure type assignment is theoretically a NP-complete problem. We develop a prototype implementation of the Solidity Type Verifier ( STV ) including the Solidity Type Checker ( STC ) based on the K-framework, and demonstrate its effectiveness on real-world smart contracts.
Quantum Technologies 2022
2020 25th International Conference on Engineering of Complex Computer Systems (ICECCS), 2020
Rely-Guarantee is a comprehensive technique that supports compositional reasoning for concurrent ... more Rely-Guarantee is a comprehensive technique that supports compositional reasoning for concurrent programs. However, specifications of the Rely condition - environment interference, and Guarantee condition - local transformation of thread state - are challenging to establish. Thus the construction of these conditions becomes bottleneck in automating the technique. To tackle the above problem, we propose a verification framework that, based on Rely-Guarantee principles, constructs the correctness proof of concurrent program through inferring suitable Rely -Guarantee conditions automatically. Our framework first constructs a Hoare-style sequential proof for each thread and then applies abstraction refinement to elevate these proofs into concurrent ones with appropriate Rely-Guarantee relations. Experiment results demonstrate that our approach is efficient in proving the correctness of concurrent programs.
Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017
Coverage-based fuzzing is one of the most effective techniques to find vulnerabilities, bugs or c... more Coverage-based fuzzing is one of the most effective techniques to find vulnerabilities, bugs or crashes. However, existing techniques suffer from the difficulty in exercising the paths that are protected by magic bytes comparisons (e.g., string equality comparisons). Several approaches have been proposed to use heavyweight program analysis to break through magic bytes comparisons, and hence are less scalable. In this paper, we propose a program-state based binary fuzzing approach, named Steelix, which improves the penetration power of a fuzzer at the cost of an acceptable slow down of the execution speed. In particular, we use lightweight static analysis and binary instrumentation to provide not only coverage information but also comparison progress information to a fuzzer. Such program state information informs a fuzzer about where the magic bytes are located in the test input and how to perform mutations to match the magic bytes efficiently. We have implemented Steelix and evaluated it on three datasets: LAVA-M dataset, DARPA CGC sample binaries and five real-life programs. The results show that Steelix has better code coverage and bug detection capability than the state-of-the-art fuzzers. Moreover, we found one CVE and nine new bugs.
2007 International Conference on Parallel and Distributed Systems, 2007
ABSTRACT Embedded systems have pervaded every aspect of our daily lives, however their design and... more ABSTRACT Embedded systems have pervaded every aspect of our daily lives, however their design and verification are often accomplished using ad hoc and trial-and-error methods. Courses introducing systematic and more formal methods are required. However, currently there is little consensus on what a standard syllabus for an undergraduate course on embedded software design should cover. This paper proposes a course design that have undergone thorough experimentations and evaluations through the last four years in actual classes. The course starts from the ARM instruction set architecture and concludes with an introduction of Java- based wireless application design. The design of standalone, as well as, RTOS-based embedded software are all introduced. The course has culminated in the generation of embedded software engineers that significantly contribute to the technical industry in Taiwan, spanning from handheld devices to home appliances and from networked systems to personal computer accessories. We hope the proposed curriculum becomes a standard effort at trainingembedded software engineers in both theory and practice.
IEEE Transactions on Dependable and Secure Computing, 2020
Despite the high stakes involved in smart contracts, they are often developed in an undisciplined... more Despite the high stakes involved in smart contracts, they are often developed in an undisciplined manner, leaving the security and reliability of blockchain transactions at risk. In this paper, we introduce ContraMaster-an oracle-supported dynamic exploit generation framework for smart contracts. Existing approaches mutate only single transactions; ContraMaster exceeds these by mutating the transaction sequences. ContraMaster uses data-flow, control-flow, and the dynamic contract state to guide its mutations. It then monitors the executions of target contract programs, and validates the results against a generalpurpose semantic test oracle to discover vulnerabilities. Being a dynamic technique, it guarantees that each discovered vulnerability is a violation of the test oracle and is able to generate the attack script to exploit this vulnerability. In contrast to rule-based approaches, ContraMaster has not shown any false positives, and it easily generalizes to unknown types of vulnerabilities (e.g., logic errors). We evaluate ContraMaster on 218 vulnerable smart contracts. The experimental results confirm its practical applicability and advantages over the state-of-the-art techniques, and also reveal three new types of attacks.
arXiv (Cornell University), Jul 18, 2022
With the rapid increasing number of open source software (OSS), the majority of the software vuln... more With the rapid increasing number of open source software (OSS), the majority of the software vulnerabilities in the open source components are fixed silently, which leads to the deployed software that integrated them being unable to get a timely update. Hence, it is critical to design a security patch identification system to ensure the security of the utilized software. However, most of the existing works for security patch identification just consider the changed code and the commit message of a commit as a flat sequence of tokens with simple neural networks to learn its semantics, while the structure information is ignored. To address these limitations, in this paper, we propose our well-designed approach E-SPI, which extracts the structure information hidden in a commit for effective identification. Specifically, it consists of the code change encoder to extract the syntactic of the changed code with the BiLSTM to learn the code representation and the message encoder to construct the dependency graph for the commit message with the graph neural network (GNN) to learn the message representation. We further enhance the code change encoder by embedding contextual information related to the changed code. To demonstrate the effectiveness of our approach, we conduct the extensive experiments against six state-of-the-art approaches on the existing dataset and from the real deployment environment. The experimental results confirm that our approach can significantly outperform current state-of-the-art baselines.
CRC Press eBooks, Oct 19, 2012
Currently available application frameworks that target the automatic design of real-time embedded... more Currently available application frameworks that target the automatic design of real-time embedded software are poor in integrating functional and non-functional requirements for mobile and ubiquitous systems. In this work, we present the internal architecture and design flow of a newly proposed framework called Verifiable Embedded Real-Time Application Framework (VERTAF), which integrates three techniques namely software component-based reuse, formal synthesis, and formal verification. Component reuse is based on a formal unified modeling language (UML) real-time embedded object model. Formal synthesis employs quasi-static and quasi-dynamic scheduling with multi-layer portable efficient code generation, which can output either real-time operating systems (RTOS)-specific application code or automatically generated real-time executive with application code. Formal verification integrates a model checker kernel from state graph manipulators (SGM), by adapting it for embedded software. The proposed architecture for VERTAF is component-based which allows plug-and-play for the scheduler and the verifier. The architecture is also easily extensible because reusable hardware and software design components can be added. Application examples developed using VERTAF demonstrate significantly reduced relative design effort as compared to design without VERTAF, which also shows how high-level reuse of software components combined with automatic synthesis and verification increases design productivity.
arXiv (Cornell University), Jul 14, 2020
Image denoising can remove natural noise that widely exists in images captured by multimedia devi... more Image denoising can remove natural noise that widely exists in images captured by multimedia devices due to low-quality imaging sensors, unstable image transmission processes, or low light conditions. Recent works also find that image denoising benefits the high-level vision tasks, e.g., image classification. In this work, we try to challenge this common sense and explore a totally new problem, i.e., whether the image denoising can be given the capability of fooling the state-of-theart deep neural networks (DNNs) while enhancing the image quality. To this end, we initiate the very first attempt to study this problem from the perspective of adversarial attack and propose the adversarial denoise attack. More specifically, our main contributions are three-fold: First, we identify a new task that stealthily embeds attacks inside the image denoising module widely deployed in multimedia devices as an image postprocessing operation to simultaneously enhance the visual image quality and fool DNNs. Second, we formulate this new task as a kernel prediction problem for image filtering and propose the adversarial-denoising kernel prediction that can produce adversarial-noiseless kernels for effective denoising and adversarial attacking simultaneously. Third, we implement an adaptive perceptual region localization to identify semantic-related vulnerability regions with which the attack can be more effective while not doing too much harm to the denoising. We name the proposed method as Pasadena (Perceptually Aware and Stealthy Adversarial DENoise Attack) and validate our method on the NeurIPS'17 adversarial competition dataset, CVPR2021-AIC-VI: unrestricted adversarial attacks on Ima-geNet, and Tiny-ImageNet-C dataset. The comprehensive evaluation and analysis demonstrate that our method not only realizes denoising but also achieves a significantly higher success rate and transferability over state-of-the-art attacks.
Lecture Notes in Computer Science, 2004
Currently available application frameworks that target the automatic design of real-time embedded... more Currently available application frameworks that target the automatic design of real-time embedded software are poor in integrating functional and non-functional requirements for mobile and ubiquitous systems. In this work, we present the internal architecture and design flow of a newly proposed framework called Verifiable Embedded Real-Time Application Framework (VERTAF), which integrates three techniques namely software component-based reuse, formal synthesis, and formal verification. Component reuse is based on a formal unified modeling language (UML) real-time embedded object model. Formal synthesis employs quasi-static and quasi-dynamic scheduling with multi-layer portable efficient code generation, which can output either real-time operating systems (RTOS)-specific application code or automatically generated real-time executive with application code. Formal verification integrates a model checker kernel from state graph manipulators (SGM), by adapting it for embedded software. The proposed architecture for VERTAF is component-based which allows plug-and-play for the scheduler and the verifier. The architecture is also easily extensible because reusable hardware and software design components can be added. Application examples developed using VERTAF demonstrate significantly reduced relative design effort as compared to design without VERTAF, which also shows how high-level reuse of software components combined with automatic synthesis and verification increases design productivity.
Lecture notes in computer science, 2024
Boolean satisfiability (SAT) solving is a fundamental problem in computer science. Finding effici... more Boolean satisfiability (SAT) solving is a fundamental problem in computer science. Finding efficient algorithms for SAT solving has broad implications in many areas of computer science and beyond. Quantum SAT solvers have been proposed in the literature based on Grover's algorithm. Although existing quantum SAT solvers can consider all possible inputs at once, they evaluate each clause in the formula one by one sequentially, making the time complexity O(m), linear to the number of clauses m, per Grover iteration. In this work, we develop a parallel quantum SAT solver, which reduces the time complexity in each iteration to constant time O(1) by utilising extra entangled qubits. To further improve the scalability of our solution in case of extremely large problems, we develop a distributed version of the proposed parallel SAT solver based on quantum teleportation such that the total qubits required are shared and distributed among a set of quantum computers (nodes), and the quantum SAT solving is accomplished collaboratively by all the nodes. We prove the correctness of our approaches and evaluate them in simulations and real quantum computers.
arXiv (Cornell University), Aug 6, 2020
A smart contract is a computer program which allows users to automate their actions on the blockc... more A smart contract is a computer program which allows users to automate their actions on the blockchain platform. Given the significance of smart contracts in supporting important activities across industry sectors including supply chain, finance, legal and medical services, there is a strong demand for verification and validation techniques. Yet, the vast majority of smart contracts lack any kind of formal specification, which is essential for establishing their correctness. In this survey, we investigate formal models and specifications of smart contracts presented in the literature and present a systematic overview in order to understand the common trends. We also discuss the current approaches used in verifying such property specifications and identify gaps with the hope to recognize promising directions for future work.
arXiv (Cornell University), Feb 28, 2021
Decentralized finance (DeFi) has become one of the most successful applications of blockchain and... more Decentralized finance (DeFi) has become one of the most successful applications of blockchain and smart contracts. The DeFi ecosystem enables a wide range of crypto-financial activities, while the underlying smart contracts often contain bugs, with many vulnerabilities arising from the unforeseen consequences of composing DeFi protocols together. In this paper, we propose a formal process-algebraic technique that models DeFi protocols in a compositional manner to allow for efficient property verification. We also conduct a case study to demonstrate the proposed approach in analyzing the composition of two interacting DeFi protocols, namely, Curve and Compound. Finally, we discuss how the proposed modeling and verification approach can be used to analyze financial and security properties of interest.
arXiv (Cornell University), Sep 14, 2019
Despite the high stakes involved in smart contracts, they are often developed in an undisciplined... more Despite the high stakes involved in smart contracts, they are often developed in an undisciplined manner, leaving the security and reliability of blockchain transactions at risk. In this paper, we introduce ContraMaster-an oracle-supported dynamic exploit generation framework for smart contracts. Existing approaches mutate only single transactions; ContraMaster exceeds these by mutating the transaction sequences. ContraMaster uses data-flow, control-flow, and the dynamic contract state to guide its mutations. It then monitors the executions of target contract programs, and validates the results against a generalpurpose semantic test oracle to discover vulnerabilities. Being a dynamic technique, it guarantees that each discovered vulnerability is a violation of the test oracle and is able to generate the attack script to exploit this vulnerability. In contrast to rule-based approaches, ContraMaster has not shown any false positives, and it easily generalizes to unknown types of vulnerabilities (e.g., logic errors). We evaluate ContraMaster on 218 vulnerable smart contracts. The experimental results confirm its practical applicability and advantages over the state-of-the-art techniques, and also reveal three new types of attacks.
Rust is an emergent systems programming language highlighting memory safety by its Ownership and ... more Rust is an emergent systems programming language highlighting memory safety by its Ownership and Borrowing System (OBS). The existing formal semantics for Rust only covers limited subsets of the major language features of Rust. Moreover, they formalize OBS as type systems at the language-level, which can only be used to conservatively analyze programs against the OBS invariants at compile-time. That is, they are not executable, and thus cannot be used for automated verification of runtime behavior. In this paper, we propose RustSEM, a new executable operational semantics for Rust. RustSEM covers a much larger subset of the major language features than existing semantics. Moreover, RustSEM provides an operational semantics for OBS at the memory-level, which can be used to verify the runtime behavior of Rust programs against the OBS invariants. We have implemented RustSEM in the executable semantics modeling tool K-Framework. We have evaluated the semantics correctness of RustSEM wrt....
37th IEEE/ACM International Conference on Automated Software Engineering
Programming errors enable security attacks on smart contracts, which are used to manage large sum... more Programming errors enable security attacks on smart contracts, which are used to manage large sums of financial assets. Automated program repair (APR) techniques aim to reduce developers' burden of manually fixing bugs by automatically generating patches for a given issue. Existing APR tools for smart contracts focus on mitigating typical smart contract vulnerabilities rather than violations of functional specification. However, in decentralized financial (DeFi) smart contracts, the inconsistency between intended behavior and implementation translates into the deviation from the underlying financial model, resulting in monetary losses for the application and its users. In this work, we propose DeFinery-a technique for automated repair of a smart contract that does not satisfy a user-defined correctness property. To explore a larger set of diverse patches while providing formal correctness guarantees w.r.t. the intended behavior, we combine search-based patch generation with semantic analysis of an original program for inferring its specification. Our experiments in repairing 9 real-world and benchmark smart contracts prove that DeFinery efficiently generates high-quality patches that cannot be found by other existing tools. CCS CONCEPTS • Software and its engineering → Automatic programming; Software verification and validation.
Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
Most of the existing smart contract symbolic execution tools perform analysis on bytecode, which ... more Most of the existing smart contract symbolic execution tools perform analysis on bytecode, which loses high-level semantic information presented in source code. This makes interactive analysis tasks-such as visualization and debugging-extremely challenging, and significantly limits the tool usability. In this paper, we present SolSEE, a source-level symbolic execution engine for Solidity smart contracts. We describe the design of SolSEE, highlight its key features, and demonstrate its usages through a Web-based user interface. SolSEE demonstrates advantages over other existing source-level analysis tools in the advanced Solidity language features it supports and analysis flexibility. A demonstration video is available at: https://sites.google.com/view/solsee/. CCS CONCEPTS • Software and its engineering → Development frameworks and environments; Software verification and validation.
Frontiers in Artificial Intelligence and Applications
Exploring the underlying structure of a Human-Machine Interface (HMI) product effectively while a... more Exploring the underlying structure of a Human-Machine Interface (HMI) product effectively while adhering to the pre-defined test conditions and methodology is critical for validating the quality of the software. We propose an reinforcement-learning powered Automated Software Structure Exploration Framework for Testing (ASSET), which is capable of interacting with and analyzing the HMI software under testing (SUT). The main challenge is to incorporate the human instructions into the ASSET phase by using the visual feedback such as the downloaded image sequence from the HMI, which could be difficult to analyze. Our framework combines both computer vision and natural language processing techniques to understand the semantic meanings of the visual feedback. Building on the semantic understanding, we develop a rules-guided software exploration algorithm via reinforcement learning and deterministic finite automaton (DFA). We conducted experiments on HMI software in actual production phase...
IEEE Transactions on Dependable and Secure Computing
Computers & Security
Abstract The widespread adoption of smart contracts demands strong security guarantees. Our work ... more Abstract The widespread adoption of smart contracts demands strong security guarantees. Our work is motivated by the problem of statically checking potential information tampering in smart contracts. This paper presents a security type verification framework for smart contracts based on type systems. We introduce a formal calculus for reasoning smart contract operations and interactions and design a lightweight type system for checking secure information flow in Solidity (a popular high-level programming language for writing smart contracts). The soundness of our type system is proved w.r.t. non-interference. In addition, a type verifier based on our type system is proposed to assist users to automatically find an optimal secure type assignment for state variables, which makes contracts well-typed. We also prove that finding the optimal secure type assignment is theoretically a NP-complete problem. We develop a prototype implementation of the Solidity Type Verifier ( STV ) including the Solidity Type Checker ( STC ) based on the K-framework, and demonstrate its effectiveness on real-world smart contracts.
Quantum Technologies 2022
2020 25th International Conference on Engineering of Complex Computer Systems (ICECCS), 2020
Rely-Guarantee is a comprehensive technique that supports compositional reasoning for concurrent ... more Rely-Guarantee is a comprehensive technique that supports compositional reasoning for concurrent programs. However, specifications of the Rely condition - environment interference, and Guarantee condition - local transformation of thread state - are challenging to establish. Thus the construction of these conditions becomes bottleneck in automating the technique. To tackle the above problem, we propose a verification framework that, based on Rely-Guarantee principles, constructs the correctness proof of concurrent program through inferring suitable Rely -Guarantee conditions automatically. Our framework first constructs a Hoare-style sequential proof for each thread and then applies abstraction refinement to elevate these proofs into concurrent ones with appropriate Rely-Guarantee relations. Experiment results demonstrate that our approach is efficient in proving the correctness of concurrent programs.
Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, 2017
Coverage-based fuzzing is one of the most effective techniques to find vulnerabilities, bugs or c... more Coverage-based fuzzing is one of the most effective techniques to find vulnerabilities, bugs or crashes. However, existing techniques suffer from the difficulty in exercising the paths that are protected by magic bytes comparisons (e.g., string equality comparisons). Several approaches have been proposed to use heavyweight program analysis to break through magic bytes comparisons, and hence are less scalable. In this paper, we propose a program-state based binary fuzzing approach, named Steelix, which improves the penetration power of a fuzzer at the cost of an acceptable slow down of the execution speed. In particular, we use lightweight static analysis and binary instrumentation to provide not only coverage information but also comparison progress information to a fuzzer. Such program state information informs a fuzzer about where the magic bytes are located in the test input and how to perform mutations to match the magic bytes efficiently. We have implemented Steelix and evaluated it on three datasets: LAVA-M dataset, DARPA CGC sample binaries and five real-life programs. The results show that Steelix has better code coverage and bug detection capability than the state-of-the-art fuzzers. Moreover, we found one CVE and nine new bugs.
2007 International Conference on Parallel and Distributed Systems, 2007
ABSTRACT Embedded systems have pervaded every aspect of our daily lives, however their design and... more ABSTRACT Embedded systems have pervaded every aspect of our daily lives, however their design and verification are often accomplished using ad hoc and trial-and-error methods. Courses introducing systematic and more formal methods are required. However, currently there is little consensus on what a standard syllabus for an undergraduate course on embedded software design should cover. This paper proposes a course design that have undergone thorough experimentations and evaluations through the last four years in actual classes. The course starts from the ARM instruction set architecture and concludes with an introduction of Java- based wireless application design. The design of standalone, as well as, RTOS-based embedded software are all introduced. The course has culminated in the generation of embedded software engineers that significantly contribute to the technical industry in Taiwan, spanning from handheld devices to home appliances and from networked systems to personal computer accessories. We hope the proposed curriculum becomes a standard effort at trainingembedded software engineers in both theory and practice.
IEEE Transactions on Dependable and Secure Computing, 2020
Despite the high stakes involved in smart contracts, they are often developed in an undisciplined... more Despite the high stakes involved in smart contracts, they are often developed in an undisciplined manner, leaving the security and reliability of blockchain transactions at risk. In this paper, we introduce ContraMaster-an oracle-supported dynamic exploit generation framework for smart contracts. Existing approaches mutate only single transactions; ContraMaster exceeds these by mutating the transaction sequences. ContraMaster uses data-flow, control-flow, and the dynamic contract state to guide its mutations. It then monitors the executions of target contract programs, and validates the results against a generalpurpose semantic test oracle to discover vulnerabilities. Being a dynamic technique, it guarantees that each discovered vulnerability is a violation of the test oracle and is able to generate the attack script to exploit this vulnerability. In contrast to rule-based approaches, ContraMaster has not shown any false positives, and it easily generalizes to unknown types of vulnerabilities (e.g., logic errors). We evaluate ContraMaster on 218 vulnerable smart contracts. The experimental results confirm its practical applicability and advantages over the state-of-the-art techniques, and also reveal three new types of attacks.