Lap Chung Lam - Academia.edu (original) (raw)

Papers by Lap Chung Lam

Research paper thumbnail of Checking Array Bound Violation Using Segmentation Hardware

The ability to check memory references against their associated array/buffer bounds helps program... more The ability to check memory references against their associated array/buffer bounds helps programmers to detect programming errors involving address overruns early on and thus avoid many difficult bugs down the line. This paper proposes a novel approach called Cash to the array bound checking problem that exploits the segmentation feature in the virtual memory hardware of the X86 architecture. The Cash approach allocates a separate segment to each static array or dynamically allocated buffer, and generates the instructions for array references in such a way that the segment limit check in X86's virtual memory protection mechanism performs the necessary array bound checking for free. In those cases that hardware bound checking is not possible, it falls back to software bound checking. As a result, Cash does not need to pay per-reference software checking overhead in most cases. However, the Cash approach incurs a fixed setup overhead for each use of an array, which may involve multiple array references. The existence of this overhead requires compiler writers to judiciously apply the proposed technique to minimize the performance cost of array bound checking. This paper presents the detailed design and implementation of the Cash compiler, and a comprehensive evaluation of various performance tradeoffs associated with the proposed array bound checking technique. For the set of complicated network applications we tested, including Apache, Sendmail, Bind, etc., the latency penalty of Cash's bound checking mechanism is between 2.5% to 9.8% when compared with the baseline case that does not perform any bound checking.

Research paper thumbnail of Variorum: a multimedia-based program documentation system

2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532)

Conventional software documentation systems are mostly based on textutal descriptions that explai... more Conventional software documentation systems are mostly based on textutal descriptions that explain or annotate the program's source code. Typically they also support interactive browsing of high-level control ows, and name-based searching of program primitives such as variable declarations and function de nitions. Because these systems rely solely on texts, it is di cult for program authors to describe overall algorithm structures and detailed implementation considerations of the programs in an interactive and exible fashion. Variorum is a novel software documentation system that allows program authors to record the process of \walking through" their own code using multimedia technology, speci cally, text, audio, and digital pen drawing. This approach greatly improves the interactivity and exibility in the software documentation process. In addition, to broaden its applicability and to reduce the implementation complexity, Variorum is designed to inter-operate with the WWW technology, in that the program source code les and their annotations are stored on web servers and directly accessible via commercial web browsers such as Microsoft's Internet Explorer. This paper describes the design and implementation of Variorum, as well as preliminary usage and performance experiences with the current prototype.

Research paper thumbnail of DataPhone: An Intelligent Phone for Data

Data conferencing is one of the more commercially successful examples among conferencing applicat... more Data conferencing is one of the more commercially successful examples among conferencing applications, because of its effectiveness and relatively modest resource requirements. Existing data conferencing applications are built almost exclusively on IP-based packet switched networks. While they are very effective in improving office communications, their use in the households is still quite limited because data conferencing end-points are mostly personal computers and are thus not particularly friendly to a majority of the population in the world. In contrast, more than 95% of the population in the United States are quite comfortable with conventional telephones. This paper describes the design, implementation, and evaluation of an information appliance, called DataPhone that can be used both as a regular phone and a data conferencing device. A DataPhone consists of a touch sensitive screen, a SVD modem and a microcontroller-based single board computer. DataPhone allows two parties t...

Research paper thumbnail of Dynamic multi-process information flow tracking for web application security

Proceedings of the 2007 ACM/IFIP/USENIX international conference on Middleware companion, 2007

Although there is a large body of research on detection and prevention of such memory corruption ... more Although there is a large body of research on detection and prevention of such memory corruption attacks as buffer overflow, integer overflow, and format string attacks, the web application security problem receives relatively less attention from the research community by comparison. The majority of web application security problems originate from the fact that web applications fail to perform sanity checks on inputs from the network that are eventually used as operands of securitysensitive operations. Therefore, a promising approach to this problem is to apply proper checks on tainted portions of the operands used in security-sensitive operations, where a byte is tainted if it is data/control dependent on some network packet(s). This paper presents the design, implementation and evaluation of a dynamic checking compiler called WASC, which automatically adds checks into web applications used in three-tier internet services to protect them from the most common two types of web application attacks: SQL-and script-injection attack. In addition to including a taint analysis infrastructure for multi-process and multi-language applications, WASC features the use of SQL and HTML parsers to defeat evasion techniques that exploit interpretation differences between attack detection engines and target applications. Experiments with a fully operational WASC prototype show that it can indeed stop all SQL/script injection attacks that we have tested. Moreover, the end-to-end latency penalty associated with the checks inserted by WASC is less than 30% for the test web applications used in our performance study.

Research paper thumbnail of DataPhone: an intelligent phone for data conferencing

2003 International Symposium on VLSI Technology, Systems and Applications. Proceedings of Technical Papers. (IEEE Cat. No.03TH8672), 2004

ABSTRACT Data conferencing is one of the more commercially successful conferencing applications, ... more ABSTRACT Data conferencing is one of the more commercially successful conferencing applications, because of its effectiveness and relatively modest resource requirements. Existing data conferencing applications are built almost exclusively on IP-based packet switched networks. While they are very effective in office communications, their use in the home is still quite limited because data conferencing end-points are mostly personal computers and are thus not particularly people friendly. In contrast, many more people are quite comfortable with conventional telephones. The paper describes the design, implementation, and evaluation of an information appliance, called DataPhone, that can be used both as a regular phone and a data conferencing device. A DataPhone consists of a touch sensitive screen, an SVD (simultaneous voice and data) modem and a microcontroller-based single board computer. DataPhone allows two parties to have a regular voice communications channel, as well as a shared whiteboard for sketching and document sharing. One of the key design decisions of DataPhone is that it runs directly on PSTN, and does not required any additional infrastructural support, such as ISP subscription, domain name service, routing gateway, etc., thus avoiding their associated configuration headaches. We have successfully built a fully functional DataPhone prototype and evaluated its usefulness through end user testing. Initial responses are both favorable and encouraging.

Research paper thumbnail of A feather-weight virtual machine for windows applications

Proceedings of the 2nd international conference on Virtual execution environments, 2006

Many fault-tolerant and intrusion-tolerant systems require the ability to execute unsafe programs... more Many fault-tolerant and intrusion-tolerant systems require the ability to execute unsafe programs in a realistic environment without leaving permanent damages. Virtual machine technology meets this requirement perfectly because it provides an execution environment that is both realistic and isolated. In this paper, we introduce an OS level virtual machine architecture for Windows applications called Feather-weight Virtual Machine (FVM), under which virtual machines share as many resources of the host machine as possible while still isolated from one another and from the host machine. The key technique behind FVM is namespace virtualization, which isolates virtual machines by renaming resources at the OS system call interface. Through a copy-on-write scheme, FVM allows multiple virtual machines to physically share resources but logically isolate their resources from each other. A main technical challenge in FVM is how to achieve strong isolation among different virtual machines and the host machine, due to numerous namespaces and interprocess communication mechanisms on Windows. Experimental results demonstrate that FVM is more flexible and scalable, requires less system resource, incurs lower start-up and run-time performance overhead than existing hardware-level virtual machine technologies, and thus makes a compelling building block for security and fault-tolerant applications.

Research paper thumbnail of Applications of a feather-weight virtual machine

Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, 2008

A Feather-weight Virtual Machine (FVM) is an OS-level virtualization technology that enables mult... more A Feather-weight Virtual Machine (FVM) is an OS-level virtualization technology that enables multiple isolated execution environments to exist on a single Windows kernel. The key design goal of FVM is efficient resource sharing among VMs so as to minimize VM startup/shutdown cost and scale to a larger number of concurrent VM instances. As a result, FVM provides an effective platform for fault-tolerant and intrusion-tolerant applications that require frequent invocation and termination of dispensable VMs. This paper presents three complete applications of the FVM technology: scalable web site testing; shared binary service for application deployment and distributed Display-Only File Server (DOFS). To identify malicious web sites that exploit browser vulnerabilities, we use a web crawler to access untrusted sites, render their pages in multiple browsers each running in a separate VM, and monitor their execution behaviors. To allow Windows-based end user machines to share binaries that are stored, managed and patched on a central location, we run shared binaries in a special VM on the end user machine whose runtime environment is imported from the central binary server. To protect confidential files in a file server against information theft by insiders, we ensure that file viewing/editing programs run in a VM, which grants file content display but prevents file content from being saved on the host machine. In this paper, we show how to customize the generic FVM framework to accommodate the needs of the three applications, and present experimental results that demonstrate their performance and effectiveness.

Research paper thumbnail of Accurate and Automated System Call Policy-Based Intrusion Prevention

International Conference on Dependable Systems and Networks (DSN'06)

Research paper thumbnail of A General Dynamic Information Flow Tracking Framework for Security Applications

2006 22nd Annual Computer Security Applications Conference (ACSAC'06), 2006

Many software security solutions require accurate tracking of control/data dependencies among inf... more Many software security solutions require accurate tracking of control/data dependencies among information objects in network applications. This paper presents a general dynamic information flow tracking framework (called GIFT) for C programs that allows an application developer to associate applicationspecific tags with input data, instruments the application to propagate these tags to all the other data that are control/data-dependent on them, and invokes application-specific processing on output data according to their tag values. To use GIFT, an application developer only needs to implement input and output proxy functions to tag input data and to perform tag-dependent processing on output data, respectively. To demonstrate the usefulness of GIFT, we implement a complete GIFT application called Aussum, which allows selective sandboxing of network client applications based on whether their inputs are "tainted" or not. For a set of computation-intensive test applications, the measured elapsed time overhead of GIFT is less than 35%.

Research paper thumbnail of How to Automatically and Accurately Sandbox Microsoft IIS

2006 22nd Annual Computer Security Applications Conference (ACSAC'06), 2006

Comparing the system call sequence of a network application against a sandboxing policy is a popu... more Comparing the system call sequence of a network application against a sandboxing policy is a popular approach to detecting control-hijacking attack, in which the attacker exploits such software vulnerabilities as buffer overflow to take over the control of a victim application and possibly the underlying machine. The long-standing technical barrier to the acceptance of this system call monitoring approach is how to derive accurate sandboxing policies for Windows applications whose source code is unavailable. In fact, many commercial computer security companies take advantage of this fact and fashion a business model in which their users have to pay a subscription fee to receive periodic updates on the application sandboxing policies, much like anti-virus signatures. This paper describes the design, implementation and evaluation of a sandboxing system called BASS 1 that can automatically extract a highly accurate application-specific sandboxing policy from a Win32/X86 binary, and enforce the extracted policy at run time with low performance overhead. BASS is built on a binary interpretation and analysis infrastructure called BIRD, which can handle application binaries with dynamically linked libraries, exception handlers and multi-threading, and has been shown to work correctly for a large number of commercially distributed Windows-based network applications, including IIS and Apache. The throughput and latency penalty of BASS for all the applications we have tested except one is under 8%.

Research paper thumbnail of BIRD: Binary Interpretation using Runtime Disassembly

International Symposium on Code Generation and Optimization (CGO'06)

The majority of security vulnerabilities published in the literature are due to software bugs. Ma... more The majority of security vulnerabilities published in the literature are due to software bugs. Many researchers have developed program transformation and analysis techniques to automatically detect or eliminate such vulnerabilities. So far, most of them cannot be applied to commercially distributed applications on the Windows/x86 platform, because it is almost impossible to disassemble a binary file with 100% accuracy and coverage on that platform. This paper presents the design, implementation, and evaluation of a binary analysis and instrumentation infrastructure for the Windows/x86 platform called BIRD (Binary Interpretation using Runtime Disassembly), which provides two services to developers of security-enhancing program transformation tools: converting binary code into assembly language instructions for further analysis, and inserting instrumentation code at specific places of a given binary without affecting its execution semantics. Instead of requiring a highfidelity instruction set architectural emulator, BIRD combines static disassembly with an on-demand dynamic disassembly approach to guarantee that each instruction in a binary file is analyzed or transformed before it is executed. It takes 12 student months to develop the first BIRD prototype, which can successfully work for all applications in Microsoft Office suite as well as Internet Explorer and IIS web server, including all DLLs that they use. Moreover, the additional throughput penalty of the BIRD prototype on production server applications such as Apache, IIS, and BIND is uniformly below 4%.

Research paper thumbnail of Automatic Extraction of Accurate Application-Specific Sandboxing Policy

MILCOM 2005 - 2005 IEEE Military Communications Conference

One of the most dangerous cybersecurity threats is control hijacking attacks, which hijack the co... more One of the most dangerous cybersecurity threats is control hijacking attacks, which hijack the control of a victim application, and execute arbitrary system calls assuming the identity of the victim program's effective user. System call monitoring has been touted as an effective defense against control hijacking attacks because it could prevent remote attackers from inflicting damage upon a victim system even if they can successfully compromise certain applications running on the system. However, the Achilles' heel of the system call monitoring approach is the construction of accurate system call behavior model that minimizes false positives and negatives. This paper describes the design, implementation, and evaluation of a Program semantics-Aware Intrusion Detection system called Paid, which automatically derives an applicationspecific system call behavior model from the application's source code, and checks the application's run-time system call pattern against this model to thwart any control hijacking attacks. The per-application behavior model is in the form of the sites and ordering of system calls made in the application, as well as its partial control flow. Experiments on a fully working Paid prototype show that Paid can indeed stop attacks that exploit nonstandard security holes, such as format string attacks that modify function pointers, and that the run-time latency and throughput penalty of Paid are under 11.66% and 10.44%, respectively, for a set of productionmode network server applications including Apache, Sendmail, Ftp daemon, etc.

Research paper thumbnail of Checking Array Bound Violation Using Segmentation Hardware

2005 International Conference on Dependable Systems and Networks (DSN'05)

The ability to check memory references against their associated array/buffer bounds helps program... more The ability to check memory references against their associated array/buffer bounds helps programmers to detect programming errors involving address overruns early on and thus avoid many difficult bugs down the line. This paper proposes a novel approach called Cash to the array bound checking problem that exploits the segmentation feature in the virtual memory hardware of the X86 architecture. The Cash approach allocates a separate segment to each static array or dynamically allocated buffer, and generates the instructions for array references in such a way that the segment limit check in X86's virtual memory protection mechanism performs the necessary array bound checking for free. In those cases that hardware bound checking is not possible, it falls back to software bound checking. As a result, Cash does not need to pay per-reference software checking overhead in most cases. However, the Cash approach incurs a fixed setup overhead for each use of an array, which may involve multiple array references. The existence of this overhead requires compiler writers to judiciously apply the proposed technique to minimize the performance cost of array bound checking. This paper presents the detailed design and implementation of the Cash compiler, and a comprehensive evaluation of various performance tradeoffs associated with the proposed array bound checking technique. For the set of complicated network applications we tested, including Apache, Sendmail, Bind, etc., the latency penalty of Cash's bound checking mechanism is between 2.5% to 9.8% when compared with the baseline case that does not perform any bound checking.

Research paper thumbnail of Web Application Attack Prevention for Tiered Internet Services

2008 The Fourth International Conference on Information Assurance and Security, 2008

ABSTRACT Because most web application attacks exploit vulnerabilities that result from lack of in... more ABSTRACT Because most web application attacks exploit vulnerabilities that result from lack of input validation, a promising approach to thwarting these attacks is to apply validation checks on tainted portions of the operands used in security-sensitive operations, where a byte is tainted if it is data/control dependent on some network packet(s). This paper presents the design, implementation and evaluation of a dynamic checking compiler called WASC, which automatically adds checks into web applications used in three-tier internet services to protect them from the most common two types of web application attacks: SQL- and script-injection attack. In addition to including a taint analysis infrastructure for multi-process and multi-language applications, WASC features the use of SQL and HTML parsers to defeat evasion techniques that exploit interpretation differences between attack detection engines and target applications. Experiments with a fully operational WASC prototype show that it can indeed stop all SQL/script injection attacks that we have tested. Moreover, the end-to-end latency penalty associated with the checks inserted by WASC is less than 30% for the test web applications used in our performance study.

Research paper thumbnail of Foreign Code Detection on the Windows/X86 Platform

2006 22nd Annual Computer Security Applications Conference (ACSAC'06), 2006

As new attacks against Windows-based machines emerge almost on a daily basis, there is an increas... more As new attacks against Windows-based machines emerge almost on a daily basis, there is an increasing need to "lock down" individual users' desktop machines in corporate computing environments. One particular way to lock down a user computer is to guarantee that only authorized binary programs are allowed to run on that computer. A major advantage of this approach is that binaries downloaded without the user's knowledge, such as spyware, adware, or code entering through buffer overflow attacks, can never run on computers that are locked down this way. This paper presents the design, implementation and evaluation of FOOD, a foreign code detection system specifically for the Windows/X86 platform, where foreign code is defined as any binary programs that do not go through an authorized installation procedure. FOOD verifies the legitimacy of binary images involved in process creation and library loading to ensure that only authorized binaries are used in these operations. In addition, FOOD checks the target address of every indirect branch instruction in Windows binaries to prevent illegitimate control transfers to either dynamically injected mobile code or pre-existing library functions that are potentially damaging. Combined together, these techniques strictly prevent the execution of any foreign code. Experiments with a fully working FOOD prototype show that it can indeed stop all spyware and buffer overflow attacks we tested, and its worst-case run-time performance overhead associated with foreign code detection is less than 35%.

Research paper thumbnail of Application Specific Sandboxing for Win32/Intel Binaries

Comparing the system call sequence of a network appli- cation against a sandboxing policy is a po... more Comparing the system call sequence of a network appli- cation against a sandboxing policy is a popular approach to detecting control-hijacking attack, in which the attacker exploits such software vulnerabilities as buffer overflow to take over the control of a victim application and pos- sibly the underlying machine. The long-standing techni- cal barrier to the acceptance of this system call monitor- ing approach is how to derive accurate sandboxing poli- cies for Windows applications whose source code is un- available. In fact, many commercial computer security companies take advantage of this fact and fashion a busi- ness model in which their users have to pay a subscription fee to receive periodic updates on the application sandbox- ing policies, much like anti-virus signatures. This paper describes the design, implementation and evaluation of a sandboxing system called BASS that can automatically ex- tract a highly accurate application-specific sandboxing pol- icy from a Win32/X...

Research paper thumbnail of Program Transformation Techniques for Host-based Intrusion Prevention

... Paid is a system call based intrusion prevention system, which includes a comprehensive progr... more ... Paid is a system call based intrusion prevention system, which includes a comprehensive program analysis tool that can automati-cally derive an accurate ... Paid checks the application's run-time system call pattern against this model to prevent control-hijacking attacks ...

Research paper thumbnail of Checking Array Bound Violation Using Segmentation Hardware

The ability to check memory references against their associated array/buffer bounds helps program... more The ability to check memory references against their associated array/buffer bounds helps programmers to detect programming errors involving address overruns early on and thus avoid many difficult bugs down the line. This paper proposes a novel approach called Cash to the array bound checking problem that exploits the segmentation feature in the virtual memory hardware of the X86 architecture. The Cash approach allocates a separate segment to each static array or dynamically allocated buffer, and generates the instructions for array references in such a way that the segment limit check in X86's virtual memory protection mechanism performs the necessary array bound checking for free. In those cases that hardware bound checking is not possible, it falls back to software bound checking. As a result, Cash does not need to pay per-reference software checking overhead in most cases. However, the Cash approach incurs a fixed setup overhead for each use of an array, which may involve multiple array references. The existence of this overhead requires compiler writers to judiciously apply the proposed technique to minimize the performance cost of array bound checking. This paper presents the detailed design and implementation of the Cash compiler, and a comprehensive evaluation of various performance tradeoffs associated with the proposed array bound checking technique. For the set of complicated network applications we tested, including Apache, Sendmail, Bind, etc., the latency penalty of Cash's bound checking mechanism is between 2.5% to 9.8% when compared with the baseline case that does not perform any bound checking.

Research paper thumbnail of Variorum: a multimedia-based program documentation system

2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532)

Conventional software documentation systems are mostly based on textutal descriptions that explai... more Conventional software documentation systems are mostly based on textutal descriptions that explain or annotate the program's source code. Typically they also support interactive browsing of high-level control ows, and name-based searching of program primitives such as variable declarations and function de nitions. Because these systems rely solely on texts, it is di cult for program authors to describe overall algorithm structures and detailed implementation considerations of the programs in an interactive and exible fashion. Variorum is a novel software documentation system that allows program authors to record the process of \walking through" their own code using multimedia technology, speci cally, text, audio, and digital pen drawing. This approach greatly improves the interactivity and exibility in the software documentation process. In addition, to broaden its applicability and to reduce the implementation complexity, Variorum is designed to inter-operate with the WWW technology, in that the program source code les and their annotations are stored on web servers and directly accessible via commercial web browsers such as Microsoft's Internet Explorer. This paper describes the design and implementation of Variorum, as well as preliminary usage and performance experiences with the current prototype.

Research paper thumbnail of DataPhone: An Intelligent Phone for Data

Data conferencing is one of the more commercially successful examples among conferencing applicat... more Data conferencing is one of the more commercially successful examples among conferencing applications, because of its effectiveness and relatively modest resource requirements. Existing data conferencing applications are built almost exclusively on IP-based packet switched networks. While they are very effective in improving office communications, their use in the households is still quite limited because data conferencing end-points are mostly personal computers and are thus not particularly friendly to a majority of the population in the world. In contrast, more than 95% of the population in the United States are quite comfortable with conventional telephones. This paper describes the design, implementation, and evaluation of an information appliance, called DataPhone that can be used both as a regular phone and a data conferencing device. A DataPhone consists of a touch sensitive screen, a SVD modem and a microcontroller-based single board computer. DataPhone allows two parties t...

Research paper thumbnail of Dynamic multi-process information flow tracking for web application security

Proceedings of the 2007 ACM/IFIP/USENIX international conference on Middleware companion, 2007

Although there is a large body of research on detection and prevention of such memory corruption ... more Although there is a large body of research on detection and prevention of such memory corruption attacks as buffer overflow, integer overflow, and format string attacks, the web application security problem receives relatively less attention from the research community by comparison. The majority of web application security problems originate from the fact that web applications fail to perform sanity checks on inputs from the network that are eventually used as operands of securitysensitive operations. Therefore, a promising approach to this problem is to apply proper checks on tainted portions of the operands used in security-sensitive operations, where a byte is tainted if it is data/control dependent on some network packet(s). This paper presents the design, implementation and evaluation of a dynamic checking compiler called WASC, which automatically adds checks into web applications used in three-tier internet services to protect them from the most common two types of web application attacks: SQL-and script-injection attack. In addition to including a taint analysis infrastructure for multi-process and multi-language applications, WASC features the use of SQL and HTML parsers to defeat evasion techniques that exploit interpretation differences between attack detection engines and target applications. Experiments with a fully operational WASC prototype show that it can indeed stop all SQL/script injection attacks that we have tested. Moreover, the end-to-end latency penalty associated with the checks inserted by WASC is less than 30% for the test web applications used in our performance study.

Research paper thumbnail of DataPhone: an intelligent phone for data conferencing

2003 International Symposium on VLSI Technology, Systems and Applications. Proceedings of Technical Papers. (IEEE Cat. No.03TH8672), 2004

ABSTRACT Data conferencing is one of the more commercially successful conferencing applications, ... more ABSTRACT Data conferencing is one of the more commercially successful conferencing applications, because of its effectiveness and relatively modest resource requirements. Existing data conferencing applications are built almost exclusively on IP-based packet switched networks. While they are very effective in office communications, their use in the home is still quite limited because data conferencing end-points are mostly personal computers and are thus not particularly people friendly. In contrast, many more people are quite comfortable with conventional telephones. The paper describes the design, implementation, and evaluation of an information appliance, called DataPhone, that can be used both as a regular phone and a data conferencing device. A DataPhone consists of a touch sensitive screen, an SVD (simultaneous voice and data) modem and a microcontroller-based single board computer. DataPhone allows two parties to have a regular voice communications channel, as well as a shared whiteboard for sketching and document sharing. One of the key design decisions of DataPhone is that it runs directly on PSTN, and does not required any additional infrastructural support, such as ISP subscription, domain name service, routing gateway, etc., thus avoiding their associated configuration headaches. We have successfully built a fully functional DataPhone prototype and evaluated its usefulness through end user testing. Initial responses are both favorable and encouraging.

Research paper thumbnail of A feather-weight virtual machine for windows applications

Proceedings of the 2nd international conference on Virtual execution environments, 2006

Many fault-tolerant and intrusion-tolerant systems require the ability to execute unsafe programs... more Many fault-tolerant and intrusion-tolerant systems require the ability to execute unsafe programs in a realistic environment without leaving permanent damages. Virtual machine technology meets this requirement perfectly because it provides an execution environment that is both realistic and isolated. In this paper, we introduce an OS level virtual machine architecture for Windows applications called Feather-weight Virtual Machine (FVM), under which virtual machines share as many resources of the host machine as possible while still isolated from one another and from the host machine. The key technique behind FVM is namespace virtualization, which isolates virtual machines by renaming resources at the OS system call interface. Through a copy-on-write scheme, FVM allows multiple virtual machines to physically share resources but logically isolate their resources from each other. A main technical challenge in FVM is how to achieve strong isolation among different virtual machines and the host machine, due to numerous namespaces and interprocess communication mechanisms on Windows. Experimental results demonstrate that FVM is more flexible and scalable, requires less system resource, incurs lower start-up and run-time performance overhead than existing hardware-level virtual machine technologies, and thus makes a compelling building block for security and fault-tolerant applications.

Research paper thumbnail of Applications of a feather-weight virtual machine

Proceedings of the fourth ACM SIGPLAN/SIGOPS international conference on Virtual execution environments, 2008

A Feather-weight Virtual Machine (FVM) is an OS-level virtualization technology that enables mult... more A Feather-weight Virtual Machine (FVM) is an OS-level virtualization technology that enables multiple isolated execution environments to exist on a single Windows kernel. The key design goal of FVM is efficient resource sharing among VMs so as to minimize VM startup/shutdown cost and scale to a larger number of concurrent VM instances. As a result, FVM provides an effective platform for fault-tolerant and intrusion-tolerant applications that require frequent invocation and termination of dispensable VMs. This paper presents three complete applications of the FVM technology: scalable web site testing; shared binary service for application deployment and distributed Display-Only File Server (DOFS). To identify malicious web sites that exploit browser vulnerabilities, we use a web crawler to access untrusted sites, render their pages in multiple browsers each running in a separate VM, and monitor their execution behaviors. To allow Windows-based end user machines to share binaries that are stored, managed and patched on a central location, we run shared binaries in a special VM on the end user machine whose runtime environment is imported from the central binary server. To protect confidential files in a file server against information theft by insiders, we ensure that file viewing/editing programs run in a VM, which grants file content display but prevents file content from being saved on the host machine. In this paper, we show how to customize the generic FVM framework to accommodate the needs of the three applications, and present experimental results that demonstrate their performance and effectiveness.

Research paper thumbnail of Accurate and Automated System Call Policy-Based Intrusion Prevention

International Conference on Dependable Systems and Networks (DSN'06)

Research paper thumbnail of A General Dynamic Information Flow Tracking Framework for Security Applications

2006 22nd Annual Computer Security Applications Conference (ACSAC'06), 2006

Many software security solutions require accurate tracking of control/data dependencies among inf... more Many software security solutions require accurate tracking of control/data dependencies among information objects in network applications. This paper presents a general dynamic information flow tracking framework (called GIFT) for C programs that allows an application developer to associate applicationspecific tags with input data, instruments the application to propagate these tags to all the other data that are control/data-dependent on them, and invokes application-specific processing on output data according to their tag values. To use GIFT, an application developer only needs to implement input and output proxy functions to tag input data and to perform tag-dependent processing on output data, respectively. To demonstrate the usefulness of GIFT, we implement a complete GIFT application called Aussum, which allows selective sandboxing of network client applications based on whether their inputs are "tainted" or not. For a set of computation-intensive test applications, the measured elapsed time overhead of GIFT is less than 35%.

Research paper thumbnail of How to Automatically and Accurately Sandbox Microsoft IIS

2006 22nd Annual Computer Security Applications Conference (ACSAC'06), 2006

Comparing the system call sequence of a network application against a sandboxing policy is a popu... more Comparing the system call sequence of a network application against a sandboxing policy is a popular approach to detecting control-hijacking attack, in which the attacker exploits such software vulnerabilities as buffer overflow to take over the control of a victim application and possibly the underlying machine. The long-standing technical barrier to the acceptance of this system call monitoring approach is how to derive accurate sandboxing policies for Windows applications whose source code is unavailable. In fact, many commercial computer security companies take advantage of this fact and fashion a business model in which their users have to pay a subscription fee to receive periodic updates on the application sandboxing policies, much like anti-virus signatures. This paper describes the design, implementation and evaluation of a sandboxing system called BASS 1 that can automatically extract a highly accurate application-specific sandboxing policy from a Win32/X86 binary, and enforce the extracted policy at run time with low performance overhead. BASS is built on a binary interpretation and analysis infrastructure called BIRD, which can handle application binaries with dynamically linked libraries, exception handlers and multi-threading, and has been shown to work correctly for a large number of commercially distributed Windows-based network applications, including IIS and Apache. The throughput and latency penalty of BASS for all the applications we have tested except one is under 8%.

Research paper thumbnail of BIRD: Binary Interpretation using Runtime Disassembly

International Symposium on Code Generation and Optimization (CGO'06)

The majority of security vulnerabilities published in the literature are due to software bugs. Ma... more The majority of security vulnerabilities published in the literature are due to software bugs. Many researchers have developed program transformation and analysis techniques to automatically detect or eliminate such vulnerabilities. So far, most of them cannot be applied to commercially distributed applications on the Windows/x86 platform, because it is almost impossible to disassemble a binary file with 100% accuracy and coverage on that platform. This paper presents the design, implementation, and evaluation of a binary analysis and instrumentation infrastructure for the Windows/x86 platform called BIRD (Binary Interpretation using Runtime Disassembly), which provides two services to developers of security-enhancing program transformation tools: converting binary code into assembly language instructions for further analysis, and inserting instrumentation code at specific places of a given binary without affecting its execution semantics. Instead of requiring a highfidelity instruction set architectural emulator, BIRD combines static disassembly with an on-demand dynamic disassembly approach to guarantee that each instruction in a binary file is analyzed or transformed before it is executed. It takes 12 student months to develop the first BIRD prototype, which can successfully work for all applications in Microsoft Office suite as well as Internet Explorer and IIS web server, including all DLLs that they use. Moreover, the additional throughput penalty of the BIRD prototype on production server applications such as Apache, IIS, and BIND is uniformly below 4%.

Research paper thumbnail of Automatic Extraction of Accurate Application-Specific Sandboxing Policy

MILCOM 2005 - 2005 IEEE Military Communications Conference

One of the most dangerous cybersecurity threats is control hijacking attacks, which hijack the co... more One of the most dangerous cybersecurity threats is control hijacking attacks, which hijack the control of a victim application, and execute arbitrary system calls assuming the identity of the victim program's effective user. System call monitoring has been touted as an effective defense against control hijacking attacks because it could prevent remote attackers from inflicting damage upon a victim system even if they can successfully compromise certain applications running on the system. However, the Achilles' heel of the system call monitoring approach is the construction of accurate system call behavior model that minimizes false positives and negatives. This paper describes the design, implementation, and evaluation of a Program semantics-Aware Intrusion Detection system called Paid, which automatically derives an applicationspecific system call behavior model from the application's source code, and checks the application's run-time system call pattern against this model to thwart any control hijacking attacks. The per-application behavior model is in the form of the sites and ordering of system calls made in the application, as well as its partial control flow. Experiments on a fully working Paid prototype show that Paid can indeed stop attacks that exploit nonstandard security holes, such as format string attacks that modify function pointers, and that the run-time latency and throughput penalty of Paid are under 11.66% and 10.44%, respectively, for a set of productionmode network server applications including Apache, Sendmail, Ftp daemon, etc.

Research paper thumbnail of Checking Array Bound Violation Using Segmentation Hardware

2005 International Conference on Dependable Systems and Networks (DSN'05)

The ability to check memory references against their associated array/buffer bounds helps program... more The ability to check memory references against their associated array/buffer bounds helps programmers to detect programming errors involving address overruns early on and thus avoid many difficult bugs down the line. This paper proposes a novel approach called Cash to the array bound checking problem that exploits the segmentation feature in the virtual memory hardware of the X86 architecture. The Cash approach allocates a separate segment to each static array or dynamically allocated buffer, and generates the instructions for array references in such a way that the segment limit check in X86's virtual memory protection mechanism performs the necessary array bound checking for free. In those cases that hardware bound checking is not possible, it falls back to software bound checking. As a result, Cash does not need to pay per-reference software checking overhead in most cases. However, the Cash approach incurs a fixed setup overhead for each use of an array, which may involve multiple array references. The existence of this overhead requires compiler writers to judiciously apply the proposed technique to minimize the performance cost of array bound checking. This paper presents the detailed design and implementation of the Cash compiler, and a comprehensive evaluation of various performance tradeoffs associated with the proposed array bound checking technique. For the set of complicated network applications we tested, including Apache, Sendmail, Bind, etc., the latency penalty of Cash's bound checking mechanism is between 2.5% to 9.8% when compared with the baseline case that does not perform any bound checking.

Research paper thumbnail of Web Application Attack Prevention for Tiered Internet Services

2008 The Fourth International Conference on Information Assurance and Security, 2008

ABSTRACT Because most web application attacks exploit vulnerabilities that result from lack of in... more ABSTRACT Because most web application attacks exploit vulnerabilities that result from lack of input validation, a promising approach to thwarting these attacks is to apply validation checks on tainted portions of the operands used in security-sensitive operations, where a byte is tainted if it is data/control dependent on some network packet(s). This paper presents the design, implementation and evaluation of a dynamic checking compiler called WASC, which automatically adds checks into web applications used in three-tier internet services to protect them from the most common two types of web application attacks: SQL- and script-injection attack. In addition to including a taint analysis infrastructure for multi-process and multi-language applications, WASC features the use of SQL and HTML parsers to defeat evasion techniques that exploit interpretation differences between attack detection engines and target applications. Experiments with a fully operational WASC prototype show that it can indeed stop all SQL/script injection attacks that we have tested. Moreover, the end-to-end latency penalty associated with the checks inserted by WASC is less than 30% for the test web applications used in our performance study.

Research paper thumbnail of Foreign Code Detection on the Windows/X86 Platform

2006 22nd Annual Computer Security Applications Conference (ACSAC'06), 2006

As new attacks against Windows-based machines emerge almost on a daily basis, there is an increas... more As new attacks against Windows-based machines emerge almost on a daily basis, there is an increasing need to "lock down" individual users' desktop machines in corporate computing environments. One particular way to lock down a user computer is to guarantee that only authorized binary programs are allowed to run on that computer. A major advantage of this approach is that binaries downloaded without the user's knowledge, such as spyware, adware, or code entering through buffer overflow attacks, can never run on computers that are locked down this way. This paper presents the design, implementation and evaluation of FOOD, a foreign code detection system specifically for the Windows/X86 platform, where foreign code is defined as any binary programs that do not go through an authorized installation procedure. FOOD verifies the legitimacy of binary images involved in process creation and library loading to ensure that only authorized binaries are used in these operations. In addition, FOOD checks the target address of every indirect branch instruction in Windows binaries to prevent illegitimate control transfers to either dynamically injected mobile code or pre-existing library functions that are potentially damaging. Combined together, these techniques strictly prevent the execution of any foreign code. Experiments with a fully working FOOD prototype show that it can indeed stop all spyware and buffer overflow attacks we tested, and its worst-case run-time performance overhead associated with foreign code detection is less than 35%.

Research paper thumbnail of Application Specific Sandboxing for Win32/Intel Binaries

Comparing the system call sequence of a network appli- cation against a sandboxing policy is a po... more Comparing the system call sequence of a network appli- cation against a sandboxing policy is a popular approach to detecting control-hijacking attack, in which the attacker exploits such software vulnerabilities as buffer overflow to take over the control of a victim application and pos- sibly the underlying machine. The long-standing techni- cal barrier to the acceptance of this system call monitor- ing approach is how to derive accurate sandboxing poli- cies for Windows applications whose source code is un- available. In fact, many commercial computer security companies take advantage of this fact and fashion a busi- ness model in which their users have to pay a subscription fee to receive periodic updates on the application sandbox- ing policies, much like anti-virus signatures. This paper describes the design, implementation and evaluation of a sandboxing system called BASS that can automatically ex- tract a highly accurate application-specific sandboxing pol- icy from a Win32/X...

Research paper thumbnail of Program Transformation Techniques for Host-based Intrusion Prevention

... Paid is a system call based intrusion prevention system, which includes a comprehensive progr... more ... Paid is a system call based intrusion prevention system, which includes a comprehensive program analysis tool that can automati-cally derive an accurate ... Paid checks the application's run-time system call pattern against this model to prevent control-hijacking attacks ...