V. Beltran | Universitat Politecnica de Catalunya (original) (raw)
Papers by V. Beltran
The Cell BE processor has proved that heterogeneous multi-core systems can provide a huge computa... more The Cell BE processor has proved that heterogeneous multi-core systems can provide a huge computational power with high efficiency for a wide range of applications. The simple design of the computational units and the use of small managed local memories is the key to achieve high efficiency and performance at the same time. However, this simple and efficient hardware design comes at the price of higher code complexity. The code written to run in this kind of processors must deal with several issues such as code vectorization, loop unrolling or the explicit management of local memories. Some of these issues such as vectorization or loop unrolling can be partially solved by the compiler, but the overlapping of data transfer and computation times must be manually addressed by the programmer with techniques such as double buffering that increase the code complexity. In this paper we present a user level threading library called CellMT that effectively hide memory latencies. The concurrent execution of several threads inside each SPU naturally overlaps computation and data transfer times without increasing the code complexity. To prove the suitability and feasibility of our multi-threaded library, we perform an exhaustive performance evaluation with a synthetic benchmark and a real application. The experimental results show that the multithreaded approach can outperform a handcoded double buffering scheme, with speedups from 0.96x to 3.2x, while maintaining the complexity of a naive buffering scheme.
Virtualized infrastructure providers demand new methods to increase the accuracy of the accountin... more Virtualized infrastructure providers demand new methods to increase the accuracy of the accounting models used to charge their customers. Future data centers will be composed of many-core systems that will host a large number of virtual machines (VMs) each. While resource utilization accounting can be achieved with existing system tools, energy accounting is a complex task when per-VM granularity is the goal.
2005 International Conference on Parallel Processing (ICPP'05), 2005
As dynamic web content and security capabilities are becoming popular in current web sites, the p... more As dynamic web content and security capabilities are becoming popular in current web sites, the performance demand on application servers that host the sites is increasing, leading sometimes these servers to overload. As a result, response times may grow to unacceptable levels and the server may saturate or even crash. In this paper we present a session-based adaptive overload control mechanism based on SSL (Secure Socket Layer) connections differentiation and admission control. The SSL connections differentiation is a key factor because the cost of establishing a new SSL connection is much greater than establishing a resumed SSL connection (it reuses an existing SSL session on server). Considering this big difference, we have implemented an admission control algorithm that prioritizes the resumed SSL connections to maximize performance on session-based environments and limits dynamically the number of new SSL connections accepted depending on the available resources and the current number of connections in the system to avoid server overload. In order to allow the differentiation of resumed SSL connections from new SSL connections we propose a possible extension of the Java Secure Sockets Extension (JSSE) API. Our evaluation on Tomcat server demonstrates the benefit of our proposal for preventing server overload.
International Conference on Parallel Processing, 2004. ICPP 2004., 2004
The two major strategies used to construct highperformance web servers are thread pools and event... more The two major strategies used to construct highperformance web servers are thread pools and eventdriven architectures. The Java platform is commonly used in web environments but up to the moment it did not provide any standard API to implement event-driven architectures efficiently. The new 1.4 release of the J2SE introduces the NIO (New I/O) API to help in the development of event-driven I/O intensive applications. In this paper we evaluate the scalability that this API provides to the Java platform in the field of web servers, bringing together the majorly used commercial server (Apache) and one experimental server developed using the NIO API. We study the scalability of the NIO-based server as well as of its rival in a number of different scenarios, including uniprocessor, multiprocessor, bandwidthbounded and CPU-bounded environments. The study concludes that the NIO API can be successfully used to create event-driven Java servers that can scale as well as the best of the commercial native-compiled web server, at a fraction of its complexity and using only one or two worker threads.
The Open Grid Services Architecture (OGSA) defines a new vision of the Grid based on the use of W... more The Open Grid Services Architecture (OGSA) defines a new vision of the Grid based on the use of Web Services (Grid Services). The standard interfaces, behaviors and schemes that are consistent with the OGSA specification are defined by the Open Grid Service Infrastructure (OGSI). Grid Services, as an extension of the Web Services, run on top of rich execution frameworks that make them accessible and interoperable with other applications. Two examples of these frameworks are Sun's J2EE platform and Microsoft's .NET.
Overload control mechanisms such as admission control and connections differentiation have been p... more Overload control mechanisms such as admission control and connections differentiation have been proven effective for preventing overload of application servers running secure web applications. However, achieving optimal results in overload prevention is only possible when considering some kind of resource management in addition to these mechanisms.
The Cell BE processor has proved that heterogeneous multi-core systems can provide a huge computa... more The Cell BE processor has proved that heterogeneous multi-core systems can provide a huge computational power with high efficiency for a wide range of applications. The simple design of the computational units and the use of small managed local memories is the key to achieve high efficiency and performance at the same time. However, this simple and efficient hardware design comes at the price of higher code complexity. The code written to run in this kind of processors must deal with several issues such as code vectorization, loop unrolling or the explicit management of local memories. Some of these issues such as vectorization or loop unrolling can be partially solved by the compiler, but the overlapping of data transfer and computation times must be manually addressed by the programmer with techniques such as double buffering that increase the code complexity. In this paper we present a user level threading library called CellMT that effectively hide memory latencies. The concurrent execution of several threads inside each SPU naturally overlaps computation and data transfer times without increasing the code complexity. To prove the suitability and feasibility of our multi-threaded library, we perform an exhaustive performance evaluation with a synthetic benchmark and a real application. The experimental results show that the multithreaded approach can outperform a handcoded double buffering scheme, with speedups from 0.96x to 3.2x, while maintaining the complexity of a naive buffering scheme.
Virtualized infrastructure providers demand new methods to increase the accuracy of the accountin... more Virtualized infrastructure providers demand new methods to increase the accuracy of the accounting models used to charge their customers. Future data centers will be composed of many-core systems that will host a large number of virtual machines (VMs) each. While resource utilization accounting can be achieved with existing system tools, energy accounting is a complex task when per-VM granularity is the goal.
2005 International Conference on Parallel Processing (ICPP'05), 2005
As dynamic web content and security capabilities are becoming popular in current web sites, the p... more As dynamic web content and security capabilities are becoming popular in current web sites, the performance demand on application servers that host the sites is increasing, leading sometimes these servers to overload. As a result, response times may grow to unacceptable levels and the server may saturate or even crash. In this paper we present a session-based adaptive overload control mechanism based on SSL (Secure Socket Layer) connections differentiation and admission control. The SSL connections differentiation is a key factor because the cost of establishing a new SSL connection is much greater than establishing a resumed SSL connection (it reuses an existing SSL session on server). Considering this big difference, we have implemented an admission control algorithm that prioritizes the resumed SSL connections to maximize performance on session-based environments and limits dynamically the number of new SSL connections accepted depending on the available resources and the current number of connections in the system to avoid server overload. In order to allow the differentiation of resumed SSL connections from new SSL connections we propose a possible extension of the Java Secure Sockets Extension (JSSE) API. Our evaluation on Tomcat server demonstrates the benefit of our proposal for preventing server overload.
International Conference on Parallel Processing, 2004. ICPP 2004., 2004
The two major strategies used to construct highperformance web servers are thread pools and event... more The two major strategies used to construct highperformance web servers are thread pools and eventdriven architectures. The Java platform is commonly used in web environments but up to the moment it did not provide any standard API to implement event-driven architectures efficiently. The new 1.4 release of the J2SE introduces the NIO (New I/O) API to help in the development of event-driven I/O intensive applications. In this paper we evaluate the scalability that this API provides to the Java platform in the field of web servers, bringing together the majorly used commercial server (Apache) and one experimental server developed using the NIO API. We study the scalability of the NIO-based server as well as of its rival in a number of different scenarios, including uniprocessor, multiprocessor, bandwidthbounded and CPU-bounded environments. The study concludes that the NIO API can be successfully used to create event-driven Java servers that can scale as well as the best of the commercial native-compiled web server, at a fraction of its complexity and using only one or two worker threads.
The Open Grid Services Architecture (OGSA) defines a new vision of the Grid based on the use of W... more The Open Grid Services Architecture (OGSA) defines a new vision of the Grid based on the use of Web Services (Grid Services). The standard interfaces, behaviors and schemes that are consistent with the OGSA specification are defined by the Open Grid Service Infrastructure (OGSI). Grid Services, as an extension of the Web Services, run on top of rich execution frameworks that make them accessible and interoperable with other applications. Two examples of these frameworks are Sun's J2EE platform and Microsoft's .NET.
Overload control mechanisms such as admission control and connections differentiation have been p... more Overload control mechanisms such as admission control and connections differentiation have been proven effective for preventing overload of application servers running secure web applications. However, achieving optimal results in overload prevention is only possible when considering some kind of resource management in addition to these mechanisms.