seif haridi | KTH Royal Institute of Technology (original) (raw)

Papers by seif haridi

Research paper thumbnail of A logic programming language based on the Andorra model

New Generation Computing, 1990

The Andorra model is a parallel execution model of logic programs which exploits the dependent an... more The Andorra model is a parallel execution model of logic programs which exploits the dependent and-parallelism and or-parallelism inherent in logic programming. We present a flat subset of a language based on the Andorra model, henceforth called Andorra Prolog, that is intended to subsume both Prolog and the committed choice languages, Flat Andorra, in addition to don't know and don't care nondeterminism, supports control of or-parallel split, synchronisation on variables, and selection of clauses. We show the operational semantics of the language, and its applicability in the domain of committed choice languages. As an example of the expressiveness of the language, we describe a method for communication between objects by time-stamped messages, which is suitable for expressing distributed discrete event simulation applications. This method depends critically on the ability to express don't know nondeterminism and thus cannot easily be expressed in a committed choice language.

Research paper thumbnail of MeteorShower: Minimizing Request Latency for Majority Quorum-Based Data Consistency Algorithms in Multiple Data Centers

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)

With the increasing popularity of serving and storing data in multiple data centers, we investiga... more With the increasing popularity of serving and storing data in multiple data centers, we investigate the efficiency of majority quorum-based data consistency algorithms under this scenario. Because of the failure-prone nature of distributed storage systems, majority quorum-based data consistency algorithms become one of the most widely adopted approaches. In this paper, we propose the MeteorShower framework, which provides faulttolerant read/write key-value storage service across multiple data centers with sequential consistency guarantees. A major feature is that most read operations are executed locally within a single data center. This results in lowering read latency from hundreds of milliseconds to tens of milliseconds. The data consistency algorithm in MeteorShower augments majority quorum-based algorithms. Thus, it keeps all the desirable properties of majority quorums, such as fault tolerance, balanced load, etc. An implementation of MeteorShower on top of Cassandra is deployed and evaluated in multiple data centers using the Google Cloud Platform. Evaluations of MeteorShower framework have shown that it can consistently serve read requests without paying the communication delays among replicas maintained in multiple data centers. As a result, we are able to improve the latency of read requests from hundreds of milliseconds to tens of milliseconds while achieving the same latency on write requests and the same fault tolerance guarantee. Thus, MeteorShower is optimized for read intensive workloads.

Research paper thumbnail of Errors Classification and Static Detection Techniques for Dual-Programming Model (OpenMP and OpenACC)

IEEE Access

Recently, incorporating more than one programming model into a system designed for high performan... more Recently, incorporating more than one programming model into a system designed for high performance computing (HPC) has become a popular solution to implementing parallel systems. Since traditional programming languages, such as C, C++, and Fortran, do not support parallelism at the level of multi-core processors and accelerators, many programmers add one or more programming models to achieve parallelism and accelerate computation efficiently. These models include Open Accelerators (OpenACC) and Open Multi-Processing (OpenMP), which have recently been used with various models, including Message Passing Interface (MPI) and Compute Unified Device Architecture (CUDA). Due to the difficulty of predicting the behavior of threads, runtime errors cannot be predicted. The compiler cannot identify runtime errors such as data races, race conditions, deadlocks, or livelocks. Many studies have been conducted on the development of testing tools to detect runtime errors when using programming models, such as the combinations of OpenACC with MPI models and OpenMP with MPI. Although more applications use OpenACC and OpenMP together, no testing tools have been developed to test these applications to date. This paper presents a testing tool for detecting runtime using a static testing technique. This tool can detect actual and potential runtime errors during the integration of the OpenACC and OpenMP models into systems developed in C++. This tool implement error dependency graphs, which are proposed in this paper. Additionally, a dependency graph of the errors is provided, along with a classification of runtime errors that result from combining the two programming models mentioned earlier.

Research paper thumbnail of Hail to the Thief: Protecting data from mobile ransomware with ransomsafedroid

2017 IEEE 16th International Symposium on Network Computing and Applications (NCA)

The growing popularity of Android and the increasing amount of sensitive data stored in mobile de... more The growing popularity of Android and the increasing amount of sensitive data stored in mobile devices have lead to the dissemination of Android ransomware. Ransomware is a class of malware that makes data inaccessible by blocking access to the device or, more frequently, by encrypting the data; to recover the data, the user has to pay a ransom to the attacker. A solution for this problem is to backup the data. Although backup tools are available for Android, these tools may be compromised or blocked by the ransomware itself. This paper presents the design and implementation of RAN-SOMSAFEDROID, a TrustZone based backup service for mobile devices. RANSOMSAFEDROID is protected from malware by leveraging the ARM TrustZone extension and running in the secure world. It does backup of files periodically to a secure local persistent partition and pushes these backups to external storage to protect them from ransomware. Initially, RANSOMSAFEDROID does a full backup of the device filesystem, then it does incremental backups that save the changes since the last backup. As a proof-of-concept, we implemented a RANSOMSAFEDROID prototype and provide a performance evaluation using an i.MX53 development board.

Research paper thumbnail of DroidPosture: A trusted posture assessment service for mobile devices

2017 IEEE 13th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), 2017

Mobile devices such as smartphones are becoming the majority among computing devices. Currently, ... more Mobile devices such as smartphones are becoming the majority among computing devices. Currently, millions of persons use such devices to store and process personal data. Unfortunately, smartphones running Android are increasingly being targeted by hackers and infected with malware. Antimalware software is being used to address this situation, but it may be subverted by the same malware it aims to detect. We present DROIDPOSTURE, a posture assessment service for Android devices. This service aims to securely evaluate the level of trust we can have on a device (assess its posture) even if the mobile OS is compromised. For that to be possible, DROID-POSTURE is protected using TrustZone, a security extension for ARM processors. DROIDPOSTURE is configurable with a set of application and kernel analysis mechanisms that enable detecting malicious applications and rootkits. We implemented a DROIDPOSTURE prototype using a hardware board with an ARM processor with TrustZone, and evaluated its performance and security.

Research paper thumbnail of Implicit Provenance for Machine Learning Artifacts

Machine learning (ML) presents new challenges for reproducible software engineering, as the artif... more Machine learning (ML) presents new challenges for reproducible software engineering, as the artifacts required for repeatably training models are not just versioned code, but also hyperparameters, code dependencies, and the exact version of the training data. Existing systems for tracking the lineage of ML artifacts, such as TensorFlow Extended or MLFlow, are invasive, requiring developers to refactor their code that now is controlled by the external system. In this paper, we present an alternative approach, we call implicit provenance, where we instrument a distributed file system and APIs to capture changes to ML artifacts, that, along with file naming conventions, mean that full lineage can be tracked for TensorFlow/Keras/Pytorch programs without requiring code changes. We address challenges related to adding strongly consistent metadata extensions to the distributed file system, while minimizing provenance overhead, and ensuring transparent eventual consistent replication of ext...

Research paper thumbnail of T2Droid: A TrustZone-Based Dynamic Analyser for Android Applications

2017 IEEE Trustcom/BigDataSE/ICESS, 2017

Research paper thumbnail of Arc: an IR for batch and stream programming

Proceedings of the 17th ACM SIGPLAN International Symposium on Database Programming Languages, 2019

In big data analytics, there is currently a large number of data programming models and their res... more In big data analytics, there is currently a large number of data programming models and their respective frontends such as relational tables, graphs, tensors, and streams. This has lead to a plethora of runtimes that typically focus on the efficient execution of just a single frontend. This fragmentation manifests itself today by highly complex pipelines that bundle multiple runtimes to support the necessary models. Hence, joint optimization and execution of such pipelines across these frontend-bound runtimes is infeasible. We propose Arc as the first unified Intermediate Representation (IR) for data analytics that incorporates stream semantics based on a modern specification of streams, windows and stream aggregation, to combine batch and stream computation models. Arc extends Weld, an IR for batch computation and adds support for partitioned, out-of-order stream and window operators which are the most fundamental building blocks in contemporary data streaming. CCS Concepts • Software and its engineering → Context specific languages; Compilers; • Computing methodologies → Distributed computing methodologies.

Research paper thumbnail of Last Lecture Overview

�Next week: �examination of your solutions �You will be called �remember: bonus points… �What is ... more �Next week: �examination of your solutions �You will be called �remember: bonus points… �What is the next assignment: �Theory assignment (no programming) �Extra voluntary assignment

Research paper thumbnail of TruApp: A TrustZone-based authenticity detection service for mobile apps

2017 IEEE 13th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), 2017

In less than a decade, mobile apps became an integral part of our lives. In several situations it... more In less than a decade, mobile apps became an integral part of our lives. In several situations it is important to provide assurance that a mobile app is authentic, i.e., that it is indeed the app produced by a certain company. However, this is challenging, as such apps can be repackaged, the user malicious, or the app tampered with by an attacker. This paper presents the design of TRUAPP, a software authentication service that provides assurance of the authenticity and integrity of apps running on mobile devices. TRUAPP provides such assurance, even if the operating system is compromised, by leveraging the ARM TrustZone hardware security extension. TRUAPP uses a set of techniques (static watermarking, dynamic watermarking, and cryptographic hashes) to verify the integrity of the apps. The service was implemented in a hardware board that emulates a mobile device, which was used to do a thorough experimental evaluation of the service.

Research paper thumbnail of Apache Flink™: Stream and Batch Processing in a Single Engine

IEEE Data Eng. Bull., 2015

Modern enterprise applications are currently undergoing a complete paradigm shift away from tradi... more Modern enterprise applications are currently undergoing a complete paradigm shift away from traditional transactional processing to combined analytical and transactional processing. This challenge of combining two opposing query types in a single database management system results in additional requirements for transaction management as well. In this paper, we discuss our approach to achieve high throughput for transactional query processing while allowing concurrent analytical queries. We present our approach to distributed snapshot isolation and optimized two-phase commit protocols.

Research paper thumbnail of Kompics Scala: narrowing the gap between algorithmic specification and executable code (short paper)

Proceedings of the 8th ACM SIGPLAN International Symposium on Scala, 2017

Kompics Scala: Narrowing the gap between algorithmic specification and executable code (short pap... more Kompics Scala: Narrowing the gap between algorithmic specification and executable code (short paper).

Research paper thumbnail of A history of the Oz multiparadigm language

Proceedings of the ACM on Programming Languages, 2020

Oz is a programming language designed to support multiple programming paradigms in a clean factor... more Oz is a programming language designed to support multiple programming paradigms in a clean factored way that is easy to program despite its broad coverage. It started in 1991 as a collaborative effort by the DFKI (Germany) and SICS (Sweden) and led to an influential system, Mozart, that was released in 1999 and widely used in the 2000s for practical applications and education. We give the history of Oz as it developed from its origins in logic programming, starting with Prolog, followed by concurrent logic programming and constraint logic programming, and leading to its two direct precursors, the concurrent constraint model and the Andorra Kernel Language (AKL). We give the lessons learned from the Oz effort including successes and failures and we explain the principles underlying the Oz design. Oz is defined through a kernel language, which is a formal model similar to a foundational calculus, but that is designed to be directly useful to the programmer. The kernel language is orga...

Research paper thumbnail of High-Level Programming Abstractions for Distributed Graph Processing

IEEE Transactions on Knowledge and Data Engineering, 2018

Efficient processing of large-scale graphs in distributed environments has been an increasingly p... more Efficient processing of large-scale graphs in distributed environments has been an increasingly popular topic of research in recent years. Interconnected data that can be modeled as graphs arise in application domains such as machine learning, recommendation, web search, and social network analysis. Writing distributed graph applications is inherently hard and requires programming models that can cover a diverse set of problem domains, including iterative refinement algorithms, graph transformations, graph aggregations, pattern matching, ego-network analysis, and graph traversals. Several high-level programming abstractions have been proposed and adopted by distributed graph processing systems and big data platforms. Even though significant work has been done to experimentally compare distributed graph processing frameworks, no qualitative study and comparison of graph programming abstractions has been conducted yet. In this survey, we review and analyze the most prevalent high-level programming models for distributed graph processing, in terms of their semantics and applicability. We identify the classes of graph applications that can be naturally expressed by each abstraction and we also give examples of applications that are hard or impossible to express. We review 34 distributed graph processing systems with respect to their programming abstractions, execution models, and communication mechanisms. Finally, we discuss trends and open research questions in the area of distributed graph processing.

Research paper thumbnail of Static Type Checking for the Kompics Component Model

First Workshop on Programming Models and Languages for Distributed Computing, 2016

Distributed systems are becoming an increasingly important part of systems and applications softw... more Distributed systems are becoming an increasingly important part of systems and applications software and it is widely accepted that writing correct distributed systems is challenging. Message-passing concurrency models are the dominant programming paradigm and, even in statically typed languages, programming frameworks typically only have limited type checking support for messages, channels, and ports or mailboxes. In this paper, we present Kola, a languagelevel implementation of Kompics, a component model with message-passing concurrency. Kola comes with its own compiler and some special language constructs which extend Java's type system as necessary to enforce static type checking on messages, channels, and ports. We show that Kola improves the readability of Kompics code and removes opportunities to introduce bugs, at the cost of little compile time overhead and no runtime overhead.

Research paper thumbnail of SICStus Prolog library manual, version 2.1 #8

This Manual corresponds to SICStus Prolog release 2.1. #8 The Prolog library comprises a number o... more This Manual corresponds to SICStus Prolog release 2.1. #8 The Prolog library comprises a number of packages which are thought to be useful in a number of applications. Note that the predicates in the Prolog library are built-in predicates. One has to explicity load each package to get access to its predicates. To load a library package Package, you will normally enter a query. I ?- use_module(library(Package)). Library packages may be compiled and consulted as well as loaded.

Research paper thumbnail of The DSS, a Middleware Library for Efficent an Transparent Distribution of Language Entities

Research paper thumbnail of Garbage Collection for Prolog Based on WAM

Research paper thumbnail of Network Explanations to Web Economy's Patterns of Growth

From its early days, the World Wide Web space has demonstrated strong agglomeration trends with a... more From its early days, the World Wide Web space has demonstrated strong agglomeration trends with a very small number of web sites capturing the larger part of the Internet population. At a first glance, agglomeration over the virtual space sounds as a paradox. Web sites are numerous and highly diversified and can be easily reached from everywhere and anybody, with no particular transportation or search cost. However, Internet users use only a small number of sites for searching for information and products, interacting with others and socialize, thus producing dense concentrations and locational patterns similar to those observed in the physical space where few cities and industrial clusters host the huge majority of population and the entire industrial activity. Is that depending on the attractiveness of the popular web sites or are there agglomeration economies providing incentives to users to be in a location which have been visited by other users or pointed-in by other sites? Thi...

Research paper thumbnail of SmoothCache 2.0

Proceedings of the 6th ACM Multimedia Systems Conference, 2015

In recent years, adaptive HTTP streaming protocols have become the de-facto standard in the indus... more In recent years, adaptive HTTP streaming protocols have become the de-facto standard in the industry for the distribution of live and video-on-demand content over the Internet. This paper presents SmoothCache 2.0, a distributed cache platform for adaptive HTTP live streaming content based on peer-to-peer (P2P) overlays. The contribution of this work is twofold. From a systems perspective, to the best of our knowledge, it is the only P2P platform which supports recent live streaming protocols based on HTTP as a transport and the concept of adaptive bitrate switching. From an algorithmic perspective, the system describes a novel set of overlay construction and prefetching techniques that realize: i) substantial savings in terms of the bandwidth load on the source of the stream, and ii) CDN-quality user experience in terms of playback latency and the watched bitrate. In order to support our claims, we conduct a methodical evaluation on thousands of real consumer machines.

Research paper thumbnail of A logic programming language based on the Andorra model

New Generation Computing, 1990

The Andorra model is a parallel execution model of logic programs which exploits the dependent an... more The Andorra model is a parallel execution model of logic programs which exploits the dependent and-parallelism and or-parallelism inherent in logic programming. We present a flat subset of a language based on the Andorra model, henceforth called Andorra Prolog, that is intended to subsume both Prolog and the committed choice languages, Flat Andorra, in addition to don't know and don't care nondeterminism, supports control of or-parallel split, synchronisation on variables, and selection of clauses. We show the operational semantics of the language, and its applicability in the domain of committed choice languages. As an example of the expressiveness of the language, we describe a method for communication between objects by time-stamped messages, which is suitable for expressing distributed discrete event simulation applications. This method depends critically on the ability to express don't know nondeterminism and thus cannot easily be expressed in a committed choice language.

Research paper thumbnail of MeteorShower: Minimizing Request Latency for Majority Quorum-Based Data Consistency Algorithms in Multiple Data Centers

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)

With the increasing popularity of serving and storing data in multiple data centers, we investiga... more With the increasing popularity of serving and storing data in multiple data centers, we investigate the efficiency of majority quorum-based data consistency algorithms under this scenario. Because of the failure-prone nature of distributed storage systems, majority quorum-based data consistency algorithms become one of the most widely adopted approaches. In this paper, we propose the MeteorShower framework, which provides faulttolerant read/write key-value storage service across multiple data centers with sequential consistency guarantees. A major feature is that most read operations are executed locally within a single data center. This results in lowering read latency from hundreds of milliseconds to tens of milliseconds. The data consistency algorithm in MeteorShower augments majority quorum-based algorithms. Thus, it keeps all the desirable properties of majority quorums, such as fault tolerance, balanced load, etc. An implementation of MeteorShower on top of Cassandra is deployed and evaluated in multiple data centers using the Google Cloud Platform. Evaluations of MeteorShower framework have shown that it can consistently serve read requests without paying the communication delays among replicas maintained in multiple data centers. As a result, we are able to improve the latency of read requests from hundreds of milliseconds to tens of milliseconds while achieving the same latency on write requests and the same fault tolerance guarantee. Thus, MeteorShower is optimized for read intensive workloads.

Research paper thumbnail of Errors Classification and Static Detection Techniques for Dual-Programming Model (OpenMP and OpenACC)

IEEE Access

Recently, incorporating more than one programming model into a system designed for high performan... more Recently, incorporating more than one programming model into a system designed for high performance computing (HPC) has become a popular solution to implementing parallel systems. Since traditional programming languages, such as C, C++, and Fortran, do not support parallelism at the level of multi-core processors and accelerators, many programmers add one or more programming models to achieve parallelism and accelerate computation efficiently. These models include Open Accelerators (OpenACC) and Open Multi-Processing (OpenMP), which have recently been used with various models, including Message Passing Interface (MPI) and Compute Unified Device Architecture (CUDA). Due to the difficulty of predicting the behavior of threads, runtime errors cannot be predicted. The compiler cannot identify runtime errors such as data races, race conditions, deadlocks, or livelocks. Many studies have been conducted on the development of testing tools to detect runtime errors when using programming models, such as the combinations of OpenACC with MPI models and OpenMP with MPI. Although more applications use OpenACC and OpenMP together, no testing tools have been developed to test these applications to date. This paper presents a testing tool for detecting runtime using a static testing technique. This tool can detect actual and potential runtime errors during the integration of the OpenACC and OpenMP models into systems developed in C++. This tool implement error dependency graphs, which are proposed in this paper. Additionally, a dependency graph of the errors is provided, along with a classification of runtime errors that result from combining the two programming models mentioned earlier.

Research paper thumbnail of Hail to the Thief: Protecting data from mobile ransomware with ransomsafedroid

2017 IEEE 16th International Symposium on Network Computing and Applications (NCA)

The growing popularity of Android and the increasing amount of sensitive data stored in mobile de... more The growing popularity of Android and the increasing amount of sensitive data stored in mobile devices have lead to the dissemination of Android ransomware. Ransomware is a class of malware that makes data inaccessible by blocking access to the device or, more frequently, by encrypting the data; to recover the data, the user has to pay a ransom to the attacker. A solution for this problem is to backup the data. Although backup tools are available for Android, these tools may be compromised or blocked by the ransomware itself. This paper presents the design and implementation of RAN-SOMSAFEDROID, a TrustZone based backup service for mobile devices. RANSOMSAFEDROID is protected from malware by leveraging the ARM TrustZone extension and running in the secure world. It does backup of files periodically to a secure local persistent partition and pushes these backups to external storage to protect them from ransomware. Initially, RANSOMSAFEDROID does a full backup of the device filesystem, then it does incremental backups that save the changes since the last backup. As a proof-of-concept, we implemented a RANSOMSAFEDROID prototype and provide a performance evaluation using an i.MX53 development board.

Research paper thumbnail of DroidPosture: A trusted posture assessment service for mobile devices

2017 IEEE 13th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), 2017

Mobile devices such as smartphones are becoming the majority among computing devices. Currently, ... more Mobile devices such as smartphones are becoming the majority among computing devices. Currently, millions of persons use such devices to store and process personal data. Unfortunately, smartphones running Android are increasingly being targeted by hackers and infected with malware. Antimalware software is being used to address this situation, but it may be subverted by the same malware it aims to detect. We present DROIDPOSTURE, a posture assessment service for Android devices. This service aims to securely evaluate the level of trust we can have on a device (assess its posture) even if the mobile OS is compromised. For that to be possible, DROID-POSTURE is protected using TrustZone, a security extension for ARM processors. DROIDPOSTURE is configurable with a set of application and kernel analysis mechanisms that enable detecting malicious applications and rootkits. We implemented a DROIDPOSTURE prototype using a hardware board with an ARM processor with TrustZone, and evaluated its performance and security.

Research paper thumbnail of Implicit Provenance for Machine Learning Artifacts

Machine learning (ML) presents new challenges for reproducible software engineering, as the artif... more Machine learning (ML) presents new challenges for reproducible software engineering, as the artifacts required for repeatably training models are not just versioned code, but also hyperparameters, code dependencies, and the exact version of the training data. Existing systems for tracking the lineage of ML artifacts, such as TensorFlow Extended or MLFlow, are invasive, requiring developers to refactor their code that now is controlled by the external system. In this paper, we present an alternative approach, we call implicit provenance, where we instrument a distributed file system and APIs to capture changes to ML artifacts, that, along with file naming conventions, mean that full lineage can be tracked for TensorFlow/Keras/Pytorch programs without requiring code changes. We address challenges related to adding strongly consistent metadata extensions to the distributed file system, while minimizing provenance overhead, and ensuring transparent eventual consistent replication of ext...

Research paper thumbnail of T2Droid: A TrustZone-Based Dynamic Analyser for Android Applications

2017 IEEE Trustcom/BigDataSE/ICESS, 2017

Research paper thumbnail of Arc: an IR for batch and stream programming

Proceedings of the 17th ACM SIGPLAN International Symposium on Database Programming Languages, 2019

In big data analytics, there is currently a large number of data programming models and their res... more In big data analytics, there is currently a large number of data programming models and their respective frontends such as relational tables, graphs, tensors, and streams. This has lead to a plethora of runtimes that typically focus on the efficient execution of just a single frontend. This fragmentation manifests itself today by highly complex pipelines that bundle multiple runtimes to support the necessary models. Hence, joint optimization and execution of such pipelines across these frontend-bound runtimes is infeasible. We propose Arc as the first unified Intermediate Representation (IR) for data analytics that incorporates stream semantics based on a modern specification of streams, windows and stream aggregation, to combine batch and stream computation models. Arc extends Weld, an IR for batch computation and adds support for partitioned, out-of-order stream and window operators which are the most fundamental building blocks in contemporary data streaming. CCS Concepts • Software and its engineering → Context specific languages; Compilers; • Computing methodologies → Distributed computing methodologies.

Research paper thumbnail of Last Lecture Overview

�Next week: �examination of your solutions �You will be called �remember: bonus points… �What is ... more �Next week: �examination of your solutions �You will be called �remember: bonus points… �What is the next assignment: �Theory assignment (no programming) �Extra voluntary assignment

Research paper thumbnail of TruApp: A TrustZone-based authenticity detection service for mobile apps

2017 IEEE 13th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), 2017

In less than a decade, mobile apps became an integral part of our lives. In several situations it... more In less than a decade, mobile apps became an integral part of our lives. In several situations it is important to provide assurance that a mobile app is authentic, i.e., that it is indeed the app produced by a certain company. However, this is challenging, as such apps can be repackaged, the user malicious, or the app tampered with by an attacker. This paper presents the design of TRUAPP, a software authentication service that provides assurance of the authenticity and integrity of apps running on mobile devices. TRUAPP provides such assurance, even if the operating system is compromised, by leveraging the ARM TrustZone hardware security extension. TRUAPP uses a set of techniques (static watermarking, dynamic watermarking, and cryptographic hashes) to verify the integrity of the apps. The service was implemented in a hardware board that emulates a mobile device, which was used to do a thorough experimental evaluation of the service.

Research paper thumbnail of Apache Flink™: Stream and Batch Processing in a Single Engine

IEEE Data Eng. Bull., 2015

Modern enterprise applications are currently undergoing a complete paradigm shift away from tradi... more Modern enterprise applications are currently undergoing a complete paradigm shift away from traditional transactional processing to combined analytical and transactional processing. This challenge of combining two opposing query types in a single database management system results in additional requirements for transaction management as well. In this paper, we discuss our approach to achieve high throughput for transactional query processing while allowing concurrent analytical queries. We present our approach to distributed snapshot isolation and optimized two-phase commit protocols.

Research paper thumbnail of Kompics Scala: narrowing the gap between algorithmic specification and executable code (short paper)

Proceedings of the 8th ACM SIGPLAN International Symposium on Scala, 2017

Kompics Scala: Narrowing the gap between algorithmic specification and executable code (short pap... more Kompics Scala: Narrowing the gap between algorithmic specification and executable code (short paper).

Research paper thumbnail of A history of the Oz multiparadigm language

Proceedings of the ACM on Programming Languages, 2020

Oz is a programming language designed to support multiple programming paradigms in a clean factor... more Oz is a programming language designed to support multiple programming paradigms in a clean factored way that is easy to program despite its broad coverage. It started in 1991 as a collaborative effort by the DFKI (Germany) and SICS (Sweden) and led to an influential system, Mozart, that was released in 1999 and widely used in the 2000s for practical applications and education. We give the history of Oz as it developed from its origins in logic programming, starting with Prolog, followed by concurrent logic programming and constraint logic programming, and leading to its two direct precursors, the concurrent constraint model and the Andorra Kernel Language (AKL). We give the lessons learned from the Oz effort including successes and failures and we explain the principles underlying the Oz design. Oz is defined through a kernel language, which is a formal model similar to a foundational calculus, but that is designed to be directly useful to the programmer. The kernel language is orga...

Research paper thumbnail of High-Level Programming Abstractions for Distributed Graph Processing

IEEE Transactions on Knowledge and Data Engineering, 2018

Efficient processing of large-scale graphs in distributed environments has been an increasingly p... more Efficient processing of large-scale graphs in distributed environments has been an increasingly popular topic of research in recent years. Interconnected data that can be modeled as graphs arise in application domains such as machine learning, recommendation, web search, and social network analysis. Writing distributed graph applications is inherently hard and requires programming models that can cover a diverse set of problem domains, including iterative refinement algorithms, graph transformations, graph aggregations, pattern matching, ego-network analysis, and graph traversals. Several high-level programming abstractions have been proposed and adopted by distributed graph processing systems and big data platforms. Even though significant work has been done to experimentally compare distributed graph processing frameworks, no qualitative study and comparison of graph programming abstractions has been conducted yet. In this survey, we review and analyze the most prevalent high-level programming models for distributed graph processing, in terms of their semantics and applicability. We identify the classes of graph applications that can be naturally expressed by each abstraction and we also give examples of applications that are hard or impossible to express. We review 34 distributed graph processing systems with respect to their programming abstractions, execution models, and communication mechanisms. Finally, we discuss trends and open research questions in the area of distributed graph processing.

Research paper thumbnail of Static Type Checking for the Kompics Component Model

First Workshop on Programming Models and Languages for Distributed Computing, 2016

Distributed systems are becoming an increasingly important part of systems and applications softw... more Distributed systems are becoming an increasingly important part of systems and applications software and it is widely accepted that writing correct distributed systems is challenging. Message-passing concurrency models are the dominant programming paradigm and, even in statically typed languages, programming frameworks typically only have limited type checking support for messages, channels, and ports or mailboxes. In this paper, we present Kola, a languagelevel implementation of Kompics, a component model with message-passing concurrency. Kola comes with its own compiler and some special language constructs which extend Java's type system as necessary to enforce static type checking on messages, channels, and ports. We show that Kola improves the readability of Kompics code and removes opportunities to introduce bugs, at the cost of little compile time overhead and no runtime overhead.

Research paper thumbnail of SICStus Prolog library manual, version 2.1 #8

This Manual corresponds to SICStus Prolog release 2.1. #8 The Prolog library comprises a number o... more This Manual corresponds to SICStus Prolog release 2.1. #8 The Prolog library comprises a number of packages which are thought to be useful in a number of applications. Note that the predicates in the Prolog library are built-in predicates. One has to explicity load each package to get access to its predicates. To load a library package Package, you will normally enter a query. I ?- use_module(library(Package)). Library packages may be compiled and consulted as well as loaded.

Research paper thumbnail of The DSS, a Middleware Library for Efficent an Transparent Distribution of Language Entities

Research paper thumbnail of Garbage Collection for Prolog Based on WAM

Research paper thumbnail of Network Explanations to Web Economy's Patterns of Growth

From its early days, the World Wide Web space has demonstrated strong agglomeration trends with a... more From its early days, the World Wide Web space has demonstrated strong agglomeration trends with a very small number of web sites capturing the larger part of the Internet population. At a first glance, agglomeration over the virtual space sounds as a paradox. Web sites are numerous and highly diversified and can be easily reached from everywhere and anybody, with no particular transportation or search cost. However, Internet users use only a small number of sites for searching for information and products, interacting with others and socialize, thus producing dense concentrations and locational patterns similar to those observed in the physical space where few cities and industrial clusters host the huge majority of population and the entire industrial activity. Is that depending on the attractiveness of the popular web sites or are there agglomeration economies providing incentives to users to be in a location which have been visited by other users or pointed-in by other sites? Thi...

Research paper thumbnail of SmoothCache 2.0

Proceedings of the 6th ACM Multimedia Systems Conference, 2015

In recent years, adaptive HTTP streaming protocols have become the de-facto standard in the indus... more In recent years, adaptive HTTP streaming protocols have become the de-facto standard in the industry for the distribution of live and video-on-demand content over the Internet. This paper presents SmoothCache 2.0, a distributed cache platform for adaptive HTTP live streaming content based on peer-to-peer (P2P) overlays. The contribution of this work is twofold. From a systems perspective, to the best of our knowledge, it is the only P2P platform which supports recent live streaming protocols based on HTTP as a transport and the concept of adaptive bitrate switching. From an algorithmic perspective, the system describes a novel set of overlay construction and prefetching techniques that realize: i) substantial savings in terms of the bandwidth load on the source of the stream, and ii) CDN-quality user experience in terms of playback latency and the watched bitrate. In order to support our claims, we conduct a methodical evaluation on thousands of real consumer machines.