General Terms Research Papers - Academia.edu (original) (raw)

Efficient implementations of Dijkstra's shortest path algorithm are investigated. A new data structure, called the radix heap , is proposed for use in this algorithm. On a network with n vertices, m edges, and nonnegative integer arc... more

Efficient implementations of Dijkstra's shortest path algorithm are investigated. A new data structure, called the radix heap , is proposed for use in this algorithm. On a network with n vertices, m edges, and nonnegative integer arc costs bounded by C , a one-level form of radix heap gives a time bound for Dijkstra's algorithm of O ( m + n log C ). A two-level form of radix heap gives a bound of O ( m + n log C /log log C ). A combination of a radix heap and a previously known data structure called a Fibonacci heap gives a bound of O ( m + n a @@@@log C ). The best previously known bounds are O ( m + n log n ) using Fibonacci heaps alone and O ( m log log C ) using the priority queue structure of Van Emde Boas et al. [ 17].

Is it possible to reduce the expected response time of every request at a web server, simply by changing the order in which we schedule the requests? That is the question we ask in this paper.This paper proposes a method for improving the... more

Is it possible to reduce the expected response time of every request at a web server, simply by changing the order in which we schedule the requests? That is the question we ask in this paper.This paper proposes a method for improving the performance of web servers servicing static HTTP requests. The idea is to give preference to requests for

The goal of this article is to review the state-of-the-art tracking methods, classify them into different cate-gories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can... more

The goal of this article is to review the state-of-the-art tracking methods, classify them into different cate-gories, and identify new trends. Object tracking, in general, is a challenging problem. Difficulties in tracking objects can arise due to abrupt object motion, changing appearance patterns ...

This paper presents a new model for exception handling, called the replacement model. The replacement model, in contrast to other exception-handling proposals, supports all the handler responses of resumption, termination, retry, and... more

This paper presents a new model for exception handling, called the replacement model. The replacement model, in contrast to other exception-handling proposals, supports all the handler responses of resumption, termination, retry, and exception propagation, within both statements and expressions, in a modular, simple, and uniform fashion. The model can be embedded in any expression-oriented language and can also be adapted to languages which are not expression oriented with almost all the above advantages. This paper presents the syntactic extensions for embedding the replacement model into Algol 68 and its operational semantics. An axiomatic semantic definition for the model can be found in [271. Categories and Subject Descriptors: D.3.3 [Programming Languages]: Language Constructsabstract

With the introduction of modularity, increasing student numbers and the continued expansion of university departments, space in Nigerian Universities is becoming an increasingly precious commodity. To address this, some institutions have... more

With the introduction of modularity, increasing student numbers and the continued expansion of university departments, space in Nigerian Universities is becoming an increasingly precious commodity. To address this, some institutions have tried to ensure efficient space utilization by employing different proposed solutions to space allocation problems especially during examination period. A number of approaches have been explored in the casting of examination timetables for academic institutions. The approach to be discussed here applies genetic algorithm using hierarchy of constraints. This hierarchy can incorporate individual requests or organizational requirements by weighing them according to some criteria. In this paper, we present a new real-world examination timetabling dataset at the University of Agriculture, Abeokuta Nigeria that will hopefully be used as a future benchmark problem. In addition, a new objective function that attempts to spread exams throughout the examinati...

Abstract. We consider multiprocessing systems where processes make independent, Poisson distributed resource requests with mean arrival time 1. We assume that resources are not released. It is shown that the expected deadlock time is... more

Abstract. We consider multiprocessing systems where processes make independent, Poisson distributed resource requests with mean arrival time 1. We assume that resources are not released. It is shown that the expected deadlock time is never less than 1, no matter how many processes and resources are in the system. Also, the expected number of processes blocked by deadlock time is one-half more than half the number of initially active processes. We obtain expressions for system statistics such as expected deadlock time, expected total processing time, and system efficiency, m terms of Abel sums. We derive asymptotic expressions for these statistics in the case of systems with many processes and the case of systems with a fixed number of processes. In the latter, generahzations of the Ramanujan Q-function arise. We use singularity analysis to obtain asymptotlcs ot coefficients of generalized Q-functions.

High-speed communications link cores must consume low-power, feature low bit-error-rates (BER), and address many applications. We present a methodology to design adaptive link architectures, whereby the link’s internal logic complexity,... more

High-speed communications link cores must consume low-power, feature low bit-error-rates (BER), and address many applications. We present a methodology to design adaptive link architectures, whereby the link’s internal logic complexity, frequency, and supply are simultaneously adapted to application requirements. The requirement space is mapped to the design space using requirements measurement circuits and configurable logic blocks. CMOS results indicate that power savings of 60 % versus the worst case are possible, while the area overhead is kept under 5%.

The problem of searching the elements of a set that are close to a given query element under some similarity criterion has a vast number of applications in many branches of computer science, from pattern recognition to textual and... more

The problem of searching the elements of a set that are close to a given query element under some similarity criterion has a vast number of applications in many branches of computer science, from pattern recognition to textual and multimedia information retrieval. We are interested in the rather general case where the similarity criterion defines a metric space, instead of the more restricted case of a vector space. Many solutions have been proposed in different areas, in many cases without cross-knowledge. Because of this, the same ideas have been reconceived several times, and very different presentations have been given for the same approaches. We present some basic results that explain the intrinsic difficulty of the search problem. This includes a quantitative definition of the elusive concept of "intrinsic dimensionality." We also present a unified view of all the known proposals to organize metric spaces, so as to be able to understand them under a common framework....

The splay tree, a self-adjusting form of binary search tree, is developed and analyzed. The binary search tree is a data structure for representing tables and lists so that accessing, inserting, and deleting items is easy. On an n -node... more

The splay tree, a self-adjusting form of binary search tree, is developed and analyzed. The binary search tree is a data structure for representing tables and lists so that accessing, inserting, and deleting items is easy. On an n -node splay tree, all the standard search tree operations have an amortized time bound of O (log n ) per operation, where by “amortized time” is meant the time per operation averaged over a worst-case sequence of operations. Thus splay trees are as efficient as balanced trees when total running time is the measure of interest. In addition, for sufficiently long access sequences, splay trees are as efficient, to within a constant factor, as static optimum search trees. The efficiency of splay trees comes not from an explicit structural constraint, as with balanced trees, but from applying a simple restructuring heuristic, called splaying , whenever the tree is accessed. Extensions of splaying give simplified forms of two other data structures: lexicographic...

Micro-task markets such as Amazon's Mechanical Turk represent a new paradigm for accomplishing work, in which employers can tap into a large population of workers around the globe to accomplish tasks in a fraction of the time and... more

Micro-task markets such as Amazon's Mechanical Turk represent a new paradigm for accomplishing work, in which employers can tap into a large population of workers around the globe to accomplish tasks in a fraction of the time and money of more traditional methods. However, such markets have been primarily used for simple, independent tasks, such as labeling an image or judging the relevance of a search result. Here we present a general purpose framework for accomplishing complex and interdependent tasks using micro-task markets. We describe our framework, a web-based prototype, and case studies on article writing, decision making, and science journalism that demonstrate the benefits and limitations of the approach. ACM Classification: H5.m. Information interfaces and presentation (e.g., HCI): Miscellaneous.

Un aperçu du droit américain des conditions générales d’utilisation des sites web et quelques protections pour les consommateurs en droit américain. Développé d'après mon intervention intitulée “Les contrats du commerce électronique... more

Un aperçu du droit américain des conditions générales d’utilisation des sites web et quelques protections pour les consommateurs en droit américain. Développé d'après mon intervention intitulée “Les contrats du commerce électronique soumis au droit américain” à la Journée d’Etude: Les défis du numérique dans les entreprises en Europe, à Toulouse, France le 27 février 2015.

Research findings are often transmitted both as written doc-uments and narrated slide presentations. As these two forms of media contain both unique and replicated information, it is useful to combine and align these two views to create a... more

Research findings are often transmitted both as written doc-uments and narrated slide presentations. As these two forms of media contain both unique and replicated information, it is useful to combine and align these two views to create a sin-gle synchronized medium. We introduce SlideSeer, a digital library that discovers, aligns and presents such presentation and document pairs. We discuss the three major system components of the SlideSeer DL: 1) the resource discovery, 2) the fine-grained alignment and 3) the user interface. For resource discovery, we have bootstrapped collection building using metadata from DBLP and CiteSeer. For alignment, we modify maximum similarity alignment to favor mono-tonic alignments and incorporate a classifier to handle slides which should not be aligned. For the user interface, we allow the user to seamlessly switch between four carefully moti-vated views of the resulting synchronized media pairs.

There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where inno-vation critically depends on being able to analyze terabytes of data collected every day. Parallel database products,... more

There is a growing need for ad-hoc analysis of extremely large data sets, especially at internet companies where inno-vation critically depends on being able to analyze terabytes of data collected every day. Parallel database products, eg, Teradata, offer a solution, but ...

A new model for the One-dimensional Cutting Stock problem using Genetic Algorithms (GA) is developed to optimize construction steel bars waste. One-dimensional construction stocks (i.e., steel rebars, steel sections, dimensional lumber,... more

A new model for the One-dimensional Cutting Stock problem using Genetic Algorithms (GA) is developed to optimize construction steel bars waste. One-dimensional construction stocks (i.e., steel rebars, steel sections, dimensional lumber, etc.) are one of the major contributors to the construction waste stream. Construction wastes account for a significant portion of municipal waste stream. Cutting one-dimensional stocks to suit needed project lengths results in trim losses, which are the main causes of one-dimensional stock wastes. The model developed and the results obtained were compared with real life case studies from local steel workshops. Cutting schedules produced by our new GA model were tested in the shop against the current cutting schedules. The comparisons show the superiority of this new GA model in terms of waste minimization.

Data created by social bookmarking systems can be described as 3-partite 3-uniform hypergraphs connecting documents, users, and tags (tagging networks), such that the toolbox of complex network analysis can be applied to examine their... more

Data created by social bookmarking systems can be described as 3-partite 3-uniform hypergraphs connecting documents, users, and tags (tagging networks), such that the toolbox of complex network analysis can be applied to examine their properties. One of the most basic tools, the analysis of connected components, however cannot be applied meaningfully: Tagging networks tend to be almost entirely connected. We

The adequacy of a programming language to a given software project or application domain is often con-sidered a key factor of success in software development and engineering, even though little theoretical or practical information is... more

The adequacy of a programming language to a given software project or application domain is often con-sidered a key factor of success in software development and engineering, even though little theoretical or practical information is readily available to help make an informed decision. In this paper, we address a particular version of this issue by comparing the adequacy of general-purpose synchronous programming languages to more domain-specific languages (DSL) in the field of computer music. More precisely, we implemented and tested the same lookup table oscillator example program, one of the most classical al-gorithms for sound synthesis, using a selection of significant synchronous programming languages, half of which designed as specific music languages – Csound, Pure Data, SuperCollider, ChucK, Faust – and the other half being general synchronous formalisms – Signal, Lustre, Esterel, Lucid Synchrone and C with the OpenMP Stream Extension (Matlab/Octave is used for the initial ...

Citations management is an important task in managing digital libraries. Citations provide valuable information e.g., used in evaluating an author's influences or scholarly quality (the impact factor of research journals). But... more

Citations management is an important task in managing digital libraries. Citations provide valuable information e.g., used in evaluating an author's influences or scholarly quality (the impact factor of research journals). But although a reliable and effective autonomous citation management is essential, manual citation management can be extremely costly. Automatic citation mining on the other hand is a non-trivial task mainly due to non-conforming citation styles, spelling errors and the difficulty of reliably extracting text from PDF documents. In this paper we propose a novel rule-based autonomous citation mining technique, to address this important task. We define a set of common heuristics that together allow to improve the state of the art in automatic citation mining. Moreover, by first disambiguating citations based on venues, our technique significantly enhances the correct discovery of citations. Our experiments show that the proposed approach is indeed able to overcom...

In this paper we consider the problem of resilient data aggregation, namely, when aggregation has to be performed on a compromised sample. We present a statistical framework that is designed to mitigate the effects of an attacker who is... more

In this paper we consider the problem of resilient data aggregation, namely, when aggregation has to be performed on a compromised sample. We present a statistical framework that is designed to mitigate the effects of an attacker who is able to alter the values of the measured parameters of the environment around some of the sensor nodes. Our proposed framework takes advantage of the naturally existing correlation between the sample elements, which is very rarely considered in other sensor network related papers. The algorithms presented are to be applied without assumption on the sensor network’s sampling distribution or on the behaviour of the attacker. The effectiveness of the algorithms is formally evaluated.

Automatic clustering of webpages helps a number of information retrieval tasks, such as improving user interfaces, collection clustering, introducing diversity in search results, etc. Typically, webpage clustering algorithms only use... more

Automatic clustering of webpages helps a number of information retrieval tasks, such as improving user interfaces, collection clustering, introducing diversity in search results, etc. Typically, webpage clustering algorithms only use features extracted from the page-text. However, the advent of social-bookmarking websites, such as StumbleUpon.com and Delicious.com, has led to a huge amount of user-generated content such as the social tag information that is associated with the webpages. In this paper, we present a subspace based feature extraction approach which leverages the social tag information to complement the page-contents of a webpage for extracting beter features, with the goal of improved clustering performance. In our approach, we consider page-text and tags as two separate views of the data, and learn a shared subspace that maximizes the correlation between the two views. Any clustering algorithm can then be applied in this subspace. We then present an extension that all...

This tutorial presents a detailed study of sensor faults that occur in deployed sensor networks and a systematic approach to model these faults. We begin by reviewing the fault detection literature for sensor networks. We draw from... more

This tutorial presents a detailed study of sensor faults that occur in deployed sensor networks and a systematic approach to model these faults. We begin by reviewing the fault detection literature for sensor networks. We draw from current literature, our own experience, and data collected from scientific deployments to develop a set of commonly used features useful in detecting and diagnosing sensor faults. We use this feature set to systematically define commonly observed faults, and provide examples of each of these faults from sensor data collected at recent deployments. Categories and Subject Descriptors: B.8.1 [Reliability, Testing, and Fault-Tolerance]: Fault

Multiplayer online 3D games are becoming very popular in recent years. However, existing games require the complete game content to be installed prior to game playing. Since the content is usually large in size, it may be difficult to run... more

Multiplayer online 3D games are becoming very popular in recent years. However, existing games require the complete game content to be installed prior to game playing. Since the content is usually large in size, it may be difficult to run these games on a PDA or other handheld devices. It also pushes game companies to distribute their games as CDROMs/DVDROMs rather than online downloading. On the other hand, due to network latency, players may perceive discrepant status of some dynamic game objects. In this paper, we present a game-on-demand (GameOD) framework to distribute game content progressively in an on-demand manner. It allows critical contents to be available at the players ’ machines in a timely fashion. We present a simple distributed synchronization method to allow concurrent players to synchronize their perceived game status. Finally, we show some performance results of the proposed framework.

This text offers a personal and very subjective view on the current situation of Music Information Research (MIR). Motivated by the desire to build systems with a somewhat deeper understanding of music than the ones we currently have, I... more

This text offers a personal and very subjective view on the current situation of Music Information Research (MIR). Motivated by the desire to build systems with a somewhat deeper understanding of music than the ones we currently have, I try to sketch a number of challenges for the next decade of MIR research, grouped around six simple truths about music that are probably generally agreed on but often ignored in everyday research.

We present a new method for constructing nearly orthogonal Latin hypercubes that greatly expands their availability to experimenters. Latin hypercube designs have proven useful for exploring complex, high-dimensional computational models,... more

We present a new method for constructing nearly orthogonal Latin hypercubes that greatly expands their availability to experimenters. Latin hypercube designs have proven useful for exploring complex, high-dimensional computational models, but can be plagued with unacceptable correlations among input variables. To improve upon their effectiveness, many researchers have developed algorithms that generate orthogonal and nearly orthogonal Latin hypercubes. Unfortunately, these methodologies can have strict limitations on the feasible number of experimental runs and variables. To overcome these restrictions, we develop a mixed integer programming algorithm that generates Latin hypercubes with little or no correlation among their columns for most any determinate run-variable combination—including fully saturated designs. Moreover, many designs can be constructed for a specified number of runs and factors—thereby providing experimenters with a choice of several designs. In addition, our al...

The adequacy of a programming language to a given software project or application domain is often considered a key factor of success in software development and engineering, even though little theoretical or practical information is... more

The adequacy of a programming language to a given software project or application domain is often considered a key factor of success in software development and engineering, even though little theoretical or practical information is readily available to help make an informed decision. In this paper, we address a particular version of this issue by comparing the adequacy of general-purpose synchronous programming languages to more domain-specific languages (DSL) in the field of computer music. More precisely, we implemented and tested the same lookup table oscillator example program, one of the most classical algorithms for sound synthesis, using a selection of significant synchronous programming languages, half of which designed as specific music languages – Csound, Pure Data, SuperCollider, ChucK, Faust – and the other half being general synchronous formalisms – Signal, Lustre, Esterel, Lucid Synchrone and C with the OpenMP Stream Extension (Matlab/Octave is used for the initial sp...

Office-by-Example (OBE) is an integrated office information system that has been under development at IBM Research. OBE, an extension of Query-by-Example, supports various office features such as database tables, word processing,... more

Office-by-Example (OBE) is an integrated office information system that has been under development at IBM Research. OBE, an extension of Query-by-Example, supports various office features such as database tables, word processing, electronic mail, graphics, images, and so forth. These seemingly heterogeneous features are integrated through a language feature called example elements . Applications involving example elements are processed by the database manager, an integrated part of the OBE system. In this paper we describe the facilities and architecture of the OBE system and discuss the techniques for integrating heterogeneous objects.

Object-oriented programming languages provide many software engineering benefits, but these often come at a performance cost. Object-oriented programs make extensive use of method invocations and pointer dereferences, both of which are... more

Object-oriented programming languages provide many software engineering benefits, but these often come at a performance cost. Object-oriented programs make extensive use of method invocations and pointer dereferences, both of which are potentially costly on modern machines. We show how to use types to produce effective, yet simple, techniques that reduce the costs of these features in Modula-3, a statically typed, object-oriented language. Our compiler performs type-based alias analysis to disambiguate memory references. It uses the results of the type-based alias analysis to eliminate redundant memory references and to replace monomorphic method invocation sites with direct calls. Using limit, static, and running time evaluation, we demonstrate that these techniques are effective, and sometimes perfect for a set of Modula-3 benchmarks.

Feature-based model templates have been proposed as a technique for modeling software product lines. We describe a set of tools supporting the technique, namely a feature model editor and feature configurator, and a model-template editor,... more

Feature-based model templates have been proposed as a technique for modeling software product lines. We describe a set of tools supporting the technique, namely a feature model editor and feature configurator, and a model-template editor, processor, and verifier.

In this article we propose a standard for role-based access control (RBAC). Although RBAC models have received broad support as a generalized approach to access control, and are well recognized for their many advantages in performing... more

In this article we propose a standard for role-based access control (RBAC). Although RBAC models have received broad support as a generalized approach to access control, and are well recognized for their many advantages in performing large-scale authorization management, no single authoritative definition of RBAC exists today. This lack of a widely accepted model results in uncertainty and confusion about RBAC's utility and meaning. The standard proposed here seeks to resolve this situation by unifying ideas from a base of frequently referenced RBAC models, commercial products, and research prototypes. It is intended to serve as a foundation for product development, evaluation, and procurement specification. Although RBAC continues to evolve as users, researchers, and vendors gain experience with its application, we feel the features and components proposed in this standard represent a fundamental and stable set of mechanisms that may be enhanced by developers in further meeting...

In this paper, we describe a procedure for memory design and exploration for low power embedded systems. Our procedure tries to reduce the power consumption due to memory traffic by (i) applying memory optimizing transformations such as... more

In this paper, we describe a procedure for memory design and exploration for low power embedded systems. Our procedure tries to reduce the power consumption due to memory traffic by (i) applying memory optimizing transformations such as loop transformations, (ii) storing frequently accessed variables in a register file and an on-chip cache, and (iii) reducing the conflict misses by appropriate choice of cache size and data placement in off chip memory. We then choose a cache configuration (cache size, line size) that satisfies the system requirements of area, number of cycles and energy. We include energy in the performance metrics, since for different cache configurations, the variation in energy is quite different from the variation in the number of cycles. Our memory exploration procedure considers only a selected set of candidate points, thereby reducing the search time significantly. INTRODUCTION In systems that involve multidimensional streams of signals such as images or vide...

Information retrieval systems traditionally rely on textual keywords to index and retrieve documents. Keyword-based retrieval may return inaccurate and incomplete results when different keywords are used to describe the same concept in... more

Information retrieval systems traditionally rely on textual keywords to index and retrieve documents. Keyword-based retrieval may return inaccurate and incomplete results when different keywords are used to describe the same concept in the documents and in the queries. Furthermore, the relationship between these related keywords may be semantic rather than syntactic, and capturing it thus requires access to comprehensive human world knowledge. Concept-based retrieval methods have attempted to tackle these difficulties by using manually built thesauri, by relying on term cooccurrence data, or by extracting latent word relationships and concepts from a corpus. In this article we introduce a new concept-based retrieval approach based on Explicit Semantic Analysis (ESA), a recently proposed method that augments keyword-based text representation with concept-based features, automatically extracted from massive human knowledge repositories such as Wikipedia. Our approach generates new tex...

Our aim is to produce a focused crawler that, given one or a number of sample pages, will crawl to all similar pages on the web as efficiently as possible. A key problem in achieving this goal is assigning credit to documents along a... more

Our aim is to produce a focused crawler that, given one or a number of sample pages, will crawl to all similar pages on the web as efficiently as possible. A key problem in achieving this goal is assigning credit to documents along a crawl path, so that the system can learn which documents lead toward goal documents and can thus efficiently search in a best first manner. To address this problem, we construct an artificial economy of autonomous agents. Each agent buys and sells web pages and is compensated when it buys a goal page, when another agent buys the current set of uncrawled web pages, or when future agents buy a goal page. The economy serves to push money up from goal pages, compensating agents that buy useful pages. Inappropriate agents go broke and new agents are created, and the system evolves agents whose bids accurately estimate the utility of adding pages to the search. The system is found to outperform a Bayesian focused crawler in our experiments.

Despite growing awareness of the accessibility issues surrounding touch screen use by blind people, designers still face challenges when creating accessible touch screen interfaces. One major stumbling block is a lack of understanding... more

Despite growing awareness of the accessibility issues surrounding touch screen use by blind people, designers still face challenges when creating accessible touch screen interfaces. One major stumbling block is a lack of understanding about how blind people actually use touch screens. We conducted two user studies that compared how blind people and sighted people use touch screen gestures.