W. Verhaegh - Academia.edu (original) (raw)
Papers by W. Verhaegh
BMC Bioinformatics, 2009
Background: Large discrepancies in signature composition and outcome concordance have been observ... more Background: Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear. Results: We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight diferent datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical. Conclusion: Feature variability can have a strong impact on breast cancer signature composition, as well as the classification of individual patient samples. We therefore strongly recommend that feature variability is considered in analyzing data from microarray breast cancer expression profiling experiments.
Lecture Notes in Computer Science, 2004
We discuss the issue of privacy protection in collaborative filtering, focusing on the commonly-u... more We discuss the issue of privacy protection in collaborative filtering, focusing on the commonly-used memory-based approach. We show that the two main steps in collaborative filtering, being the determination of similarities and the prediction of ratings, can be performed on encrypted profiles, thereby securing the users’ private data. We list a number of variants of the similarity measures and prediction
Proceedings International Test Conference 1996. Test and Design Validity
ABSTRACT This paper addresses the problem of constructing a scan chain such that (1) the area ove... more ABSTRACT This paper addresses the problem of constructing a scan chain such that (1) the area overhead is minimal for latch-based designs, and (2) the number of pipeline scan shifts is minimal. We present an efficient heuristic algorithm to construct near-optimal scan chains. On the theoretical side, we show that part (1) of the problem can be solved in polynomial time, and that part (2) is NP-hard, thus precisely pinpointing the source of complexity and justifying our heuristic approach. Experimental results on three industrial asynchronous IC designs show (1) less than 0.1% extra scan latches for level-sensitive scan design, and (2) scan shift reductions up to 86% over traditional scan schemes
Real-Time Systems, 2009
Fixed-priority scheduling with deferred preemption (FPDS) has been proposed in the literature as ... more Fixed-priority scheduling with deferred preemption (FPDS) has been proposed in the literature as a viable alternative to fixed-priority pre-emptive scheduling (FPPS), that obviates the need for non-trivial resource access protocols and reduces the cost of arbitrary preemptions. This paper shows that existing worst-case response time analysis of hard real-time tasks under FPDS, arbitrary phasing and relative deadlines at most equal to periods is pessimistic and/or optimistic. The same problem also arises for fixed-priority nonpre-emptive scheduling (FPNS), being a special case of FPDS. This paper provides a revised analysis, resolving the problems with the existing approaches. The analysis is based on known concepts of critical instant and busy period for FPPS. To accommodate for our scheduling model for FPDS, we need to slightly modify existing definitions of these concepts. The analysis assumes a continuous scheduling model, which is based on a partitioning of the timeline in a set of non-empty, right semi-open intervals. It is shown that the critical instant, longest busy period, and worst-case response time for a task are suprema rather than maxima for all tasks, except for the lowest priority task. Hence, that instant, period, and response time cannot be assumed for any task, except for the lowest priority task. Moreover, it is shown that the analysis is not uniform for all tasks, i.e. the analysis for the lowest priority task differs from the Excerpts of this document have been published as Bril et al. (2007).
Journal of Scheduling, 2001
... research.philips.com Copyright ? 2001 John Wiley & Sons, Ltd. Page 2. 246 J. AERTS, J. KO... more ... research.philips.com Copyright ? 2001 John Wiley & Sons, Ltd. Page 2. 246 J. AERTS, J. KORST AND W. VERHAEGH Figure 1. Model of a multimedia server. that arrive in one period are serviced in the next one. In the server ...
IEEE Transactions on Computers, 2003
Random redundant data storage strategies have proven to be a good choice for efficient data stora... more Random redundant data storage strategies have proven to be a good choice for efficient data storage in multimedia servers. These strategies lead to a retrieval problem in which it is decided for each requested data block which disk to use for its retrieval. In this paper, we give a complexity classification of retrieval problems for random redundant storage. Index Terms-Random redundant storage, load balancing, video servers, complexity analysis. ae 1 INTRODUCTION A multimedia server [13] offers continuous streams of multimedia data to multiple users. In a multimedia server, one can generally distinguish three parts: an array of hard disks to store the data, an internal network, and fast memory used for buffering. The latter is usually implemented in random access memory (RAM). The multimedia data is stored on the hard disks in blocks such that a data stream is realized by periodically reading a block from disk and storing it in the buffer, from which the stream can be consumed in a continuous way. A block generally contains a couple of hundred milliseconds of video data. The total buffer space is split up into a number of buffers, one for each user. A user consumes, possibly at a variable bit rate, from his/her buffer and the buffer is repeatedly refilled with blocks from the hard disks. A buffer generates a request for a new block as soon as the amount of data in the buffer becomes smaller than a certain threshold. We assume that requests are handled periodically in batches, in a way that the requests that arrive in one period are serviced in the next one [16]. In the server, we need a cost-efficient storage and retrieval strategy that guarantees, either deterministically or probabilistically, that the buffers do not underflow or overflow. Load balancing is very important within a multimedia server, as efficient usage of the available bandwidth of the disk array increases the maximum number of users that can be serviced simultaneously, which results in lower cost per user. Random redundant storage strategies have proven to enable a good load balancing performance [1], [3], [15], [23]. In these storage strategies, each data block is stored more than once, on different, randomly chosen disks. This data redundancy gives the freedom to obtain a balanced load with high probability. To exploit this freedom, an algorithm is needed to solve, in each period, a retrieval problem, i.e., we have to decide, for each data block, from which disk(s) to retrieve it in such a way that the load is balanced.
... in which there are only two tools is considered and a polynomial-time algorithm is provided. ... more ... in which there are only two tools is considered and a polynomial-time algorithm is provided. ... The groups are obtained by a hybrid method based on a temporal decomposition method. ... The two latter criteria are equivalent to maximizing the total processing time and the weighted ...
An important resource management issue in multimedia servers is to balance the load on the availa... more An important resource management issue in multimedia servers is to balance the load on the available hard disks. Using random redundant storage strategies we can balance block requests over the disks by exploiting the freedom to service them by dierent disks. This induces a so-called retrieval problem in which to decide for each block which disk to use for its retrieval. We formulate this problem as an MILP problem, prove that it is NP-complete, and show that the problem can be seen as a multiprocessor scheduling problem. Next, we describe a heuristic with a performance guarantee based on an LP-relaxation.
In this paper, we present QoS control challenges for multimedia consumer terminals based on an ap... more In this paper, we present QoS control challenges for multimedia consumer terminals based on an application execution model and a QoS resource management framework. In the context of this framework, we briefly recapitulate earlier work aimed at QoS control for high-quality video processing. By relaxing a number of assumptions in that work, and by considering the work in relation to the framework and the applied video algorithms, new control challenges are identified.
DOI to the publisher's website. • The final author version and the galley proof are versions of t... more DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User Agreement:
ABSTRACT First Page of the Article
BMC Bioinformatics, 2009
Background: Large discrepancies in signature composition and outcome concordance have been observ... more Background: Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear. Results: We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight diferent datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical. Conclusion: Feature variability can have a strong impact on breast cancer signature composition, as well as the classification of individual patient samples. We therefore strongly recommend that feature variability is considered in analyzing data from microarray breast cancer expression profiling experiments.
Lecture Notes in Computer Science, 2004
We discuss the issue of privacy protection in collaborative filtering, focusing on the commonly-u... more We discuss the issue of privacy protection in collaborative filtering, focusing on the commonly-used memory-based approach. We show that the two main steps in collaborative filtering, being the determination of similarities and the prediction of ratings, can be performed on encrypted profiles, thereby securing the users’ private data. We list a number of variants of the similarity measures and prediction
Proceedings International Test Conference 1996. Test and Design Validity
ABSTRACT This paper addresses the problem of constructing a scan chain such that (1) the area ove... more ABSTRACT This paper addresses the problem of constructing a scan chain such that (1) the area overhead is minimal for latch-based designs, and (2) the number of pipeline scan shifts is minimal. We present an efficient heuristic algorithm to construct near-optimal scan chains. On the theoretical side, we show that part (1) of the problem can be solved in polynomial time, and that part (2) is NP-hard, thus precisely pinpointing the source of complexity and justifying our heuristic approach. Experimental results on three industrial asynchronous IC designs show (1) less than 0.1% extra scan latches for level-sensitive scan design, and (2) scan shift reductions up to 86% over traditional scan schemes
Real-Time Systems, 2009
Fixed-priority scheduling with deferred preemption (FPDS) has been proposed in the literature as ... more Fixed-priority scheduling with deferred preemption (FPDS) has been proposed in the literature as a viable alternative to fixed-priority pre-emptive scheduling (FPPS), that obviates the need for non-trivial resource access protocols and reduces the cost of arbitrary preemptions. This paper shows that existing worst-case response time analysis of hard real-time tasks under FPDS, arbitrary phasing and relative deadlines at most equal to periods is pessimistic and/or optimistic. The same problem also arises for fixed-priority nonpre-emptive scheduling (FPNS), being a special case of FPDS. This paper provides a revised analysis, resolving the problems with the existing approaches. The analysis is based on known concepts of critical instant and busy period for FPPS. To accommodate for our scheduling model for FPDS, we need to slightly modify existing definitions of these concepts. The analysis assumes a continuous scheduling model, which is based on a partitioning of the timeline in a set of non-empty, right semi-open intervals. It is shown that the critical instant, longest busy period, and worst-case response time for a task are suprema rather than maxima for all tasks, except for the lowest priority task. Hence, that instant, period, and response time cannot be assumed for any task, except for the lowest priority task. Moreover, it is shown that the analysis is not uniform for all tasks, i.e. the analysis for the lowest priority task differs from the Excerpts of this document have been published as Bril et al. (2007).
Journal of Scheduling, 2001
... research.philips.com Copyright ? 2001 John Wiley & Sons, Ltd. Page 2. 246 J. AERTS, J. KO... more ... research.philips.com Copyright ? 2001 John Wiley & Sons, Ltd. Page 2. 246 J. AERTS, J. KORST AND W. VERHAEGH Figure 1. Model of a multimedia server. that arrive in one period are serviced in the next one. In the server ...
IEEE Transactions on Computers, 2003
Random redundant data storage strategies have proven to be a good choice for efficient data stora... more Random redundant data storage strategies have proven to be a good choice for efficient data storage in multimedia servers. These strategies lead to a retrieval problem in which it is decided for each requested data block which disk to use for its retrieval. In this paper, we give a complexity classification of retrieval problems for random redundant storage. Index Terms-Random redundant storage, load balancing, video servers, complexity analysis. ae 1 INTRODUCTION A multimedia server [13] offers continuous streams of multimedia data to multiple users. In a multimedia server, one can generally distinguish three parts: an array of hard disks to store the data, an internal network, and fast memory used for buffering. The latter is usually implemented in random access memory (RAM). The multimedia data is stored on the hard disks in blocks such that a data stream is realized by periodically reading a block from disk and storing it in the buffer, from which the stream can be consumed in a continuous way. A block generally contains a couple of hundred milliseconds of video data. The total buffer space is split up into a number of buffers, one for each user. A user consumes, possibly at a variable bit rate, from his/her buffer and the buffer is repeatedly refilled with blocks from the hard disks. A buffer generates a request for a new block as soon as the amount of data in the buffer becomes smaller than a certain threshold. We assume that requests are handled periodically in batches, in a way that the requests that arrive in one period are serviced in the next one [16]. In the server, we need a cost-efficient storage and retrieval strategy that guarantees, either deterministically or probabilistically, that the buffers do not underflow or overflow. Load balancing is very important within a multimedia server, as efficient usage of the available bandwidth of the disk array increases the maximum number of users that can be serviced simultaneously, which results in lower cost per user. Random redundant storage strategies have proven to enable a good load balancing performance [1], [3], [15], [23]. In these storage strategies, each data block is stored more than once, on different, randomly chosen disks. This data redundancy gives the freedom to obtain a balanced load with high probability. To exploit this freedom, an algorithm is needed to solve, in each period, a retrieval problem, i.e., we have to decide, for each data block, from which disk(s) to retrieve it in such a way that the load is balanced.
... in which there are only two tools is considered and a polynomial-time algorithm is provided. ... more ... in which there are only two tools is considered and a polynomial-time algorithm is provided. ... The groups are obtained by a hybrid method based on a temporal decomposition method. ... The two latter criteria are equivalent to maximizing the total processing time and the weighted ...
An important resource management issue in multimedia servers is to balance the load on the availa... more An important resource management issue in multimedia servers is to balance the load on the available hard disks. Using random redundant storage strategies we can balance block requests over the disks by exploiting the freedom to service them by dierent disks. This induces a so-called retrieval problem in which to decide for each block which disk to use for its retrieval. We formulate this problem as an MILP problem, prove that it is NP-complete, and show that the problem can be seen as a multiprocessor scheduling problem. Next, we describe a heuristic with a performance guarantee based on an LP-relaxation.
In this paper, we present QoS control challenges for multimedia consumer terminals based on an ap... more In this paper, we present QoS control challenges for multimedia consumer terminals based on an application execution model and a QoS resource management framework. In the context of this framework, we briefly recapitulate earlier work aimed at QoS control for high-quality video processing. By relaxing a number of assumptions in that work, and by considering the work in relation to the framework and the applied video algorithms, new control challenges are identified.
DOI to the publisher's website. • The final author version and the galley proof are versions of t... more DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User Agreement:
ABSTRACT First Page of the Article