Daniel Berend - Academia.edu (original) (raw)
Papers by Daniel Berend
Our starting point is the coupon collector's problem (CCP). In this problem, there are n coupons ... more Our starting point is the coupon collector's problem (CCP). In this problem, there are n coupons that are drawn uniformly randomly with replacement. The question is how many drawings on average are needed to collect at least one copy (or some other predetermined number m of copies) of each coupon? The problem may be traced back to the 18-th century, having been mentioned already by de Moivre. Numerous questions have been posed based on the problem since its inception, and it turned out to appear naturally in many applications. A naive simulation of the process is trivial to implement. However, the runtime of this algorithm makes it impractical for large values of n. We present here an alternative view of the coupon collecting process, for coupons with any probabilities, that allows us to increase the range of n-s (and m-s) for which the simulation may be run. For equiprobable coupons, we present additional improvements, making the simulation possible in a very short time practically for any n. More precisely, we show that the runtime of our algorithm is Θ(max{m, log n}). We present theoretical results concerning some of the quantities relevant to our algorithms and conduct simulations to test the algorithms in practice.
Springer optimization and its applications, 2016
We study maximum likelihood estimators (henceforth MLE) in experiments consisting of two stages, ... more We study maximum likelihood estimators (henceforth MLE) in experiments consisting of two stages, where the first-stage sample is unknown to us, but the second-stage samples are known and depend on the first-stage sample. The setup is similar to that in parametric empirical Bayes models, and arises naturally in numerous applications. However, problems arise when the number of second-level observations is not the same for all first-stage observations. As far as we know, this situation has been discussed in very few cases (see Brandel, Empirical Bayes methods for missing data analysis. Technical Report 2004:11, Department of Mathematics, Uppsala University, Sweden, 2004 and Carlin and Louis, Bayes and Empirical Bayes Methods for Data Analysis, 2nd edn. Chapman & Hall, Boca Raton, 2000) and no analytic expression for the indirect maximum likelihood estimator was derived there. The novelty of our paper is that it details and exemplifies this point. Specifically, we study in detail two situations: 1. Both levels correspond to normal distributions; here we are able to find an explicit formula for the MLE and show that it forms uniformly minimum-variance unbiased estimator (henceforth UMVUE). 2. Exponential first-level and Poissonian second-level; here the MLE can usually be expressed only implicitly as a solution of a certain polynomial equation. It seems that the MLE is usually not a UMVUE.
arXiv (Cornell University), Apr 11, 2023
arXiv (Cornell University), Apr 11, 2023
arXiv (Cornell University), Mar 30, 2020
We tackle some fundamental problems in probability theory on corrupted random processes on the in... more We tackle some fundamental problems in probability theory on corrupted random processes on the integer line. We analyze when a biased random walk is expected to reach its bottommost point and when intervals of integer points can be detected under a natural model of noise. We apply these results to problems in learning thresholds and intervals under a new model for learning under adversarial design.
Discrete Applied Mathematics, Jul 1, 2017
We consider scenarios where a sequence of tests is to be applied to an object, such that one outc... more We consider scenarios where a sequence of tests is to be applied to an object, such that one outcome of a test may be a decision to terminate the sequence (e.g. deciding that the object is faulty) without running additional tests. One seeks an ordering of the tests that is minimal in expected resource consumption. In prior work, we examined conditions under which statistically independent test sequences can be optimized under precedence constraints. This paper examines conditions under which one can efficiently find an optimal ordering of tests with statistical dependencies. We show that with dependencies the optimization problem is NP-hard in the general case, and provide low-order polynomial time algorithms for special cases with non-trivial dependency structures.
Neural Networks, May 1, 2016
The holographic conceptual approach to cognitive processes in the human brain suggests that, in s... more The holographic conceptual approach to cognitive processes in the human brain suggests that, in some parts of the brain, each part of the memory (a neuron or a group of neurons) contains some information regarding the entire data. In [11, 12] we demonstrated how to encode data in a holographic manner using the Walsh-Hadamard transform. The encoding is performed on randomized information, that is then represented by a set of Walsh-Hadamard coefficients. These coefficients turn out to have holographic properties. Namely, any portion of the set of coefficients defines a "blurry image" of the original data. In this work, we describe a built-in error correction technique-enlarging the width of the matrix used in the Walsh-Hadamard transform to produce a rectangular Hadamard matrix. By adding this redundancy, the data can bear more errors, resulting in a system that is not affected by missing coefficients up to a certain threshold. Above this threshold, the loss of data is reflected by getting a "blurry image" rather than a concentrated damage. We provide a heuristic analysis of the ability of the technique to correct errors, as well as an example of an image saved using the system. Finally, we give an example of a simple implementation of our approach using neural networks as a proof of $ Partially supported by Rita Altura Trust Chair in Computer Sciences, Lynne and William Frankel Center for Computer Sciences, and Israel Science Foundation (grant number 428/11).
Mathematics of Computation, May 1, 2006
Let N * (m) be the minimal length of a polynomial with ±1 coefficients divisible by (x − 1) m. By... more Let N * (m) be the minimal length of a polynomial with ±1 coefficients divisible by (x − 1) m. Byrnes noted that N * (m) ≤ 2 m for each m, and asked whether in fact N * (m) = 2 m. Boyd showed that N * (m) = 2 m for all m ≤ 5, but N * (6) = 48. He further showed that N * (7) = 96, and that N * (8) is one of the 5 numbers 96, 144, 160, 176, or 192. Here we prove that N * (8) = 144. Similarly, let m * (N) be the maximal power of (x − 1) dividing some polynomial of degree N − 1 with ±1 coefficients. Boyd was able to find m * (N) for N < 88. In this paper we determine m * (N) for N < 168.
Discrete Applied Mathematics, 2016
"Dominosa" is a puzzle game. Before the game begins, the ( n + 2 2 ) domino bones ( 0 ,... more "Dominosa" is a puzzle game. Before the game begins, the ( n + 2 2 ) domino bones ( 0 , 0 ) , ( 0 , 1 ) , ? , ( n , n ) are placed on an ( n + 1 ) × ( n + 2 ) grid (constituting a perfect tiling). The domino bones are then being hidden, leaving only the numbers revealed. The purpose of the game is to recover the original tiling of the domino bones.The game raises many questions, such as how various algorithms perform and how many solutions a random puzzle has. In this paper, we study some of these combinatorial, probabilistic and algorithmic questions.
Journal of Combinatorial Optimization, Sep 30, 2022
Statistics & Probability Letters, Jun 1, 2012
A new field equation is proposed, associated to a S3 x R1 topology. We introduce a differential i... more A new field equation is proposed, associated to a S3 x R1 topology. We introduce a differential involutive maping A which links any point of space σ to the antipodal region A(σ). According to this equation the geometry of the manifold depends both on the energy-momentum tensor T and on the antipodal tensor A(T). Considering time-independent metric with low fields and small velocities, we derive the associated Poisson equation, which provides cluster-like structures interacting with halo-like antipodal structures. The second structure helps the confinement of the first. It is suggested that this model could explain the missing mass effect and the large scale structure of the universe.
Discrete Optimization, Feb 1, 2023
arXiv (Cornell University), Feb 23, 2011
The generalized Tower of Hanoi problem with h ≥ 4 pegs is known to require a sub-exponentially fa... more The generalized Tower of Hanoi problem with h ≥ 4 pegs is known to require a sub-exponentially fast growing number of moves in order to transfer a pile of n disks from one peg to another. In this paper we study the Path h variant, where the pegs are placed along a line, and disks can be moved from a peg to its nearest neighbor(s) only. Whereas in the simple variant there are h(h−1)/2 possible bi-directional interconnections among pegs, here there are only h − 1 of them. Despite the significant reduction in the number of interconnections, the number of moves needed to transfer a pile of n disks between any two pegs also grows sub-exponentially as a function of n. We study these graphs, identify sets of mutually recursive tasks, and obtain a relatively tight upper bound for the number of moves, depending on h, n and the source and destination pegs.
Discrete Mathematics, Nov 1, 2018
We provide an upper bound on the number of n 2 × n 2 Sudoku squares, and explain intuitively why ... more We provide an upper bound on the number of n 2 × n 2 Sudoku squares, and explain intuitively why there is reason to believe that the bound is tight up to a multiplicative factor of a much smaller order of magnitude. A similar bound is established for Sudoku squares with rectangular regions.
Social Science Research Network, 2022
Springer eBooks, 2020
The goal of this paper is to present an efficient realistic metric for evaluating cache schedulin... more The goal of this paper is to present an efficient realistic metric for evaluating cache scheduling algorithms in multiuser multi-cache environments. In a previous work, the requests sequence was set deliberately by an opponent (offline optimal) algorithm in an extremely unrealistic way, leading to an unlimited competitive ratio and to extremely unreasonable and unrealistic cache management strategies. In this paper, we propose to analyze the performance of cache management in a typical scenario, i.e., we consider all possibilities with their (realistic) distribution. In other words, we analyze the average case and not the worst case of scheduling scenarios. In addition, we present an efficient, according to our novel average case analysis, online heuristic algorithm for cache scheduling. The algorithm is based on machine-learning concepts, it is flexible and easy to implement.
Study of the dichotomous choice model has been initiated by Condorcet in 1785. A group of experts... more Study of the dichotomous choice model has been initiated by Condorcet in 1785. A group of experts is required to select one of two alternatives, of which exactly one is regarded as correct. Naturally, the model can be applied to a wide variety of social as well as scientific areas. A fundamental notion for this research is that of a decision rule -- a rule which translates the individual opinions of the members into a group decision. A decision rules is optimal if it maximizes the probability of the group to make a correct choice. The goal of our research is identifying the optimal decision rule under partial information on the decision skills. Specifically, we assume we do not know the correctness probabilities of the experts; however, we know they are independent random variables selected from some given distribution, and their internal ranking. For some distributions, we obtain explicit formulas for the probability of the expert and of the simple majority rules, as a function of group size and distribution parameter. For the exponential distribution, we obtain the optimality probability of several families of decision rules, ranging from the expert to the majority rule.
arXiv (Cornell University), Oct 11, 2012
A random variable is sampled from a discrete distribution. The missing mass is the probability of... more A random variable is sampled from a discrete distribution. The missing mass is the probability of the set of points not observed in the sample. We sharpen and simplify McAllester and Ortiz's results (JMLR, 2003) bounding the probability of large deviations of the missing mass. Along the way, we refine and rigorously prove a fundamental inequality of Kearns and Saul (UAI, 1998).
The Tower of Hanoi problem with h ≥ 4 pegs is long known to require a sub-exponential number of m... more The Tower of Hanoi problem with h ≥ 4 pegs is long known to require a sub-exponential number of moves in order to transfer a pile of n disks from one peg to another. In this paper we discuss the Path h variant, where the pegs are placed along a line, and disks can be moved from a peg to its nearest neighbor(s) only. Whereas in the simple variant there are h(h − 1)/2 bi-directional interconnections among pegs, here there are only h − 1 of them. Despite the significant reduction in the number of interconnections, the task of moving n disks between any two pegs is still shown to grow sub-exponentially as a function of the number of disks.
Journal of Number Theory, Sep 1, 2023
Our starting point is the coupon collector's problem (CCP). In this problem, there are n coupons ... more Our starting point is the coupon collector's problem (CCP). In this problem, there are n coupons that are drawn uniformly randomly with replacement. The question is how many drawings on average are needed to collect at least one copy (or some other predetermined number m of copies) of each coupon? The problem may be traced back to the 18-th century, having been mentioned already by de Moivre. Numerous questions have been posed based on the problem since its inception, and it turned out to appear naturally in many applications. A naive simulation of the process is trivial to implement. However, the runtime of this algorithm makes it impractical for large values of n. We present here an alternative view of the coupon collecting process, for coupons with any probabilities, that allows us to increase the range of n-s (and m-s) for which the simulation may be run. For equiprobable coupons, we present additional improvements, making the simulation possible in a very short time practically for any n. More precisely, we show that the runtime of our algorithm is Θ(max{m, log n}). We present theoretical results concerning some of the quantities relevant to our algorithms and conduct simulations to test the algorithms in practice.
Springer optimization and its applications, 2016
We study maximum likelihood estimators (henceforth MLE) in experiments consisting of two stages, ... more We study maximum likelihood estimators (henceforth MLE) in experiments consisting of two stages, where the first-stage sample is unknown to us, but the second-stage samples are known and depend on the first-stage sample. The setup is similar to that in parametric empirical Bayes models, and arises naturally in numerous applications. However, problems arise when the number of second-level observations is not the same for all first-stage observations. As far as we know, this situation has been discussed in very few cases (see Brandel, Empirical Bayes methods for missing data analysis. Technical Report 2004:11, Department of Mathematics, Uppsala University, Sweden, 2004 and Carlin and Louis, Bayes and Empirical Bayes Methods for Data Analysis, 2nd edn. Chapman & Hall, Boca Raton, 2000) and no analytic expression for the indirect maximum likelihood estimator was derived there. The novelty of our paper is that it details and exemplifies this point. Specifically, we study in detail two situations: 1. Both levels correspond to normal distributions; here we are able to find an explicit formula for the MLE and show that it forms uniformly minimum-variance unbiased estimator (henceforth UMVUE). 2. Exponential first-level and Poissonian second-level; here the MLE can usually be expressed only implicitly as a solution of a certain polynomial equation. It seems that the MLE is usually not a UMVUE.
arXiv (Cornell University), Apr 11, 2023
arXiv (Cornell University), Apr 11, 2023
arXiv (Cornell University), Mar 30, 2020
We tackle some fundamental problems in probability theory on corrupted random processes on the in... more We tackle some fundamental problems in probability theory on corrupted random processes on the integer line. We analyze when a biased random walk is expected to reach its bottommost point and when intervals of integer points can be detected under a natural model of noise. We apply these results to problems in learning thresholds and intervals under a new model for learning under adversarial design.
Discrete Applied Mathematics, Jul 1, 2017
We consider scenarios where a sequence of tests is to be applied to an object, such that one outc... more We consider scenarios where a sequence of tests is to be applied to an object, such that one outcome of a test may be a decision to terminate the sequence (e.g. deciding that the object is faulty) without running additional tests. One seeks an ordering of the tests that is minimal in expected resource consumption. In prior work, we examined conditions under which statistically independent test sequences can be optimized under precedence constraints. This paper examines conditions under which one can efficiently find an optimal ordering of tests with statistical dependencies. We show that with dependencies the optimization problem is NP-hard in the general case, and provide low-order polynomial time algorithms for special cases with non-trivial dependency structures.
Neural Networks, May 1, 2016
The holographic conceptual approach to cognitive processes in the human brain suggests that, in s... more The holographic conceptual approach to cognitive processes in the human brain suggests that, in some parts of the brain, each part of the memory (a neuron or a group of neurons) contains some information regarding the entire data. In [11, 12] we demonstrated how to encode data in a holographic manner using the Walsh-Hadamard transform. The encoding is performed on randomized information, that is then represented by a set of Walsh-Hadamard coefficients. These coefficients turn out to have holographic properties. Namely, any portion of the set of coefficients defines a "blurry image" of the original data. In this work, we describe a built-in error correction technique-enlarging the width of the matrix used in the Walsh-Hadamard transform to produce a rectangular Hadamard matrix. By adding this redundancy, the data can bear more errors, resulting in a system that is not affected by missing coefficients up to a certain threshold. Above this threshold, the loss of data is reflected by getting a "blurry image" rather than a concentrated damage. We provide a heuristic analysis of the ability of the technique to correct errors, as well as an example of an image saved using the system. Finally, we give an example of a simple implementation of our approach using neural networks as a proof of $ Partially supported by Rita Altura Trust Chair in Computer Sciences, Lynne and William Frankel Center for Computer Sciences, and Israel Science Foundation (grant number 428/11).
Mathematics of Computation, May 1, 2006
Let N * (m) be the minimal length of a polynomial with ±1 coefficients divisible by (x − 1) m. By... more Let N * (m) be the minimal length of a polynomial with ±1 coefficients divisible by (x − 1) m. Byrnes noted that N * (m) ≤ 2 m for each m, and asked whether in fact N * (m) = 2 m. Boyd showed that N * (m) = 2 m for all m ≤ 5, but N * (6) = 48. He further showed that N * (7) = 96, and that N * (8) is one of the 5 numbers 96, 144, 160, 176, or 192. Here we prove that N * (8) = 144. Similarly, let m * (N) be the maximal power of (x − 1) dividing some polynomial of degree N − 1 with ±1 coefficients. Boyd was able to find m * (N) for N < 88. In this paper we determine m * (N) for N < 168.
Discrete Applied Mathematics, 2016
"Dominosa" is a puzzle game. Before the game begins, the ( n + 2 2 ) domino bones ( 0 ,... more "Dominosa" is a puzzle game. Before the game begins, the ( n + 2 2 ) domino bones ( 0 , 0 ) , ( 0 , 1 ) , ? , ( n , n ) are placed on an ( n + 1 ) × ( n + 2 ) grid (constituting a perfect tiling). The domino bones are then being hidden, leaving only the numbers revealed. The purpose of the game is to recover the original tiling of the domino bones.The game raises many questions, such as how various algorithms perform and how many solutions a random puzzle has. In this paper, we study some of these combinatorial, probabilistic and algorithmic questions.
Journal of Combinatorial Optimization, Sep 30, 2022
Statistics & Probability Letters, Jun 1, 2012
A new field equation is proposed, associated to a S3 x R1 topology. We introduce a differential i... more A new field equation is proposed, associated to a S3 x R1 topology. We introduce a differential involutive maping A which links any point of space σ to the antipodal region A(σ). According to this equation the geometry of the manifold depends both on the energy-momentum tensor T and on the antipodal tensor A(T). Considering time-independent metric with low fields and small velocities, we derive the associated Poisson equation, which provides cluster-like structures interacting with halo-like antipodal structures. The second structure helps the confinement of the first. It is suggested that this model could explain the missing mass effect and the large scale structure of the universe.
Discrete Optimization, Feb 1, 2023
arXiv (Cornell University), Feb 23, 2011
The generalized Tower of Hanoi problem with h ≥ 4 pegs is known to require a sub-exponentially fa... more The generalized Tower of Hanoi problem with h ≥ 4 pegs is known to require a sub-exponentially fast growing number of moves in order to transfer a pile of n disks from one peg to another. In this paper we study the Path h variant, where the pegs are placed along a line, and disks can be moved from a peg to its nearest neighbor(s) only. Whereas in the simple variant there are h(h−1)/2 possible bi-directional interconnections among pegs, here there are only h − 1 of them. Despite the significant reduction in the number of interconnections, the number of moves needed to transfer a pile of n disks between any two pegs also grows sub-exponentially as a function of n. We study these graphs, identify sets of mutually recursive tasks, and obtain a relatively tight upper bound for the number of moves, depending on h, n and the source and destination pegs.
Discrete Mathematics, Nov 1, 2018
We provide an upper bound on the number of n 2 × n 2 Sudoku squares, and explain intuitively why ... more We provide an upper bound on the number of n 2 × n 2 Sudoku squares, and explain intuitively why there is reason to believe that the bound is tight up to a multiplicative factor of a much smaller order of magnitude. A similar bound is established for Sudoku squares with rectangular regions.
Social Science Research Network, 2022
Springer eBooks, 2020
The goal of this paper is to present an efficient realistic metric for evaluating cache schedulin... more The goal of this paper is to present an efficient realistic metric for evaluating cache scheduling algorithms in multiuser multi-cache environments. In a previous work, the requests sequence was set deliberately by an opponent (offline optimal) algorithm in an extremely unrealistic way, leading to an unlimited competitive ratio and to extremely unreasonable and unrealistic cache management strategies. In this paper, we propose to analyze the performance of cache management in a typical scenario, i.e., we consider all possibilities with their (realistic) distribution. In other words, we analyze the average case and not the worst case of scheduling scenarios. In addition, we present an efficient, according to our novel average case analysis, online heuristic algorithm for cache scheduling. The algorithm is based on machine-learning concepts, it is flexible and easy to implement.
Study of the dichotomous choice model has been initiated by Condorcet in 1785. A group of experts... more Study of the dichotomous choice model has been initiated by Condorcet in 1785. A group of experts is required to select one of two alternatives, of which exactly one is regarded as correct. Naturally, the model can be applied to a wide variety of social as well as scientific areas. A fundamental notion for this research is that of a decision rule -- a rule which translates the individual opinions of the members into a group decision. A decision rules is optimal if it maximizes the probability of the group to make a correct choice. The goal of our research is identifying the optimal decision rule under partial information on the decision skills. Specifically, we assume we do not know the correctness probabilities of the experts; however, we know they are independent random variables selected from some given distribution, and their internal ranking. For some distributions, we obtain explicit formulas for the probability of the expert and of the simple majority rules, as a function of group size and distribution parameter. For the exponential distribution, we obtain the optimality probability of several families of decision rules, ranging from the expert to the majority rule.
arXiv (Cornell University), Oct 11, 2012
A random variable is sampled from a discrete distribution. The missing mass is the probability of... more A random variable is sampled from a discrete distribution. The missing mass is the probability of the set of points not observed in the sample. We sharpen and simplify McAllester and Ortiz's results (JMLR, 2003) bounding the probability of large deviations of the missing mass. Along the way, we refine and rigorously prove a fundamental inequality of Kearns and Saul (UAI, 1998).
The Tower of Hanoi problem with h ≥ 4 pegs is long known to require a sub-exponential number of m... more The Tower of Hanoi problem with h ≥ 4 pegs is long known to require a sub-exponential number of moves in order to transfer a pile of n disks from one peg to another. In this paper we discuss the Path h variant, where the pegs are placed along a line, and disks can be moved from a peg to its nearest neighbor(s) only. Whereas in the simple variant there are h(h − 1)/2 bi-directional interconnections among pegs, here there are only h − 1 of them. Despite the significant reduction in the number of interconnections, the task of moving n disks between any two pegs is still shown to grow sub-exponentially as a function of the number of disks.
Journal of Number Theory, Sep 1, 2023