Carla da Costa Gomes | Universidade do Minho (original) (raw)
Papers by Carla da Costa Gomes
O Mercado de Trabalho em Portugal tem apresentado uma evolução muito positiva.
O Banco Central Europeu tem um papel fundamental no desenvolvimento da Política Monetária da Uniã... more O Banco Central Europeu tem um papel fundamental no desenvolvimento da Política Monetária da União Euorpeia.
There has been considerable interest in the identification of structural properties of combinator... more There has been considerable interest in the identification of structural properties of combinatorial problems that lead to efficient algorithms for solving them. Some of these properties are “easily” identifiable, while others are of interest because they capture key aspects of state-of-the-art constraint solvers. In particular, it was recently shown that the problem of identifying a strong Horn- or 2CNF-backdoor can be solved by exploiting equivalence with deletion backdoors, and is NP-complete. We prove that strong backdoor identification becomes harder than NP (unless NP=coNP) as soon as the inconsequential sounding feature of empty clause detection (present in all modern SAT solvers) is added. More interestingly, in practice such a feature as well as polynomial time constraint propagation mechanisms often lead to much smaller backdoor sets. In fact, despite the worst-case complexity results for strong backdoor detection, we show that Satz-Rand is remarkably good at finding small strong backdoors on a range of experimental domains. Our results suggest that structural notions explored for designing efficient algorithms for combinatorial problems should capture both statically and dynamically identifiable properties.
¥ This paper complements the work reported in . In we formally introduced the notion of backdoor ... more ¥ This paper complements the work reported in . In we formally introduced the notion of backdoor and characterize the complexity of several algorithms designed to take advantage of backdoors, including restart strategies. In this paper, we focus on the connection between heavy-tails and backdoors.
We describe research and results centering on the construction and use of Bayesian models that ca... more We describe research and results centering on the construction and use of Bayesian models that can predict the run time of problem solvers. Our efforts are motivated by observations of high variance in the time required to solve instances for several challenging problems. The methods have application to the decision-theoretic control of hard search and reasoning algorithms. We illustrate the approach with a focus on the task of predicting run time for general and domain-specific solvers on a hard class of structured constraint satisfaction problems. We review the use of learned models to predict the ultimate length of a trial, based on observing the behavior of the search algorithm during an early phase of a problem session. Finally, we discuss how we can employ the models to inform dynamic run-time decisions.
Recent progress on search and reasoning procedures has been driven by experimentation on computat... more Recent progress on search and reasoning procedures has been driven by experimentation on computation-ally hard problem instances. Hard random prob-lem distributions are an important source of such in-stances. Challenge problems from the area of nite algebra have also ...
Constraints - An International Journal, 2005
Much progress has been made in terms of boosting the effectiveness of backtrack style search meth... more Much progress has been made in terms of boosting the effectiveness of backtrack style search methods. In addition, during the last decade, a much better understanding of problem hardness, typical case complexity, and backtrack search behavior has been obtained. One example of a recent insight into backtrack search concerns so-called heavy-tailed behavior in randomized versions of backtrack search. Such heavy-tails explain the large variance in runtime often observed in practice. However, heavy-tailed behavior does certainly not occur on all instances. This has led to a need for a more precise characterization of when heavy-tailedness does and when it does not occur in backtrack search. In this paper, we provide such a characterization. We identify different statistical regimes of the tail of the runtime distributions of randomized backtrack search methods and show how they are correlated with the “sophistication” of the search procedure combined with the inherent hardness of the instances. We also show that the runtime distribution regime is highly correlated with the distribution of the depth of inconsistent subtrees discovered during the search. In particular, we show that an exponential distribution of the depth of inconsistent subtrees combined with a search space that grows exponentially with the depth of the inconsistent subtrees implies heavy-tailed behavior.
Unpredictability in the running time of complete search procedures can often be explained by the ... more Unpredictability in the running time of complete search procedures can often be explained by the phenomenon of "heavy-tailed cost distributions", meaning that at any time during the experiment there is a non-negligible probability of hitting a problem that requires exponentially more time to solve than any that has been encountered before ). We present a general method for introducing controlled randomization into complete search algorithms. The "boosted" search methods provably eliminate heavy-tails to the right of the median. Furthermore, they can take advantage of heavy-tails to the left of the median (that is, a nonnegligible chance of very short runs) to dramatically shorten the solution time. We demonstrate speedups of several orders of magnitude for state-of-the-art complete search procedures running on hard, real-world problems.
Artificial Intelligence, 2001
We introduce SensorDCSP, a naturally distributed benchmark based on a real-world application that... more We introduce SensorDCSP, a naturally distributed benchmark based on a real-world application that arises in the context of networked distributed systems. In order to study the performance of Distributed CSP (DisCSP) algorithms in a truly distributed setting, we use a discrete-event network simulator, which allows us to model the impact of different network traffic conditions on the performance of the algorithms. We consider two complete DisCSP algorithms: asynchronous backtracking (ABT) and asynchronous weak-commitment search (AWC). In our study of different network traffic distributions, we found that random delays, in some cases combined with a dynamic decentralized restart strategy, can improve the performance of DisCSP algorithms. More interestingly, we also found that the active introduction of message delays by agents can improve performance and robustness while reducing the overall network load. Finally, our work confirms that AWC performs better than ABT on satisfiable instances. On unsatis-fiable instances, however, the performance of AWC is considerably worse than ABT.
We introduce a new technique for counting models of Boolean satisfiability problems. Our approach... more We introduce a new technique for counting models of Boolean satisfiability problems. Our approach incorporates information obtained from sampling the solution space. Unlike previous approaches, our method does not require uniform or near-uniform samples. It instead converts local search sampling without any guarantees into very good bounds on the model count with guarantees. We give a formal analysis and provide experimental results showing the effectiveness of our approach.
We propose a new technique for sampling the solutions of combinatorial problems in a near-uniform... more We propose a new technique for sampling the solutions of combinatorial problems in a near-uniform manner. We focus on problems specified as a Boolean formula, i.e., on SAT instances. Sampling for SAT problems has been shown to have interesting connections with probabilistic reasoning, making practical sampling algorithms for SAT highly desirable. The best current approaches are based on Markov Chain Monte Carlo methods, which have some practical limitations. Our approach exploits combinatorial properties of random parity (XOR) constraints to prune away solutions near-uniformly. The final sample is identified amongst the remaining ones using a state-of-the-art SAT solver. The resulting sampling distribution is provably arbitrarily close to uniform. Our experiments show that our technique achieves a significantly better sampling quality than the best alternative.
Combinatorial search methods often exhibit a large variability in performance. We study the cost ... more Combinatorial search methods often exhibit a large variability in performance. We study the cost profiles of combinatorial search procedures. Our study reveals some intriguing properties of such cost profiles. The distributions are often characterized by very long tails or “heavy tails”. We will show that these distributions are best characterized by a general class of distributions that have no moments (i.e., an infinite mean, variance, etc.). Such non-standard distributions have recently been observed in areas as diverse as economics, statistical physics, and geophysics. They are closely related to fractal phenomena, whose study was introduced by Mandelbrot. We believe this is the first finding of these distributions in a purely computational setting. We also show how random restarts can effectively eliminate heavy-tailed behavior, thereby dramatically improving the overall performance of a search procedure.
New methods to generate hard random problem instances have driven progress on algorithms for dedu... more New methods to generate hard random problem instances have driven progress on algorithms for deduction and constraint satisfaction. Recently Achlioptas et al. (AAAI 2000) introduced a new generator based on Latin squares that creates only satisfiable problems, and so can be used to accurately test incomplete (one sided) solvers. We investigate how this and other generators are biased away from the uniform distribution of satisfiable problems and show how they can be improved by imposing a balance condition. More generally, we show that the generator is one member of a family of related models that generate distributions ranging from ones that are everywhere tractable to ones that exhibit a sharp hardness threshold. We also discuss the critical role of the problem encoding in the performance of both systematic and local search solvers.
We describe theoretical results and empirical study of context-sensitive restart policies for ran... more We describe theoretical results and empirical study of context-sensitive restart policies for randomized search procedures. The methods generalize previous results on optimal restart policies by exploiting dynamically updated beliefs about the probability distribution for run time. Rather than assuming complete knowledge or zero knowledge about the run-time distribution, we formulate restart policies that consider real-time observations about properties of instances and the solver's activity. We describe background work on the application of Bayesian methods to build predictive models for run time, introduce an optimal policy for dynamic restarts that considers predictions about run time, and perform a comparative study of traditional fixed versus dynamic restart policies.
Model counting is the classical problem of computing the number of solutions of a given propositi... more Model counting is the classical problem of computing the number of solutions of a given propositional formula. It vastly generalizes the NP-complete problem of propositional satisfiability, and hence is both highly useful and extremely expensive to solve in practice. We present a new approach to model counting that is based on adding a carefully chosen number of so-called streamlining constraints to the input formula in order to cut down the size of its solution space in a controlled manner. Each of the additional constraints is a randomly chosen XOR or parity constraint on the problem variables, represented either directly or in the standard CNF form. Inspired by a related yet quite different theoretical study of the properties of XOR constraints, we provide a formal proof that with high probability, the number of XOR constraints added in order to bring the formula to the boundary of being unsatisfiable determines with high precision its model count. Experimentally, we demonstrate that this approach can be used to obtain good bounds on the model counts for formulas that are far beyond the reach of exact counting methods. In fact, we obtain the first non-trivial solution counts for very hard, highly structured combinatorial problem instances. Note that unlike other counting techniques, such as Markov Chain Monte Carlo methods, we are able to provide highconfidence guarantees on the quality of the counts obtained.
We study the runtime profiles of complete backtrack-style search methods applied to hard scheduli... more We study the runtime profiles of complete backtrack-style search methods applied to hard scheduling prob-lems. Such search methods often exhibit a large vari-ability in performance due to the non-standard nature of their underlying cost distributions. The distribu-tions generally ...
In recent years we have seen significant progress in the area of Boolean satisfiability (SAT) sol... more In recent years we have seen significant progress in the area of Boolean satisfiability (SAT) solving and its applications. As a new challenge, the community is now moving to investigate whether similar advances can be made in the use of Quantified Boolean Formulas (QBF). QBF provides a natural framework for capturing problem solving and planning in multiagent settings. However, contrarily to single-agent planning, which can be effectively formulated as SAT, we show that a QBF approach to planning in a multi-agent setting leads to significant unexpected computational difficulties. We identify as a key difficulty of the QBF approach the fact that QBF solvers often end up exploring a much larger search space than the natural search space of the original problem. This is in contrast to the experience with SAT approaches. We also show how one can alleviate these problems by introducing two special QBF formulations and a new QBF solution strategy. We present experiments that show the effectiveness of our approach in terms of a significant improvement in performance compared to earlier work in this area. Our work also provides a general methodology for formulating adversarial scenarios in QBF.
O Mercado de Trabalho em Portugal tem apresentado uma evolução muito positiva.
O Banco Central Europeu tem um papel fundamental no desenvolvimento da Política Monetária da Uniã... more O Banco Central Europeu tem um papel fundamental no desenvolvimento da Política Monetária da União Euorpeia.
There has been considerable interest in the identification of structural properties of combinator... more There has been considerable interest in the identification of structural properties of combinatorial problems that lead to efficient algorithms for solving them. Some of these properties are “easily” identifiable, while others are of interest because they capture key aspects of state-of-the-art constraint solvers. In particular, it was recently shown that the problem of identifying a strong Horn- or 2CNF-backdoor can be solved by exploiting equivalence with deletion backdoors, and is NP-complete. We prove that strong backdoor identification becomes harder than NP (unless NP=coNP) as soon as the inconsequential sounding feature of empty clause detection (present in all modern SAT solvers) is added. More interestingly, in practice such a feature as well as polynomial time constraint propagation mechanisms often lead to much smaller backdoor sets. In fact, despite the worst-case complexity results for strong backdoor detection, we show that Satz-Rand is remarkably good at finding small strong backdoors on a range of experimental domains. Our results suggest that structural notions explored for designing efficient algorithms for combinatorial problems should capture both statically and dynamically identifiable properties.
¥ This paper complements the work reported in . In we formally introduced the notion of backdoor ... more ¥ This paper complements the work reported in . In we formally introduced the notion of backdoor and characterize the complexity of several algorithms designed to take advantage of backdoors, including restart strategies. In this paper, we focus on the connection between heavy-tails and backdoors.
We describe research and results centering on the construction and use of Bayesian models that ca... more We describe research and results centering on the construction and use of Bayesian models that can predict the run time of problem solvers. Our efforts are motivated by observations of high variance in the time required to solve instances for several challenging problems. The methods have application to the decision-theoretic control of hard search and reasoning algorithms. We illustrate the approach with a focus on the task of predicting run time for general and domain-specific solvers on a hard class of structured constraint satisfaction problems. We review the use of learned models to predict the ultimate length of a trial, based on observing the behavior of the search algorithm during an early phase of a problem session. Finally, we discuss how we can employ the models to inform dynamic run-time decisions.
Recent progress on search and reasoning procedures has been driven by experimentation on computat... more Recent progress on search and reasoning procedures has been driven by experimentation on computation-ally hard problem instances. Hard random prob-lem distributions are an important source of such in-stances. Challenge problems from the area of nite algebra have also ...
Constraints - An International Journal, 2005
Much progress has been made in terms of boosting the effectiveness of backtrack style search meth... more Much progress has been made in terms of boosting the effectiveness of backtrack style search methods. In addition, during the last decade, a much better understanding of problem hardness, typical case complexity, and backtrack search behavior has been obtained. One example of a recent insight into backtrack search concerns so-called heavy-tailed behavior in randomized versions of backtrack search. Such heavy-tails explain the large variance in runtime often observed in practice. However, heavy-tailed behavior does certainly not occur on all instances. This has led to a need for a more precise characterization of when heavy-tailedness does and when it does not occur in backtrack search. In this paper, we provide such a characterization. We identify different statistical regimes of the tail of the runtime distributions of randomized backtrack search methods and show how they are correlated with the “sophistication” of the search procedure combined with the inherent hardness of the instances. We also show that the runtime distribution regime is highly correlated with the distribution of the depth of inconsistent subtrees discovered during the search. In particular, we show that an exponential distribution of the depth of inconsistent subtrees combined with a search space that grows exponentially with the depth of the inconsistent subtrees implies heavy-tailed behavior.
Unpredictability in the running time of complete search procedures can often be explained by the ... more Unpredictability in the running time of complete search procedures can often be explained by the phenomenon of "heavy-tailed cost distributions", meaning that at any time during the experiment there is a non-negligible probability of hitting a problem that requires exponentially more time to solve than any that has been encountered before ). We present a general method for introducing controlled randomization into complete search algorithms. The "boosted" search methods provably eliminate heavy-tails to the right of the median. Furthermore, they can take advantage of heavy-tails to the left of the median (that is, a nonnegligible chance of very short runs) to dramatically shorten the solution time. We demonstrate speedups of several orders of magnitude for state-of-the-art complete search procedures running on hard, real-world problems.
Artificial Intelligence, 2001
We introduce SensorDCSP, a naturally distributed benchmark based on a real-world application that... more We introduce SensorDCSP, a naturally distributed benchmark based on a real-world application that arises in the context of networked distributed systems. In order to study the performance of Distributed CSP (DisCSP) algorithms in a truly distributed setting, we use a discrete-event network simulator, which allows us to model the impact of different network traffic conditions on the performance of the algorithms. We consider two complete DisCSP algorithms: asynchronous backtracking (ABT) and asynchronous weak-commitment search (AWC). In our study of different network traffic distributions, we found that random delays, in some cases combined with a dynamic decentralized restart strategy, can improve the performance of DisCSP algorithms. More interestingly, we also found that the active introduction of message delays by agents can improve performance and robustness while reducing the overall network load. Finally, our work confirms that AWC performs better than ABT on satisfiable instances. On unsatis-fiable instances, however, the performance of AWC is considerably worse than ABT.
We introduce a new technique for counting models of Boolean satisfiability problems. Our approach... more We introduce a new technique for counting models of Boolean satisfiability problems. Our approach incorporates information obtained from sampling the solution space. Unlike previous approaches, our method does not require uniform or near-uniform samples. It instead converts local search sampling without any guarantees into very good bounds on the model count with guarantees. We give a formal analysis and provide experimental results showing the effectiveness of our approach.
We propose a new technique for sampling the solutions of combinatorial problems in a near-uniform... more We propose a new technique for sampling the solutions of combinatorial problems in a near-uniform manner. We focus on problems specified as a Boolean formula, i.e., on SAT instances. Sampling for SAT problems has been shown to have interesting connections with probabilistic reasoning, making practical sampling algorithms for SAT highly desirable. The best current approaches are based on Markov Chain Monte Carlo methods, which have some practical limitations. Our approach exploits combinatorial properties of random parity (XOR) constraints to prune away solutions near-uniformly. The final sample is identified amongst the remaining ones using a state-of-the-art SAT solver. The resulting sampling distribution is provably arbitrarily close to uniform. Our experiments show that our technique achieves a significantly better sampling quality than the best alternative.
Combinatorial search methods often exhibit a large variability in performance. We study the cost ... more Combinatorial search methods often exhibit a large variability in performance. We study the cost profiles of combinatorial search procedures. Our study reveals some intriguing properties of such cost profiles. The distributions are often characterized by very long tails or “heavy tails”. We will show that these distributions are best characterized by a general class of distributions that have no moments (i.e., an infinite mean, variance, etc.). Such non-standard distributions have recently been observed in areas as diverse as economics, statistical physics, and geophysics. They are closely related to fractal phenomena, whose study was introduced by Mandelbrot. We believe this is the first finding of these distributions in a purely computational setting. We also show how random restarts can effectively eliminate heavy-tailed behavior, thereby dramatically improving the overall performance of a search procedure.
New methods to generate hard random problem instances have driven progress on algorithms for dedu... more New methods to generate hard random problem instances have driven progress on algorithms for deduction and constraint satisfaction. Recently Achlioptas et al. (AAAI 2000) introduced a new generator based on Latin squares that creates only satisfiable problems, and so can be used to accurately test incomplete (one sided) solvers. We investigate how this and other generators are biased away from the uniform distribution of satisfiable problems and show how they can be improved by imposing a balance condition. More generally, we show that the generator is one member of a family of related models that generate distributions ranging from ones that are everywhere tractable to ones that exhibit a sharp hardness threshold. We also discuss the critical role of the problem encoding in the performance of both systematic and local search solvers.
We describe theoretical results and empirical study of context-sensitive restart policies for ran... more We describe theoretical results and empirical study of context-sensitive restart policies for randomized search procedures. The methods generalize previous results on optimal restart policies by exploiting dynamically updated beliefs about the probability distribution for run time. Rather than assuming complete knowledge or zero knowledge about the run-time distribution, we formulate restart policies that consider real-time observations about properties of instances and the solver's activity. We describe background work on the application of Bayesian methods to build predictive models for run time, introduce an optimal policy for dynamic restarts that considers predictions about run time, and perform a comparative study of traditional fixed versus dynamic restart policies.
Model counting is the classical problem of computing the number of solutions of a given propositi... more Model counting is the classical problem of computing the number of solutions of a given propositional formula. It vastly generalizes the NP-complete problem of propositional satisfiability, and hence is both highly useful and extremely expensive to solve in practice. We present a new approach to model counting that is based on adding a carefully chosen number of so-called streamlining constraints to the input formula in order to cut down the size of its solution space in a controlled manner. Each of the additional constraints is a randomly chosen XOR or parity constraint on the problem variables, represented either directly or in the standard CNF form. Inspired by a related yet quite different theoretical study of the properties of XOR constraints, we provide a formal proof that with high probability, the number of XOR constraints added in order to bring the formula to the boundary of being unsatisfiable determines with high precision its model count. Experimentally, we demonstrate that this approach can be used to obtain good bounds on the model counts for formulas that are far beyond the reach of exact counting methods. In fact, we obtain the first non-trivial solution counts for very hard, highly structured combinatorial problem instances. Note that unlike other counting techniques, such as Markov Chain Monte Carlo methods, we are able to provide highconfidence guarantees on the quality of the counts obtained.
We study the runtime profiles of complete backtrack-style search methods applied to hard scheduli... more We study the runtime profiles of complete backtrack-style search methods applied to hard scheduling prob-lems. Such search methods often exhibit a large vari-ability in performance due to the non-standard nature of their underlying cost distributions. The distribu-tions generally ...
In recent years we have seen significant progress in the area of Boolean satisfiability (SAT) sol... more In recent years we have seen significant progress in the area of Boolean satisfiability (SAT) solving and its applications. As a new challenge, the community is now moving to investigate whether similar advances can be made in the use of Quantified Boolean Formulas (QBF). QBF provides a natural framework for capturing problem solving and planning in multiagent settings. However, contrarily to single-agent planning, which can be effectively formulated as SAT, we show that a QBF approach to planning in a multi-agent setting leads to significant unexpected computational difficulties. We identify as a key difficulty of the QBF approach the fact that QBF solvers often end up exploring a much larger search space than the natural search space of the original problem. This is in contrast to the experience with SAT approaches. We also show how one can alleviate these problems by introducing two special QBF formulations and a new QBF solution strategy. We present experiments that show the effectiveness of our approach in terms of a significant improvement in performance compared to earlier work in this area. Our work also provides a general methodology for formulating adversarial scenarios in QBF.