Facundo Bromberg | Universidad Tecnologica Nacional (original) (raw)
Uploads
Papers by Facundo Bromberg
cs.iastate.edu
In this work we introduce the problem of multiagent data mining (MADM). For that, we first define... more In this work we introduce the problem of multiagent data mining (MADM). For that, we first define a (nonMsocial) learning agent, which is a casting of the definition of an inductive learner [17] in agents jargon. Then, we define a social learning agent, which extends a learning ...
This work focuses on learning the structure of Markov networks. Markov networks are parametric mo... more This work focuses on learning the structure of Markov networks. Markov networks are parametric models for compactly representing complex probability distributions. These models are composed by: a structure and numerical weights. The structure describes independences that hold in the distribution. Depending on the goal of learning intended by the user, structure learning algorithms can be divided into: density estimation algorithms, focusing on learning structures for answering inference queries; and knowledge discovery algorithms, focusing on learning structures for describing independences qualitatively. The latter algorithms present an important limitation for describing independences as they use a single graph, a coarse grain representation of the structure. However, many practical distributions present a flexible type of independences called context-specific independences, which cannot be described by a single graph. This work presents an approach for overcoming this limitation by proposing an alternative representation of the structure that named canonical model; and a novel knowledge discovery algorithm called CSPC for learning canonical models by using as constraints context-specific independences present in data. On an extensive empirical evaluation, CSPC learns more accurate structures than state-of-the-art density estimation and knowledge discovery algorithms. Moreover, for answering inference queries, our approach obtains competitive results against density estimation algorithms, significantly outperforming knowledge discovery algorithms.
We address the problem of reliability of independence-based causal discovery algorithms that resu... more We address the problem of reliability of independence-based causal discovery algorithms that results from unreliable statistical independence tests. We model the problem as a knowledge base containing a set of independences that are related through the well-known Pearl’s axioms. Statistical tests on finite data sets may result in errors in these tests and inconsistencies in the knowledge base. Our approach uses an instance of the class of defeasible logics called argumentation, augmented with a preference function that is used to reason and possibly correct errors in these tests, thereby resolving the corresponding inconsistencies. This results in a more robust conditional independence test, called argumentative independence test. We evaluated our approach on data sets sampled from randomly generated causal models as well as real-world data sets. Our experiments show a clear advantage of argumentative over purely statistical tests, with improvements in accuracy of up to 17%, measure...
Resumen. El Observatorio Pierre Auger está dedicado a la detección de rayos cósmicos de ultra ele... more Resumen. El Observatorio Pierre Auger está dedicado a la detección de rayos cósmicos de ultra elevada energía, partículas subatómicas que bombardean la tierra desde el espacio, mediante el uso de varios detectores ya partir de las cascadas de partículas secundarias ...
Trunk diameter is a variable of agricultural interest, used mainly in the prediction of fruit tre... more Trunk diameter is a variable of agricultural interest, used mainly in the prediction of fruit trees production. It is correlated with leaf area and biomass of trees, and consequently gives a good estimate of potential production of the plants. This work presents a low cost, high precision method for autonomous measurement of trunk diameter of fruit trees based on Computer Vision. Autonomous methods based on Computer Vision or other techniques are introduced in the literature for they present important simplifications in the measurement process, requiring little to none human decision making. This presents different advantages for crop management: the method is amenable to be operated by unknowledgeable personnel, with lower operational costs; it results in lower stress levels to knowledgeable personnel, avoiding the deterioration of the measurement quality over time; or it makes the measurement process amenable to be embedded in larger autonomous systems, allowing more measurement to be taken with equivalent costs. In a more personal aspect, the present work is also a successful proof-of-concept for our laboratories and regional research institutions in favor of autonomous measurements based on Computer Vision, opening the door to further investigations in other important agronomic variables measurable by Computer Vision. To date, all existing autonomous methods are either of low precision, or have a prohibitive cost for massive agricultural adoption, leaving the manual Vernier caliper or tape measure as the only choice in most situations. In this work we present an autonomous solution that is costly effective for mass adoption, and its precision is competitive (with slight improvements) over the caliper method.
Trunk diameter is a variable of agricultural interest, used mainly in the prediction of fruit tre... more Trunk diameter is a variable of agricultural interest, used mainly in the prediction of fruit trees production. It is correlated with leaf area and biomass of trees, and consequently gives a good estimate of potential production of the plants. This work presents a low cost, high precision method for autonomous measurement of trunk diameter of fruit trees based on Computer Vision. Autonomous methods based on Computer Vision or other techniques are introduced in the literature for they present important simplifications in the measurement process, requiring little to none human decision making. This presents different advantages for crop management: the method is amenable to be operated by unknowledgeable personnel, with lower operational costs; it results in lower stress levels to knowledgeable personnel, avoiding the deterioration of the measurement quality over time; or it makes the measurement process amenable to be embedded in larger autonomous systems, allowing more measurement to be taken with equivalent costs. In a more personal aspect, the present work is also a successful proof-of-concept for our laboratories and regional research institutions in favor of autonomous measurements based on Computer Vision, opening the door to further investigations in other important agronomic variables measurable by Computer Vision. To date, all existing autonomous methods are either of low precision, or have a prohibitive cost for massive agricultural adoption, leaving the manual Vernier caliper or tape measure as the only choice in most situations. In this work we present an autonomous solution that is costly effective for mass adoption, and its precision is competitive (with slight improvements) over the caliper method.
Markov random fields provide a compact representation of joint probability distributions by repre... more Markov random fields provide a compact representation of joint probability distributions by representing its independence properties in an undirected graph. The well-known Hammersley-Clifford theorem uses these conditional independences to factorize a Gibbs distribution into a set of factors. However, an important issue of using a graph to represent independences is that it cannot encode some types of independence relations, such as the context-specific independences (CSIs). They are a particular case of conditional independences that is true only for a certain assignment of its conditioning set; in contrast to conditional independences that must hold for all its assignments. This work presents a method for factorizing a Markov random field according to CSIs present in a distribution, and formally guarantees that this factorization is correct. This is presented in our main contribution, the context-specific Hammersley-Clifford theorem, a generalization to CSIs of the Hammersley-Clifford theorem that applies for conditional independences.
Lecture Notes in Computer Science, 2014
Markov networks are models for compactly representing complex probability distributions. They are... more Markov networks are models for compactly representing complex probability distributions. They are composed by a structure and a set of numerical weights. The structure qualitatively describes independences in the distribution, which can be exploited to factorize the distribution into a set of compact functions. A key application for learning structures from data is to automatically discover knowledge. In practice, structure learning algorithms focused on "knowledge discovery" present a limitation: they use a coarse-grained representation of the structure. As a result, this representation cannot describe context-specific independences. Very recently, an algorithm called CSPC was designed to overcome this limitation, but it has a high computational complexity. This work tries to mitigate this downside presenting CSGS, an algorithm that uses the Grow-Shrink strategy for reducing unnecessary computations. On an empirical evaluation, the structures learned by CSGS achieve competitive accuracies and lower computational complexity with respect to those obtained by CSPC.
International Journal on Artificial Intelligence Tools, 2014
2013 IEEE 25th International Conference on Tools with Artificial Intelligence, 2013
Learning the Markov network structure from data is a problem that has received considerable atten... more Learning the Markov network structure from data is a problem that has received considerable attention in machine learning, and in many other application fields. This work focuses on a particular approach for this purpose called Independence-Based learning. Such approach guarantees the learning of the correct structure efficiently, whenever data is sufficient for representing the underlying distribution. However, an important issue of such approach is that the learned structures are encoded in an undirected graph. The problem with graphs is that they cannot encode some types of independence relations, such as the context-specific independences. They are a particular case of conditional independences that is true only for a certain assignment of its conditioning set, in contrast to conditional independences that must hold for all its assignments. In this work we present CSPC, an independencebased algorithm for learning structures that encode contextspecific independences, and encoding them in a log-linear model instead of a graph. The central idea of CSPC is to combine the theoretical guarantees provided by the independence-based approach with the benefits of representing complex structures by using features in a log-linear model. We present experiments in a synthetic case, showing that CSPC is more accurate than the state-of-the-art Independence-Based algorithms when the underlying distribution contains CSIs.
2013 Fourth Argentine Symposium and Conference on Embedded Systems (SASE/CASE), 2013
One of the most important aspects before installing a Wireless Sensor Network (WSN) is a previous... more One of the most important aspects before installing a Wireless Sensor Network (WSN) is a previous study of connectivity constraints that exist in the area to be covered. This study is critical to the final distribution of the sensors, with an important impact in the life of the network by reducing consumption, and on the robustness by contemplating redundancy of paths and sensors. In this paper, we present a summary of the most important aspects of a preliminary empirical study of the Link Quality Indicator (LQI), on different landscapes in the glaciers area of Patagonia Argentina. The landscapes covered varied in geographical structures with different levels of attenuation and extreme environmental conditions. Through the analysis of the Cumulative Distribution Function (CDF) of the measured LQI values, we can characterize the behavior of four different scenarios and correlate the combined effects of the environmental structure with the distance from the transmitter. The measurements performed were designed for characterizing the links at the physical layer with the purpose of defining models to estimate the Packet Error Rate (PER) for the WSN deployment stage.
cs.iastate.edu
In this work we introduce the problem of multiagent data mining (MADM). For that, we first define... more In this work we introduce the problem of multiagent data mining (MADM). For that, we first define a (nonMsocial) learning agent, which is a casting of the definition of an inductive learner [17] in agents jargon. Then, we define a social learning agent, which extends a learning ...
This work focuses on learning the structure of Markov networks. Markov networks are parametric mo... more This work focuses on learning the structure of Markov networks. Markov networks are parametric models for compactly representing complex probability distributions. These models are composed by: a structure and numerical weights. The structure describes independences that hold in the distribution. Depending on the goal of learning intended by the user, structure learning algorithms can be divided into: density estimation algorithms, focusing on learning structures for answering inference queries; and knowledge discovery algorithms, focusing on learning structures for describing independences qualitatively. The latter algorithms present an important limitation for describing independences as they use a single graph, a coarse grain representation of the structure. However, many practical distributions present a flexible type of independences called context-specific independences, which cannot be described by a single graph. This work presents an approach for overcoming this limitation by proposing an alternative representation of the structure that named canonical model; and a novel knowledge discovery algorithm called CSPC for learning canonical models by using as constraints context-specific independences present in data. On an extensive empirical evaluation, CSPC learns more accurate structures than state-of-the-art density estimation and knowledge discovery algorithms. Moreover, for answering inference queries, our approach obtains competitive results against density estimation algorithms, significantly outperforming knowledge discovery algorithms.
We address the problem of reliability of independence-based causal discovery algorithms that resu... more We address the problem of reliability of independence-based causal discovery algorithms that results from unreliable statistical independence tests. We model the problem as a knowledge base containing a set of independences that are related through the well-known Pearl’s axioms. Statistical tests on finite data sets may result in errors in these tests and inconsistencies in the knowledge base. Our approach uses an instance of the class of defeasible logics called argumentation, augmented with a preference function that is used to reason and possibly correct errors in these tests, thereby resolving the corresponding inconsistencies. This results in a more robust conditional independence test, called argumentative independence test. We evaluated our approach on data sets sampled from randomly generated causal models as well as real-world data sets. Our experiments show a clear advantage of argumentative over purely statistical tests, with improvements in accuracy of up to 17%, measure...
Resumen. El Observatorio Pierre Auger está dedicado a la detección de rayos cósmicos de ultra ele... more Resumen. El Observatorio Pierre Auger está dedicado a la detección de rayos cósmicos de ultra elevada energía, partículas subatómicas que bombardean la tierra desde el espacio, mediante el uso de varios detectores ya partir de las cascadas de partículas secundarias ...
Trunk diameter is a variable of agricultural interest, used mainly in the prediction of fruit tre... more Trunk diameter is a variable of agricultural interest, used mainly in the prediction of fruit trees production. It is correlated with leaf area and biomass of trees, and consequently gives a good estimate of potential production of the plants. This work presents a low cost, high precision method for autonomous measurement of trunk diameter of fruit trees based on Computer Vision. Autonomous methods based on Computer Vision or other techniques are introduced in the literature for they present important simplifications in the measurement process, requiring little to none human decision making. This presents different advantages for crop management: the method is amenable to be operated by unknowledgeable personnel, with lower operational costs; it results in lower stress levels to knowledgeable personnel, avoiding the deterioration of the measurement quality over time; or it makes the measurement process amenable to be embedded in larger autonomous systems, allowing more measurement to be taken with equivalent costs. In a more personal aspect, the present work is also a successful proof-of-concept for our laboratories and regional research institutions in favor of autonomous measurements based on Computer Vision, opening the door to further investigations in other important agronomic variables measurable by Computer Vision. To date, all existing autonomous methods are either of low precision, or have a prohibitive cost for massive agricultural adoption, leaving the manual Vernier caliper or tape measure as the only choice in most situations. In this work we present an autonomous solution that is costly effective for mass adoption, and its precision is competitive (with slight improvements) over the caliper method.
Trunk diameter is a variable of agricultural interest, used mainly in the prediction of fruit tre... more Trunk diameter is a variable of agricultural interest, used mainly in the prediction of fruit trees production. It is correlated with leaf area and biomass of trees, and consequently gives a good estimate of potential production of the plants. This work presents a low cost, high precision method for autonomous measurement of trunk diameter of fruit trees based on Computer Vision. Autonomous methods based on Computer Vision or other techniques are introduced in the literature for they present important simplifications in the measurement process, requiring little to none human decision making. This presents different advantages for crop management: the method is amenable to be operated by unknowledgeable personnel, with lower operational costs; it results in lower stress levels to knowledgeable personnel, avoiding the deterioration of the measurement quality over time; or it makes the measurement process amenable to be embedded in larger autonomous systems, allowing more measurement to be taken with equivalent costs. In a more personal aspect, the present work is also a successful proof-of-concept for our laboratories and regional research institutions in favor of autonomous measurements based on Computer Vision, opening the door to further investigations in other important agronomic variables measurable by Computer Vision. To date, all existing autonomous methods are either of low precision, or have a prohibitive cost for massive agricultural adoption, leaving the manual Vernier caliper or tape measure as the only choice in most situations. In this work we present an autonomous solution that is costly effective for mass adoption, and its precision is competitive (with slight improvements) over the caliper method.
Markov random fields provide a compact representation of joint probability distributions by repre... more Markov random fields provide a compact representation of joint probability distributions by representing its independence properties in an undirected graph. The well-known Hammersley-Clifford theorem uses these conditional independences to factorize a Gibbs distribution into a set of factors. However, an important issue of using a graph to represent independences is that it cannot encode some types of independence relations, such as the context-specific independences (CSIs). They are a particular case of conditional independences that is true only for a certain assignment of its conditioning set; in contrast to conditional independences that must hold for all its assignments. This work presents a method for factorizing a Markov random field according to CSIs present in a distribution, and formally guarantees that this factorization is correct. This is presented in our main contribution, the context-specific Hammersley-Clifford theorem, a generalization to CSIs of the Hammersley-Clifford theorem that applies for conditional independences.
Lecture Notes in Computer Science, 2014
Markov networks are models for compactly representing complex probability distributions. They are... more Markov networks are models for compactly representing complex probability distributions. They are composed by a structure and a set of numerical weights. The structure qualitatively describes independences in the distribution, which can be exploited to factorize the distribution into a set of compact functions. A key application for learning structures from data is to automatically discover knowledge. In practice, structure learning algorithms focused on "knowledge discovery" present a limitation: they use a coarse-grained representation of the structure. As a result, this representation cannot describe context-specific independences. Very recently, an algorithm called CSPC was designed to overcome this limitation, but it has a high computational complexity. This work tries to mitigate this downside presenting CSGS, an algorithm that uses the Grow-Shrink strategy for reducing unnecessary computations. On an empirical evaluation, the structures learned by CSGS achieve competitive accuracies and lower computational complexity with respect to those obtained by CSPC.
International Journal on Artificial Intelligence Tools, 2014
2013 IEEE 25th International Conference on Tools with Artificial Intelligence, 2013
Learning the Markov network structure from data is a problem that has received considerable atten... more Learning the Markov network structure from data is a problem that has received considerable attention in machine learning, and in many other application fields. This work focuses on a particular approach for this purpose called Independence-Based learning. Such approach guarantees the learning of the correct structure efficiently, whenever data is sufficient for representing the underlying distribution. However, an important issue of such approach is that the learned structures are encoded in an undirected graph. The problem with graphs is that they cannot encode some types of independence relations, such as the context-specific independences. They are a particular case of conditional independences that is true only for a certain assignment of its conditioning set, in contrast to conditional independences that must hold for all its assignments. In this work we present CSPC, an independencebased algorithm for learning structures that encode contextspecific independences, and encoding them in a log-linear model instead of a graph. The central idea of CSPC is to combine the theoretical guarantees provided by the independence-based approach with the benefits of representing complex structures by using features in a log-linear model. We present experiments in a synthetic case, showing that CSPC is more accurate than the state-of-the-art Independence-Based algorithms when the underlying distribution contains CSIs.
2013 Fourth Argentine Symposium and Conference on Embedded Systems (SASE/CASE), 2013
One of the most important aspects before installing a Wireless Sensor Network (WSN) is a previous... more One of the most important aspects before installing a Wireless Sensor Network (WSN) is a previous study of connectivity constraints that exist in the area to be covered. This study is critical to the final distribution of the sensors, with an important impact in the life of the network by reducing consumption, and on the robustness by contemplating redundancy of paths and sensors. In this paper, we present a summary of the most important aspects of a preliminary empirical study of the Link Quality Indicator (LQI), on different landscapes in the glaciers area of Patagonia Argentina. The landscapes covered varied in geographical structures with different levels of attenuation and extreme environmental conditions. Through the analysis of the Cumulative Distribution Function (CDF) of the measured LQI values, we can characterize the behavior of four different scenarios and correlate the combined effects of the environmental structure with the distance from the transmitter. The measurements performed were designed for characterizing the links at the physical layer with the purpose of defining models to estimate the Packet Error Rate (PER) for the WSN deployment stage.