Mika Sato-Ilic - Academia.edu (original) (raw)
Papers by Mika Sato-Ilic
This paper defines a generalized structural model of similarity between a pair of objects. We hav... more This paper defines a generalized structural model of similarity between a pair of objects. We have discussed an additive fuzzy clustering model previously. The merits of the additive fuzzy clustering models are (1) the amount of computations for the identification of the models are much fewer than in a hard clustering model and (2) we obtain a suitable fitness by using fewer number of clusters. This paper proposes a general class of the clustering model, in which aggregation operators are used to define the degree of simultaneous belongingness of a pair of objects to a cluster. We discuss some required conditions for the aggregation operators. T-norms are concrete examples for satisfying these conditions. Moreover, the validity of this model is shown by investigating a characteristic of the model and numerical applications
We have proposed a weighted principal component analysis for interval-valued data which is a hybr... more We have proposed a weighted principal component analysis for interval-valued data which is a hybrid method of fuzzy clustering and principal component analysis. However, in this method, we need to assume the relationship between minimum values and maximum values of the interval-valued data. That is, if the assumption is not adaptable, then the transformed matrix cannot show the exact situation of the interval-valued data. In order to avoid the wrong assumption, this paper proposes another weighted principal component analysis using the fuzzy clustering solutions of minimum and maximum data under unique clusters. From the uniqueness of the clusters, we can obtain two comparable results of principal components for the minimum and maximum data.
Smart innovation, systems and technologies, 2020
Studies in Classification, Data Analysis, and Knowledge Organization, 1998
This paper presents a dynamic clustering model in which clusters are constructed in order to find... more This paper presents a dynamic clustering model in which clusters are constructed in order to find the features of the dynamical change.
Studies in Computational Intelligence, 2007
In this chapter, an introduction to intelligent machine is presented. An explanation on intellige... more In this chapter, an introduction to intelligent machine is presented. An explanation on intelligent behavior, and the difference between intelligent and repetitive natural or programmed behavior is provided. Some learning techniques in the field of Artificial Intelligence in constructing intelligent machines are then discussed. In addition, applications of intelligent machines to a number of areas including aerial navigation, ocean and space exploration, and humanoid robots are presented.
This paper proposes a method to obtain the quantitative difference of fuzzy clustering results an... more This paper proposes a method to obtain the quantitative difference of fuzzy clustering results and to visualize it in lower dimensional space. Since the result of fuzzy clustering is shown as the degree of belongingness of objects to exploratory obtained clusters, the obtained clusters do not have "explicit order". Therefore, if we obtain multiple results from multiple datasets individually, then we cannot compare the results, since each obtained "order" of clusters for each dataset may differ from each other. However, there is a need to compare the results. For example, if we observe multiple datasets in which each dataset consists of objects and variables and such a dataset is observed for a multiple number of subjects (or times), and if we obtain the clustering result at each subject (or time), then we usually would like to know the difference of the clustering results over the subjects (or times). The proposed method in this paper enables the capturing of the difference based on the mathematical comparability of the obtained clusters over the different datasets corresponding to different subjects (or times) and visualize the difference of clustering results over the subjects (or times). Numerical examples are shown for an improved understanding of the proposed method.
SCIS & ISIS SCIS & ISIS 2006, 2006
This paper proposes methods to obtain difference among subjects by using the degree of reliabilit... more This paper proposes methods to obtain difference among subjects by using the degree of reliability of each subject based on the results of fuzzy clustering and multidimensional scaling (MDS). In addition, new fuzzy clustering and MDS, including the weights of reliability scores, are proposed to classify subjects. When we observe data consisting of values of objects with respect to variables, and such data are observed over multiple subjects, capturing the difference among subjects is important in many fields. In this paper, the degree of reliability is obtained through the optimality of convex clustering. Based on this idea, it is shown that the same difference over the subjects can be obtained, regardless of the difference in obtained latent structures, which are the result of dynamic fuzzy clustering and the result of MDS by a numerical example. From this, we show the robustness of the proposed reliability concerning the variety of the obtained latent structures of data.
Springer eBooks, 2021
In this paper, a classification method based on an ensemble learning of deep learning and multidi... more In this paper, a classification method based on an ensemble learning of deep learning and multidimensional scaling is proposed for a problem of discrimination of large and complex data. The advantage of the proposed method is improving the accuracy of results of the discrimination by removing the latent structure of data which have low explanatory power as noise, and this is done by transforming original data into a space spanned by dimensions which explain the latent structure of the data. Using numerical examples, we demonstrate the effectiveness of the proposed method.
Smart innovation, systems and technologies, 2020
The classification of objects based on corresponding classes is an important task in official sta... more The classification of objects based on corresponding classes is an important task in official statistics. In the previous study, the overlapping classifier that assigns classes to an object based on the reliability score was proposed. The proposed reliability score has been defined considering both the uncertainty from data and the uncertainty from the latent classification structure in data and generalized using the idea of the T-norm in statistical metric space. This paper proposes a new procedure for the improvement of the training dataset based on a pattern of reliability scores to get a better classification accuracy. The numerical example shows the proposed procedure gives a better result as compared to the result of our previous study.
Smart innovation, systems and technologies, May 31, 2018
Business demography statistics, which provide data on the numbers of births, deaths and survivals... more Business demography statistics, which provide data on the numbers of births, deaths and survivals of enterprises and/or establishments in a specific period, serve as important information for policymakers who intend to make decisions on the policy to promote entrepreneurship, which is considered as an essential instrument for improving competitiveness and generating economic growth and job opportunities. These important statistics can be compiled from statistical business registers in many countries. In Japan, the statistical business register is being re-engineered not only by redesigning the system but also by improving the quality of data sources; however, it will take some years before business demography statistics can be compiled directly from the Japanese statistical business register. In this paper, an alternative method of estimating business demography statistics that can be obtained directly from the data of Economic Censuses is proposed. The model used here is basically based on previous works, however, an enhanced version of the model is proposed here so that quantitative attributes of establishments/enterprises such as the number of persons engaged can be analyzed. Based on the enhanced model, a numerical example on job creation and destruction is given using the micro data of Economic Censuses, which reveals the importance of fostering entrepreneurship as well as opening establishments as policy measures for economic growth and job opportunities.
Smart innovation, systems and technologies, 2020
Constrained cluster analysis is a semi-supervised approach of clustering where some additional in... more Constrained cluster analysis is a semi-supervised approach of clustering where some additional information about the clusters is incorporated as constraints. For example, sometimes, we need to consider the constraint of homogeneity among all obtained clusters. This paper presents an algorithm for constrained cluster analysis with homogeneity of clusters and shows a practical application of the algorithm in formulating survey blocks in official statistics such as the Economic Census, which reveals the effectiveness of the algorithm. In this application, travel distance is utilized considering the property of homogeneity of this clustering.
Advances in intelligent systems and computing, Nov 3, 2015
This paper proposes a fuzzy correlational direction multidimensional scaling based on fuzzy clust... more This paper proposes a fuzzy correlational direction multidimensional scaling based on fuzzy clustering-based correlation. We have proposed the fuzzy clustering-based correlation [6] and its application of multidimensional scaling [8] in order to obtain a more accurate result. In this method, we proposed dissimilarity between a pair of objects weighted by the fuzzy clustering-based correlation and applied it to the ordinary multidimensional scaling (MDS). However, in this method, we could not consider the direction of the correlation into the MDS. In order to solve this problem, this paper proposes a new dissimilarity which can include the difference of the direction of the correlation and proposes a new MDS by applying this dissimilarity to the ordinary MDS. We call this method fuzzy correlational direction multidimensional scaling. First, we show the non-fuzzy version of the correlational direction multidimensional scaling and next, we show the fuzzy version of the correlational direction multidimensional scaling. Several numerical examples show the better performance of the proposed method.
Springer eBooks, Aug 11, 2008
Regression analysis is a well known and a widely used technique in multivariate data analysis. Th... more Regression analysis is a well known and a widely used technique in multivariate data analysis. The efficiency of it is extensively recognized. Recently, several proposed regression models have exploited the spatial classification structure of data. The purpose of this inclusion of the spatial classification structure is to set a heterogeneous data structure to homogeneous structure in order to adjust the
Procedia Computer Science, 2014
With the advancement of information processing technology in recent years, larger and more compli... more With the advancement of information processing technology in recent years, larger and more complicated data has appeared. On the basis of this situation, a method to deal with this kind of data is required. Cluster analysis, or clustering will be one solution. There are two types of data in a clustering method. One is the data that consists of objects and attributes, the other is the data that consists of the similarity of each object. The latter, a data of similarity is treated in this study. The purpose of the clustering for similarity data is to obtain the clustering result based on the similarity scaling among objects. However, when the data is complex the given similarity data does not always have the structure of similarity scaling defined in the clustering method. Therefore, in this paper, a fuzzy clustering method that enables us to obtain a clear classification for the complex data is proposed, by introducing the similarity data to the obtained clustering result and considering the relative structure for all the clusters. By considering the relative structure of the belongingness to clusters, more specific information of objects can be given, and the belongingness would be improved.
Springer eBooks, 2009
Fuzzy regression methods are proposed considering classification structure which is obtained as a... more Fuzzy regression methods are proposed considering classification structure which is obtained as a result of fuzzy clustering with respect to each attribute. The fuzzy clustering is based on dissimilarities over objects in the subspace of the object’s space. Exploiting the degree of belongingness of objects to clusters with respect to attributes, we define two fuzzy regression methods in order to estimate the fuzzy cluster loadings and weighted regression coefficients. Numerical examples show the applicability of our proposed method
Studies in classification, data analysis, and knowledge organization, 2020
This paper defines a generalized structural model of similarity between a pair of objects. We hav... more This paper defines a generalized structural model of similarity between a pair of objects. We have discussed an additive fuzzy clustering model previously. The merits of the additive fuzzy clustering models are (1) the amount of computations for the identification of the models are much fewer than in a hard clustering model and (2) we obtain a suitable fitness by using fewer number of clusters. This paper proposes a general class of the clustering model, in which aggregation operators are used to define the degree of simultaneous belongingness of a pair of objects to a cluster. We discuss some required conditions for the aggregation operators. T-norms are concrete examples for satisfying these conditions. Moreover, the validity of this model is shown by investigating a characteristic of the model and numerical applications
We have proposed a weighted principal component analysis for interval-valued data which is a hybr... more We have proposed a weighted principal component analysis for interval-valued data which is a hybrid method of fuzzy clustering and principal component analysis. However, in this method, we need to assume the relationship between minimum values and maximum values of the interval-valued data. That is, if the assumption is not adaptable, then the transformed matrix cannot show the exact situation of the interval-valued data. In order to avoid the wrong assumption, this paper proposes another weighted principal component analysis using the fuzzy clustering solutions of minimum and maximum data under unique clusters. From the uniqueness of the clusters, we can obtain two comparable results of principal components for the minimum and maximum data.
Smart innovation, systems and technologies, 2020
Studies in Classification, Data Analysis, and Knowledge Organization, 1998
This paper presents a dynamic clustering model in which clusters are constructed in order to find... more This paper presents a dynamic clustering model in which clusters are constructed in order to find the features of the dynamical change.
Studies in Computational Intelligence, 2007
In this chapter, an introduction to intelligent machine is presented. An explanation on intellige... more In this chapter, an introduction to intelligent machine is presented. An explanation on intelligent behavior, and the difference between intelligent and repetitive natural or programmed behavior is provided. Some learning techniques in the field of Artificial Intelligence in constructing intelligent machines are then discussed. In addition, applications of intelligent machines to a number of areas including aerial navigation, ocean and space exploration, and humanoid robots are presented.
This paper proposes a method to obtain the quantitative difference of fuzzy clustering results an... more This paper proposes a method to obtain the quantitative difference of fuzzy clustering results and to visualize it in lower dimensional space. Since the result of fuzzy clustering is shown as the degree of belongingness of objects to exploratory obtained clusters, the obtained clusters do not have "explicit order". Therefore, if we obtain multiple results from multiple datasets individually, then we cannot compare the results, since each obtained "order" of clusters for each dataset may differ from each other. However, there is a need to compare the results. For example, if we observe multiple datasets in which each dataset consists of objects and variables and such a dataset is observed for a multiple number of subjects (or times), and if we obtain the clustering result at each subject (or time), then we usually would like to know the difference of the clustering results over the subjects (or times). The proposed method in this paper enables the capturing of the difference based on the mathematical comparability of the obtained clusters over the different datasets corresponding to different subjects (or times) and visualize the difference of clustering results over the subjects (or times). Numerical examples are shown for an improved understanding of the proposed method.
SCIS & ISIS SCIS & ISIS 2006, 2006
This paper proposes methods to obtain difference among subjects by using the degree of reliabilit... more This paper proposes methods to obtain difference among subjects by using the degree of reliability of each subject based on the results of fuzzy clustering and multidimensional scaling (MDS). In addition, new fuzzy clustering and MDS, including the weights of reliability scores, are proposed to classify subjects. When we observe data consisting of values of objects with respect to variables, and such data are observed over multiple subjects, capturing the difference among subjects is important in many fields. In this paper, the degree of reliability is obtained through the optimality of convex clustering. Based on this idea, it is shown that the same difference over the subjects can be obtained, regardless of the difference in obtained latent structures, which are the result of dynamic fuzzy clustering and the result of MDS by a numerical example. From this, we show the robustness of the proposed reliability concerning the variety of the obtained latent structures of data.
Springer eBooks, 2021
In this paper, a classification method based on an ensemble learning of deep learning and multidi... more In this paper, a classification method based on an ensemble learning of deep learning and multidimensional scaling is proposed for a problem of discrimination of large and complex data. The advantage of the proposed method is improving the accuracy of results of the discrimination by removing the latent structure of data which have low explanatory power as noise, and this is done by transforming original data into a space spanned by dimensions which explain the latent structure of the data. Using numerical examples, we demonstrate the effectiveness of the proposed method.
Smart innovation, systems and technologies, 2020
The classification of objects based on corresponding classes is an important task in official sta... more The classification of objects based on corresponding classes is an important task in official statistics. In the previous study, the overlapping classifier that assigns classes to an object based on the reliability score was proposed. The proposed reliability score has been defined considering both the uncertainty from data and the uncertainty from the latent classification structure in data and generalized using the idea of the T-norm in statistical metric space. This paper proposes a new procedure for the improvement of the training dataset based on a pattern of reliability scores to get a better classification accuracy. The numerical example shows the proposed procedure gives a better result as compared to the result of our previous study.
Smart innovation, systems and technologies, May 31, 2018
Business demography statistics, which provide data on the numbers of births, deaths and survivals... more Business demography statistics, which provide data on the numbers of births, deaths and survivals of enterprises and/or establishments in a specific period, serve as important information for policymakers who intend to make decisions on the policy to promote entrepreneurship, which is considered as an essential instrument for improving competitiveness and generating economic growth and job opportunities. These important statistics can be compiled from statistical business registers in many countries. In Japan, the statistical business register is being re-engineered not only by redesigning the system but also by improving the quality of data sources; however, it will take some years before business demography statistics can be compiled directly from the Japanese statistical business register. In this paper, an alternative method of estimating business demography statistics that can be obtained directly from the data of Economic Censuses is proposed. The model used here is basically based on previous works, however, an enhanced version of the model is proposed here so that quantitative attributes of establishments/enterprises such as the number of persons engaged can be analyzed. Based on the enhanced model, a numerical example on job creation and destruction is given using the micro data of Economic Censuses, which reveals the importance of fostering entrepreneurship as well as opening establishments as policy measures for economic growth and job opportunities.
Smart innovation, systems and technologies, 2020
Constrained cluster analysis is a semi-supervised approach of clustering where some additional in... more Constrained cluster analysis is a semi-supervised approach of clustering where some additional information about the clusters is incorporated as constraints. For example, sometimes, we need to consider the constraint of homogeneity among all obtained clusters. This paper presents an algorithm for constrained cluster analysis with homogeneity of clusters and shows a practical application of the algorithm in formulating survey blocks in official statistics such as the Economic Census, which reveals the effectiveness of the algorithm. In this application, travel distance is utilized considering the property of homogeneity of this clustering.
Advances in intelligent systems and computing, Nov 3, 2015
This paper proposes a fuzzy correlational direction multidimensional scaling based on fuzzy clust... more This paper proposes a fuzzy correlational direction multidimensional scaling based on fuzzy clustering-based correlation. We have proposed the fuzzy clustering-based correlation [6] and its application of multidimensional scaling [8] in order to obtain a more accurate result. In this method, we proposed dissimilarity between a pair of objects weighted by the fuzzy clustering-based correlation and applied it to the ordinary multidimensional scaling (MDS). However, in this method, we could not consider the direction of the correlation into the MDS. In order to solve this problem, this paper proposes a new dissimilarity which can include the difference of the direction of the correlation and proposes a new MDS by applying this dissimilarity to the ordinary MDS. We call this method fuzzy correlational direction multidimensional scaling. First, we show the non-fuzzy version of the correlational direction multidimensional scaling and next, we show the fuzzy version of the correlational direction multidimensional scaling. Several numerical examples show the better performance of the proposed method.
Springer eBooks, Aug 11, 2008
Regression analysis is a well known and a widely used technique in multivariate data analysis. Th... more Regression analysis is a well known and a widely used technique in multivariate data analysis. The efficiency of it is extensively recognized. Recently, several proposed regression models have exploited the spatial classification structure of data. The purpose of this inclusion of the spatial classification structure is to set a heterogeneous data structure to homogeneous structure in order to adjust the
Procedia Computer Science, 2014
With the advancement of information processing technology in recent years, larger and more compli... more With the advancement of information processing technology in recent years, larger and more complicated data has appeared. On the basis of this situation, a method to deal with this kind of data is required. Cluster analysis, or clustering will be one solution. There are two types of data in a clustering method. One is the data that consists of objects and attributes, the other is the data that consists of the similarity of each object. The latter, a data of similarity is treated in this study. The purpose of the clustering for similarity data is to obtain the clustering result based on the similarity scaling among objects. However, when the data is complex the given similarity data does not always have the structure of similarity scaling defined in the clustering method. Therefore, in this paper, a fuzzy clustering method that enables us to obtain a clear classification for the complex data is proposed, by introducing the similarity data to the obtained clustering result and considering the relative structure for all the clusters. By considering the relative structure of the belongingness to clusters, more specific information of objects can be given, and the belongingness would be improved.
Springer eBooks, 2009
Fuzzy regression methods are proposed considering classification structure which is obtained as a... more Fuzzy regression methods are proposed considering classification structure which is obtained as a result of fuzzy clustering with respect to each attribute. The fuzzy clustering is based on dissimilarities over objects in the subspace of the object’s space. Exploiting the degree of belongingness of objects to clusters with respect to attributes, we define two fuzzy regression methods in order to estimate the fuzzy cluster loadings and weighted regression coefficients. Numerical examples show the applicability of our proposed method
Studies in classification, data analysis, and knowledge organization, 2020