Chung-Chih Li - Academia.edu (original) (raw)
Papers by Chung-Chih Li
Journal of Computing Sciences in Colleges, Apr 1, 2005
Theoretically, using any Linear Congruence Generator (LCG) to generate pseudorandom numbers for c... more Theoretically, using any Linear Congruence Generator (LCG) to generate pseudorandom numbers for cryptographic purposes is problematic because of its predictableness. On the other hand, due to its simplicity and efficiency, we think that the LCG should not be completely ignored. Since the random numbers generated by the LCG are predictable, it is clear that we cannot use them directly. However, we shall not introduce too much complication in the implementation which will compromise the reasons, simplicity and efficiency, of choosing the LCG. Thus, we propose an easy encryption method using an LCG for email encryption. To see how practical in predicting random numbers produced by an LCG, we implement Plumstead's inference algorithm [2] and run it on some numbers generated by the easiest congruence: X n+1 = aX n + b mod m. Based on the result, we confirm the theoretical fault of the LCG, that is, simply increasing the size of the modulus does not significantly increase the difficulty of breaking the sequence. Our remedy is to break a whole random number into pieces and use them separately (with interference from another source, in our case, English text). We use 16-bytes random numbers and embed each byte of the random number as noise in one text character. In such a way, we can avoid revealing enough numbers for the attacker to predict.
J. Inf. Sci. Eng., 2016
This work proposes an improved artificial bee colony (ABC) algorithm, called the rank-based ABC a... more This work proposes an improved artificial bee colony (ABC) algorithm, called the rank-based ABC algorithm, which includes a rank-based selection mechanism in the onlooker bees phase and a modified abandonment mechanism in the scout bees phase for solving unconstrained and constrained optimization problems. In the onlooker bees phase, the probability that an onlooker bee selects a food source is determined using a nonlinear selective pressure function, which is based on a ranking of fitness instead of proportional total fitness values. A nectar source with a superior fitness rank has a large probability of being selected by onlooker bees as new solutions and so yields a similar “best solution pool,” which often comprises the best and several good solutions, therefore, the exploitation capability for searching good solution is enhanced for the basic ABC algorithm. Moreover, the modified abandonment mechanism is used in the scout bees phase to increase the exploration capability for se...
In [15] we defined a class of functions called Type-2 Time Bounds (henceforth T2TB) for clocking ... more In [15] we defined a class of functions called Type-2 Time Bounds (henceforth T2TB) for clocking the Oracle Turing Machine (henceforth OTM) in order to capture the long missing notion of complexity classes at type-2. In the present paper we adopt the same notion and further advance this apt type-2 complexity theory along the line of the classical one. Albeit the OTM is mostly accepted as a natural computing device for type-2 computation, the complexity theorems based on the OTM are highly sensitive to the convention of the machine. In particular, the cost model in dealing with the oracle answers turns out to be a crucial factor in the theory. Almost all existent literatures on the machine characterization for type-2 computation are based on a cost model known as answer-length cost model. In this paper we present a reasonable alternative called unit cost model, and examine how this model shapes the outlook of the type-2 complexity theory. We prove two theorems that are opposite to th...
International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06)
We present an approach for improving computer Go programs called CSDTA (Canonical Sequence Direct... more We present an approach for improving computer Go programs called CSDTA (Canonical Sequence Directed Tactics Analyzer), which analyzes the local tactics on a quadrant of the standard Go board based on a collection of canonical sequences (Joseki). We collect 1278 canonical sequences and their deviations in our system. Instead of trivially matching the current game and the collected canonical sequences, we define a notion of similar sequences with respect to the current game. This paper also explains how to extract the most suitable move from the candidate sequences for the next move. The simplicity of our method and its positive outcome make our approach suitable to be intergraded in a complete computer Go program for foreseeable improvement.
Journal of Advances in Information Technology, 2015
The notion of type-2 computability occurs naturally in many practical and theoretical settings in... more The notion of type-2 computability occurs naturally in many practical and theoretical settings in computer science. For examples, machine learning, programing languages, databases enquiry, complexity-theoretic problem reductions, and so on, are immediate applications of type-2 computation. However, there is no satisfactory type-2 complexity theory to characterize the computational cost of these widely ranged applications. Thus, the purpose of this thesis is to give a theoretical framework for analyzing the complexity of type-2 computation. We use the Oracle Turing Machine (OTM) as our standard formalism for type-2 computation. The best way to characterize the computational cost of type-2 computation is to give a robust notion of type-2 complexity classes. In order to do so, we first study the induced topologies determined by type-2 continuous functionals of type (N → N) × N r N. Then, based on the compact sets in the induced topologies, we define a type-2 almost-everywhere relation ≤ *2 over type-2 continuous functionals. The type-2 almost-everywhere relation ≤* 2 provides an analogous notion of asymptotic approach for complexity analysis in type-2. We also specify a clocking scheme for OTMs based on a class of computable functions called Type-2 Time Bounds ( T2TB). With the tools we developed, each type-2 time bound β ∈ T2TB determines a type-2 complexity class C(β). We also define a type-2 big-O notation—O(β)—which would be a useful tool for type-2 algorithm analysis. To justify our notion of type-2 complexity classes, we prove the Union Theorem, the Gap Theorem, the Compression Theorem, and the Speed-up Theorem in type-2 along the lines of classical complexity theory. Most of the theorems we proved are very different from their type-1 counterparts. We thus learn that the structure of type-2 complexity classes is not as sturdy as the structure in type-1; they are very sensitive to the topological constraint. With theses complexity results, we have a reasonable outlook for a general type-2 complexity theory.
IEEE Systems Journal, 2014
In this paper, we discuss how to prevent users' passwords from being stolen by adversaries in onl... more In this paper, we discuss how to prevent users' passwords from being stolen by adversaries in online environments and automated teller machines. We propose differentiated virtual password mechanisms in which a user has the freedom to choose a virtual password scheme ranging from weak security to strong security, where a virtual password requires a small amount of human computing to secure users' passwords. The tradeoff is that the stronger the scheme, the more complex the scheme may be. Among the schemes, we have a default method (i.e., traditional password scheme), system recommended functions, user-specified functions, user-specified programs, and so on. A function/program is used to implement the virtual password concept with a tradeoff of security for complexity requiring a small amount of human computing. We further propose several functions to serve as system recommended functions and provide a security analysis. For user-specified functions, we adopt secret little functions in which security is enhanced by hiding secret functions/algorithms.
2008 IEEE International Conference on Communications, 2008
People enjoy the convenience of on-line services, but online environments may bring many risks. I... more People enjoy the convenience of on-line services, but online environments may bring many risks. In this paper, we discuss how to prevent users' passwords from being stolen by adversaries. We propose a virtual password concept involving a small amount of human computing to secure users' passwords in on-line environments. We adopt user-determined randomized linear generation functions to secure users' passwords based on the fact that a server has more information than any adversary does. We analyze how the proposed scheme defends against phishing, key logger, and shoulder-surfing attacks. To the best of our knowledge, our virtual password mechanism is the first one which is able to defend against all three attacks together. I.
International Journal of Machine Learning and Computing, 2015
This study proposes a non-symmetrical weighted k-means (NSWKM) clustering algorithm to improve th... more This study proposes a non-symmetrical weighted k-means (NSWKM) clustering algorithm to improve the accuracy of clustering result. The similarity distance of original k-means algorithm is modified by adding weights with a non-symmetrical form to the distance measurement. Namely, different weights for attributes are applied for clusters such that the contribution of attributes can be adjusted adaptively during the clustering process. In this work, the weights are given via an optimization process using a rank-based artificial bee colony (RABC) algorithm. Furthermore, the proposed NSWKM clustering algorithm combined with the RABC, termed NSWKM-RABC herein, is then applied to the medical diagnoses of five data sets of diseases, including breast cancer, cardiac disease, diabetes, liver disease and hepatitis, to evaluate the performance of the proposed algorithm. Index Terms-Data sets of diseases, medical diagnoses, non-symmetrical weighted k-means clustering algorithm, rank-based artificial bee colony algorithm. I. INTRODUCTION Data mining technology is an effective tool to mining useful knowledge from the databases in hand, and it is a critical analysis step for achieving the knowledge-discovery in databases (KDD). Basically, the processes of KDD are getting data, choosing target data, preprocessing data, transforming data, discovering patterns/rules, and performing the optimal decision for action. According to the research work of [1] presented in 2008, the top ten data mining algorithms were C4.5, k-means, SVM (Support Vector Machine), Apriori, EM (Expectation-Maximization), PageRank (Google's Page Rank), AdaBoost (Adaptive Boosting), kNN (k-Nearest Neighbor), Naï ve Bayes, and CART (Classification and Regression Trees). Detailed technologies and references of the ten algorithms were presented in [1]. Among the ten algorithms, k-means algorithm is an iterative method to divide a given data set to a user-specified number of clusters, k for example, by using the measurement of similarity distance. Each instance will be assigned to its belonging cluster with the smallest distance from the instance to assigned cluster centroid. Generally, the distance measurement is often using Euclidean or Manhattan distance with specifying the same weights, all equal to 1, for attributes to evaluate the distance. However, the contribution Manuscript
CS educators have paid a great deal of attention to Dis-crete Mathematics over the past several d... more CS educators have paid a great deal of attention to Dis-crete Mathematics over the past several decades. Although there have been many suggestions for improving this course, it seems to us that the real purpose of Discrete Mathematics as a transitional course has been long forgotten. We believe that the most important objective of this course is to let students be familiar with the format and structure of rigorous mathematical arguments for their future study in CS. What subjects should be taught are also important because they are tools for us to effectively achieve the objective we just mentioned. Thus, we argue that complicated subjects should not be used in this transitional course because they will distract the student's attention from the underlying structure of the arguments that we want to emphasize. In this paper we start with some prevailing misconceptions in the effort of improving this course, and then we provide our solutions.
There are now a number of things called "higher-type complexity classes." The most prom... more There are now a number of things called "higher-type complexity classes." The most promenade of these is the class of basic feasible functionals (CU93, CK90), a fairly conservative higher-type analogue the (type-1) polynomial-time computable functions. There is however cur- rently no satisfactory general notion of what a higher-type complexity class should be. In this paper we propose one such notion
Lecture Notes in Computer Science
A classic result known as the speed-up theorem in machineindependent complexity theory shows that... more A classic result known as the speed-up theorem in machineindependent complexity theory shows that there exist some computable functions that do not have best programs for them [2, 3]. In this paper we lift this result into type-2 computation under the notion of our type-2 complexity theory depicted in [15, 13, 14]. While the speed-up phenomenon is essentially inherited from type-1 computation, we cannot directly apply the original proof to our type-2 speed-up theorem because the oracle queries can interfere the speed of the programs and hence the cancellation strategy used in the original proof is no longer correct at type-2. We also argue that a type-2 analog of the operator speed-up theorem [16] does not hold, which suggests that this curious phenomenon disappears in higher-typed computation beyond type-2. Theorem 1 (The Speed-up Theorem [2, 3]). For any recursive function r, there exists a recursive function f such that (∀ i : ϕ i = f) (∃j : ϕ j = f) (∞ ∀ x) r(Φ j (x)) ≤ Φ i (x). 1 The original remarks were translated in [7], pages 82-83. More discussion about the relation between the computational speed-up phenomena and Gödel's speed-up results in logic can be found in [21]. 2 The negation of "for all but finitely many" is "exist infinitely many" denoted by ∞ ∃ .
2008 IEEE International Conference on Communications, 2008
In this paper, we discuss how to prevent users' passwords from being stolen by adversaries. We pr... more In this paper, we discuss how to prevent users' passwords from being stolen by adversaries. We propose differentiated security mechanisms in which a user has the freedom to choose a virtual password scheme ranging from weak security to strong security. The tradeoff is that the stronger the scheme, the more complex the scheme may be. Among the schemes, we have a default method (i.e., traditional password scheme), system recommended function, user-specified function, user-specified program, etc. A function/program is used to implement the virtual password concept with a trade off of security for complexity requiring a small amount of human computing. We further propose codebook approach to serve as system recommended functions and provide a security analysis. For user-specified functions, we adopt secret little functions, in which security is enhanced by hiding secret functions/algorithms. I.
2006 IEEE International Conference on Communications, 2006
In this paper, based on a Linear Congruential Generator (LCG), we propose a new block cipher that... more In this paper, based on a Linear Congruential Generator (LCG), we propose a new block cipher that is suitable for constructing a lightweight secure protocol for resourceconstrained wireless sensor networks. Based on the Plumstead's inference algorithm, we are motivated to embed the generated pseudo-random numbers with sensor data messages in order to provide security. Specifically, the security of our proposed cipher is achieved by adding random noise and random permutations to the original data messages. The analysis of our cipher indicates that it can satisfy the security requirements of wireless sensor networks. We demonstrate that secure protocols based on our proposed cipher satisfy the baseline security requirements: data confidentiality, authenticity, and integrity with low overhead. Performance analysis demonstrates that our proposed block cipher is more lightweight than RC5 in terms of the number of basic operations.
IEEE Globecom 2006, 2006
Radio Frequency Identification (RFID) systems have provided promising solutions to effective iden... more Radio Frequency Identification (RFID) systems have provided promising solutions to effective identification of a large number of tagged objects. However, RFID systems suffer from unauthorized tag reading and potential eavesdropping, which becomes a challenging issue because of the shared radio medium and limited size and cost considerations in RFID. In this paper, based on a Linear Congruential Generator (LCG), we propose a lightweight block cipher that can meet the security and performance requirement of RFID systems. The trade-off between the security and overhead is discussed. Based on the proposed block cipher, we further propose a secure protocol for RFID that can provide data confidentiality and mutual authentication between the reader and the tag. We also provide performance analysis of our proposed block cipher.
Theory of Computing Systems, 2009
Metadata of the article that will be visualized in Online First Journal Name Bulletin of Mathemat... more Metadata of the article that will be visualized in Online First Journal Name Bulletin of Mathematical Biology Article Title Morphogenetic Gradients and the Stability of Boundaries Between Neighboring Morphogenetic Regions
In (15) we defined a class of functions called Type-2 Time Bounds (henceforth T2TB) for clocking ... more In (15) we defined a class of functions called Type-2 Time Bounds (henceforth T2TB) for clocking the Oracle Turing Machine (henceforth OTM) in order to capture the long missing notion of com- plexity classes at type-2. In the present paper we adopt the same notion and further advance this apt type-2 complexity theory along the line of the classical one. Albeit the OTM is mostly accepted as a natural com- puting device for type-2 computation, the complexity theorems based on the OTM are highly sensitive to the convention of the machine. In particular, the cost model in dealing with the oracle answers turns out to be a crucial factor in the theory. Almost all existent literatures on the machine characterization for type-2 computation are based on a cost model known as answer-length cost model. In this paper we present a rea- sonable alternative called unit cost model, and examine how this model shapes the outlook of the type-2 complexity theory. We prove two theo- rems that are opposi...
Formal Aspects of Computing, 2012
Modern web applications often suffer from command injection attacks. Even when equipped with sani... more Modern web applications often suffer from command injection attacks. Even when equipped with sanitization code, many systems can be penetrated due to software bugs. It is desirable to automatically discover such vulnerabilities, given the bytecode of a web application. One approach would be symbolically executing the target system and constructing constraints for matching path conditions and attack patterns. Solving these constraints yields an attack signature, based on which, the attack process can be replayed. Constraint solving is the key to symbolic execution. For web applications, string constraints receive most of the attention because web applications are essentially text processing programs. We present simple linear string equation (SISE) , a decidable fragment of the general string constraint system. SISE models a collection of regular replacement operations (such as the greedy, reluctant, declarative, and finite replacement), which are frequently used by text processing pr...
Computer Communications, 2008
This article appeared in a journal published by Elsevier. The attached copy is furnished to the a... more This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier's archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright
Computer Communications, 2006
In this paper, based on a Linear Congruential Generator (LCG), we propose a new block cipher that... more In this paper, based on a Linear Congruential Generator (LCG), we propose a new block cipher that is suitable for constructing a lightweight secure protocol for resource-constrained wireless sensor networks. From the cryptanalysis point of view, our building block is considered secure if the attacker cannot obtain the pseudo-random numbers generated by the LCG. The Plumstead's inference algorithm for a LCG with unknown parameters demonstrates that it is impossible to significantly enhance the security of the system simply by increasing the size of the modulus. Therefore, we are motivated to embed the generated pseudo-random numbers with sensor data messages in order to provide security. Specifically, the security of our proposed cipher is achieved by adding random noise and random permutations to the original data messages. We also adopt the Hull and Dobell's algorithm to select proper parameters used in the LCG. The analysis of our cipher indicates that it can satisfy the security requirements of wireless sensor networks. We further demonstrate that secure protocols based on our proposed cipher satisfy the baseline security requirements: data confidentiality, authenticity, and integrity with low overhead. Performance analysis demonstrates that our proposed block cipher is more lightweight than RC5, a commonly used cipher in wireless sensor networks, in terms of the number of basic operations.
Journal of Computing Sciences in Colleges, Apr 1, 2005
Theoretically, using any Linear Congruence Generator (LCG) to generate pseudorandom numbers for c... more Theoretically, using any Linear Congruence Generator (LCG) to generate pseudorandom numbers for cryptographic purposes is problematic because of its predictableness. On the other hand, due to its simplicity and efficiency, we think that the LCG should not be completely ignored. Since the random numbers generated by the LCG are predictable, it is clear that we cannot use them directly. However, we shall not introduce too much complication in the implementation which will compromise the reasons, simplicity and efficiency, of choosing the LCG. Thus, we propose an easy encryption method using an LCG for email encryption. To see how practical in predicting random numbers produced by an LCG, we implement Plumstead's inference algorithm [2] and run it on some numbers generated by the easiest congruence: X n+1 = aX n + b mod m. Based on the result, we confirm the theoretical fault of the LCG, that is, simply increasing the size of the modulus does not significantly increase the difficulty of breaking the sequence. Our remedy is to break a whole random number into pieces and use them separately (with interference from another source, in our case, English text). We use 16-bytes random numbers and embed each byte of the random number as noise in one text character. In such a way, we can avoid revealing enough numbers for the attacker to predict.
J. Inf. Sci. Eng., 2016
This work proposes an improved artificial bee colony (ABC) algorithm, called the rank-based ABC a... more This work proposes an improved artificial bee colony (ABC) algorithm, called the rank-based ABC algorithm, which includes a rank-based selection mechanism in the onlooker bees phase and a modified abandonment mechanism in the scout bees phase for solving unconstrained and constrained optimization problems. In the onlooker bees phase, the probability that an onlooker bee selects a food source is determined using a nonlinear selective pressure function, which is based on a ranking of fitness instead of proportional total fitness values. A nectar source with a superior fitness rank has a large probability of being selected by onlooker bees as new solutions and so yields a similar “best solution pool,” which often comprises the best and several good solutions, therefore, the exploitation capability for searching good solution is enhanced for the basic ABC algorithm. Moreover, the modified abandonment mechanism is used in the scout bees phase to increase the exploration capability for se...
In [15] we defined a class of functions called Type-2 Time Bounds (henceforth T2TB) for clocking ... more In [15] we defined a class of functions called Type-2 Time Bounds (henceforth T2TB) for clocking the Oracle Turing Machine (henceforth OTM) in order to capture the long missing notion of complexity classes at type-2. In the present paper we adopt the same notion and further advance this apt type-2 complexity theory along the line of the classical one. Albeit the OTM is mostly accepted as a natural computing device for type-2 computation, the complexity theorems based on the OTM are highly sensitive to the convention of the machine. In particular, the cost model in dealing with the oracle answers turns out to be a crucial factor in the theory. Almost all existent literatures on the machine characterization for type-2 computation are based on a cost model known as answer-length cost model. In this paper we present a reasonable alternative called unit cost model, and examine how this model shapes the outlook of the type-2 complexity theory. We prove two theorems that are opposite to th...
International Conference on Computational Intelligence for Modelling, Control and Automation and International Conference on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06)
We present an approach for improving computer Go programs called CSDTA (Canonical Sequence Direct... more We present an approach for improving computer Go programs called CSDTA (Canonical Sequence Directed Tactics Analyzer), which analyzes the local tactics on a quadrant of the standard Go board based on a collection of canonical sequences (Joseki). We collect 1278 canonical sequences and their deviations in our system. Instead of trivially matching the current game and the collected canonical sequences, we define a notion of similar sequences with respect to the current game. This paper also explains how to extract the most suitable move from the candidate sequences for the next move. The simplicity of our method and its positive outcome make our approach suitable to be intergraded in a complete computer Go program for foreseeable improvement.
Journal of Advances in Information Technology, 2015
The notion of type-2 computability occurs naturally in many practical and theoretical settings in... more The notion of type-2 computability occurs naturally in many practical and theoretical settings in computer science. For examples, machine learning, programing languages, databases enquiry, complexity-theoretic problem reductions, and so on, are immediate applications of type-2 computation. However, there is no satisfactory type-2 complexity theory to characterize the computational cost of these widely ranged applications. Thus, the purpose of this thesis is to give a theoretical framework for analyzing the complexity of type-2 computation. We use the Oracle Turing Machine (OTM) as our standard formalism for type-2 computation. The best way to characterize the computational cost of type-2 computation is to give a robust notion of type-2 complexity classes. In order to do so, we first study the induced topologies determined by type-2 continuous functionals of type (N → N) × N r N. Then, based on the compact sets in the induced topologies, we define a type-2 almost-everywhere relation ≤ *2 over type-2 continuous functionals. The type-2 almost-everywhere relation ≤* 2 provides an analogous notion of asymptotic approach for complexity analysis in type-2. We also specify a clocking scheme for OTMs based on a class of computable functions called Type-2 Time Bounds ( T2TB). With the tools we developed, each type-2 time bound β ∈ T2TB determines a type-2 complexity class C(β). We also define a type-2 big-O notation—O(β)—which would be a useful tool for type-2 algorithm analysis. To justify our notion of type-2 complexity classes, we prove the Union Theorem, the Gap Theorem, the Compression Theorem, and the Speed-up Theorem in type-2 along the lines of classical complexity theory. Most of the theorems we proved are very different from their type-1 counterparts. We thus learn that the structure of type-2 complexity classes is not as sturdy as the structure in type-1; they are very sensitive to the topological constraint. With theses complexity results, we have a reasonable outlook for a general type-2 complexity theory.
IEEE Systems Journal, 2014
In this paper, we discuss how to prevent users' passwords from being stolen by adversaries in onl... more In this paper, we discuss how to prevent users' passwords from being stolen by adversaries in online environments and automated teller machines. We propose differentiated virtual password mechanisms in which a user has the freedom to choose a virtual password scheme ranging from weak security to strong security, where a virtual password requires a small amount of human computing to secure users' passwords. The tradeoff is that the stronger the scheme, the more complex the scheme may be. Among the schemes, we have a default method (i.e., traditional password scheme), system recommended functions, user-specified functions, user-specified programs, and so on. A function/program is used to implement the virtual password concept with a tradeoff of security for complexity requiring a small amount of human computing. We further propose several functions to serve as system recommended functions and provide a security analysis. For user-specified functions, we adopt secret little functions in which security is enhanced by hiding secret functions/algorithms.
2008 IEEE International Conference on Communications, 2008
People enjoy the convenience of on-line services, but online environments may bring many risks. I... more People enjoy the convenience of on-line services, but online environments may bring many risks. In this paper, we discuss how to prevent users' passwords from being stolen by adversaries. We propose a virtual password concept involving a small amount of human computing to secure users' passwords in on-line environments. We adopt user-determined randomized linear generation functions to secure users' passwords based on the fact that a server has more information than any adversary does. We analyze how the proposed scheme defends against phishing, key logger, and shoulder-surfing attacks. To the best of our knowledge, our virtual password mechanism is the first one which is able to defend against all three attacks together. I.
International Journal of Machine Learning and Computing, 2015
This study proposes a non-symmetrical weighted k-means (NSWKM) clustering algorithm to improve th... more This study proposes a non-symmetrical weighted k-means (NSWKM) clustering algorithm to improve the accuracy of clustering result. The similarity distance of original k-means algorithm is modified by adding weights with a non-symmetrical form to the distance measurement. Namely, different weights for attributes are applied for clusters such that the contribution of attributes can be adjusted adaptively during the clustering process. In this work, the weights are given via an optimization process using a rank-based artificial bee colony (RABC) algorithm. Furthermore, the proposed NSWKM clustering algorithm combined with the RABC, termed NSWKM-RABC herein, is then applied to the medical diagnoses of five data sets of diseases, including breast cancer, cardiac disease, diabetes, liver disease and hepatitis, to evaluate the performance of the proposed algorithm. Index Terms-Data sets of diseases, medical diagnoses, non-symmetrical weighted k-means clustering algorithm, rank-based artificial bee colony algorithm. I. INTRODUCTION Data mining technology is an effective tool to mining useful knowledge from the databases in hand, and it is a critical analysis step for achieving the knowledge-discovery in databases (KDD). Basically, the processes of KDD are getting data, choosing target data, preprocessing data, transforming data, discovering patterns/rules, and performing the optimal decision for action. According to the research work of [1] presented in 2008, the top ten data mining algorithms were C4.5, k-means, SVM (Support Vector Machine), Apriori, EM (Expectation-Maximization), PageRank (Google's Page Rank), AdaBoost (Adaptive Boosting), kNN (k-Nearest Neighbor), Naï ve Bayes, and CART (Classification and Regression Trees). Detailed technologies and references of the ten algorithms were presented in [1]. Among the ten algorithms, k-means algorithm is an iterative method to divide a given data set to a user-specified number of clusters, k for example, by using the measurement of similarity distance. Each instance will be assigned to its belonging cluster with the smallest distance from the instance to assigned cluster centroid. Generally, the distance measurement is often using Euclidean or Manhattan distance with specifying the same weights, all equal to 1, for attributes to evaluate the distance. However, the contribution Manuscript
CS educators have paid a great deal of attention to Dis-crete Mathematics over the past several d... more CS educators have paid a great deal of attention to Dis-crete Mathematics over the past several decades. Although there have been many suggestions for improving this course, it seems to us that the real purpose of Discrete Mathematics as a transitional course has been long forgotten. We believe that the most important objective of this course is to let students be familiar with the format and structure of rigorous mathematical arguments for their future study in CS. What subjects should be taught are also important because they are tools for us to effectively achieve the objective we just mentioned. Thus, we argue that complicated subjects should not be used in this transitional course because they will distract the student's attention from the underlying structure of the arguments that we want to emphasize. In this paper we start with some prevailing misconceptions in the effort of improving this course, and then we provide our solutions.
There are now a number of things called "higher-type complexity classes." The most prom... more There are now a number of things called "higher-type complexity classes." The most promenade of these is the class of basic feasible functionals (CU93, CK90), a fairly conservative higher-type analogue the (type-1) polynomial-time computable functions. There is however cur- rently no satisfactory general notion of what a higher-type complexity class should be. In this paper we propose one such notion
Lecture Notes in Computer Science
A classic result known as the speed-up theorem in machineindependent complexity theory shows that... more A classic result known as the speed-up theorem in machineindependent complexity theory shows that there exist some computable functions that do not have best programs for them [2, 3]. In this paper we lift this result into type-2 computation under the notion of our type-2 complexity theory depicted in [15, 13, 14]. While the speed-up phenomenon is essentially inherited from type-1 computation, we cannot directly apply the original proof to our type-2 speed-up theorem because the oracle queries can interfere the speed of the programs and hence the cancellation strategy used in the original proof is no longer correct at type-2. We also argue that a type-2 analog of the operator speed-up theorem [16] does not hold, which suggests that this curious phenomenon disappears in higher-typed computation beyond type-2. Theorem 1 (The Speed-up Theorem [2, 3]). For any recursive function r, there exists a recursive function f such that (∀ i : ϕ i = f) (∃j : ϕ j = f) (∞ ∀ x) r(Φ j (x)) ≤ Φ i (x). 1 The original remarks were translated in [7], pages 82-83. More discussion about the relation between the computational speed-up phenomena and Gödel's speed-up results in logic can be found in [21]. 2 The negation of "for all but finitely many" is "exist infinitely many" denoted by ∞ ∃ .
2008 IEEE International Conference on Communications, 2008
In this paper, we discuss how to prevent users' passwords from being stolen by adversaries. We pr... more In this paper, we discuss how to prevent users' passwords from being stolen by adversaries. We propose differentiated security mechanisms in which a user has the freedom to choose a virtual password scheme ranging from weak security to strong security. The tradeoff is that the stronger the scheme, the more complex the scheme may be. Among the schemes, we have a default method (i.e., traditional password scheme), system recommended function, user-specified function, user-specified program, etc. A function/program is used to implement the virtual password concept with a trade off of security for complexity requiring a small amount of human computing. We further propose codebook approach to serve as system recommended functions and provide a security analysis. For user-specified functions, we adopt secret little functions, in which security is enhanced by hiding secret functions/algorithms. I.
2006 IEEE International Conference on Communications, 2006
In this paper, based on a Linear Congruential Generator (LCG), we propose a new block cipher that... more In this paper, based on a Linear Congruential Generator (LCG), we propose a new block cipher that is suitable for constructing a lightweight secure protocol for resourceconstrained wireless sensor networks. Based on the Plumstead's inference algorithm, we are motivated to embed the generated pseudo-random numbers with sensor data messages in order to provide security. Specifically, the security of our proposed cipher is achieved by adding random noise and random permutations to the original data messages. The analysis of our cipher indicates that it can satisfy the security requirements of wireless sensor networks. We demonstrate that secure protocols based on our proposed cipher satisfy the baseline security requirements: data confidentiality, authenticity, and integrity with low overhead. Performance analysis demonstrates that our proposed block cipher is more lightweight than RC5 in terms of the number of basic operations.
IEEE Globecom 2006, 2006
Radio Frequency Identification (RFID) systems have provided promising solutions to effective iden... more Radio Frequency Identification (RFID) systems have provided promising solutions to effective identification of a large number of tagged objects. However, RFID systems suffer from unauthorized tag reading and potential eavesdropping, which becomes a challenging issue because of the shared radio medium and limited size and cost considerations in RFID. In this paper, based on a Linear Congruential Generator (LCG), we propose a lightweight block cipher that can meet the security and performance requirement of RFID systems. The trade-off between the security and overhead is discussed. Based on the proposed block cipher, we further propose a secure protocol for RFID that can provide data confidentiality and mutual authentication between the reader and the tag. We also provide performance analysis of our proposed block cipher.
Theory of Computing Systems, 2009
Metadata of the article that will be visualized in Online First Journal Name Bulletin of Mathemat... more Metadata of the article that will be visualized in Online First Journal Name Bulletin of Mathematical Biology Article Title Morphogenetic Gradients and the Stability of Boundaries Between Neighboring Morphogenetic Regions
In (15) we defined a class of functions called Type-2 Time Bounds (henceforth T2TB) for clocking ... more In (15) we defined a class of functions called Type-2 Time Bounds (henceforth T2TB) for clocking the Oracle Turing Machine (henceforth OTM) in order to capture the long missing notion of com- plexity classes at type-2. In the present paper we adopt the same notion and further advance this apt type-2 complexity theory along the line of the classical one. Albeit the OTM is mostly accepted as a natural com- puting device for type-2 computation, the complexity theorems based on the OTM are highly sensitive to the convention of the machine. In particular, the cost model in dealing with the oracle answers turns out to be a crucial factor in the theory. Almost all existent literatures on the machine characterization for type-2 computation are based on a cost model known as answer-length cost model. In this paper we present a rea- sonable alternative called unit cost model, and examine how this model shapes the outlook of the type-2 complexity theory. We prove two theo- rems that are opposi...
Formal Aspects of Computing, 2012
Modern web applications often suffer from command injection attacks. Even when equipped with sani... more Modern web applications often suffer from command injection attacks. Even when equipped with sanitization code, many systems can be penetrated due to software bugs. It is desirable to automatically discover such vulnerabilities, given the bytecode of a web application. One approach would be symbolically executing the target system and constructing constraints for matching path conditions and attack patterns. Solving these constraints yields an attack signature, based on which, the attack process can be replayed. Constraint solving is the key to symbolic execution. For web applications, string constraints receive most of the attention because web applications are essentially text processing programs. We present simple linear string equation (SISE) , a decidable fragment of the general string constraint system. SISE models a collection of regular replacement operations (such as the greedy, reluctant, declarative, and finite replacement), which are frequently used by text processing pr...
Computer Communications, 2008
This article appeared in a journal published by Elsevier. The attached copy is furnished to the a... more This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier's archiving and manuscript policies are encouraged to visit: http://www.elsevier.com/copyright
Computer Communications, 2006
In this paper, based on a Linear Congruential Generator (LCG), we propose a new block cipher that... more In this paper, based on a Linear Congruential Generator (LCG), we propose a new block cipher that is suitable for constructing a lightweight secure protocol for resource-constrained wireless sensor networks. From the cryptanalysis point of view, our building block is considered secure if the attacker cannot obtain the pseudo-random numbers generated by the LCG. The Plumstead's inference algorithm for a LCG with unknown parameters demonstrates that it is impossible to significantly enhance the security of the system simply by increasing the size of the modulus. Therefore, we are motivated to embed the generated pseudo-random numbers with sensor data messages in order to provide security. Specifically, the security of our proposed cipher is achieved by adding random noise and random permutations to the original data messages. We also adopt the Hull and Dobell's algorithm to select proper parameters used in the LCG. The analysis of our cipher indicates that it can satisfy the security requirements of wireless sensor networks. We further demonstrate that secure protocols based on our proposed cipher satisfy the baseline security requirements: data confidentiality, authenticity, and integrity with low overhead. Performance analysis demonstrates that our proposed block cipher is more lightweight than RC5, a commonly used cipher in wireless sensor networks, in terms of the number of basic operations.