tashfia islam - Academia.edu (original) (raw)
Papers by tashfia islam
Proceedings of the 2022 International Conference on Management of Data
Classification, a heavily studied data-driven machine learning task, drives a large number of pre... more Classification, a heavily studied data-driven machine learning task, drives a large number of prediction systems involving critical decisions such as loan approval and criminal risk assessment. However, classifiers often demonstrate discriminatory behavior, especially when presented with biased data. Consequently, fairness in classification has emerged as a high-priority research area. Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness, including the topic of fair classification. The interdisciplinary efforts in fair classification, with machine learning research having the largest presence, have resulted in a large number of fairness notions and a wide range of approaches that have not been systematically evaluated and compared. In this paper, we contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, robustness to data errors, sensitivity to underlying ML model, data efficiency, and stability using a variety of metrics and real-world datasets. Our analysis highlights novel insights on the impact of different metrics and highlevel approach characteristics on different aspects of performance. We also discuss general principles for choosing approaches suitable for different practical settings, and identify areas where datamanagement-centric solutions are likely to have the most impact. CCS CONCEPTS • General and reference → Empirical studies; • Computing methodologies → Machine learning; • Information systems → Data management systems.
Being one of the most useful fields of data mining, sequential pattern mining is a very popular a... more Being one of the most useful fields of data mining, sequential pattern mining is a very popular and much researched domain. However, simply pattern mining is often not enough to understand the intricate relationships that exist between data objects or items. A correlation measure can uplift the task of mining interesting information that is useful to the end user. In this paper, we propose a new correlation measure, SequentialCorrelation, for sequential patterns. Along with that, we propose a complete method called SCMine and design its efficient trie-based implementation. We use the measure to define a one or two way relationship between data objects and subsequently classify patterns into two subsets based on order dependency. Our performance study shows that a number of insignificant patterns can be pruned and it can give valuable insight into the datasets. SequentialCorrelation along with SCMine can be very useful in many real life applications, especially because conventional c...
2017 20th International Conference of Computer and Information Technology (ICCIT), 2017
IP based communication protocols will play a key role enabling the pervasive connectivity among t... more IP based communication protocols will play a key role enabling the pervasive connectivity among the devices within IoT network. Such a resource-restricted platform specific protocol is Datagram Transport Layer Security (DTLS) protocol. Applicability of DTLS to constrained environments is still a controversial issue because of its network overhead, memory management, and latency. This work presents a security scheme with two-way verification for wireless sensor nodes employing established standards with rather insignificant data transfer and handshaking latency. The recommended security scheme is built with a new cipher suite incorporated with an authentication algorithm “Poly1305” and an encryption algorithm “ChaCha20”.
Journal of Optics, 2020
Recently, plasmonics has been used to trap the light at nanoscale to improve the absorption in so... more Recently, plasmonics has been used to trap the light at nanoscale to improve the absorption in solar cells. In this study, we construct a silicon thin-film solar cell (TFSC) using finite-difference time-domain (FDTD) simulation. The TFSC solar cell was designed with TiO 2 antireflection layer, aluminum (Al) as a reflective layer, and silicon (Si) as a absorption layer. The silver nanoparticles of 50 nm diameter and 400 nm separation are embedded on absorption layer. Optimizing the location of nanoparticle, light scattering efficiency and near-filled light concentration has been harnessed. The reflective Al layer reflect the light towards upper layer and increases the overall absorption. The proposed structure allows the sunlight to enter into the absorption layer at all directions, breaking the molecular bonding and releasing electrons. We determined the short-circuit current, fill factor, opencircuit voltage, absorption efficiency, and conversion efficiency of our proposed model. The simulation results reveals that, TFSC with embedded nanoparticle array, antireflection coating and reflective layer has higher shortcircuit current of 23.98 mA/cm 2 , higher fill-factor of 0.84 and an efficiency of 12.3%.The proposed TFSC solar cell increases the absorption efficiency by 43.5% and conversion efficiency by 64.66% compared with conventional TFSC. The findings will be helpful to design more efficient and cost effective solar cell utilizing light-trapping mechanism of plasmonic nanoparticle of different size, shape, and geometry.
Advances in Data Mining. Applications and Theoretical Aspects, 2018
In the field of Data Mining, classification and regression plays a vital role as they are useful ... more In the field of Data Mining, classification and regression plays a vital role as they are useful in various real-life domains. Most of the real-life data suffer from data imbalance problem. The performances of the standard algorithms are hindered for the data imbalance problem. A number of methods have been introduced for imbalance data classification. However, most of them are designed for binary class imbalance problems. Furthermore, they suffer from various problems like loss of useful information, likelihood of overfitting, unexpected mistakes etc. On the other hand, data imbalance problem exists in regression analysis also, although very few existing methods consider this problem. Hence, we propose an effective recursive based ensemble method for multi-class imbalance data classification. We also extend our method to propose an effective recursive based method for solving the data imbalance problem in regression. Extensive performance analyses show that our proposed approach achieves high performance in multi-class classification on class imbalance data and regression analysis on skewed or imbalance data. The experimental results also show that our method outperforms various existing methods for imbalance classification and regression.
IEEE Access, 2019
In machine learning, classification and regression are two of the most noteworthy key topics sinc... more In machine learning, classification and regression are two of the most noteworthy key topics since they occur extensively in numerous real-world applications. However, real life data is hardly ever found balanced, rather skewed data is the common occurrence. This poses some serious challenges to the standard techniques of classification and regression. The performance and effectiveness of these techniques are substantially affected by overfitting, creating a bias towards the majority class. In recent years, quite a few number of methods have been introduced for classification of imbalanced data. But most of them are designed for binary classes and it is difficult or inefficient to extend them for multiple classes. Moreover, data imbalance problem occurs frequently in regression analysis too, with only a handful of algorithms robust enough to tackle to this problem. In this paper, we propose an effective recursive method for multi-class classification with imbalanced data. Our proposed algorithm partitions and balances the data, and is applied recursively coupled with ensemble techniques. Furthermore, we also extend our proposed method to solve the data imbalance problem in regression analysis. Experimental results demonstrate that the proposed recursive technique is effective and improves the performance when compared to existing methods for classification and regression with imbalanced distribution.
Proceedings of the 2022 International Conference on Management of Data
Classification, a heavily studied data-driven machine learning task, drives a large number of pre... more Classification, a heavily studied data-driven machine learning task, drives a large number of prediction systems involving critical decisions such as loan approval and criminal risk assessment. However, classifiers often demonstrate discriminatory behavior, especially when presented with biased data. Consequently, fairness in classification has emerged as a high-priority research area. Data management research is showing an increasing presence and interest in topics related to data and algorithmic fairness, including the topic of fair classification. The interdisciplinary efforts in fair classification, with machine learning research having the largest presence, have resulted in a large number of fairness notions and a wide range of approaches that have not been systematically evaluated and compared. In this paper, we contribute a broad analysis of 13 fair classification approaches and additional variants, over their correctness, fairness, efficiency, scalability, robustness to data errors, sensitivity to underlying ML model, data efficiency, and stability using a variety of metrics and real-world datasets. Our analysis highlights novel insights on the impact of different metrics and highlevel approach characteristics on different aspects of performance. We also discuss general principles for choosing approaches suitable for different practical settings, and identify areas where datamanagement-centric solutions are likely to have the most impact. CCS CONCEPTS • General and reference → Empirical studies; • Computing methodologies → Machine learning; • Information systems → Data management systems.
Being one of the most useful fields of data mining, sequential pattern mining is a very popular a... more Being one of the most useful fields of data mining, sequential pattern mining is a very popular and much researched domain. However, simply pattern mining is often not enough to understand the intricate relationships that exist between data objects or items. A correlation measure can uplift the task of mining interesting information that is useful to the end user. In this paper, we propose a new correlation measure, SequentialCorrelation, for sequential patterns. Along with that, we propose a complete method called SCMine and design its efficient trie-based implementation. We use the measure to define a one or two way relationship between data objects and subsequently classify patterns into two subsets based on order dependency. Our performance study shows that a number of insignificant patterns can be pruned and it can give valuable insight into the datasets. SequentialCorrelation along with SCMine can be very useful in many real life applications, especially because conventional c...
2017 20th International Conference of Computer and Information Technology (ICCIT), 2017
IP based communication protocols will play a key role enabling the pervasive connectivity among t... more IP based communication protocols will play a key role enabling the pervasive connectivity among the devices within IoT network. Such a resource-restricted platform specific protocol is Datagram Transport Layer Security (DTLS) protocol. Applicability of DTLS to constrained environments is still a controversial issue because of its network overhead, memory management, and latency. This work presents a security scheme with two-way verification for wireless sensor nodes employing established standards with rather insignificant data transfer and handshaking latency. The recommended security scheme is built with a new cipher suite incorporated with an authentication algorithm “Poly1305” and an encryption algorithm “ChaCha20”.
Journal of Optics, 2020
Recently, plasmonics has been used to trap the light at nanoscale to improve the absorption in so... more Recently, plasmonics has been used to trap the light at nanoscale to improve the absorption in solar cells. In this study, we construct a silicon thin-film solar cell (TFSC) using finite-difference time-domain (FDTD) simulation. The TFSC solar cell was designed with TiO 2 antireflection layer, aluminum (Al) as a reflective layer, and silicon (Si) as a absorption layer. The silver nanoparticles of 50 nm diameter and 400 nm separation are embedded on absorption layer. Optimizing the location of nanoparticle, light scattering efficiency and near-filled light concentration has been harnessed. The reflective Al layer reflect the light towards upper layer and increases the overall absorption. The proposed structure allows the sunlight to enter into the absorption layer at all directions, breaking the molecular bonding and releasing electrons. We determined the short-circuit current, fill factor, opencircuit voltage, absorption efficiency, and conversion efficiency of our proposed model. The simulation results reveals that, TFSC with embedded nanoparticle array, antireflection coating and reflective layer has higher shortcircuit current of 23.98 mA/cm 2 , higher fill-factor of 0.84 and an efficiency of 12.3%.The proposed TFSC solar cell increases the absorption efficiency by 43.5% and conversion efficiency by 64.66% compared with conventional TFSC. The findings will be helpful to design more efficient and cost effective solar cell utilizing light-trapping mechanism of plasmonic nanoparticle of different size, shape, and geometry.
Advances in Data Mining. Applications and Theoretical Aspects, 2018
In the field of Data Mining, classification and regression plays a vital role as they are useful ... more In the field of Data Mining, classification and regression plays a vital role as they are useful in various real-life domains. Most of the real-life data suffer from data imbalance problem. The performances of the standard algorithms are hindered for the data imbalance problem. A number of methods have been introduced for imbalance data classification. However, most of them are designed for binary class imbalance problems. Furthermore, they suffer from various problems like loss of useful information, likelihood of overfitting, unexpected mistakes etc. On the other hand, data imbalance problem exists in regression analysis also, although very few existing methods consider this problem. Hence, we propose an effective recursive based ensemble method for multi-class imbalance data classification. We also extend our method to propose an effective recursive based method for solving the data imbalance problem in regression. Extensive performance analyses show that our proposed approach achieves high performance in multi-class classification on class imbalance data and regression analysis on skewed or imbalance data. The experimental results also show that our method outperforms various existing methods for imbalance classification and regression.
IEEE Access, 2019
In machine learning, classification and regression are two of the most noteworthy key topics sinc... more In machine learning, classification and regression are two of the most noteworthy key topics since they occur extensively in numerous real-world applications. However, real life data is hardly ever found balanced, rather skewed data is the common occurrence. This poses some serious challenges to the standard techniques of classification and regression. The performance and effectiveness of these techniques are substantially affected by overfitting, creating a bias towards the majority class. In recent years, quite a few number of methods have been introduced for classification of imbalanced data. But most of them are designed for binary classes and it is difficult or inefficient to extend them for multiple classes. Moreover, data imbalance problem occurs frequently in regression analysis too, with only a handful of algorithms robust enough to tackle to this problem. In this paper, we propose an effective recursive method for multi-class classification with imbalanced data. Our proposed algorithm partitions and balances the data, and is applied recursively coupled with ensemble techniques. Furthermore, we also extend our proposed method to solve the data imbalance problem in regression analysis. Experimental results demonstrate that the proposed recursive technique is effective and improves the performance when compared to existing methods for classification and regression with imbalanced distribution.