Khalid Saleem | Quaid-i-Azam University, Islamabad, Pakistan (original) (raw)
Papers by Khalid Saleem
Today schema matching is a basic problem in almost every data intensive distributed application, ... more Today schema matching is a basic problem in almost every data intensive distributed application, namely enterprise information integration, collaborating web services, ontology based agents communication , web catalogue integration and schema based P2P database systems. There has been a plethora of algorithms and techniques researched in schema matching and integration for data interoperability. Numerous surveys have been presented in the past to summerize this research. The requirement for extending the previous surveys has been created because of the mushrooming of the dynamic nature of these data intensive applications. Today data is viewed as a semantic entity, motivating new algorithms and strategies. The evolving large scale distributed information systems are further pushing the schema matching research to utilize the processing power not available in the past. Thus directly increasing the industry investment proportion in the matching domain. This article reviews the latest ...
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific ... more HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Proceedings of the 2nd International Conference on Information System and Data Mining - ICISDM '18
The paper describes the use of Deep Convolution Neural Networks (DCNN) for the recognition of Sno... more The paper describes the use of Deep Convolution Neural Networks (DCNN) for the recognition of Snow Leopards, from a data set of photos taken in the wild. The data set comprises of 1500 images, captured in the Himalayas using motion sensing cameras. The images contain numerous living species, ranging from a butterfly to a human being, other than Snow Leopard. For the training phase we divided the data set into two classes, Snow Leopard and Other Animals. The Snow Leopard class contains photos showing more than one animal, from different angles, having different sizes, body parts because of distance from camera and several backgrounds. The photos are converted to 200 x 200, grey scale images in the preprocessing phase. A 5 layer DCCN, constituted of 3 convoluted and 2 fully connected layers, is employed for the experimental setup. Rectified Liner Units (ReLU) is used as the activation function in the fully connected layers and softmax function is applied for classification. The evaluation of the system shows an overall 91% accuracy, along with sensitivity of 0.90 and specificity of 0.88 for Snow Leopard class identification.
Advances in Intelligent Systems and Computing
Wireless sensor networks (WSNs) rely on effective deployment of sensing nodes. Efficient sensor d... more Wireless sensor networks (WSNs) rely on effective deployment of sensing nodes. Efficient sensor deployment with ensured connectivity is a major challenge in WSNs. Several deployment approaches have been proposed in literature to address the connectivity and efficiency of sensor networks. However, most of these works either lack in efficiency or ignore the connectivity issues. In this paper, we propose an efficient and connectivity-based algorithm by modifying the Ant Colony Optimization (ACO) (Liu and He, J Netw Comput Appl 39:310–318, 2014). Traditional ACO algorithms ensure coverage at a high cost and repetitive sensing, which results in resource wastage. Our proposed algorithm reduces the sensing cost with efficient deployment and enhanced connectivity. Simulation results indicate the ability of proposed framework to significantly reduce the coverage cost as well as achieve longer life time for WSNs.
PeerJ Computer Science
The wireless networks face challenges in efficient utilization of bandwidth due to paucity of res... more The wireless networks face challenges in efficient utilization of bandwidth due to paucity of resources and lack of central management, which may result in undesired congestion. The cognitive radio (CR) paradigm can bring efficiency, better utilization of bandwidth, and appropriate management of limited resources. While the CR paradigm is an attractive choice, the CRs selfishly compete to acquire and utilize available bandwidth that may ultimately result in inappropriate power levels, causing degradation in network’s Quality of Service (QoS). A cooperative game theoretic approach can ease the problem of spectrum sharing and power utilization in a hostile and selfish environment. We focus on the challenge of congestion control that results in inadequate and uncontrolled access of channels and utilization of resources. The Nash equilibrium (NE) of a cooperative congestion game is examined by considering the cost basis, which is embedded in the utility function. The proposed algorithm ...
Proceedings of the 5th International Conference on Communication and Information Processing
Countries development are mostly judged and categorized according to economic conditions, human d... more Countries development are mostly judged and categorized according to economic conditions, human development and technology development. Ranking tools like e-Government index by United Nation (UN) is one such tool that is used by policy makers while employing the information and communication policies and allocating the resources to implement those policies. In spite of mostly used, currently used e-Government ranking tools have some limitations. For example, UN e-Government index does not handle the changing technology enhancements in ICT. In this paper we assessed, UN e-Government index and its components, identified and added social media as new parameter to UN e-Government index by using data collected from UN 193 member government websites. The alternate UN e-government index responds to the needs of enhancements and expansions in Information and Communication Technology.
Advances in Intelligent Systems and Computing
Researchers are devising new ways for robust digital content delivery in situations where telecom... more Researchers are devising new ways for robust digital content delivery in situations where telecommunication signal strength is very low, especially during natural disasters. In this paper, we present research work targeting two dimensions: (a) We selected the IANA standard for digital content classification, 20 types in 5 categories; applied and compared five different lossless compression schemes (LZW, Huffman coding, PPM, Arithmetic Coding, BWT and LZMA) on these 20 data types; (b) A generic prototype application which encodes (for sending) and decodes (on receiving) the compressed digital content over SMS. Sending digital contents via SMS over satellite communication is achieved by converting digital content into text; apply lossless compression on the text and transmit the compressed text by using SMS. Proposed method does not require Internet Service and also not requires any additional hardware in existing network architecture to transmit digital contents. Results show that overall PPM compression method offers best compression ratio (0.63) among all compression schemes Thus PPM reduces the SMS transmission saving up to 43%, while LZW performs the least with 17.6%.
matching of schemas in heterogeneous data sharing systems is time consuming and error prone. Exis... more matching of schemas in heterogeneous data sharing systems is time consuming and error prone. Existing mapping tools employ semi-automatic techniques for mapping two schemas at a time. In a large-scale scenario, where data sharing involves a large number of data sources, such techniques are not suitable. In this paper we present a method, which creates a mediated schema tree from a large set of input schema trees and defines mappings from the contributing schemas to the mediated schema. It is a two-phase approach. First, we use a set of linguistic matchers, which extract the semantics of all distinct node labels, present in input schemas, and form clusters of semantically similar labels. Second, we use a tree-mining data structure, combined with the similar label clusters, to calculate the context of each node, which is used in mapping. Since the input schemas are trees, our tree mining algorithm uses node ranks calculated by pre-order traversal. Tree mining combined with semantic la...
Modularity is generally used as a measure for determining goodness of clustering. In this paper, ... more Modularity is generally used as a measure for determining goodness of clustering. In this paper, we performed analysis on coauthorship network. The authors belong to the field of Biology related to different research and academic institutes from all over the world. We took different centrality measures such as betweenness, degree, page rank, closeness and ranked authors on the basis of these measures. We also found cohesive groups of authors and detected communities in the form of clusters, using modularity metrics such as edge betweenness, walk-trap and label propagation. Our results indicate that different modularity metrics identified communities near to each other and clustered the authors of highly collaborating
Match cardinality aspect in schema matching is categorized as simple element level matching and c... more Match cardinality aspect in schema matching is categorized as simple element level matching and complex structural level matching. Simple matching comprises of 1:1, 1:n and n:1 match cardinality, whereas n:m match cardinality is considered to be complex matching. Most of the existing approaches and tools give good 1:1 local and global match cardinality but lack the capabilities for handling the complex cardinality issue. In this paper we demonstrate an automatic approach for creation and validation of n:m schema mappings. Our technique is applicable to hierarchical structures like XML Schema. Basic idea is to propose an n:m nodes mapping between children (leaf nodes) of two matching non-leaf nodes of two schemas. The similarity computation of the two non-leaf nodes is based upon the syntactic and linguistic similarity of node labels; supported by similarity among the ancestral paths from nodes to the root. The n:m mapping proposition is then verified with the help of mini-taxonomies...
Scientific Reports
Block cipher has been a standout amongst the most reliable option by which data security is accom... more Block cipher has been a standout amongst the most reliable option by which data security is accomplished. Block cipher strength against various attacks relies on substitution boxes. In literature, extensively algebraic structures, and chaotic systems-based techniques are available to design the cryptographic substitution boxes. Although, algebraic and chaotic systems-based approaches have favorable characteristics for the design of substitution boxes, but on the other side researchers have also pointed weaknesses in these approaches. First-time multilevel information fusion is introduced to construct the substitution boxes, having four layers; Multi Sources, Multi Features, Nonlinear Multi Features Whitening and Substitution Boxes Construction. Our proposed design does not hold the weakness of algebraic structures and chaotic systems because our novel s-box construction relies on the strength of true random numbers. In our proposed method true random numbers are generated from the i...
In this paper, we describe a novel method for imputing weather temperature data. The technique di... more In this paper, we describe a novel method for imputing weather temperature data. The technique discussed is targeted toward unit imputation of a missing surface temperature value, for a specific weather station, on a specific date. The imputation method relies solely on the available daily maximum temperature data set of the weather station. We propose a hybrid approach, K-Nearest Temperature Trends (KNTT) which identifies a cluster of K years, showing nearest temperature trends to that of the year of the missing value date. Next, the missing temperature value of the date is imputed by taking average of the values for the same date of the identified K-Trends years. We used the data set of temperature values from 38 weather stations of Pakistan, spanning over 30 years (1980-2010), for our experiments. We evaluated our methodology by using ME, MAE and RMSE and the results show that our technique imputed correctly, with an error rate less than the standard KNN technique.
Semantic matching of schemas in heterogeneous data sharing systems is time consuming and error pr... more Semantic matching of schemas in heterogeneous data sharing systems is time consuming and error prone. The dissertation presents a new robust automatic method which integrates a large set of domain specific schemas, represented as tree structures, based upon semantic correspondences among them. The method also creates the mappings from source schemas to the integrated schema. Secondly, the report gives an automatic technique to compute complex matchings between two schemas. Existing mapping tools employ semi-automatic techniques for mapping two schemas at a time. In a large-scale scenario, where data sharing involves a large number of data sources, such techniques are not suitable. Semi-automatic matching requires user intervention to finalize a certain mapping. Although it provides the flexibilty to compute the best possible mapping but time performance wise abates the whole matching process. At first, the dissertation gives a detail discussion about the state of the art in schema m...
The recent proliferation of multimedia information on the web enhances user information need from... more The recent proliferation of multimedia information on the web enhances user information need from simple textual lookup to multi-modal exploration activities. The current search engines act as major gateways to access the immense amount of multimedia data. However, access to the multimedia content is provided by aggregating disjoint multimedia search verticals. The aggregation of the multimedia search results cannot consider relationships in them and are partially blended. Additionally, the search results’ presentation is via linear lists, which cannot support the users’ non-linear navigation patterns to explore the multimedia search results. Contrarily, users’ are demanding more services from search engines. It includes adequate access to navigate, explore, and discover multimedia information. Our discovery approach allow users to explore and discover multimedia information by semantically aggregating disjoint verticals using sentence embeddings and transforming snippets into conce...
Today schema matching is a basic task in almost every data intensive distributed application, nam... more Today schema matching is a basic task in almost every data intensive distributed application, namely enterprise information integration, collaborating web services, ontology based agents communication, web catalogue integration and schema based P2P database systems. There has been a plethora of algorithms and techniques researched in schema matching and integration for data interoperability. Numerous surveys have been presented in the past to summarize this research. The requirement for extending the previous surveys has been created because of the mushrooming of the dynamic nature of these data intensive applications. Indeed, evolving large scale distributed information systems are further pushing the schema matching research to utilize the processing power not available in the past and directly increasing the industry investment proportion in the matching domain. This article reviews the latest application domains in which schema matching is being utilized. The paper gives a detai...
Multim. Tools Appl., 2021
Nowadays, web users frequently explore multimedia contents to satisfy their information needs. Th... more Nowadays, web users frequently explore multimedia contents to satisfy their information needs. The exploration approaches usually provide linear interaction mechanisms and do not exploit the multiple information modalities associated with results. They cannot treat multimedia documents as aggregated entities. The aggregation of results in multimedia documents and nonlinear navigation in them is usually not possible. The exploration of multimedia content segregated in multiple verticals is tedious. In this research, we propose an approach to address the core issues in multimedia contents exploration. We provide a nonlinear and multimodal exploration of multimedia document results. We generate result spaces by exploiting multimodal similarity and semantic relationships in results and enable their nonlinear and multimodal exploration via a search user interface (SUI) design. The result space connects retrieved multimedia documents and their aggregated media objects via multimodal simil...
Expert Systems with Applications
IEEE Access
In recent years, user's trust has gained attention in recommender systems. Trust plays a vital ro... more In recent years, user's trust has gained attention in recommender systems. Trust plays a vital role in the recommendation of online products. Trust is a dynamic feature which evolves with passage of time and varies from person to person. Trust-based cross domain recommender systems suggest items to the users usually by ratings, provided by similar users, often not available in the same domain. However, due to the sparse rating scores, recommender systems cannot generate up-to-the-mark recommendations. In this research, we solved a user cold start problem, mainly by modeling preference drift on a temporal basis. We tried to solve this problem by adopting one of the scenarios of cross domain of 'No Overlap' using cross domain information. In this work, we proposed a model called Trust Aware Cross Domain Temporal Recommendations (TrustCTR) that predict the rating of an item about an active user from the most recent time. We generated user features and item features by using latent factor model and trained the proposed model. We also introduced the concept of trust relevancy that shows the degree of trust, computed the trusted neighbors in target domain for an active user belonging to a source domain, and predicted the ratings of items for cold start users. We performed experiments on public datasets Ciao and Epinions and used these datasets in cross domain form such as the categories of Ciao as source domain and Epinions as the target domain. We selected five different domains, having a higher proportion of rating sparsity, for observing the performance of our approach using MAE, RMSE, and F-measure. Our approach is a viable solution of cold start problem and offers effective recommendations We also compared the model with state-of-the-art methods; the model generates satisfactory results.
Journal of Medical Imaging and Health Informatics
IEEE Access
Massive advances in internet infrastructure are impacting e-healthcare services compared to conve... more Massive advances in internet infrastructure are impacting e-healthcare services compared to conventional means. Therefore, extra care and protection is needed for extremely confidential patient medical records. With this intention, we have proposed an enhanced image steganography method, to improve imperceptibility and data hiding capacity of stego images. The proposed Image Region Decomposition (IRD) method, embeds more secret information with better imperceptibility, in patient's medical images. The algorithm decomposes the grayscale magnetic resonance imaging (MRI) images into three unique regions: low-intensity, medium-intensity, and high-intensity. Each region is made up of k number of pixels, and in each pixel we operate the block of n least significant bits (LSBs), where 1 ≤ n ≤ 3. Four classes of MRI images of different dimensions are used for embedding. Data with different volumes are used to test the images for imperceptibility and verified with quality factors. The proposed IRD algorithm is tested for performance, on the set of brain MRI images using peak signal-to-noise ratio (PSNR), mean square error (MSE) and structural similarity (SSIM) index. The results elucidated that the MRI stego image is imperceptible, like the original cover image by adjusting 2 nd and 1 st LSBs in the low-intensity region. Our proposed steganography technique provides a better average PSNR (49.27), than other similar methods. The empirical results show that the proposed IRD algorithm, significantly improves the imperceptibility and data embedding capacity, compared to the existing state-of-the-art methods.
Today schema matching is a basic problem in almost every data intensive distributed application, ... more Today schema matching is a basic problem in almost every data intensive distributed application, namely enterprise information integration, collaborating web services, ontology based agents communication , web catalogue integration and schema based P2P database systems. There has been a plethora of algorithms and techniques researched in schema matching and integration for data interoperability. Numerous surveys have been presented in the past to summerize this research. The requirement for extending the previous surveys has been created because of the mushrooming of the dynamic nature of these data intensive applications. Today data is viewed as a semantic entity, motivating new algorithms and strategies. The evolving large scale distributed information systems are further pushing the schema matching research to utilize the processing power not available in the past. Thus directly increasing the industry investment proportion in the matching domain. This article reviews the latest ...
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific ... more HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et a ̀ la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Proceedings of the 2nd International Conference on Information System and Data Mining - ICISDM '18
The paper describes the use of Deep Convolution Neural Networks (DCNN) for the recognition of Sno... more The paper describes the use of Deep Convolution Neural Networks (DCNN) for the recognition of Snow Leopards, from a data set of photos taken in the wild. The data set comprises of 1500 images, captured in the Himalayas using motion sensing cameras. The images contain numerous living species, ranging from a butterfly to a human being, other than Snow Leopard. For the training phase we divided the data set into two classes, Snow Leopard and Other Animals. The Snow Leopard class contains photos showing more than one animal, from different angles, having different sizes, body parts because of distance from camera and several backgrounds. The photos are converted to 200 x 200, grey scale images in the preprocessing phase. A 5 layer DCCN, constituted of 3 convoluted and 2 fully connected layers, is employed for the experimental setup. Rectified Liner Units (ReLU) is used as the activation function in the fully connected layers and softmax function is applied for classification. The evaluation of the system shows an overall 91% accuracy, along with sensitivity of 0.90 and specificity of 0.88 for Snow Leopard class identification.
Advances in Intelligent Systems and Computing
Wireless sensor networks (WSNs) rely on effective deployment of sensing nodes. Efficient sensor d... more Wireless sensor networks (WSNs) rely on effective deployment of sensing nodes. Efficient sensor deployment with ensured connectivity is a major challenge in WSNs. Several deployment approaches have been proposed in literature to address the connectivity and efficiency of sensor networks. However, most of these works either lack in efficiency or ignore the connectivity issues. In this paper, we propose an efficient and connectivity-based algorithm by modifying the Ant Colony Optimization (ACO) (Liu and He, J Netw Comput Appl 39:310–318, 2014). Traditional ACO algorithms ensure coverage at a high cost and repetitive sensing, which results in resource wastage. Our proposed algorithm reduces the sensing cost with efficient deployment and enhanced connectivity. Simulation results indicate the ability of proposed framework to significantly reduce the coverage cost as well as achieve longer life time for WSNs.
PeerJ Computer Science
The wireless networks face challenges in efficient utilization of bandwidth due to paucity of res... more The wireless networks face challenges in efficient utilization of bandwidth due to paucity of resources and lack of central management, which may result in undesired congestion. The cognitive radio (CR) paradigm can bring efficiency, better utilization of bandwidth, and appropriate management of limited resources. While the CR paradigm is an attractive choice, the CRs selfishly compete to acquire and utilize available bandwidth that may ultimately result in inappropriate power levels, causing degradation in network’s Quality of Service (QoS). A cooperative game theoretic approach can ease the problem of spectrum sharing and power utilization in a hostile and selfish environment. We focus on the challenge of congestion control that results in inadequate and uncontrolled access of channels and utilization of resources. The Nash equilibrium (NE) of a cooperative congestion game is examined by considering the cost basis, which is embedded in the utility function. The proposed algorithm ...
Proceedings of the 5th International Conference on Communication and Information Processing
Countries development are mostly judged and categorized according to economic conditions, human d... more Countries development are mostly judged and categorized according to economic conditions, human development and technology development. Ranking tools like e-Government index by United Nation (UN) is one such tool that is used by policy makers while employing the information and communication policies and allocating the resources to implement those policies. In spite of mostly used, currently used e-Government ranking tools have some limitations. For example, UN e-Government index does not handle the changing technology enhancements in ICT. In this paper we assessed, UN e-Government index and its components, identified and added social media as new parameter to UN e-Government index by using data collected from UN 193 member government websites. The alternate UN e-government index responds to the needs of enhancements and expansions in Information and Communication Technology.
Advances in Intelligent Systems and Computing
Researchers are devising new ways for robust digital content delivery in situations where telecom... more Researchers are devising new ways for robust digital content delivery in situations where telecommunication signal strength is very low, especially during natural disasters. In this paper, we present research work targeting two dimensions: (a) We selected the IANA standard for digital content classification, 20 types in 5 categories; applied and compared five different lossless compression schemes (LZW, Huffman coding, PPM, Arithmetic Coding, BWT and LZMA) on these 20 data types; (b) A generic prototype application which encodes (for sending) and decodes (on receiving) the compressed digital content over SMS. Sending digital contents via SMS over satellite communication is achieved by converting digital content into text; apply lossless compression on the text and transmit the compressed text by using SMS. Proposed method does not require Internet Service and also not requires any additional hardware in existing network architecture to transmit digital contents. Results show that overall PPM compression method offers best compression ratio (0.63) among all compression schemes Thus PPM reduces the SMS transmission saving up to 43%, while LZW performs the least with 17.6%.
matching of schemas in heterogeneous data sharing systems is time consuming and error prone. Exis... more matching of schemas in heterogeneous data sharing systems is time consuming and error prone. Existing mapping tools employ semi-automatic techniques for mapping two schemas at a time. In a large-scale scenario, where data sharing involves a large number of data sources, such techniques are not suitable. In this paper we present a method, which creates a mediated schema tree from a large set of input schema trees and defines mappings from the contributing schemas to the mediated schema. It is a two-phase approach. First, we use a set of linguistic matchers, which extract the semantics of all distinct node labels, present in input schemas, and form clusters of semantically similar labels. Second, we use a tree-mining data structure, combined with the similar label clusters, to calculate the context of each node, which is used in mapping. Since the input schemas are trees, our tree mining algorithm uses node ranks calculated by pre-order traversal. Tree mining combined with semantic la...
Modularity is generally used as a measure for determining goodness of clustering. In this paper, ... more Modularity is generally used as a measure for determining goodness of clustering. In this paper, we performed analysis on coauthorship network. The authors belong to the field of Biology related to different research and academic institutes from all over the world. We took different centrality measures such as betweenness, degree, page rank, closeness and ranked authors on the basis of these measures. We also found cohesive groups of authors and detected communities in the form of clusters, using modularity metrics such as edge betweenness, walk-trap and label propagation. Our results indicate that different modularity metrics identified communities near to each other and clustered the authors of highly collaborating
Match cardinality aspect in schema matching is categorized as simple element level matching and c... more Match cardinality aspect in schema matching is categorized as simple element level matching and complex structural level matching. Simple matching comprises of 1:1, 1:n and n:1 match cardinality, whereas n:m match cardinality is considered to be complex matching. Most of the existing approaches and tools give good 1:1 local and global match cardinality but lack the capabilities for handling the complex cardinality issue. In this paper we demonstrate an automatic approach for creation and validation of n:m schema mappings. Our technique is applicable to hierarchical structures like XML Schema. Basic idea is to propose an n:m nodes mapping between children (leaf nodes) of two matching non-leaf nodes of two schemas. The similarity computation of the two non-leaf nodes is based upon the syntactic and linguistic similarity of node labels; supported by similarity among the ancestral paths from nodes to the root. The n:m mapping proposition is then verified with the help of mini-taxonomies...
Scientific Reports
Block cipher has been a standout amongst the most reliable option by which data security is accom... more Block cipher has been a standout amongst the most reliable option by which data security is accomplished. Block cipher strength against various attacks relies on substitution boxes. In literature, extensively algebraic structures, and chaotic systems-based techniques are available to design the cryptographic substitution boxes. Although, algebraic and chaotic systems-based approaches have favorable characteristics for the design of substitution boxes, but on the other side researchers have also pointed weaknesses in these approaches. First-time multilevel information fusion is introduced to construct the substitution boxes, having four layers; Multi Sources, Multi Features, Nonlinear Multi Features Whitening and Substitution Boxes Construction. Our proposed design does not hold the weakness of algebraic structures and chaotic systems because our novel s-box construction relies on the strength of true random numbers. In our proposed method true random numbers are generated from the i...
In this paper, we describe a novel method for imputing weather temperature data. The technique di... more In this paper, we describe a novel method for imputing weather temperature data. The technique discussed is targeted toward unit imputation of a missing surface temperature value, for a specific weather station, on a specific date. The imputation method relies solely on the available daily maximum temperature data set of the weather station. We propose a hybrid approach, K-Nearest Temperature Trends (KNTT) which identifies a cluster of K years, showing nearest temperature trends to that of the year of the missing value date. Next, the missing temperature value of the date is imputed by taking average of the values for the same date of the identified K-Trends years. We used the data set of temperature values from 38 weather stations of Pakistan, spanning over 30 years (1980-2010), for our experiments. We evaluated our methodology by using ME, MAE and RMSE and the results show that our technique imputed correctly, with an error rate less than the standard KNN technique.
Semantic matching of schemas in heterogeneous data sharing systems is time consuming and error pr... more Semantic matching of schemas in heterogeneous data sharing systems is time consuming and error prone. The dissertation presents a new robust automatic method which integrates a large set of domain specific schemas, represented as tree structures, based upon semantic correspondences among them. The method also creates the mappings from source schemas to the integrated schema. Secondly, the report gives an automatic technique to compute complex matchings between two schemas. Existing mapping tools employ semi-automatic techniques for mapping two schemas at a time. In a large-scale scenario, where data sharing involves a large number of data sources, such techniques are not suitable. Semi-automatic matching requires user intervention to finalize a certain mapping. Although it provides the flexibilty to compute the best possible mapping but time performance wise abates the whole matching process. At first, the dissertation gives a detail discussion about the state of the art in schema m...
The recent proliferation of multimedia information on the web enhances user information need from... more The recent proliferation of multimedia information on the web enhances user information need from simple textual lookup to multi-modal exploration activities. The current search engines act as major gateways to access the immense amount of multimedia data. However, access to the multimedia content is provided by aggregating disjoint multimedia search verticals. The aggregation of the multimedia search results cannot consider relationships in them and are partially blended. Additionally, the search results’ presentation is via linear lists, which cannot support the users’ non-linear navigation patterns to explore the multimedia search results. Contrarily, users’ are demanding more services from search engines. It includes adequate access to navigate, explore, and discover multimedia information. Our discovery approach allow users to explore and discover multimedia information by semantically aggregating disjoint verticals using sentence embeddings and transforming snippets into conce...
Today schema matching is a basic task in almost every data intensive distributed application, nam... more Today schema matching is a basic task in almost every data intensive distributed application, namely enterprise information integration, collaborating web services, ontology based agents communication, web catalogue integration and schema based P2P database systems. There has been a plethora of algorithms and techniques researched in schema matching and integration for data interoperability. Numerous surveys have been presented in the past to summarize this research. The requirement for extending the previous surveys has been created because of the mushrooming of the dynamic nature of these data intensive applications. Indeed, evolving large scale distributed information systems are further pushing the schema matching research to utilize the processing power not available in the past and directly increasing the industry investment proportion in the matching domain. This article reviews the latest application domains in which schema matching is being utilized. The paper gives a detai...
Multim. Tools Appl., 2021
Nowadays, web users frequently explore multimedia contents to satisfy their information needs. Th... more Nowadays, web users frequently explore multimedia contents to satisfy their information needs. The exploration approaches usually provide linear interaction mechanisms and do not exploit the multiple information modalities associated with results. They cannot treat multimedia documents as aggregated entities. The aggregation of results in multimedia documents and nonlinear navigation in them is usually not possible. The exploration of multimedia content segregated in multiple verticals is tedious. In this research, we propose an approach to address the core issues in multimedia contents exploration. We provide a nonlinear and multimodal exploration of multimedia document results. We generate result spaces by exploiting multimodal similarity and semantic relationships in results and enable their nonlinear and multimodal exploration via a search user interface (SUI) design. The result space connects retrieved multimedia documents and their aggregated media objects via multimodal simil...
Expert Systems with Applications
IEEE Access
In recent years, user's trust has gained attention in recommender systems. Trust plays a vital ro... more In recent years, user's trust has gained attention in recommender systems. Trust plays a vital role in the recommendation of online products. Trust is a dynamic feature which evolves with passage of time and varies from person to person. Trust-based cross domain recommender systems suggest items to the users usually by ratings, provided by similar users, often not available in the same domain. However, due to the sparse rating scores, recommender systems cannot generate up-to-the-mark recommendations. In this research, we solved a user cold start problem, mainly by modeling preference drift on a temporal basis. We tried to solve this problem by adopting one of the scenarios of cross domain of 'No Overlap' using cross domain information. In this work, we proposed a model called Trust Aware Cross Domain Temporal Recommendations (TrustCTR) that predict the rating of an item about an active user from the most recent time. We generated user features and item features by using latent factor model and trained the proposed model. We also introduced the concept of trust relevancy that shows the degree of trust, computed the trusted neighbors in target domain for an active user belonging to a source domain, and predicted the ratings of items for cold start users. We performed experiments on public datasets Ciao and Epinions and used these datasets in cross domain form such as the categories of Ciao as source domain and Epinions as the target domain. We selected five different domains, having a higher proportion of rating sparsity, for observing the performance of our approach using MAE, RMSE, and F-measure. Our approach is a viable solution of cold start problem and offers effective recommendations We also compared the model with state-of-the-art methods; the model generates satisfactory results.
Journal of Medical Imaging and Health Informatics
IEEE Access
Massive advances in internet infrastructure are impacting e-healthcare services compared to conve... more Massive advances in internet infrastructure are impacting e-healthcare services compared to conventional means. Therefore, extra care and protection is needed for extremely confidential patient medical records. With this intention, we have proposed an enhanced image steganography method, to improve imperceptibility and data hiding capacity of stego images. The proposed Image Region Decomposition (IRD) method, embeds more secret information with better imperceptibility, in patient's medical images. The algorithm decomposes the grayscale magnetic resonance imaging (MRI) images into three unique regions: low-intensity, medium-intensity, and high-intensity. Each region is made up of k number of pixels, and in each pixel we operate the block of n least significant bits (LSBs), where 1 ≤ n ≤ 3. Four classes of MRI images of different dimensions are used for embedding. Data with different volumes are used to test the images for imperceptibility and verified with quality factors. The proposed IRD algorithm is tested for performance, on the set of brain MRI images using peak signal-to-noise ratio (PSNR), mean square error (MSE) and structural similarity (SSIM) index. The results elucidated that the MRI stego image is imperceptible, like the original cover image by adjusting 2 nd and 1 st LSBs in the low-intensity region. Our proposed steganography technique provides a better average PSNR (49.27), than other similar methods. The empirical results show that the proposed IRD algorithm, significantly improves the imperceptibility and data embedding capacity, compared to the existing state-of-the-art methods.