sailaja kumar - Academia.edu (original) (raw)
Papers by sailaja kumar
Advanced Science Letters, 2011
Computer Systems Science and Engineering
Standalone systems cannot handle the giant traffic loads generated by Twitter due to memory const... more Standalone systems cannot handle the giant traffic loads generated by Twitter due to memory constraints. A parallel computational environment provided by Apache Hadoop can distribute and process the data over different destination systems. In this paper, the Hadoop cluster with four nodes integrated with RHadoop, Flume, and Hive is created to analyze the tweets gathered from the Twitter stream. Twitter stream data is collected relevant to an event/topic like IPL-2015, cricket, Royal Challengers Bangalore, Kohli, Modi, from May 24 to 30, 2016 using Flume. Hive is used as a data warehouse to store the streamed tweets. Twitter analytics like maximum number of tweets by users, the average number of followers, and maximum number of friends are obtained using Hive. The network graph is constructed with the user's unique screen name and mentions using 'R'. A timeline graph of individual users is generated using 'R'. Also, the proposed solution analyses the emotions of cricket fans by classifying their Twitter messages into appropriate emotional categories using the optimized support vector neural network (OSVNN) classification model. To attain better classification accuracy, the performance of SVNN is enhanced using a chimp optimization algorithm (ChOA). Extracting the users' emotions toward an event is beneficial for prediction, but when coupled with visualizations, it becomes more powerful. Bar-chart and wordcloud are generated to visualize the emotional analysis results.
Indonesian Journal of Electrical Engineering and Computer Science, 2019
International Journal of Computer Applications, 2016
In distributed environment the changes and challenges faced by composite web services in their ne... more In distributed environment the changes and challenges faced by composite web services in their network environment are enormous. The primary issue in these systems is to distribute the load among various components of the composite web services to improve the performance through minimizing the response time. In this paper, we present a methodology to estimate the load sharing among composite web services using traditional workflow concept combined with game theory approach. In addition, we have developed a simulation model and observed that the results are promising for a better understanding of workload sharing among individual web services. We obtained the stable solution for ensuring the cooperation of the web services which further helps in capacity planning in composite web services network environment.
Advances in Intelligent Systems and Computing, 2022
Nowadays, mail services are used rigorously for communication purpose. Due to the widespread dema... more Nowadays, mail services are used rigorously for communication purpose. Due to the widespread demand for mail services, performance degradation may occur for mail servers. Performance is an open issue that is affected by many factors including the technical factors. To identify the factors that have an impact on the performance of the mail services, we have carried out an experimental study by focusing on few prominent mail services. In this paper, the results of the experimental study and the results obtained from activity-based performance prediction approach are compared and discussed. Regression analysis is used for comparison, and the obtained value shows that both are closure to each other.
2016 International Conference on Circuits, Controls, Communications and Computing (I4C), 2016
The most powerful medium for communication among the individuals to share their valuable thoughts... more The most powerful medium for communication among the individuals to share their valuable thoughts are Online Social Networks (OSNs). ‘Twitter’ is one of the most popular OSN rich with public data/tweets. In this paper we used Twitter Streaming API ‘streamR’ which is provided by ‘R’ statistical programming language, to extract the real-time tweets from Twitter. The tweet has many attributes which can be further analyzed to find most significant information about the Twitter user. We considered three attributes: screen-name, follower-count and friend-count. Twitter data is scaled up from gigabytes to petabytes and standalone system could not withstand or process this huge data due to hardware constraints. We used the prevailing parallel computing environment provided by ‘Hadoop’ with ‘Python’ programming language to analyze the Twitter users' whose follower-count and friend-count is less than 5000. We identified the user with maximum follower-count as the influential user who can ...
World Academy of Science, Engineering and Technology, International Journal of Educational and Pedagogical Sciences, 2016
Journal of Engineering and Technology, 2015
Sharing opinion through social media on various historical events by millions of internet users h... more Sharing opinion through social media on various historical events by millions of internet users has become very popular. In general,online social networks (OSNs) assist users of social networks to obtain the accurate and up-to-date information about various historical events.'Twitter'the most popular OSNis a micro-blogging platform used for sending and receiving messages.This paper focuses on the role of Twitter in updating its followers about the most mission critical time-sensitive events information through the communication opportunities offered by the Internet.An experimental study conducted to collect the twitter data on the space research organizations event 'Mangalyaan'and around 1030 tweets were examined based on the keywords related to that event.The statistical software tool 'R' is used for text analysis to identify the most frequent words and the significance of those words in thetarget event.
The analysis of Online Social Networks (OSNs) data is an emerging field involving sociology, stat... more The analysis of Online Social Networks (OSNs) data is an emerging field involving sociology, statistics, and graph theory. Regression Discontinuity Design (RDD) is a quasi-experimental research design widely used in social, behavioral and related sciences. In this paper, we proposed a methodology to analyze the data from the most popular micro-blogging OSN ‘Twitter’. The methodology is implemented using ‘R’ statistical tool. The tweets related to the ‘Mangalayan’ event, India’s Mars Orbiter Mission launched on 5 November 2013 by the Indian Space Research Organization are analyzed. The Twitter users who are expressive/non expressive on this event are examined. In particular the pattern related to the user’s responses to this event is identified, which helps in predicting the Twitter users’ social behavior and their involvement associated to such similar events. The most frequent words reflecting the relevance to this event are visualized. The visual results are helpful to understand ...
International journal of engineering research and technology, 2018
Online Social Networks (OSNs), an emerging multidisciplinary research field has become the import... more Online Social Networks (OSNs), an emerging multidisciplinary research field has become the important element of information society. OSNs provide a basis for maintaining social relationships, finding users with similar interests, and locating content and knowledge that has been contributed or endorsed by other users. The key aspect of many of the OSNs is that they are rich in data, and provide unprecedented challenges and opportunities from the perspective of knowledge discovery and data mining. This survey paper reviews the current state-of-art on the selected key challenges in OSNs such as data gathering techniques, heterogeneity, scalability and missing data, which helps the research community and also suggests that significant further research is required in this area.
In distributed environment the changes and challenges faced by composite web services in their ne... more In distributed environment the changes and challenges faced by composite web services in their network environment are enormous. The primary issue in these systems is to distribute the load among various components of the composite web services to improve the performance through minimizing the response time. In this paper, we present a methodology to estimate the load sharing among composite web services using traditional workflow concept combined with game theory approach. In addition, we have developed a simulation model and observed that the results are promising for a better understanding of workload sharing among individual web services. We obtained the stable solution for ensuring the cooperation of the web services which further helps in capacity planning in composite web services network environment.
Journal of Computational and Theoretical Nanoscience
Analyzing the heterogeneous data generated by social networking sites is a research challenge. Tw... more Analyzing the heterogeneous data generated by social networking sites is a research challenge. Twitter is a massive social networking site. In this paper, for processing the heterogeneous data, a methodology is devised, which helps in categorizing the data obtained from Twitter into different directories and understanding the text data explicitly. The methodology is implemented using Python programming language. Python’s tweepy package is used to download the Twitter stream data which includes images, videos and text data. Python’s Aylien API is used for analyzing the Twitter text data. Using this API, sentiment analysis report is generated. Using Python’s matplotlib package, a pie chart is generated to visualize the sentiment analysis results. Further an algorithm is proposed for sentiment analysis, which not only categorizes the tweets into positive, negative and neutral (as Aylien API does), but also categorizes the tweets into strongly and weakly, positive and negative based on ...
2016 International Conference on Recent Trends in Information Technology (ICRTIT), 2016
International Conference on Circuits, Communication, Control and Computing, 2014
Indonesian Journal of Electrical Engineering and Computer Science
In the era of rapid growth of cloud computing, performance calculation of cloud service is an ess... more In the era of rapid growth of cloud computing, performance calculation of cloud service is an essential criterion to assure quality of service. Nevertheless, it is a perplexing task to effectively analyze the performance of cloud service due to the complexity of cloud resources and the diversity of Big Data applications. Hence, we propose to examine the performance of Big Data applications with Hadoop and thus to figure out the performance in cloud cluster. Hadoop is built based on MapReduce, one of the widely used programming models in Big Data. In this paper, the performance analysis of Hadoop MapReduce WordCount application for Twitter data is presented. A 4-node in-house Hadoop cluster was setup and experiment was carried out for analyzing the performance. Through this work, it was concluded that Hadoop is efficient for BigData applications with 3 or more nodes with replication factor 3. Also, it was observed that system time was relatively more compared to user time for BigData applications beyond 80GB. This experiment had also thrown certain pattern on actual data blocks used to process the WordCount application.
Advanced Science Letters, 2011
Computer Systems Science and Engineering
Standalone systems cannot handle the giant traffic loads generated by Twitter due to memory const... more Standalone systems cannot handle the giant traffic loads generated by Twitter due to memory constraints. A parallel computational environment provided by Apache Hadoop can distribute and process the data over different destination systems. In this paper, the Hadoop cluster with four nodes integrated with RHadoop, Flume, and Hive is created to analyze the tweets gathered from the Twitter stream. Twitter stream data is collected relevant to an event/topic like IPL-2015, cricket, Royal Challengers Bangalore, Kohli, Modi, from May 24 to 30, 2016 using Flume. Hive is used as a data warehouse to store the streamed tweets. Twitter analytics like maximum number of tweets by users, the average number of followers, and maximum number of friends are obtained using Hive. The network graph is constructed with the user's unique screen name and mentions using 'R'. A timeline graph of individual users is generated using 'R'. Also, the proposed solution analyses the emotions of cricket fans by classifying their Twitter messages into appropriate emotional categories using the optimized support vector neural network (OSVNN) classification model. To attain better classification accuracy, the performance of SVNN is enhanced using a chimp optimization algorithm (ChOA). Extracting the users' emotions toward an event is beneficial for prediction, but when coupled with visualizations, it becomes more powerful. Bar-chart and wordcloud are generated to visualize the emotional analysis results.
Indonesian Journal of Electrical Engineering and Computer Science, 2019
International Journal of Computer Applications, 2016
In distributed environment the changes and challenges faced by composite web services in their ne... more In distributed environment the changes and challenges faced by composite web services in their network environment are enormous. The primary issue in these systems is to distribute the load among various components of the composite web services to improve the performance through minimizing the response time. In this paper, we present a methodology to estimate the load sharing among composite web services using traditional workflow concept combined with game theory approach. In addition, we have developed a simulation model and observed that the results are promising for a better understanding of workload sharing among individual web services. We obtained the stable solution for ensuring the cooperation of the web services which further helps in capacity planning in composite web services network environment.
Advances in Intelligent Systems and Computing, 2022
Nowadays, mail services are used rigorously for communication purpose. Due to the widespread dema... more Nowadays, mail services are used rigorously for communication purpose. Due to the widespread demand for mail services, performance degradation may occur for mail servers. Performance is an open issue that is affected by many factors including the technical factors. To identify the factors that have an impact on the performance of the mail services, we have carried out an experimental study by focusing on few prominent mail services. In this paper, the results of the experimental study and the results obtained from activity-based performance prediction approach are compared and discussed. Regression analysis is used for comparison, and the obtained value shows that both are closure to each other.
2016 International Conference on Circuits, Controls, Communications and Computing (I4C), 2016
The most powerful medium for communication among the individuals to share their valuable thoughts... more The most powerful medium for communication among the individuals to share their valuable thoughts are Online Social Networks (OSNs). ‘Twitter’ is one of the most popular OSN rich with public data/tweets. In this paper we used Twitter Streaming API ‘streamR’ which is provided by ‘R’ statistical programming language, to extract the real-time tweets from Twitter. The tweet has many attributes which can be further analyzed to find most significant information about the Twitter user. We considered three attributes: screen-name, follower-count and friend-count. Twitter data is scaled up from gigabytes to petabytes and standalone system could not withstand or process this huge data due to hardware constraints. We used the prevailing parallel computing environment provided by ‘Hadoop’ with ‘Python’ programming language to analyze the Twitter users' whose follower-count and friend-count is less than 5000. We identified the user with maximum follower-count as the influential user who can ...
World Academy of Science, Engineering and Technology, International Journal of Educational and Pedagogical Sciences, 2016
Journal of Engineering and Technology, 2015
Sharing opinion through social media on various historical events by millions of internet users h... more Sharing opinion through social media on various historical events by millions of internet users has become very popular. In general,online social networks (OSNs) assist users of social networks to obtain the accurate and up-to-date information about various historical events.'Twitter'the most popular OSNis a micro-blogging platform used for sending and receiving messages.This paper focuses on the role of Twitter in updating its followers about the most mission critical time-sensitive events information through the communication opportunities offered by the Internet.An experimental study conducted to collect the twitter data on the space research organizations event 'Mangalyaan'and around 1030 tweets were examined based on the keywords related to that event.The statistical software tool 'R' is used for text analysis to identify the most frequent words and the significance of those words in thetarget event.
The analysis of Online Social Networks (OSNs) data is an emerging field involving sociology, stat... more The analysis of Online Social Networks (OSNs) data is an emerging field involving sociology, statistics, and graph theory. Regression Discontinuity Design (RDD) is a quasi-experimental research design widely used in social, behavioral and related sciences. In this paper, we proposed a methodology to analyze the data from the most popular micro-blogging OSN ‘Twitter’. The methodology is implemented using ‘R’ statistical tool. The tweets related to the ‘Mangalayan’ event, India’s Mars Orbiter Mission launched on 5 November 2013 by the Indian Space Research Organization are analyzed. The Twitter users who are expressive/non expressive on this event are examined. In particular the pattern related to the user’s responses to this event is identified, which helps in predicting the Twitter users’ social behavior and their involvement associated to such similar events. The most frequent words reflecting the relevance to this event are visualized. The visual results are helpful to understand ...
International journal of engineering research and technology, 2018
Online Social Networks (OSNs), an emerging multidisciplinary research field has become the import... more Online Social Networks (OSNs), an emerging multidisciplinary research field has become the important element of information society. OSNs provide a basis for maintaining social relationships, finding users with similar interests, and locating content and knowledge that has been contributed or endorsed by other users. The key aspect of many of the OSNs is that they are rich in data, and provide unprecedented challenges and opportunities from the perspective of knowledge discovery and data mining. This survey paper reviews the current state-of-art on the selected key challenges in OSNs such as data gathering techniques, heterogeneity, scalability and missing data, which helps the research community and also suggests that significant further research is required in this area.
In distributed environment the changes and challenges faced by composite web services in their ne... more In distributed environment the changes and challenges faced by composite web services in their network environment are enormous. The primary issue in these systems is to distribute the load among various components of the composite web services to improve the performance through minimizing the response time. In this paper, we present a methodology to estimate the load sharing among composite web services using traditional workflow concept combined with game theory approach. In addition, we have developed a simulation model and observed that the results are promising for a better understanding of workload sharing among individual web services. We obtained the stable solution for ensuring the cooperation of the web services which further helps in capacity planning in composite web services network environment.
Journal of Computational and Theoretical Nanoscience
Analyzing the heterogeneous data generated by social networking sites is a research challenge. Tw... more Analyzing the heterogeneous data generated by social networking sites is a research challenge. Twitter is a massive social networking site. In this paper, for processing the heterogeneous data, a methodology is devised, which helps in categorizing the data obtained from Twitter into different directories and understanding the text data explicitly. The methodology is implemented using Python programming language. Python’s tweepy package is used to download the Twitter stream data which includes images, videos and text data. Python’s Aylien API is used for analyzing the Twitter text data. Using this API, sentiment analysis report is generated. Using Python’s matplotlib package, a pie chart is generated to visualize the sentiment analysis results. Further an algorithm is proposed for sentiment analysis, which not only categorizes the tweets into positive, negative and neutral (as Aylien API does), but also categorizes the tweets into strongly and weakly, positive and negative based on ...
2016 International Conference on Recent Trends in Information Technology (ICRTIT), 2016
International Conference on Circuits, Communication, Control and Computing, 2014
Indonesian Journal of Electrical Engineering and Computer Science
In the era of rapid growth of cloud computing, performance calculation of cloud service is an ess... more In the era of rapid growth of cloud computing, performance calculation of cloud service is an essential criterion to assure quality of service. Nevertheless, it is a perplexing task to effectively analyze the performance of cloud service due to the complexity of cloud resources and the diversity of Big Data applications. Hence, we propose to examine the performance of Big Data applications with Hadoop and thus to figure out the performance in cloud cluster. Hadoop is built based on MapReduce, one of the widely used programming models in Big Data. In this paper, the performance analysis of Hadoop MapReduce WordCount application for Twitter data is presented. A 4-node in-house Hadoop cluster was setup and experiment was carried out for analyzing the performance. Through this work, it was concluded that Hadoop is efficient for BigData applications with 3 or more nodes with replication factor 3. Also, it was observed that system time was relatively more compared to user time for BigData applications beyond 80GB. This experiment had also thrown certain pattern on actual data blocks used to process the WordCount application.