Joy Bose - Profile on Academia.edu (original) (raw)
Papers by Joy Bose
Identification of Inefficient Radios for Efficient Energy Consumption in a Mobile Network
In a mobile network, it is important to identify energy inefficient RRUs (Remote Radio Units) to ... more In a mobile network, it is important to identify energy inefficient RRUs (Remote Radio Units) to improve the overall energy efficiency of the network and achieve significant energy and cost savings. Existing solutions can identify inefficient RRUs based on hardware alarms or faults, but not energy consumption in real time for a given region. In this paper, we propose a network energy consumption model and method to identify inefficient RRUs with respect to power consumption in real time. Our method involves ML models trained on the historical data of performance management (PM) counters to predict the RRU energy consumption, on the basis of which the RRUs having a higher divergence between predicted and actual energy consumption are identified. The system is trained and tested with simulated data based on a major network operator.
COMSNETS, 2024
In a mobile network, it is important to identify energy inefficient RRUs (Remote Radio Units) to ... more In a mobile network, it is important to identify energy inefficient RRUs (Remote Radio Units) to improve the overall energy efficiency of the network and achieve significant energy and cost savings. Existing solutions can identify inefficient RRUs based on hardware alarms or faults, but not energy consumption in real time for a given region. In this paper, we propose a network energy consumption model and method to identify inefficient RRUs with respect to power consumption in real time. Our method involves ML models trained on the historical data of performance management (PM) counters to predict the RRU energy consumption, on the basis of which the RRUs having a higher divergence between predicted and actual energy consumption are identified. The system is trained and tested with simulated data based on a major network operator.
IJMEMS, 2025
Data Drift is the phenomenon where the generating model behind the data changes over time. Due to... more Data Drift is the phenomenon where the generating model behind the data changes over time. Due to data drift, any model built on the past training data becomes less relevant and inaccurate over time. Thus, detecting and controlling for data drift is critical in machine learning models. Hierarchical Temporal Memory (HTM) is a machine learning model developed by Jeff Hawkins, inspired by how the human brain processes information. It is a biologically inspired model of memory that is similar in structure to the neocortex, and whose performance is claimed to be comparable to state of the art models in detecting anomalies in time series data. Another unique benefit of HTMs is its independence from training and testing cycle; all the learning takes place online with streaming data and no separate training and testing cycle is required. In sequential learning paradigm, Sequential Probability Ratio Test (SPRT) offers some unique benefit for online learning and inference. This paper proposes a novel hybrid framework combining HTM and SPRT for real-time data drift detection and anomaly identification. Unlike existing data drift methods, our approach eliminates frequent retraining and ensures low false positive rates. HTMs currently work with one dimensional or univariate data. In a second study, we also propose an application of HTM in multidimensional supervised scenario for anomaly detection by combining the outputs of multiple HTM columns, one for each dimension of the data, through a neural network. Experimental evaluations demonstrate that the proposed method outperforms conventional drift detection techniques like the Kolmogorov-Smirnov (KS) test, Wasserstein distance, and Population Stability Index (PSI) in terms of accuracy, adaptability, and computational efficiency. Our experiments also provide insights into optimizing hyperparameters for real-time deployment in domains such as Telecom.
CODS COMAD, 2025
We describe a novel approach to automating unit test generation for Java methods using large lang... more We describe a novel approach to automating unit test generation for Java methods using large language models (LLMs). Existing LLMbased approaches rely on sample usage(s) of the method to test (focal method) and/or provide the entire class of the focal method as input prompt and context. The former approach is often not viable due to the lack of sample usages, especially for newly written focal methods. The latter approach does not scale well enough; the bigger the complexity of the focal method and larger associated class, the harder it is to produce adequate test code (due to factors such as exceeding the prompt and context lengths of the underlying LLM). We show that augmenting prompts with concise and precise context information obtained by program analysis increases the effectiveness of generating unit test code through LLMs. We validate * Equal Contribution † The author was at Ericsson R&D during this study
Mindfulness meditation has been proven to be effective in treating a range of mental and physical... more Mindfulness meditation has been proven to be effective in treating a range of mental and physical conditions. Mindful Art is a type of mindfulness meditation that comprises sessions of drawing, painting and sculpturing with mindfulness for a given length of time. To date, the efficacy of mindful art has not been systematically studied. In this paper, we describe an experimental pilot study on two groups of participants, a beginner group of 21 participants and an experienced meditation group of 9 participants, who had previously practiced mindfulness meditation for more than one year. The beginner group was instructed in mindfulness sitting and moving meditation, while the experienced group was instructed in mindful art making in addition to mindfulness meditation. The instructions were delivered remotely over Tencent Conference and WeChat. The sessions were of 90 minutes duration each, twice per week, with 45 minutes of home practice daily and the length of the study was 21 days. Th...
Background Anxiety disorders, such as generalized anxiety disorder and social anxiety, are a majo... more Background Anxiety disorders, such as generalized anxiety disorder and social anxiety, are a major problem among adolescents and young adults. Structured mindfulness based interventions such as Mindfulness Based Cognitive Therapy (MBCT) and Mindfulness Based Stress Reduction (MBSR) have been shown to be at least as effective as other interventions for treating anxiety, but a thorough analysis of different factors for effective treatments is missing. Objective The objective of this narrative synthesis is to synthesize mindfulness treatments for anxiety in young adults aged between 12 to 25, and examine components of those interventions that are more effective in reducing anxiety. Methods Studies were selected from 3 public databases (APA Psycinfo, Embase, Medline), as well as a manual process to augment the searches. Interventions involving Mindfulness based Cognitive Therapy (MBCT) and Mindfulness based Stress Reduction (MBSR) based studies, as well as their variants were eligible. Anxiety should be one of the measures in the study although it may not be the primary measure. After initial screening and removal of duplicates, 8 studies involving 423 participants were identified. Results Identified themes included customizations for young people, homework and follow ups, qualifications of the instructors, dropout rates, physical activity and subjective experience. Most studies showed a significant decrease in anxiety symptoms, in case of social phobia, chronic pain, stress and academic performance. However, variable scales for measuring anxiety were employed across studies, making it difficult to combine or compare them. The amount of improvement of anxiety was variable. Interventions that included mindfulness information sessions for parents and interventions with mindful physical activity such as yoga showed better results. Conclusion Recommendations are presented to enable more effective mindfulness interventions tailored for young people with anxiety.
Link-Adaptation for Improved Quality-of-Service in V2V Communication using Reinforcement Learning
arXiv (Cornell University), Dec 16, 2019
The number of machine learning, artificial intelligence or data science related software engineer... more The number of machine learning, artificial intelligence or data science related software engineering projects using Agile methodology is increasing. However, there are very few studies on how such projects work in practice. In this paper, we analyze project issues tracking data taken from Scrum (a popular tool for Agile) for several machine learning projects. We compare this data with corresponding data from non-machine learning projects, in an attempt to analyze how machine learning projects are executed differently from normal software engineering projects. On analysis, we find that machine learning project issues use different kinds of words to describe issues, have higher number of exploratory or research oriented tasks as compared to implementation tasks, and have a higher number of issues in the product backlog after each sprint, denoting that it is more difficult to estimate the duration of machine learning project related tasks in advance. After analyzing this data, we propose a few ways in which Agile machine learning projects can be better logged and executed, given their differences with normal software engineering projects.
Story and Task Issue Analysis for Agile Machine Learning Projects
The usage of Agile methodology in planning and executing machine learning (ML) and data science r... more The usage of Agile methodology in planning and executing machine learning (ML) and data science related software engineering projects is increasing. However, there are very few studies using real data on how effective such planning is or guidelines on how to plan such projects. In this paper, we analyze data taken from several software projects using Scrum tools. We compare the data for data science/ML and non-ML projects, in an attempt to understand if data science and ML projects are planned or executed any differently compared to normal software engineering projects. We also perform a story classification task using machine learning to analyze story logs for agile tasks for several teams. We find there are differences in what makes a good ML story as opposed to a non ML story. After analyzing this data, we propose a few ways in which software projects, whether machine learning related or not, can be better logged and executed using Scrum tools like Jira.
Enhanced Alternate Action Recommender System Using Recurrent Patterns and Fault Detection System for Smart Home Users
We present a fault tolerant alternate action recommender system for smart home Internet of Things... more We present a fault tolerant alternate action recommender system for smart home Internet of Things (IoT) users to enrich the user experience with uninterrupted routines and various methods to achieve the regular routines in the smart home system. Our system takes events data from the smart home IoT devices as input, performs preprocessing using the big data handling techniques to transform it to be applicable to our system, applies our custom pattern-mining algorithm to derive the highly probable and active recurrent patterns of an individual user, ensures those frequently used devices are up and running using our fault detection monitoring system, and then finally recommends the alternate possibilities of achieving the deviated actions. Our custom fault detection system is based on various parameters of the IoT devices and context of the smart home users wherein the alternate recommendations given to the user are practical and useful in real time. We validated our system using user trial methods and various validation techniques.
arXiv (Cornell University), Nov 7, 2019
Boilerplate removal refers to the problem of removing noisy content from a webpage such as ads an... more Boilerplate removal refers to the problem of removing noisy content from a webpage such as ads and extracting relevant content that can be used by various services. This can be useful in several features in web browsers such as ad blocking, accessibility tools such as read out loud, translation, summarization etc. In order to create a training dataset to train a model for boilerplate detection and removal, labeling or tagging webpage data manually can be tedious and time consuming. Hence, a semi-supervised model, in which some of the webpage elements are labeled manually and labels for others are inferred based on some parameters, can be useful. In this paper we present a solution for extraction of relevant content from a webpage that relies on semi-supervised learning using Gaussian Random Fields. We first represent the webpage as a graph, with text elements as nodes and the edge weights representing similarity between nodes. After this, we label a few nodes in the graph using heuristics and label the remaining nodes by a weighted measure of similarity to the already labeled nodes. We describe the system architecture and a few preliminary results on a dataset of webpages.
Intelligent and Secure Autofill System in Web Browsers
Advances in intelligent systems and computing, 2021
An associative memory fortheon-linerecognition and predictionoftemporal sequences
Thispaperpresents thedesign ofanassociative memorywithfeedback thatiscapable ofon-line temporal s... more Thispaperpresents thedesign ofanassociative memorywithfeedback thatiscapable ofon-line temporal sequence learning. A framework foron-line sequence learning hasbeenproposed, anddifferent sequence learning models have beenanalysed according tothis framework. Thenetwork model isanassociative memorywithaseparate store forthesequence context ofasymbol. A sparse distributed memoryisusedto gainscalability. Thecontext store combines thefunctionality of aneural layer withashift register. Thesensitivity ofthemachine tothesequence context iscontrollable, resulting indifferent characteristic behaviours. Themodelcanstore andpredict on- line sequences ofvarious types andlength. Numerical simulations onthemodelhavebeencarried outtodetermine itsproperties.
A Generic Visualization Framework based on a Data Driven Approach for the Analytics data
There are a number of analytics dashboard related solutions available today, but currently there ... more There are a number of analytics dashboard related solutions available today, but currently there is no open standard available to integrate different dashboards. In this paper, we provide a dashboard framework to combine data from different analytics sources such as Google Analytics, Flurry, JSON and Excel files, to form a customizable user interface. Our framework uses two configuration files, one for generic meta information and the other for individual services, to configure the dashboard. In our interface, it is possible to program basic calculations based on data from different sources. It is also possible to incorporate interfaces like drag and drop to configure options. Our framework is based on the plugin architecture, which allows easy addition of new data sources. The framework and visualization tool are data driven, meaning that if the source data changes in the future, there is no need to amend the dashboard as well. Our solution can work with local data as well as remote data from AWS servers with added authentication. We present the components of our dashboard solution along with implementation details of a prototype dashboard for a web service.
Prediction of Throughput Degradation from Trouble Frequencies, given Environmental Unknowns
2022 14th International Conference on COMmunication Systems & NETworkS (COMSNETS)
Emergence of carrier aggregation technology to augment user throughput in LTE and 5G technologies... more Emergence of carrier aggregation technology to augment user throughput in LTE and 5G technologies also results in passive intermodulation (PIM) artifacts in frequency-division duplexing (FDD)-based radio transceivers. While it is imperative to suppress PIM distortions, in real time, the problem is more arduous. In practical scenarios, the transmission frequencies are unknown across telecom operators due to security concerns and dynamically changing set of frequencies. PIM detection and mitigation in the face of such environmental unknowns becomes a challenge. In this paper, we address this challenge and propose an automated solution to mitigate PIM in real time. We propose a binary search-based solution that is amenable to real-time implementation. We show through simulations that this search in tandem with a reinforcement learning based solution can dynamically mitigate and cancel PIM. Results show that the number of steps to converge to identify and mitigate the PIM in uplink frequency is reduced by a factor of ~200 (i.e., from 2500 ms to 12 ms) for around 200 combinations of DL PRB combinations.
Bokeh Effect in Images on Objects Based on User Interest
Humans pay visual attention to those objects in the visual field that they are most interested in... more Humans pay visual attention to those objects in the visual field that they are most interested in seeing. The Bokeh effect is a popular blurring effect in photography, where the object of interest is emphasized by blurring other objects. In this paper, we apply the principle of visual attention to the user's object of interest to post processing of photos taken using a smartphone. We simulate the Bokeh effect of blurring objects in the image except those that the user is interested. This adds a biologically inspired effect to the camera and gallery apps in the smartphone. We first define a hierarchy of user interests in different categories. We then create a user interest profile based on the user's demographics, apps and URLs. We build a user interest vector out of this hierarchy by using a word embedding model, and take the weighted average of the vectors of the words corresponding to the user interests. After this, we detect objects in the image and calculate the similarity of the detected objects with the user interest vector, returning a sorted list of objects the user is interested. The Bokeh effect is applied to the image to blur other objects, thus giving a realistic touch to the image. Finally, we conduct a user study to validate the effectiveness of the system.
Arxiv, 2023
Mindfulness meditation has been proven to be effective in treating a range of mental and physical... more Mindfulness meditation has been proven to be effective in treating a range of mental and physical conditions. Mindful Art is a type of mindfulness meditation that comprises sessions of free drawing with mindfulness for a given length of time. To date, the efficacy of mindful art has not been systematically studied. In this paper, we describe an experimental pilot study on two groups of participants, a beginner group of 21 participants and an experienced meditation group of 9 participants, who had previously practiced mindfulness meditation for one year. The beginner group was instructed in mindfulness sitting and walking meditation, while the experienced group was instructed in mindful drawing in addition to mindfulness meditation. The instructions were delivered remotely over WeChat, the sessions were of 2 hours duration each and the length of the study was 21 days. The blood pressure, pulse rate and breathing rates, as well as the subjective degree of relaxation were recorded at every session. At the end of the study, the experienced group reported higher degrees of improvement in breath rate and relaxation, while the beginner group reported a greater degree of improvement in breath rate and relaxation, although their scores were lower on average than the experienced group.
A Privacy Preserving Approach for Home Ownership Prediction
2019 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), 2019
Web service providers have access to private user data such as preferences and behaviors of users... more Web service providers have access to private user data such as preferences and behaviors of users, which is used to provide customized or improved services and make predictions. Privacy restrictions such as General Data Protection Regulation (GDPR) mean that such user data should not be traceable back to the original user i.e. the user's privacy should not be compromised. In this paper, we propose a system for predicting home ownership using machine learning, i.e. whether the user is likely to be a homeowner or a renter, on the basis of the user's demographic data, in a way that preserves the user's privacy while making the predictions. Our system uses differential private data perturbation along with homomorphic encryption of the Term Frequency-Inverse Document Frequency (TF-IDF) vectors as the privacy preservation technique to mask the real identities of the users whose home ownership data is predicted. Our trained model is used for prediction on a sample dataset of a few thousand users. We get an accuracy of 69% in the prediction, which is around 4% lower than the algorithm performance without the privacy preservation. This shows that it is feasible to implement privacy preservation techniques on demographic prediction without compromising on the prediction accuracy.
A good diagnostic assessment is one that can (i) discriminate between students of different abili... more A good diagnostic assessment is one that can (i) discriminate between students of different abilities for a given skill set, (ii) be consistent with ground truth data and (iii) achieve this with as few assessment questions as possible. In this paper, we explore a method to meet these objectives. This is achieved by selecting questions from a question database and assembling them to create a diagnostic test paper according to a given configurable policy. We consider policies based on multiple attributes of the questions such as discrimination ability and behavioral parameters, as well as a baseline policy. We develop metrics to evaluate the policies and perform the evaluation using historical student attempt data on assessments conducted on an online learning platform, as well as on a pilot test on the platform administered to a subset of users. We are able to estimate student abilities 40% better with a diagnostic test as compared to baseline policy, with questions derived from a la...
Security Mechanism for Packaged Web Applications
2017 IEEE International Conference on Web Services (ICWS), 2017
OAuth is an open security standard that enables users to provide specific and time bound rights t... more OAuth is an open security standard that enables users to provide specific and time bound rights to an application to access protected user resources, stored on some external resource server, without needing them to share their credentials, with the application. Using OAuth, a client application gets one access token for further use through an HTTP redirect response from the resource server once the user authenticates the resource access. Unlike websites, for locally installed packaged web applications the main security challenge is to handle the redirect response appropriately. This paper proposes a novel method to execute OAuth flow from such applications with the help of web runtime framework that manages the life cycle of these applications. We compare our approach with other two approaches for OAuth flow handling proposed in the literature. Experimenting with different categories of packaged web applications, we found our approach blocking all illegal OAuth flow executions. Our approach also gives better OAuth response handling time and power consumption performance.
Identification of Inefficient Radios for Efficient Energy Consumption in a Mobile Network
In a mobile network, it is important to identify energy inefficient RRUs (Remote Radio Units) to ... more In a mobile network, it is important to identify energy inefficient RRUs (Remote Radio Units) to improve the overall energy efficiency of the network and achieve significant energy and cost savings. Existing solutions can identify inefficient RRUs based on hardware alarms or faults, but not energy consumption in real time for a given region. In this paper, we propose a network energy consumption model and method to identify inefficient RRUs with respect to power consumption in real time. Our method involves ML models trained on the historical data of performance management (PM) counters to predict the RRU energy consumption, on the basis of which the RRUs having a higher divergence between predicted and actual energy consumption are identified. The system is trained and tested with simulated data based on a major network operator.
COMSNETS, 2024
In a mobile network, it is important to identify energy inefficient RRUs (Remote Radio Units) to ... more In a mobile network, it is important to identify energy inefficient RRUs (Remote Radio Units) to improve the overall energy efficiency of the network and achieve significant energy and cost savings. Existing solutions can identify inefficient RRUs based on hardware alarms or faults, but not energy consumption in real time for a given region. In this paper, we propose a network energy consumption model and method to identify inefficient RRUs with respect to power consumption in real time. Our method involves ML models trained on the historical data of performance management (PM) counters to predict the RRU energy consumption, on the basis of which the RRUs having a higher divergence between predicted and actual energy consumption are identified. The system is trained and tested with simulated data based on a major network operator.
IJMEMS, 2025
Data Drift is the phenomenon where the generating model behind the data changes over time. Due to... more Data Drift is the phenomenon where the generating model behind the data changes over time. Due to data drift, any model built on the past training data becomes less relevant and inaccurate over time. Thus, detecting and controlling for data drift is critical in machine learning models. Hierarchical Temporal Memory (HTM) is a machine learning model developed by Jeff Hawkins, inspired by how the human brain processes information. It is a biologically inspired model of memory that is similar in structure to the neocortex, and whose performance is claimed to be comparable to state of the art models in detecting anomalies in time series data. Another unique benefit of HTMs is its independence from training and testing cycle; all the learning takes place online with streaming data and no separate training and testing cycle is required. In sequential learning paradigm, Sequential Probability Ratio Test (SPRT) offers some unique benefit for online learning and inference. This paper proposes a novel hybrid framework combining HTM and SPRT for real-time data drift detection and anomaly identification. Unlike existing data drift methods, our approach eliminates frequent retraining and ensures low false positive rates. HTMs currently work with one dimensional or univariate data. In a second study, we also propose an application of HTM in multidimensional supervised scenario for anomaly detection by combining the outputs of multiple HTM columns, one for each dimension of the data, through a neural network. Experimental evaluations demonstrate that the proposed method outperforms conventional drift detection techniques like the Kolmogorov-Smirnov (KS) test, Wasserstein distance, and Population Stability Index (PSI) in terms of accuracy, adaptability, and computational efficiency. Our experiments also provide insights into optimizing hyperparameters for real-time deployment in domains such as Telecom.
CODS COMAD, 2025
We describe a novel approach to automating unit test generation for Java methods using large lang... more We describe a novel approach to automating unit test generation for Java methods using large language models (LLMs). Existing LLMbased approaches rely on sample usage(s) of the method to test (focal method) and/or provide the entire class of the focal method as input prompt and context. The former approach is often not viable due to the lack of sample usages, especially for newly written focal methods. The latter approach does not scale well enough; the bigger the complexity of the focal method and larger associated class, the harder it is to produce adequate test code (due to factors such as exceeding the prompt and context lengths of the underlying LLM). We show that augmenting prompts with concise and precise context information obtained by program analysis increases the effectiveness of generating unit test code through LLMs. We validate * Equal Contribution † The author was at Ericsson R&D during this study
Mindfulness meditation has been proven to be effective in treating a range of mental and physical... more Mindfulness meditation has been proven to be effective in treating a range of mental and physical conditions. Mindful Art is a type of mindfulness meditation that comprises sessions of drawing, painting and sculpturing with mindfulness for a given length of time. To date, the efficacy of mindful art has not been systematically studied. In this paper, we describe an experimental pilot study on two groups of participants, a beginner group of 21 participants and an experienced meditation group of 9 participants, who had previously practiced mindfulness meditation for more than one year. The beginner group was instructed in mindfulness sitting and moving meditation, while the experienced group was instructed in mindful art making in addition to mindfulness meditation. The instructions were delivered remotely over Tencent Conference and WeChat. The sessions were of 90 minutes duration each, twice per week, with 45 minutes of home practice daily and the length of the study was 21 days. Th...
Background Anxiety disorders, such as generalized anxiety disorder and social anxiety, are a majo... more Background Anxiety disorders, such as generalized anxiety disorder and social anxiety, are a major problem among adolescents and young adults. Structured mindfulness based interventions such as Mindfulness Based Cognitive Therapy (MBCT) and Mindfulness Based Stress Reduction (MBSR) have been shown to be at least as effective as other interventions for treating anxiety, but a thorough analysis of different factors for effective treatments is missing. Objective The objective of this narrative synthesis is to synthesize mindfulness treatments for anxiety in young adults aged between 12 to 25, and examine components of those interventions that are more effective in reducing anxiety. Methods Studies were selected from 3 public databases (APA Psycinfo, Embase, Medline), as well as a manual process to augment the searches. Interventions involving Mindfulness based Cognitive Therapy (MBCT) and Mindfulness based Stress Reduction (MBSR) based studies, as well as their variants were eligible. Anxiety should be one of the measures in the study although it may not be the primary measure. After initial screening and removal of duplicates, 8 studies involving 423 participants were identified. Results Identified themes included customizations for young people, homework and follow ups, qualifications of the instructors, dropout rates, physical activity and subjective experience. Most studies showed a significant decrease in anxiety symptoms, in case of social phobia, chronic pain, stress and academic performance. However, variable scales for measuring anxiety were employed across studies, making it difficult to combine or compare them. The amount of improvement of anxiety was variable. Interventions that included mindfulness information sessions for parents and interventions with mindful physical activity such as yoga showed better results. Conclusion Recommendations are presented to enable more effective mindfulness interventions tailored for young people with anxiety.
Link-Adaptation for Improved Quality-of-Service in V2V Communication using Reinforcement Learning
arXiv (Cornell University), Dec 16, 2019
The number of machine learning, artificial intelligence or data science related software engineer... more The number of machine learning, artificial intelligence or data science related software engineering projects using Agile methodology is increasing. However, there are very few studies on how such projects work in practice. In this paper, we analyze project issues tracking data taken from Scrum (a popular tool for Agile) for several machine learning projects. We compare this data with corresponding data from non-machine learning projects, in an attempt to analyze how machine learning projects are executed differently from normal software engineering projects. On analysis, we find that machine learning project issues use different kinds of words to describe issues, have higher number of exploratory or research oriented tasks as compared to implementation tasks, and have a higher number of issues in the product backlog after each sprint, denoting that it is more difficult to estimate the duration of machine learning project related tasks in advance. After analyzing this data, we propose a few ways in which Agile machine learning projects can be better logged and executed, given their differences with normal software engineering projects.
Story and Task Issue Analysis for Agile Machine Learning Projects
The usage of Agile methodology in planning and executing machine learning (ML) and data science r... more The usage of Agile methodology in planning and executing machine learning (ML) and data science related software engineering projects is increasing. However, there are very few studies using real data on how effective such planning is or guidelines on how to plan such projects. In this paper, we analyze data taken from several software projects using Scrum tools. We compare the data for data science/ML and non-ML projects, in an attempt to understand if data science and ML projects are planned or executed any differently compared to normal software engineering projects. We also perform a story classification task using machine learning to analyze story logs for agile tasks for several teams. We find there are differences in what makes a good ML story as opposed to a non ML story. After analyzing this data, we propose a few ways in which software projects, whether machine learning related or not, can be better logged and executed using Scrum tools like Jira.
Enhanced Alternate Action Recommender System Using Recurrent Patterns and Fault Detection System for Smart Home Users
We present a fault tolerant alternate action recommender system for smart home Internet of Things... more We present a fault tolerant alternate action recommender system for smart home Internet of Things (IoT) users to enrich the user experience with uninterrupted routines and various methods to achieve the regular routines in the smart home system. Our system takes events data from the smart home IoT devices as input, performs preprocessing using the big data handling techniques to transform it to be applicable to our system, applies our custom pattern-mining algorithm to derive the highly probable and active recurrent patterns of an individual user, ensures those frequently used devices are up and running using our fault detection monitoring system, and then finally recommends the alternate possibilities of achieving the deviated actions. Our custom fault detection system is based on various parameters of the IoT devices and context of the smart home users wherein the alternate recommendations given to the user are practical and useful in real time. We validated our system using user trial methods and various validation techniques.
arXiv (Cornell University), Nov 7, 2019
Boilerplate removal refers to the problem of removing noisy content from a webpage such as ads an... more Boilerplate removal refers to the problem of removing noisy content from a webpage such as ads and extracting relevant content that can be used by various services. This can be useful in several features in web browsers such as ad blocking, accessibility tools such as read out loud, translation, summarization etc. In order to create a training dataset to train a model for boilerplate detection and removal, labeling or tagging webpage data manually can be tedious and time consuming. Hence, a semi-supervised model, in which some of the webpage elements are labeled manually and labels for others are inferred based on some parameters, can be useful. In this paper we present a solution for extraction of relevant content from a webpage that relies on semi-supervised learning using Gaussian Random Fields. We first represent the webpage as a graph, with text elements as nodes and the edge weights representing similarity between nodes. After this, we label a few nodes in the graph using heuristics and label the remaining nodes by a weighted measure of similarity to the already labeled nodes. We describe the system architecture and a few preliminary results on a dataset of webpages.
Intelligent and Secure Autofill System in Web Browsers
Advances in intelligent systems and computing, 2021
An associative memory fortheon-linerecognition and predictionoftemporal sequences
Thispaperpresents thedesign ofanassociative memorywithfeedback thatiscapable ofon-line temporal s... more Thispaperpresents thedesign ofanassociative memorywithfeedback thatiscapable ofon-line temporal sequence learning. A framework foron-line sequence learning hasbeenproposed, anddifferent sequence learning models have beenanalysed according tothis framework. Thenetwork model isanassociative memorywithaseparate store forthesequence context ofasymbol. A sparse distributed memoryisusedto gainscalability. Thecontext store combines thefunctionality of aneural layer withashift register. Thesensitivity ofthemachine tothesequence context iscontrollable, resulting indifferent characteristic behaviours. Themodelcanstore andpredict on- line sequences ofvarious types andlength. Numerical simulations onthemodelhavebeencarried outtodetermine itsproperties.
A Generic Visualization Framework based on a Data Driven Approach for the Analytics data
There are a number of analytics dashboard related solutions available today, but currently there ... more There are a number of analytics dashboard related solutions available today, but currently there is no open standard available to integrate different dashboards. In this paper, we provide a dashboard framework to combine data from different analytics sources such as Google Analytics, Flurry, JSON and Excel files, to form a customizable user interface. Our framework uses two configuration files, one for generic meta information and the other for individual services, to configure the dashboard. In our interface, it is possible to program basic calculations based on data from different sources. It is also possible to incorporate interfaces like drag and drop to configure options. Our framework is based on the plugin architecture, which allows easy addition of new data sources. The framework and visualization tool are data driven, meaning that if the source data changes in the future, there is no need to amend the dashboard as well. Our solution can work with local data as well as remote data from AWS servers with added authentication. We present the components of our dashboard solution along with implementation details of a prototype dashboard for a web service.
Prediction of Throughput Degradation from Trouble Frequencies, given Environmental Unknowns
2022 14th International Conference on COMmunication Systems & NETworkS (COMSNETS)
Emergence of carrier aggregation technology to augment user throughput in LTE and 5G technologies... more Emergence of carrier aggregation technology to augment user throughput in LTE and 5G technologies also results in passive intermodulation (PIM) artifacts in frequency-division duplexing (FDD)-based radio transceivers. While it is imperative to suppress PIM distortions, in real time, the problem is more arduous. In practical scenarios, the transmission frequencies are unknown across telecom operators due to security concerns and dynamically changing set of frequencies. PIM detection and mitigation in the face of such environmental unknowns becomes a challenge. In this paper, we address this challenge and propose an automated solution to mitigate PIM in real time. We propose a binary search-based solution that is amenable to real-time implementation. We show through simulations that this search in tandem with a reinforcement learning based solution can dynamically mitigate and cancel PIM. Results show that the number of steps to converge to identify and mitigate the PIM in uplink frequency is reduced by a factor of ~200 (i.e., from 2500 ms to 12 ms) for around 200 combinations of DL PRB combinations.
Bokeh Effect in Images on Objects Based on User Interest
Humans pay visual attention to those objects in the visual field that they are most interested in... more Humans pay visual attention to those objects in the visual field that they are most interested in seeing. The Bokeh effect is a popular blurring effect in photography, where the object of interest is emphasized by blurring other objects. In this paper, we apply the principle of visual attention to the user's object of interest to post processing of photos taken using a smartphone. We simulate the Bokeh effect of blurring objects in the image except those that the user is interested. This adds a biologically inspired effect to the camera and gallery apps in the smartphone. We first define a hierarchy of user interests in different categories. We then create a user interest profile based on the user's demographics, apps and URLs. We build a user interest vector out of this hierarchy by using a word embedding model, and take the weighted average of the vectors of the words corresponding to the user interests. After this, we detect objects in the image and calculate the similarity of the detected objects with the user interest vector, returning a sorted list of objects the user is interested. The Bokeh effect is applied to the image to blur other objects, thus giving a realistic touch to the image. Finally, we conduct a user study to validate the effectiveness of the system.
Arxiv, 2023
Mindfulness meditation has been proven to be effective in treating a range of mental and physical... more Mindfulness meditation has been proven to be effective in treating a range of mental and physical conditions. Mindful Art is a type of mindfulness meditation that comprises sessions of free drawing with mindfulness for a given length of time. To date, the efficacy of mindful art has not been systematically studied. In this paper, we describe an experimental pilot study on two groups of participants, a beginner group of 21 participants and an experienced meditation group of 9 participants, who had previously practiced mindfulness meditation for one year. The beginner group was instructed in mindfulness sitting and walking meditation, while the experienced group was instructed in mindful drawing in addition to mindfulness meditation. The instructions were delivered remotely over WeChat, the sessions were of 2 hours duration each and the length of the study was 21 days. The blood pressure, pulse rate and breathing rates, as well as the subjective degree of relaxation were recorded at every session. At the end of the study, the experienced group reported higher degrees of improvement in breath rate and relaxation, while the beginner group reported a greater degree of improvement in breath rate and relaxation, although their scores were lower on average than the experienced group.
A Privacy Preserving Approach for Home Ownership Prediction
2019 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), 2019
Web service providers have access to private user data such as preferences and behaviors of users... more Web service providers have access to private user data such as preferences and behaviors of users, which is used to provide customized or improved services and make predictions. Privacy restrictions such as General Data Protection Regulation (GDPR) mean that such user data should not be traceable back to the original user i.e. the user's privacy should not be compromised. In this paper, we propose a system for predicting home ownership using machine learning, i.e. whether the user is likely to be a homeowner or a renter, on the basis of the user's demographic data, in a way that preserves the user's privacy while making the predictions. Our system uses differential private data perturbation along with homomorphic encryption of the Term Frequency-Inverse Document Frequency (TF-IDF) vectors as the privacy preservation technique to mask the real identities of the users whose home ownership data is predicted. Our trained model is used for prediction on a sample dataset of a few thousand users. We get an accuracy of 69% in the prediction, which is around 4% lower than the algorithm performance without the privacy preservation. This shows that it is feasible to implement privacy preservation techniques on demographic prediction without compromising on the prediction accuracy.
A good diagnostic assessment is one that can (i) discriminate between students of different abili... more A good diagnostic assessment is one that can (i) discriminate between students of different abilities for a given skill set, (ii) be consistent with ground truth data and (iii) achieve this with as few assessment questions as possible. In this paper, we explore a method to meet these objectives. This is achieved by selecting questions from a question database and assembling them to create a diagnostic test paper according to a given configurable policy. We consider policies based on multiple attributes of the questions such as discrimination ability and behavioral parameters, as well as a baseline policy. We develop metrics to evaluate the policies and perform the evaluation using historical student attempt data on assessments conducted on an online learning platform, as well as on a pilot test on the platform administered to a subset of users. We are able to estimate student abilities 40% better with a diagnostic test as compared to baseline policy, with questions derived from a la...
Security Mechanism for Packaged Web Applications
2017 IEEE International Conference on Web Services (ICWS), 2017
OAuth is an open security standard that enables users to provide specific and time bound rights t... more OAuth is an open security standard that enables users to provide specific and time bound rights to an application to access protected user resources, stored on some external resource server, without needing them to share their credentials, with the application. Using OAuth, a client application gets one access token for further use through an HTTP redirect response from the resource server once the user authenticates the resource access. Unlike websites, for locally installed packaged web applications the main security challenge is to handle the redirect response appropriately. This paper proposes a novel method to execute OAuth flow from such applications with the help of web runtime framework that manages the life cycle of these applications. We compare our approach with other two approaches for OAuth flow handling proposed in the literature. Experimenting with different categories of packaged web applications, we found our approach blocking all illegal OAuth flow executions. Our approach also gives better OAuth response handling time and power consumption performance.
There are a number of inbound web services, which recommend content to users. However, there is n... more There are a number of inbound web services, which recommend content to users. However, there is no way for such services to prioritize their recommendations as per the users' interests. Here we are not interested in generating new recommendations, but rather organizing and
Given the ongoing controversy over biased news, it would be useful to have a system that can dete... more Given the ongoing controversy over biased news, it would be useful to have a system that can detect the extent of bias in online news articles and indicate it to the user in real time. In this paper we provide such a system. Here we measure bias in a given sentence or article as the word vector similarity with a corpus of biased words. We compute the word vector similarity of each of the sentences with the words taken from a Wikipedia Neutral Point of View (NPOV) corpus, measured using the word2vec tool, where our model is trained using Wikipedia articles. We then compute the bias score, which indicates how much that article uses biased words. This is implemented as a web browser extension, which queries an online server running our bias detection algorithm. Finally, we validate the accuracy of our bias detection by comparing bias rankings of a variety of articles from various sources. We get lower bias scores for Wikipedia articles than for news articles, which is lower than that for opinion articles.
There are a number of analytics dashboard related solutions available today, but currently there ... more There are a number of analytics dashboard related solutions available today, but currently there is no open standard available to integrate different dashboards. In this paper, we provide a dashboard framework to combine data from different analytics sources such as Google Analytics, Flurry, JSON and Excel files, to form a customizable user interface. Our framework uses two configuration files, one for generic meta information and the other for individual services, to configure the dashboard. In our interface, it is possible to program basic calculations based on data from different sources. It is also possible to incorporate interfaces like drag and drop to configure options. Our framework is based on the plugin architecture, which allows easy addition of new data sources. The framework and visualization tool are data driven, meaning that if the source data changes in the future, there is no need to amend the dashboard as well. Our solution can work with local data as well as remote data from AWS servers with added authentication. We present the components of our dashboard solution along with implementation details of a prototype dashboard for a web service.
Too many applications on smartphones consume memory and slow down the performance of the device. ... more Too many applications on smartphones consume memory and slow down the performance of the device. Hence we need web applications which are lightweight and consume less memory on the mobile. Web Applications use the Browser engine and take a lot of time to launch compared to a native application, especially upon device boot up or if the browser is not already running in the background. In this paper, we propose an intelligent framework to launch web applications as fast as native applications. The framework considers the user's usage of web applications and pre-launches the preferred web applications, thus enhancing the launch time performance. We provide the architecture and implementation details of the framework. We then analyze results of an experiment on various web applications to measure the effectiveness of the framework for fast launch of web applications after the device boots Background
Push Notification Service is an essential
We examine issues involving the transmission of information by spike trains through networks made... more We examine issues involving the transmission of information by spike trains through networks made of real time Asynchronous spiking neurons. For our convenience we use a spiking model that has an intrinsic delay between an input and output spike. We look at issues involving transmission of a desired average level of stable spiking activity over many layers, and show how feedback reset
Address Decoder (Kanerva) Data Memory (CMM) Address Data Output Associative memory is a version o... more Address Decoder (Kanerva) Data Memory (CMM) Address Data Output Associative memory is a version of Kanerva's SDM (Kanerva 88) as used by Furber (Furber 04)
MSc Dissertation, KCL, 2023
Background Anxiety disorders, such as generalized anxiety disorder and social anxiety, are a majo... more Background Anxiety disorders, such as generalized anxiety disorder and social anxiety, are a major problem among adolescents and young adults. Structured mindfulness based interventions such as Mindfulness Based Cognitive Therapy (MBCT) and Mindfulness Based Stress Reduction (MBSR) have been shown to be at least as effective as other interventions for treating anxiety, but a thorough analysis of different factors for effective treatments is missing. Objective The objective of this narrative synthesis is to synthesize mindfulness treatments for anxiety in young adults aged between 12 to 25, and examine components of those interventions that are more effective in reducing anxiety. Methods Studies were selected from 3 public databases (APA Psycinfo, Embase, Medline), as well as a manual process to augment the searches. Interventions involving Mindfulness based Cognitive Therapy (MBCT) and Mindfulness based Stress Reduction (MBSR) based studies, as well as their variants were eligible. Anxiety should be one of the measures in the study although it may not be the primary measure. After initial screening and removal of duplicates, 8 studies involving 423 participants were identified. Results Identified themes included customizations for young people, homework and follow ups, qualifications of the instructors, dropout rates, physical activity and subjective experience. Most studies showed a significant decrease in anxiety symptoms, in case of social phobia, chronic pain, stress and academic performance. However, variable scales for measuring anxiety were employed across studies, making it difficult to combine or compare them. The amount of improvement of anxiety was variable. Interventions that included mindfulness information sessions for parents and interventions with mindful physical activity such as yoga showed better results. Conclusion Recommendations are presented to enable more effective mindfulness interventions tailored for young people with anxiety.