Ritu Chaturvedi - Profile on Academia.edu (original) (raw)

Papers by Ritu Chaturvedi

Using Derived Sequential Pattern Mining for E-Commerce Recommendations in Multiple Sources

Lecture Notes in Computer Science, Dec 31, 2022

Mining Twitter Multi-word Product Opinions with Most Frequent Sequences of Aspect Terms

Lecture Notes in Computer Science, 2022

Intelligent Feature Selection on Multivariate Dataset using Advanced Data Profiling

2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS)

Internet users go to different social media platforms to read about reviews or comments about a p... more Internet users go to different social media platforms to read about reviews or comments about a product before they buy or invest in one. Thus, for many companies it is very important to keep track of sentiments of their customers review, respond to them in-time, and to keep the brand value high. There is no dearth of models being created for mining sentiments from twitter data but these models fail when sarcasm is involved in tweets. Thus, sarcasm detection (when expressed implicitly, in contrary to being expressed using explicit words) can help in gaining better insight of customer sentiments on their opinion or review about a product or company. In order to fulfil the above mentioned objective of detecting sarcasm, this paper engineers features from tweets and their description in order to capture the sarcasm expressed in tweets. Each registered author on twitter has the opportunity to selfdescribe oneself. We utilize this selfdescription to extract extra information about person...

Predicting Student Performance in an ITS Using Task-Driven Features

2017 IEEE International Conference on Computer and Information Technology (CIT)

Intelligent Tutoring Systems (ITS) are typically designed to offer one-on-one tutoring on a subje... more Intelligent Tutoring Systems (ITS) are typically designed to offer one-on-one tutoring on a subject to students in an adaptive way so that students can learn the subject at their own pace. The ability to predict student performance enables an ITS to make informed decisions towards meeting the individual needs of students. It is also useful for ITS designers to validate if students are actually able to succeed in learning the subject. Predicting student performance is a function of two complex and dynamic factors: (f1) student learning behavior and (f2) their current knowledge in the subject. Learning behavior is captured from student interaction with the ITS (e.g. time spent on an assigned task) and is stored in the form of web logs. Student knowledge in the subject is represented by the marks they score in assigned tasks and is stored in a specific component of the ITS called student model. In order to build an accurate prediction model, this raw data from student model and web logs must be engineered carefully and transformed into meaningful features. Existing systems such as LON-CAPA predict students performance using their learning behavior alone, without considering their (current) knowledge on the subject. Lack of proper feature engineering is evident from the low values of accuracy of their prediction models. This research proposes a highly accurate model that predicts student success in assigned tasks with a 96% accuracy by using features that are better informed not only about students in terms of the two factors f1 and f2 mentioned above, but also on the assigned task itself (e.g. task's difficulty level). In order to accomplish this, an Example Recommendation System (ERS) is designed with a fine-grained student model (to represent student data) and a fine-grained domain model (to represent domain resources such as tasks).

Clustering Examples in Web-based Tutoring Systems based on Relevance of Concepts

2020 Fourth International Conference On Intelligent Computing in Data Sciences (ICDS), 2020

Web-based online tutoring systems (WOTS) have become extremely important and relevant in today’s ... more Web-based online tutoring systems (WOTS) have become extremely important and relevant in today’s world, especially with COVID-19 requiring schools, colleges and universities to offer alternate forms of delivery. Many studies have indicated that students find worked-out examples very useful, when they are performing a task or studying for final exams. WOTS certainly have the capability to host hundreds of such examples in their repositories, but presenting students with such repositories may cause cognitive overload on students and may force them to bear the responsibility of searching for the most relevant examples, when in need. This paper proposes an algorithm called CER (Clustering Examples based on Relevance) that organizes a collection of worked-out examples into coherent and relevant clusters - relevant to the learning concepts covered by them. When generating clusters, CER acknowledges not only the local relevance of a concept (using parameters such as mode) within a cluster but also its global relevance. The proposed algorithm CER is validated using Dunn’s index as the internal validity index - a score of 0.81 was achieved for CER. The external validity of CER was measured by comparing its results to a benchmark dataset that had properties of data that were common to the domain of CER.

The HSPRec E-Commerce System Open Source Code Implementation

2021 IEEE/ACIS 20th International Fall Conference on Computer and Information Science (ICIS Fall), 2021

To promote big data application access, usage and deployment, this paper presents a downloadable ... more To promote big data application access, usage and deployment, this paper presents a downloadable open source code implementation for an E-Commerce Recommendation system, HSPRec (Historical Sequential Pattern Recommendation System), in JAVA. The HSPRec system is composed of six different modules for generating purchase/click sequential databases, mining sequential patterns, computing click purchase similarities, generating purchase sequential rules, computing weights for frequent purchase patterns through Weighted Frequent Purchase Pattern Miner, and normalization of the user-item ratings to predict level of interest. The source code of each module and the main runner are discussed under four possible headings of running environment, input data files and format, minimum support format, output data files and format. The overall goal of the HSPRec system is to improve E-commerce Recommendation accuracy by incorporating more complex sequential patterns of user purchase and click stream behavior learned through frequent sequential purchase patterns. HSPRec provides more accurate recommendations than the tested comparative systems.

2020 11th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), 2020

The 4-gram would be "I went to work" 4) Prune the list of candidate aspects in order to create a ... more The 4-gram would be "I went to work" 4) Prune the list of candidate aspects in order to create a more-relevant list of aspects 5) Cross compare the pruned aspects to the original data in order to find sentimental connections between the data.

Concept Extraction: A Modular Approach to Extraction of Source Code Concepts

2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData), 2018

Code examples have always been one of the most sought after pieces of information when it comes t... more Code examples have always been one of the most sought after pieces of information when it comes to understanding and mastering programming concepts. Research shows extracting knowledge from such examples in any online tutoring system is a challenging task. Current methods rely upon specifically formed regular expressions that must be tailor made to the input language, or generation of an AST for the given input program. In our paper, we extend upon existing implementations in code recommendation software using a novel keyword based search tree (k-BST) method. K-BST recommends relevant code fragments by extracting existing keywords, matching with relevant coding examples by k-means clustering, and recommending the relevant coding examples back to the user. K-BSTs also address several major issues which modern knowledge extraction software often run into, like ease of use, extendibility to other domains and run time. With that in mind, K-BSTs are designed to tackle ease of use with po...

An Intelligent Tutoring System (ITS) is a computer system that provides a direct customized instr... more An Intelligent Tutoring System (ITS) is a computer system that provides a direct customized instruction or feedback to students while performing a task in a tutoring system without the intervention of a human. One of the modules of an ITS system is student module which helps to understand the student’s learning abilities. Several data mining techniques like association rule mining, clustering and mining using Bayesian networks have been proposed to design effective student models in ITS systems. This paper provides a comparative study of the various data mining techniques and tools that are used in student modeling. We also propose an example-driven approach that can integrate mined concept examples at different difficulty levels with the Bayesian networks in order to influence student learning.

Intelligent tutoring systems (ITS) aim to provide customized resources or feedback on a subject (... more Intelligent tutoring systems (ITS) aim to provide customized resources or feedback on a subject (commonly known as domain in ITS) to students in real-time, emulating the behavior of an actual teacher in a classroom. This thesis designs an ITS based on an instructional strategy called example-based learning (EBL), that focuses primarily on students devoting their time and cognitive capacity to studying worked-out examples so that they can enhance their learning and apply it to similar graded problems or tasks. A task is a graded problem or question that an ITS assigns to students (e.g. task T1 in C programming domain defined as “Write an assignment instruction in C that adds 2 integers”). A worked-out example refers to a complete solution of a problem or question in the domain. Existing ITS systems such as NavEx and PADS, that use EBL to teach their domain suffer from several limitations such as (1) methods used to extract knowledge from given tasks and worked-out examples require hi...

Educational model for higher education has shown a drift from traditional classroom to technology... more Educational model for higher education has shown a drift from traditional classroom to technology-driven models that merge classroom teaching with web-based learning management systems (LMS) such as Moodle and CLEW. Every teaching model has a set of supervised (e.g. quizzes) and/or unsupervised (e.g. assignments) instruments that are used to evaluate the effectiveness of learning. The challenge is in preserving student motivation in the unsupervised instruments such as assignments as they are less structured compared to quizzes and tests. The research applies association rule mining to specifically find the impact of unsupervised course work (e.g. assignments) on overall performance (e.g. exam and total marks).

International Journal of Data Warehousing and Mining, 2020

Existing work on multiple databases (MDBs) sequential pattern mining cannot mine frequent sequenc... more Existing work on multiple databases (MDBs) sequential pattern mining cannot mine frequent sequences to answer exact and historical queries from MDBs having different table structures. This article proposes the transaction id frequent sequence pattern (TidFSeq) algorithm to handle the difficult problem of mining frequent sequences from diverse MDBs. The TidFSeq algorithm transforms candidate 1-sequences to get transaction subsequences where candidate 1-sequences occurred as (1-sequence, itssubsequenceidlist) tuple or (1-sequence, position id list). Subsequent frequent i-sequences are computed using the counts of the sequence ids in each candidate i-sequence position id list tuples. An extended version of the general sequential pattern (GSP)-like candidate generates and a frequency count approach is used for computing supports of itemset (I-step) and separate (S-step) sequences without repeated database scans but with transaction ids. Generated patterns answer complex queries from MDB...

Energy Aware Distributed Clustering in Two-Tiered Sensor Networks

2009 Proceedings of 18th International Conference on Computer Communications and Networks, 2009

... Sensor Networks Ataul Bari, Ritu Chaturvedi, Arunita Jaekel and Subir Bandyopadhyay School of... more ... Sensor Networks Ataul Bari, Ritu Chaturvedi, Arunita Jaekel and Subir Bandyopadhyay School of Computer Science, University of Windsor 401 Sunset Ave., Windsor, ON N9B 3P4, Canada E-mail: {bari1, rituch, arunita, subir}@uwindsor.ca ...

Enhancement of greenhouse gases associated with Canadian forest fire using multi sensor data

Forest fire is a common natural hazard that takes lives of people and billion dollar loss of prop... more Forest fire is a common natural hazard that takes lives of people and billion dollar loss of properties almost every year. In the recent past frequency of forest fires have increased in Canada and throughout the world that is associated with the changes in land use and land cover practice. Multi sensor satellites are now capable in providing information about

Variability of Atmospheric Properties during 2000 -2009 over Major Cities in Canada

An elevated amount of greenhouse gases, total ozone column and significant changes in me-teorolog... more An elevated amount of greenhouse gases, total ozone column and significant changes in me-teorological parameters were found to be associated with forest fires in Canada (Singh and Chaturvedi, 2008 COSPAR). In the present paper, we have carried out detailed analysis of multi sensor data for the period 2000 -2009 over major Canadian cities (Vancouver, Calgary, Montreal, Toronto, Ottawa, Windsor, St.

Computers & Education, 2014

This paper posits the use of computer games as cognitive development tools that can provide playe... more This paper posits the use of computer games as cognitive development tools that can provide players with transferable skills suitable for learning in the 21st century. We describe a method for categorizing single-player computer games according to the main cognitive function(s) engaged in by the player during gaming. Categorization was done in collaboration with a neuropsychologist, academic researchers, and research assistants. Twelve research assistants, mostly domain novices, were trained to categorize games according to a cognitive matrix developed by the neuropsychologist. They also categorized the games, and evaluated and commented on the relevance of the neuropsychologist's categorization of the games. Through the process of "critic proofing," computer games were reliably classified into primary and secondary cognitive categories, and the team was able to identify problems with both the categorization of certain games and the definitions of some of the cognitive functions in our cognitive matrix. Such an approach allowed for the identification of under-populated cognitive categories in the project's existing repository of games, and for further development of the cognitive representation framework, information useful for both researchers and designers in the gaming industry.

International Journal on Data Science and Technology

In recent years, technology has enabled Universities and Colleges to offer web-based courses, in ... more In recent years, technology has enabled Universities and Colleges to offer web-based courses, in which, teachers (or experts) design, curate and upload all course material required to teach the course online so that students can learn at their own pace, time and location. This research proposes a tutoring framework called Example Recommendation System (ERS) that is based on example-based learning (EBL) instructional method. ERS focuses on students devoting their time and cognitive capacity to studying worked-out examples so that they can enhance their learning and apply it to graded tasks assigned to them. ERS uses regular expression analysis to extract basic learning units (LU) (e.g. scanf is a LU in C programming) from all task solutions and worked-out examples and represents this knowledge in vector space. Then, these vectors are mined to generate a customized list of worked-out examples for each assigned task. The prime contribution of ERS's extraction module is its extendibility to new domains without requiring highly trained experts. Besides extendibility, ERS extracts LUs with 81% correctness for the domain of "Programming in C" and 95% for domain of "Programming in Miranda". ERS's data mining model used for customization has 93% accuracy and 88% f score. ERS's educational impact is also evident from experiments that show that students score an average of 89% in tasks for which they use ERS's recommended worked-out examples, as opposed to an average of 73% for those tasks that students attempt without ERS's assistance.

Mining product opinions with most frequent clusters of aspect terms

Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing

This paper addresses the problem of more accurately mining product aspect opinions from Twitter p... more This paper addresses the problem of more accurately mining product aspect opinions from Twitter posts, in the presence of spam and noisy posts, by proposing an algorithm called Microblog Aspect Miner (MAM). MAM takes a three step approach of classifying the microblog posts into subjective and objective posts using opinion scores of words from SentiWordNet. MAM then represents frequent nouns of subjective posts as vectors in such a way that nouns semantically similar to the products have a similar vector value using the WordVec algorithm. K-Means clustering algorithm is used to obtain the cluster of aspects relevant to the product to separate the noisy aspects so that the most relevant aspects are ranked using proposed Aspect-Product Similarity Threshold based on cosine similarities. Experiments show that this improves accuracy of obtaining relevant aspects of products from microblog posts in comparison to such existing aspect based opinion mining (ABOM) systems as Twitter Aspect Classifier (TAC).

Mining Binary Data with Matrix Algebra

2015 IEEE International Conference on Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing, 2015