Sanjeera siddiqui - Academia.edu (original) (raw)
Uploads
Papers by Sanjeera siddiqui
Intelligent Natural Language Processing: Trends and Applications, 2017
This paper presents an innovative approach that explores the role of lexicalization for Arabic se... more This paper presents an innovative approach that explores the role of lexicalization for Arabic sentiment analysis. Sentiment Analysis in Arabic is hindered due to lack of resources, language in use with sentiment lexicons, pre-processing of dataset as a must and major concern is repeatedly following same approaches. One of the key solution found to resolve these problems include applying the extension of lexicon to include more words not restricted to Modern Standard Arabic. Secondly, avoiding pre-processing of dataset. Third, and the most important one, is investigating the development of an Arabic Sentiment Analysis system using a novel rule-based approach. This approach uses heuristics rules in a manner that accurately classifies tweets as positive or negative. The manner in which a series of abstraction occurs resulting in an end-to-end mechanism with rule-based chaining approach. For each lexicon, this chain specifically follows a chaining of rules, with appropriate positioning and prioritization of rules. Expensive rules in terms of time and effort thus resulted in outstanding results. The results with end-to-end rule chaining approach achieved 93.9% accuracy when tested on baseline dataset and 85.6% accuracy on OCA, the second dataset. A further comparison with the baseline showed huge increase in accuracy by 23.85%.
The British University in Dubai (BUiD), Dec 1, 2015
Named Entity Recognition, Question Answering, Information Retrieval, Machine Translation, etc. fa... more Named Entity Recognition, Question Answering, Information Retrieval, Machine Translation, etc. fall under the tasks that follow Natural Language Processing approaches wherein Sentiment Analysis uses Natural Language Processing as one of the means to find the subjective text indicating negative, positive or neutral polarity. The united approach of text mining and natural language processing, termed to be as Sentiment Analysis has gained huge heights due to the increased use of social media websites like Facebook, Instagram, Twitter to name a few. Sentiment Analysis is a growing field and nevertheless a lot of research is done in English when compared to Arabic language. Analysis of Sentiments helps companies, government and other organization to improvise their products and service based on the reviews or comments. This dissertation not only depicts the various challenges faced by Arabic Natural Language processing in the Sentiment Analysis task, but this dissertation presents an Innovative approach that explores the role of lexicalization for Arabic sentiment analysis. Sentiment Analysis in Arabic is hindered due to lack of resources, language in use with sentiment lexicons, pre-processing of dataset as a must and major concern is repeatedly following same approaches. One of the key solution found to resolve these problems include applying the extension of lexicon to include more words not restricted to Modern Standard Arabic. Secondly, avoiding pre-processing of dataset. Third, and the most important one, is investigating the development of an Arabic Sentiment Analysis system using a novel rule-based approach. This approach uses heuristics rules that is triggered based on end-to-end mechanism of a particular word in a manner that accurately classifies the tweets as positive or negative. The manner in which a series of abstraction occurs resulting in an end to end rule-based chaining approach. For each lexicon this chain specifically follows a chaining of rules (i.e. rule A chains with rule B and if required rule C and so on), with appropriate positioning and prioritization of rules. Expensive rules in terms of time and effort thus resulted in outstanding results. Experiments were conducted on two dataset. They are chosen for a number of good reasons, including their availability and successfully used by other researches, richness and sufficient to come to a conclusion, and provision with electronic resources such as lexicon. Two set of 4 experiments were done. The first set of experiment was done only with two rules-"equal to" and "within the text". The second set of experiment was done with rule chaining mechanism. The results thus achieved with end to end rule chaining approach achieved 93.9% accuracy when tested on one dataset, which is considered the baseline, and 85.6% accuracy on OCA, the second dataset. A further comparison with the baseline showed huge increase in accuracy by 23.85%.
Web-based social networking organizations, for instance, Facebook and Twitter and web based syste... more Web-based social networking organizations, for instance, Facebook and Twitter and web based systems administration encouraging locales, for instance, Flickr and YouTube have ended up being dynamically well known in later quite a while. One key variable to the social media websites like Twitter, facebook is that these worldwide allow people to express and give their experiences, likes, and loathes, energetically and direct. The appraisals posted degree from impugning government authorities to discussing first class cricket people, alluding to top news, evaluating movies, and proposing new things et cetera. This headway has controlled another field known as sentiment analysis. This rising field has pulled in an endless research intrigue, however most of the ebb and flow work focuses on English substance, with less commitment to Arabic. Arabic Sentiment Analysis focusses on datasets and dictionaries, however less endeavors and commitment to this upsets the achievement in Sentiment Arab...
This paper presents an innovative approach that explores the role of lexicalization for Arabic se... more This paper presents an innovative approach that explores the role of lexicalization for Arabic sentiment analysis. Sentiment Analysis in Arabic is hindered due to lack of resources, language in use with sentiment lexicons, pre-processing of dataset as a must and major concern is repeatedly following same approaches. One of the key solution found to resolve these problems include applying the extension of lexicon to include more words not restricted to Modern Standard Arabic. Secondly, avoiding pre-processing of dataset. Third, and the most important one, is investigating the development of an Arabic Sentiment Analysis system using a novel rule-based approach. This approach uses heuristics rules in a manner that accurately classifies tweets as positive or negative. The manner in which a series of abstraction occurs resulting in an end-to-end mechanism with rule-based chaining approach. For each lexicon, this chain specifically follows a chaining of rules, with appropriate positioning...
Lecture Notes in Computer Science, 2016
This paper concentrates on contrasting between two well-known Arabic parsers that is the Stanford... more This paper concentrates on contrasting between two well-known Arabic parsers that is the Stanford Parser and the Bikel parser by utilizing the Arabic Treebank (ATB). The contrast between the Stanford and Bikel parser is done for model preparing and testing, for this reason we made a software that empowers us to change over the ATB arrangement to language structure organize, change over the Arabic Morphological labels (tags) to Penn labels (tags), and assess the parsers yield by ascertaining the Precision, Recall, F-Score, and Tag Accuracy. We additionally alter Bikel Parser to utilize the Penn labels (tags) in preparing to enhance the Precision, Recall, F-Score, and Tag Accuracy comes about because of the parse yield.
The rule-based approach has successfully been used in developing many natural language processing... more The rule-based approach has successfully been used in developing many natural language processing systems. Systems that use rule-based transformations are based on a core of solid linguistic knowledge. The linguistic knowledge acquired for one natural language processing system may be reused to build knowledge required for a similar task in another system. The advantage of the rule-based approach over the corpus-based approach is clear for: 1) less-resourced languages, for which large corpora, possibly parallel or bilingual, with representative structures and entities are neither available nor easily affordable, and 2) for morphologically rich languages, which even with the availability of corpora suffer from data sparseness. These have motivated many researchers to fully or partially follow the rulebased approach in developing their Arabic natural processing tools and systems. In this paper we address our successful efforts that involved rule-based approach for different Arabic natural language processing tasks.
This paper presents an innovative approach that explores the role of lexicalization for Arabic se... more This paper presents an innovative approach that explores the role of lexicalization for Arabic sentiment analysis. Sentiment Analysis in Arabic is hindered due to lack of resources, language in use with sentiment lexicons, pre-processing of dataset as a must and major concern is repeatedly following same approaches. One of the key solution found to resolve these problems include applying the extension of lexicon to include more words not restricted to Modern Standard Arabic. Secondly, avoiding pre-processing of dataset. Third, and the most important one, is investigating the development of an Arabic Sentiment Analysis system using a novel rule-based approach. This approach uses heuristics rules in a manner that accurately classifies tweets as positive or negative. The manner in which a series of abstraction occurs resulting in an end-to-end mechanism with rule-based chaining approach. For each lexicon, this chain specifically follows a chaining of rules, with appropriate positioning...
Web-based social networking organizations, for instance, Facebook and Twitter and web based syste... more Web-based social networking organizations, for instance, Facebook and Twitter and web based systems administration encouraging locales, for instance, Flickr and YouTube have ended up being dynamically well known in later quite a while. One key variable to the social media websites like Twitter, facebook is that these worldwide allow people to express and give their experiences , likes, and loathes, energetically and direct. The appraisals posted degree from impugning government authorities to discussing first class cricket people, alluding to top news, evaluating movies, and proposing new things et cetera. This headway has controlled another field known as sentiment analysis. This rising field has pulled in an endless research intrigue, however most of the ebb and flow work focuses on English substance, with less commitment to Arabic. Arabic Sentiment Analysis focusses on datasets and dictionaries, however less endeavors and commitment to this upsets the achievement in Sentiment Ara...
Advances in Intelligent Systems and Computing, 2016
Intelligent Natural Language Processing: Trends and Applications, 2017
This paper presents an innovative approach that explores the role of lexicalization for Arabic se... more This paper presents an innovative approach that explores the role of lexicalization for Arabic sentiment analysis. Sentiment Analysis in Arabic is hindered due to lack of resources, language in use with sentiment lexicons, pre-processing of dataset as a must and major concern is repeatedly following same approaches. One of the key solution found to resolve these problems include applying the extension of lexicon to include more words not restricted to Modern Standard Arabic. Secondly, avoiding pre-processing of dataset. Third, and the most important one, is investigating the development of an Arabic Sentiment Analysis system using a novel rule-based approach. This approach uses heuristics rules in a manner that accurately classifies tweets as positive or negative. The manner in which a series of abstraction occurs resulting in an end-to-end mechanism with rule-based chaining approach. For each lexicon, this chain specifically follows a chaining of rules, with appropriate positioning and prioritization of rules. Expensive rules in terms of time and effort thus resulted in outstanding results. The results with end-to-end rule chaining approach achieved 93.9% accuracy when tested on baseline dataset and 85.6% accuracy on OCA, the second dataset. A further comparison with the baseline showed huge increase in accuracy by 23.85%.
The British University in Dubai (BUiD), Dec 1, 2015
Named Entity Recognition, Question Answering, Information Retrieval, Machine Translation, etc. fa... more Named Entity Recognition, Question Answering, Information Retrieval, Machine Translation, etc. fall under the tasks that follow Natural Language Processing approaches wherein Sentiment Analysis uses Natural Language Processing as one of the means to find the subjective text indicating negative, positive or neutral polarity. The united approach of text mining and natural language processing, termed to be as Sentiment Analysis has gained huge heights due to the increased use of social media websites like Facebook, Instagram, Twitter to name a few. Sentiment Analysis is a growing field and nevertheless a lot of research is done in English when compared to Arabic language. Analysis of Sentiments helps companies, government and other organization to improvise their products and service based on the reviews or comments. This dissertation not only depicts the various challenges faced by Arabic Natural Language processing in the Sentiment Analysis task, but this dissertation presents an Innovative approach that explores the role of lexicalization for Arabic sentiment analysis. Sentiment Analysis in Arabic is hindered due to lack of resources, language in use with sentiment lexicons, pre-processing of dataset as a must and major concern is repeatedly following same approaches. One of the key solution found to resolve these problems include applying the extension of lexicon to include more words not restricted to Modern Standard Arabic. Secondly, avoiding pre-processing of dataset. Third, and the most important one, is investigating the development of an Arabic Sentiment Analysis system using a novel rule-based approach. This approach uses heuristics rules that is triggered based on end-to-end mechanism of a particular word in a manner that accurately classifies the tweets as positive or negative. The manner in which a series of abstraction occurs resulting in an end to end rule-based chaining approach. For each lexicon this chain specifically follows a chaining of rules (i.e. rule A chains with rule B and if required rule C and so on), with appropriate positioning and prioritization of rules. Expensive rules in terms of time and effort thus resulted in outstanding results. Experiments were conducted on two dataset. They are chosen for a number of good reasons, including their availability and successfully used by other researches, richness and sufficient to come to a conclusion, and provision with electronic resources such as lexicon. Two set of 4 experiments were done. The first set of experiment was done only with two rules-"equal to" and "within the text". The second set of experiment was done with rule chaining mechanism. The results thus achieved with end to end rule chaining approach achieved 93.9% accuracy when tested on one dataset, which is considered the baseline, and 85.6% accuracy on OCA, the second dataset. A further comparison with the baseline showed huge increase in accuracy by 23.85%.
Web-based social networking organizations, for instance, Facebook and Twitter and web based syste... more Web-based social networking organizations, for instance, Facebook and Twitter and web based systems administration encouraging locales, for instance, Flickr and YouTube have ended up being dynamically well known in later quite a while. One key variable to the social media websites like Twitter, facebook is that these worldwide allow people to express and give their experiences, likes, and loathes, energetically and direct. The appraisals posted degree from impugning government authorities to discussing first class cricket people, alluding to top news, evaluating movies, and proposing new things et cetera. This headway has controlled another field known as sentiment analysis. This rising field has pulled in an endless research intrigue, however most of the ebb and flow work focuses on English substance, with less commitment to Arabic. Arabic Sentiment Analysis focusses on datasets and dictionaries, however less endeavors and commitment to this upsets the achievement in Sentiment Arab...
This paper presents an innovative approach that explores the role of lexicalization for Arabic se... more This paper presents an innovative approach that explores the role of lexicalization for Arabic sentiment analysis. Sentiment Analysis in Arabic is hindered due to lack of resources, language in use with sentiment lexicons, pre-processing of dataset as a must and major concern is repeatedly following same approaches. One of the key solution found to resolve these problems include applying the extension of lexicon to include more words not restricted to Modern Standard Arabic. Secondly, avoiding pre-processing of dataset. Third, and the most important one, is investigating the development of an Arabic Sentiment Analysis system using a novel rule-based approach. This approach uses heuristics rules in a manner that accurately classifies tweets as positive or negative. The manner in which a series of abstraction occurs resulting in an end-to-end mechanism with rule-based chaining approach. For each lexicon, this chain specifically follows a chaining of rules, with appropriate positioning...
Lecture Notes in Computer Science, 2016
This paper concentrates on contrasting between two well-known Arabic parsers that is the Stanford... more This paper concentrates on contrasting between two well-known Arabic parsers that is the Stanford Parser and the Bikel parser by utilizing the Arabic Treebank (ATB). The contrast between the Stanford and Bikel parser is done for model preparing and testing, for this reason we made a software that empowers us to change over the ATB arrangement to language structure organize, change over the Arabic Morphological labels (tags) to Penn labels (tags), and assess the parsers yield by ascertaining the Precision, Recall, F-Score, and Tag Accuracy. We additionally alter Bikel Parser to utilize the Penn labels (tags) in preparing to enhance the Precision, Recall, F-Score, and Tag Accuracy comes about because of the parse yield.
The rule-based approach has successfully been used in developing many natural language processing... more The rule-based approach has successfully been used in developing many natural language processing systems. Systems that use rule-based transformations are based on a core of solid linguistic knowledge. The linguistic knowledge acquired for one natural language processing system may be reused to build knowledge required for a similar task in another system. The advantage of the rule-based approach over the corpus-based approach is clear for: 1) less-resourced languages, for which large corpora, possibly parallel or bilingual, with representative structures and entities are neither available nor easily affordable, and 2) for morphologically rich languages, which even with the availability of corpora suffer from data sparseness. These have motivated many researchers to fully or partially follow the rulebased approach in developing their Arabic natural processing tools and systems. In this paper we address our successful efforts that involved rule-based approach for different Arabic natural language processing tasks.
This paper presents an innovative approach that explores the role of lexicalization for Arabic se... more This paper presents an innovative approach that explores the role of lexicalization for Arabic sentiment analysis. Sentiment Analysis in Arabic is hindered due to lack of resources, language in use with sentiment lexicons, pre-processing of dataset as a must and major concern is repeatedly following same approaches. One of the key solution found to resolve these problems include applying the extension of lexicon to include more words not restricted to Modern Standard Arabic. Secondly, avoiding pre-processing of dataset. Third, and the most important one, is investigating the development of an Arabic Sentiment Analysis system using a novel rule-based approach. This approach uses heuristics rules in a manner that accurately classifies tweets as positive or negative. The manner in which a series of abstraction occurs resulting in an end-to-end mechanism with rule-based chaining approach. For each lexicon, this chain specifically follows a chaining of rules, with appropriate positioning...
Web-based social networking organizations, for instance, Facebook and Twitter and web based syste... more Web-based social networking organizations, for instance, Facebook and Twitter and web based systems administration encouraging locales, for instance, Flickr and YouTube have ended up being dynamically well known in later quite a while. One key variable to the social media websites like Twitter, facebook is that these worldwide allow people to express and give their experiences , likes, and loathes, energetically and direct. The appraisals posted degree from impugning government authorities to discussing first class cricket people, alluding to top news, evaluating movies, and proposing new things et cetera. This headway has controlled another field known as sentiment analysis. This rising field has pulled in an endless research intrigue, however most of the ebb and flow work focuses on English substance, with less commitment to Arabic. Arabic Sentiment Analysis focusses on datasets and dictionaries, however less endeavors and commitment to this upsets the achievement in Sentiment Ara...
Advances in Intelligent Systems and Computing, 2016