Tokenization Research Papers - Academia.edu (original) (raw)
Токенизация и блокчейн в сфере искусства Андрей Ханов 16:40, 19 марта 2021 103 В качестве краткого высказывания о современной «токенизации картин»-приведу (это будет смена контекста)-высказывание одного знакомого физика: «И, что мне... more
Токенизация и блокчейн в сфере искусства Андрей Ханов 16:40, 19 марта 2021 103 В качестве краткого высказывания о современной «токенизации картин»-приведу (это будет смена контекста)-высказывание одного знакомого физика: «И, что мне теперь в Бога не верить, если поп подлец»? Не секрет, что-для большинства зрителей-смысл картины давно покинул плоскость холста и расстворился в пространстве (тусовке) вокруг картины. Это уже понято и подробно объяснено (смотрите неопрагматическую эстетику Ричарда Рорти). Одни художники коллекционируют символы того, что они художники: заявления о принципах живописной композиции, выставки в музеях (сам музей позвал, проект в музее или выставка за взятку), каталоги, статьи, звания, факты продаж, провенанс, нахождение работ в коллекциях, артист стейтменты и т.п. Другие-по сути делая ровно то-же самое-добиваясь самоуспокоения (самоудовлетворения)-покупают новые токены, как лицензии их права на творчество. Всё это одинаково бессмысленно, но каждый такой «художник» (в кавычках)-верит, что символ чего и это-то что-то-одно и тоже. Таких много, но не все. Якобы-Луна и палец, указывающий на неё-одно и тоже. Это тождественно только в подлинном (искреннем, концептуальном от слова концепт, а не концепция) творчестве, в процессе, это требует усилий над собой, само по себе-нет. Местоположение смысла картины. Считается, что тому, что смысл картины окончательно покинул плоскость холста-способствовало галерейное контемпрорари конца 20-го века (мировые арт-ярмарки, глобальный арт-рынок: погружение чувственности зрителя в-намеренно сокращённую галеристом-матрицу поверхностных признаков имён его чувств идеи картины) и-крипто-технологическая
Free Download here:
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3540250
Smart contracts promise a world without intermediaries. However, that promise has quickly proved elusive, including in the context of Initial Coin Offerings (ICOs), a vehicle for funding startups built on smart contracts and blockchain. Particularly as ICOs attract retail investors who are not code-literate, the question arises: is there a role for new intermediaries? This article assesses the possibility of an ICO auditor, providing a framework for understanding potential audit functions. In particular, it identifies three main roles: to translate the code for retail investors who are not code-sophisticates, to reconcile the code with promises made in other materials aimed at ICO participants, and to verify offline activity and identity where these remain important to the transactions. It then maps these functions onto emerging models.
- by Vanessa Villanueva Collao and +1
- •
- Corporate Law, Securities Law, Tokenization, Gatekeepers
There has been an emerging trend of a vast number of chat applications which are present in the recent years to help people to connect with each other across different mediums, like Hike, WhatsApp, Telegram, etc. The proposed... more
There has been an emerging trend of a vast number of chat applications which are present in the recent years to help people to connect with each other across different mediums, like Hike, WhatsApp, Telegram, etc. The proposed network-based android chat application used for chatting purpose with remote clients or users connected to the internet, and it will not let the user send inappropriate messages. This paper proposes the mechanism of creating professional chat application that will not permit the user to send inappropriate or improper messages to the participants by incorporating base level implementation of natural language processing (NLP). Before sending the messages to the user, the typed message evaluated to find any inappropriate terms in the message that may include vulgar words, etc., using natural language processing. The user can build an own dictionary which contains vulgar or irrelevant terms. After pre-processing steps of removal of punctuations, numbers, conversion of text to lower case and NLP concepts of removing stop words, stemming, tokenization, named entity recognition and parts of speech tagging, it gives keywords from the user typed message. These derived keywords compared with the terms in the dictionary to analyze the sentiment of the message. If the context of the message is negative, then the user not permitted to send the message.
A medida que blockchain se introduce en el modelo de negocio de las corporaciones se crean ecosistemas de empresas internacionalizados desde el origen. Estos ecosistemas transfronterizos se denominan metaversos cuando incorporan realidad... more
A medida que blockchain se introduce en el modelo de negocio de las corporaciones se crean ecosistemas de empresas internacionalizados desde el origen. Estos ecosistemas transfronterizos se denominan metaversos cuando incorporan realidad virtual y por lo tanto mayor sensación de presencialidad y entidad, y cuando se utiliza la tokenización de activos como herramienta de intercambio de valor y de creación de un sistema económico-financiero propio. En este artículo, damos una visión general de estos ecosistemas como plataformas de negocio internacional y proponemos un nuevo marco para la digitalización empresarial. En particular, aclaramos varios conceptos críticos como blockchainización, tokenización y virtualización relación a la regulación y las nuevas oportunidades profesionales que esta innovación genera.
In the web, amount of operational data has been increasing exponentially from past few decades, the expectations of data-user is changing proportionally as well. The data-user expects more deep, exact, and detailed results. Retrieval of... more
In the web, amount of operational data has been increasing exponentially from past few decades, the expectations of data-user is changing proportionally as well. The data-user expects more deep, exact, and detailed results. Retrieval of relevant results is always affected by the pattern, how they are stored/ indexed. There are various techniques are designed to indexed the documents, which is done on the token’s identified with in documents. Tokenization process, primarily effective is to identifying the token and their count. In this paper, we have proposed an effective tokenization approach which is based on training vector and result shows that
efficiency/ effectiveness of proposed algorithm.Tokenization of a given documents helps to satisfy user’s information need more precisely and reduced search sharply, is believed to be a part of information retrieval. Tokenization involves pre-processing of documents and generates its respective tokens which is the basis of these tokens probabilistic IR generate its scoring and gives reduced search space. No of Token generated is the parameters used for result analysis.
- by Vikram Chouhan and +1
- •
- Tokenization, Stemming, Information Retrieval (IR), Indexing/Ranking
The mobile digital wallet has been revolving around the world. However, factors are influencing the adoption and use of a mobile wallet. The purpose of the qualitative phenomenological study was to examine the perceptions of the use of... more
The mobile digital wallet has been revolving around the world. However, factors are influencing the adoption and use of a mobile wallet. The purpose of the qualitative phenomenological study was to examine the perceptions of the use of mobile wallet among the users in Toronto, Canada.
Participants of this study included 17 individuals who have embraced and utilized a mobile wallet for different transactions. Their perceptions about the security mechanism and motivation for adoption allows for a deeper understanding of their experiences during and after the adoption. Most available researches have been mainly focussing on users’ initial adoption and the usage of mobile payment, whereas postadoption usage has not been fully investigated, therefore, this research tries to close the gap.
Amazon Mechanical Turks (MTurks) was used to recruit the participants while Skype® technology was used to conduct online interviews. The unified theory of acceptance and use of technology (UTAUT) model was used to describe the perception of the users and NVivo 12 ® software was used to analyze the transcribed data from the open-ended interviews. The findings identified the factors that influence the adoption and continuance use of a mobile wallet.
The six themes emerged from the analysis are Consumers’ Mindset, Motivations for Adoption, Challenges in Mobile Wallet Enrollment, Physical and Mobile Wallet Comparison, Consumer’s Security Perceptions, and Consumer’s Perceptions of Mobile Wallet Transactions. The findings from this study may benefit consumers, device manufacturers, and mobile wallet application, vendors. Future research is recommended to replicate the study using a quantitative methodology, however, in a different setting with a larger sample of older adults as participants.
Asset tokenization refers to the usage of digital tokens to indicate asset ownership. These assets could range from real estate, fine art, and jewelry to smart contracts on a blockchain (to handle these ownership rights). Asset... more
The emerging field of token engineering aims to open “economy itself as design space,” using blockchains and other distributed ledger and smart contract tech- nologies in combination with game theory and behavioral economics. This paper... more
The emerging field of token engineering aims to open “economy itself as design space,” using blockchains and other distributed ledger and smart contract tech- nologies in combination with game theory and behavioral economics. This paper describes the emergence of the Ethereum network as the principle site for the development of a cryptographically tokenized mode of economic life, and the re- construction of human economic agency along game-theoretic lines in the figure of Homo cryptoeconomicus. The paper draws together questions from the econo- mization and marketization, materiality and human/non-human agency programs in economic sociology: token engineering represents both a new field of practice for a self-consciously performative economics, and a potential point of intervention for a practice-oriented mode of social studies of finance.
Sign Language Translation (SLT) first uses a Sign Language Recognition (SLR) system to extract sign language glosses from videos. Then, a translation system generates spoken language translations from the sign language glosses. Though SLT... more
Sign Language Translation (SLT) first uses a Sign Language Recognition (SLR) system to extract sign language glosses from videos. Then, a translation system generates spoken language translations from the sign language glosses. Though SLT has gathered interest recently , little study has been performed on the translation system. This paper focuses on the translation system and improves performance by utilizing Transformer networks. We report a wide range of experimental results for various Transformer setups and introduce the use of Spatial-Temporal Multi-Cue (STMC) networks in an end-to-end SLT system with Transformer. We perform experiments on RWTH-PHOENIX-Weather 2014T, a challenging SLT benchmark dataset of German sign language, and ASLG-PC12, a dataset involving American Sign Language (ASL) recently used in gloss-to-text translation. Our methodology improves on the current state-of-the-art by over 5 and 7 points respectively in BLEU-4 score on ground truth glosses and by using a STMC network to predict glosses of the RWTH-PHOENIX-Weather 2014T dataset. On the ASLG-PC12 corpus, we report an improvement of over 16 points in BLEU-4. Our findings also demonstrate that end-to-end translation on predicted glosses provides even better performance than translation on ground truth glosses, which shows potential for further improvement in SLT by jointly training both the SLR and translation systems or revising the gloss annotation system.
Text mining is the process of extracting interesting and non-trivial knowledge or information from unstructured text data. Text mining is the multidisciplinary field which draws on data mining, machine learning, information retrieval,... more
Text mining is the process of extracting interesting and non-trivial knowledge or information from unstructured text data. Text mining is the multidisciplinary field which draws on data mining, machine learning, information retrieval, computational linguistics and statistics. Important text mining processes are information extraction, information retrieval, natural language processing, text classification, content analysis and text clustering. All these processes are required to complete the preprocessing step before doing their intended task. Pre-processing significantly reduces the size of the input text documents and the actions involved in this step are sentence boundary determination, natural language specific stop-word elimination, tokenization and stemming. Among this, the most essential and important action is the tokenization. Tokenization helps to divide the textual information into individual words. For performing tokenization process, there are many open source tools are available. The main objective of this work is to analyze the performance of the seven open source tokenization tools. For this comparative analysis, we have taken Nlpdotnet Tokenizer, Mila Tokenizer, NLTK Word Tokenize, TextBlob Word Tokenize, MBSP Word Tokenize, Pattern Word Tokenize and Word Tokenization with Python NLTK. Based on the results, we observed that the Nlpdotnet Tokenizer tool performance is better than other tools.
The development of information technology is very rapid, resulting in an overflow of data. The amount of data can be used to obtain information needed by the user. The problem is, not all information can be found easily, especially very... more
The development of information technology is very rapid, resulting in an overflow of data. The amount of data can be used to obtain information needed by the user. The problem is, not all information can be found easily, especially very specific information. Likewise information about tourism. One way to overcome these problems is to utilize Natural Language Processing Technology, especially Question Answering System, which allows Computers to understand the meaning of Questions posed by users in natural languages. This study built a simple Question Answering System application. Application developed with PHP programming language, and MySql database. Preprocessing techniques used are Tokenization, Part-Of-Speech tagging, and Named Entity Recognation. The test result show that the application is able to provide answers to user questions of 82,05%.
The problem of implementing modern technologies into the electric power industry is quite relevant in the world. The article considers the models of decentralized platforms providing services for energy distribution and trade, their main... more
The problem of implementing modern technologies into the electric power industry is quite relevant in the world. The article considers the models of decentralized platforms providing services for energy distribution and trade, their main advantages and disadvantages. The basic principles of tokenization were developed, which allow optimizing of the energy systems and concentration of the crowd funding process for the construction of new generation facilities.
La tecnología blockchain ha supuesto una profunda transformación de los más diversos ámbitos socio-económicos. En este breve artículo analizamos el impacto de la tecnología de registro distribuido en la industria cultural, en los modelos... more
La tecnología blockchain ha supuesto una profunda transformación de los más diversos ámbitos socio-económicos. En este breve artículo analizamos el impacto de la tecnología de registro distribuido en la industria cultural, en los modelos de negocio más tradicionales de las misma y su función habilitadora para canalizar nuevas expresiones creativas.
Blockchain technology has meant a deep transformation of most diverse socioeconomic areas. In this short paper we analyze the impact of distributed ledger technology in the cultural industry, in its most traditional business models and its enabling function to channel new creative expressions.
O mercado imobiliário é um de principais setores da macroeconomia brasileira, influenciando, diretamente, os índices de inflação e taxas de juros. Entretanto, ainda que o setor seja de reconhecível importância, a realidade que... more
O mercado imobiliário é um de principais setores da macroeconomia brasileira, influenciando, diretamente, os índices de inflação e taxas de juros. Entretanto, ainda que o setor seja de reconhecível importância, a realidade que experimentamos é traduzida pela falta de incentivo e de crédito disponível à produção, agravada pela imposição de burocracias quase intransponíveis e altíssimos custos atribuídos aos veículos de investimento, que acabam tornando o investimento no setor elitizado e de baixa atratividade. Por outro lado, vivenciamos atualmente uma revolução baseada na tecnologia blockchain que, em linhas gerais, constitui-se por uma base de dados distribuída e descentralizada, semelhante a um livro-razão, vinculado a uma rede peer-to-peer, eliminando, portanto, a necessidade a interferência de intermediários. Assim, a blockchain assegura aos seus adeptos transparência, imutabilidade, simplificação, agilidade e economia em seus processos, enquanto, especificamente ao mercado imobiliário, a blockchain pode representar uma promessa interessante para a simplificação e alcance de novos recursos. Portanto, o presente artigo terá por objetivo a avaliação de uma captação de recursos para desenvolvimento de um empreendimento de base imobiliária, sendo que, para tanto, será concebida a modelagem de um protótipo de edifício comercial destinado à locação para a formatação do financiamento à produção por meio de uma security token offering, a qual será vinculada a condições auto executáveis de smart contracts e com retornos decorrentes da renda gerada pela exploração do ativo. Serão consideradas todas as etapas e custos necessários para a estruturação da oferta do criptoativo ao mercado, desde a elaboração de seu whitepaper até a análise da qualidade do investimento, tanto em relação ao investidor quanto ao empreendedor imobiliário. Com isso, pretende-se avaliar a viabilidade e otimização da estrutura de funding utilizando a tecnologia blockchain, bem como averiguar da aceitação desta modalidade de investimento pelo público alvo, por meio de uma pesquisa direcionada. Palavras-chave: Blockchain, Empreendimentos Destinados à Renda, Análise da Qualidade de Investimento, Meios de Financiamento, Smart Contracts.
Peer review is a necessary and essential quality control step for scientific publications but lacks proper incentives. Indeed, the process, which is very costly in terms of time and intellectual investment, not only is not remunerated by... more
Peer review is a necessary and essential quality control step for scientific publications but lacks proper incentives. Indeed, the process, which is very costly in terms of time and intellectual investment, not only is not remunerated by the journals but it is also not openly recognized by the academic community as a relevant scientific output for a researcher. Therefore, scientific dissemination is affected in timeliness, quality and fairness. Here, to solve this issue, we propose a blockchainbased incentive system that rewards scientists for peer reviewing other scientists' work and that builds up trust and reputation. We designed a privacy-oriented protocol of smart contracts called Ants-Review that allows authors to issue a bounty for open anonymous peer reviews on Ethereum. If requirements are met, peer reviews will be accepted and paid by the approver proportionally to their assessed quality. To promote ethical behaviour and inclusiveness the system implements a gamified mechanism that allows the whole community to evaluate the peer reviews and vote for the best ones.
Sentence word segmentation and Part-Of-Speech (POS) tagging are common preprocessing tasks for many Natural Language Processing (NLP) applications. This paper presents a practical application for POS tagging and segmentation... more
Sentence word segmentation and Part-Of-Speech (POS) tagging are common preprocessing tasks for many Natural Language Processing (NLP) applications. This paper presents a practical application for POS tagging and segmentation disambiguation using ...
Este trabajo presenta diversas propuestas para la regulación del alojamiento colaborativo, considerando el necesario distanciamiento social del COVID-19 como una oportunidad. Para ello, se emplean soluciones existentes en el Derecho... more
Este trabajo presenta diversas propuestas para la regulación del alojamiento colaborativo, considerando el necesario distanciamiento social del COVID-19 como una oportunidad. Para ello, se emplean soluciones existentes en el Derecho comparado europeo y de los Estados Unidos, junto con aplicaciones prácticas de las nuevas tecnologías. En este sentido, se examinará la desregulación en España y las propuestas novedosas surgidas en Andalucía y Cataluña, junto con una autorregulación mediante los mecanismos reputacionales de las plataformas P2P. Finalmente, se aplicarán diversas tecnologías emergentes en el alojamiento colaborativo, como la tecnología blockchain, la Tokenización y la Proptech, en particular, el Internet of Things.
We present a novel method (“waste”) for the segmentation of text into tokens and sentences. Our approach makes use of a Hidden Markov Model for the detection of segment boundaries. Model parameters can be estimated from pre-segmented text... more
We present a novel method (“waste”) for the segmentation of text into tokens and sentences. Our approach makes use of a Hidden Markov Model for the detection of segment boundaries. Model parameters can be estimated from pre-segmented text which is widely available in the form of treebanks or aligned multi-lingual corpora. We formally define the waste boundary detection model and evaluate the system’s performance on corpora from various languages as well as a small corpus of computer-mediated communication.
The President of the United States, Donald Trump, signed an Executive Order December 11, 2019 allegedly to reduce discrimination toward Jewish people, particularly on college and university campuses. A primary condition of Liberation is... more
The President of the United States, Donald Trump, signed an Executive Order December 11, 2019 allegedly to reduce discrimination toward Jewish people, particularly on college and university campuses. A primary condition of Liberation is the Freedom to define oneself; BUT, President Trump’s order defines Judaism as a “race,” a “color,” and a “national origin”/nationality, and not just a religion. Trump has demonstrated throughout his lifetime that he uses Jews, and by so doing, has exposed his hateful antisemitic bigotry by attempting to weaponize Jewish bodies!
Email has become an important means of electronic communication but the viability of its usage is marred by Un-solicited Bulk Email (UBE) messages. UBE poses technical and socio-economic challenges to usage of emails. Besides, the... more
Email has become an important means of electronic communication but the viability of its usage is marred by Un-solicited Bulk Email (UBE) messages. UBE poses technical and socio-economic challenges to usage of emails. Besides, the definition and understanding of UBE differs from one person to another. To meet these challenges and combat this menace, we need to understand UBE. Towards this end, this paper proposes a classifier for UBE documents. Technically, this is an application of un-structured document classification using text content analysis and we approach it using supervised machine learning technique. Our experiments show the success rate of proposed classifier is 98.50%. This is the first formal attempt to provide a novel tool for UBE classification and the empirical results show that the tool is strong enough to be implemented in real world.
Technological innovations are creating new products, services, and markets that satisfy enduring consumer needs. These technological innovations create value for consumers and firms in many ways, but they also disrupt psychological... more
Technological innovations are creating new products, services, and markets that satisfy enduring consumer needs. These technological innovations create value for consumers and firms in many ways, but they also disrupt psychological ownership-the feeling that a thing is "MINE." The authors describe two key dimensions of this technology-driven evolution of consumption pertaining to psychological ownership: (1) replacing legal ownership of private goods with legal access rights to goods and services owned and used by others and (2) replacing "solid" material goods with "liquid" experiential goods. They propose that these consumption changes can have three effects on psychological ownership: they can threaten it, cause it to transfer to other targets, and create new opportunities to preserve it. These changes and their effects are organized in a framework and examined across three macro trends in marketing: (1) growth of the sharing economy, (2) digitization of goods and services, and (3) expansion of personal data. This psychological ownership framework generates future research opportunities and actionable marketing strategies for firms aiming to preserve the positive consequences of psychological ownership and navigate cases for which it is a liability.
Text mining is the process of extracting interesting and non-trivial knowledge or information from unstructured text data. Text mining is the multidisciplinary field which draws on data mining, machine learning, information retrieval,... more
Text mining is the process of extracting interesting and non-trivial knowledge or information from unstructured text data. Text mining is the multidisciplinary field which draws on data mining, machine learning, information retrieval, computational linguistics and statistics. Important text mining processes are information extraction, information retrieval, natural language processing, text classification, content analysis and text clustering. All these processes are required to complete the preprocessing step before doing their intended task. Pre-processing significantly reduces the size of the input text documents and the actions involved in this step are sentence boundary determination, natural language specific stop-word elimination, tokenization and stemming. Among this, the most essential and important action is the tokenization. Tokenization helps to divide the textual information into individual words. For performing tokenization process, there are many open source tools are available. The main objective of this work is to analyze the performance of the seven open source tokenization tools. For this comparative analysis, we have taken Nlpdotnet Tokenizer, Mila Tokenizer, NLTK Word Tokenize, TextBlob Word Tokenize, MBSP Word Tokenize, Pattern Word Tokenize and Word Tokenization with Python NLTK. Based on the results, we observed that the Nlpdotnet Tokenizer tool performance is better than other tools.
Resumen En este art culo se describe el sistema ERIAL, llevado a cabo en el marco del proyecto del mismo nombre, para Recuperaci on de Informaci on. Tras una primera descripci on externa del proyecto (Secci on 1), se presenta el entorno... more
Resumen En este art culo se describe el sistema ERIAL, llevado a cabo en el marco del proyecto del mismo nombre, para Recuperaci on de Informaci on. Tras una primera descripci on externa del proyecto (Secci on 1), se presenta el entorno modular LEIRA desarrollado con tal n (Secci on 2) y, a continuaci on (Secci on 3), se describen en detalle tanto los recursos ling u sticos integrados en los m odulos computacionalmente diferenciados en el entorno en cuesti on como los m odulos en s mismos. En la Secci on 4 ...
Digital reviews now play a critical role in strengthening global consumer communications and influencing consumer purchasing patterns. Consumers can use ecommerce giants like Amazon, Flipkart, and others to share their experiences and... more
Digital reviews now play a critical role in strengthening global consumer communications and influencing consumer purchasing patterns. Consumers can use ecommerce giants like Amazon, Flipkart, and others to share their experiences and provide real insights about the performance of a product to future buyers. The classification of reviews into positive and negative sentiment is required in order to derive relevant insights from a big set of reviews. Sentiment Analysis is a computer programme that extracts subjective data from text. The many procedures involved in the sentiment analysis pre-processing are discussed in this contribution. Gain Ratio based Stemming, which leverages content sensitivity for stemming processing, is used in this pre-processing method. The review dataset is pre-processed using steps such as Sentence Segmentation, Word Tozenization, Gain Ratio based Stemming Method, and Lemmatization. The proposed Optimizations based Feature Extraction – Selection approach improves the sentiment classification accuracy on the product review dataset. In the feature extraction, Principle Component Analysis and Linear Discriminant Analysis are used, and optimization techniques such as Cuckoo Search Optimization (CSO) and Genetic Algorithm (GA) are combined to find the pre-dominant features for improving the classification of the Amazon product review dataset. The suggested preprocessing and feature selection strategy was evaluated using three classifiers: Artificial Neural Network (ANN), Random Forest (RF), and Support Vector Machine (SVM)
En 2019 escribí mi texto La lucha contra el huachicol digital, el cual observaba el ataque hacia suscriptores de redes sociales que ganaban dinero a través de las mismas sin pagarles cuotas. Este modelo, identificado por mi como... more
En 2019 escribí mi texto La lucha contra el huachicol digital, el cual observaba el ataque hacia suscriptores de redes sociales que ganaban dinero a través de las mismas sin pagarles cuotas. Este modelo, identificado por mi como huachicoleo digital me parecía la única manera para lucha en contra de la explotación laboral que enfrentamos hoy en día, situación que permanecía constante hasta la popularización de los NFTs.
The Internet is probably the most successful distributed computing system ever. However, our capabilities for data querying and manipulation on the internet are primordial at best. The user expectations are enhancing over the period of... more
The Internet is probably the most successful distributed computing system ever. However, our capabilities for data querying and manipulation on the internet are primordial at best. The user expectations are enhancing over the period of time along with increased amount of operational data past few decades. The data-user expects more deep, exact, and detailed results. Result retrieval for the user query is always relative o the pattern of data storage and index. In Information retrieval systems, tokenization is an integrals part whose prime objective is to identifying the token and their count. In this paper, we have proposed an effective tokenization approach which is based on training vector and result shows that efficiency/ effectiveness of proposed algorithm. Tokenization on documents helps to satisfy user’s information need more precisely and reduced search sharply, is believed to be a part of information retrieval. Pre processing of input document is an integral part of Tokenization, which involves preprocessing of documents and generates its respective tokens which is the basis of these tokens probabilistic IR generate its scoring and gives reduced search space. The comparative analysis is based on the two parameters; Number of Token generated, Pre processing time.
In recent years, there has been an exponential growth within the number of complex documents and texts. It requires a deeper understanding of machine learning methods to be ready to accurately classify texts in many applications.... more
In recent years, there has been an exponential growth within the number of complex documents and texts. It requires a deeper understanding of machine learning methods to be ready to accurately classify texts in many applications. Understanding the rapidly growing short text is extremely important. Short text is different from traditional documents in its length. With the recent explosive growth of e-commerce and online communication, a replacement genre of text, short text, has been extensively applied in many areas. Numerous researches specialise in short text mining. It's a challenge to classify the short text due to its natural characters, like sparseness, large-scale, immediacy, non-standardization etc. With the rapid development of the web, Web users and Web service are generating more and more short text, including tweets, search snippets, product reviews then on. There's an urgent demand to know the short text. For instance an honest understanding of tweets can help advertisers put relevant advertisements along the tweets, which makes revenue without hurting user experience Short text classification is one among important tasks in tongue Processing (NLP). Unlike paragraphs or documents, short texts are more ambiguous .They do not have enough contextual information, which poses challenge for classification. We retrieve knowledge from external knowledge source to reinforce the semantic representation of short texts. We take conceptual information as a sort of data and incorporate it into deep neural networks. Here we are going to study different methods available for text classification and categorisation.
There has been an emerging trend of a vast number of chat applications which are present in the recent years to help people to connect with each other across different mediums, like Hike, WhatsApp, Telegram, etc. The proposed... more
There has been an emerging trend of a vast number of chat applications which are present in the recent years to help people to connect with each other across different mediums, like Hike, WhatsApp, Telegram, etc. The proposed network-based android chat application used for chatting purpose with remote clients or users connected to the internet, and it will not let the user send inappropriate messages. This paper proposes the mechanism of creating professional chat application that will not permit the user to send inappropriate or improper messages to the participants by incorporating base level implementation of natural language processing (NLP). Before sending the messages to the user, the typed message evaluated to find any inappropriate terms in the message that may include vulgar words, etc., using natural language processing. The user can build an own dictionary which contains vulgar or irrelevant terms. After pre-processing steps of removal of punctuations, numbers, conversion...
We consider a set of natural language processing techniques based on finite-state technology that can be used to analyze huge amounts of texts. These techniques include an advanced tokenizer, a part-of-speech tagger that can manage... more
We consider a set of natural language processing techniques based on finite-state technology that can be used to analyze huge amounts of texts. These techniques include an advanced tokenizer, a part-of-speech tagger that can manage ambiguous streams of words, a system for containing words by means of derivational mechanisms, and a shallow parser to extract syntactic-dependency pairs. We propose to use these techniques in order to improve the performance of standard indexing engines.
Structured Query Language (SQL) Injection is a code injection technique that exploits security vulnerability occurring in database layer of web applications [8]. According to Open Web Application Security Projects (OWASP), SQL Injection... more
Structured Query Language (SQL) Injection is a code injection technique that exploits security vulnerability occurring in database layer of web applications [8]. According to Open Web Application Security Projects (OWASP), SQL Injection is one of top 10 web based attacks [10]. This paper shows the basics of SQL Injection attack, types of SQL Injection Attack according to their classification. It also describes the survey of different SQL Injection attack detection and prevention. At the end of this paper, the comparison of different SQL Injection Attack detection and prevention is shown. Mr. Vishal Andodariya"SQL Injection Attack Detection and Prevention Techniques to Secure Web-Site" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-2 | Issue-4 , June 2018, URL: http://www.ijtsrd.com/papers/ijtsrd13034.pdf
In software development process, coping of existing code fragment and pasting them with or without modification may be a frequent process. Clone means copy of an ingenious form or duplicate. Software clone detection is vital to scale back... more
In software development process, coping of existing code fragment and pasting them with or without modification may be a frequent process. Clone means copy of an ingenious form or duplicate. Software clone detection is vital to scale back the software maintenance cost and to acknowledge the software during a better way. There are many software code clone detection techniques like text based, token-based, Abstract Syntax tree based etc. and they are used to spot and finding the existence of clones in software system. One of the approach is token based comparisons of two programs to detect code clone. Our technique uses the token-based approach using hashing algorithm by analysis of two source codes in the process of finding clones among them and the percentage of cloning is reported as result.
Qué es TOKENIZACIÓN La Tokenización es una de las características más disruptiva y transformadora de la era Blockchain, que permite una mayor eficacia del mercado y donde las personas pueden operar el valor de los bienes de acuerdo a la... more
Qué es TOKENIZACIÓN La Tokenización es una de las características más disruptiva y transformadora de la era Blockchain, que permite una mayor eficacia del mercado y donde las personas pueden operar el valor de los bienes de acuerdo a la oferta y demanda de una manera inmediata y más transparente. La Tokenización en sí es la representación y transformación de cualquier objeto o bien dentro de una cadena de Blockchain. Para ello se toman las características físicas, información, valor del objeto y se digitalizan en una cadena de blockchain de manera que una vez registrado el mismo puede ser almacenado, comercializado, intercambio, etc. en el mundo real. Siendo que todo objeto o actividad del mundo real puede ser tokenizada, desde un espectáculo pasando por productos en serie, obras de arte, propiedad intelectual, propiedades físicas, patentes, etc. La tokenización también agrega valor a las cadenas productivas y financieras en cuanto a velocidad, costos, transparencia e inmediatez de la información. Pero qué es un Token Un Token es parte de la creación más básica del mundo blockchain, encerrando en sí las características propias de la tecnología como seguridad, transparencia, intercambio inmediato, incorruptibilidad, etc y muchas otras que están en desarrollo o por descubrirse. Definiendo un poco los Token son objetos similares a las monedas, sin lo físico, pero que no tienen valor de curso legal, sino que están elaborados para que obtengan ese valor dentro del ecosistema privado que los usa y crea. Como por ejemplo el Bitcoin que sin tener real valor tiene uno muy elevado entre los cada día más crecientes usuarios. La historia podemos encontrar ejemplos de tokens las fichas de un casino, las monedas que le daban los hacendados a sus trabajadores y que se intercambiaban en
The mobile digital wallet has been revolving around the world. However, factors are influencing the adoption and use of a mobile wallet. The purpose of the qualitative phenomenological study was to examine the perceptions of the use of... more
The mobile digital wallet has been revolving around the world. However, factors are influencing the adoption and use of a mobile wallet. The purpose of the qualitative phenomenological study was to examine the perceptions of the use of mobile wallet among the users in Toronto, Canada. Participants of this study included 17 individuals who have embraced and utilized a mobile wallet for different transactions. Their perceptions about the security mechanism and motivation for adoption allows for a deeper understanding of their experiences during and after the adoption. Most available researches have been mainly focussing on users’ initial adoption and the usage of mobile payment, whereas postadoption usage has not been fully investigated, therefore, this research tries to close the gap. Amazon Mechanical Turks (MTurks) was used to recruit the participants while Skype® technology was used to conduct online interviews. The unified theory of acceptance and use of technology (UTAUT) model was used to describe the perception of the users and NVivo 12 ® software was used to analyze the transcribed data from the open-ended interviews. The findings identified the factors that influence the adoption and continuance use of a mobile wallet. The six themes emerged from the analysis are Consumers’ Mindset, Motivations for Adoption, Challenges in Mobile Wallet Enrollment, Physical and Mobile Wallet Comparison, Consumer’s Security Perceptions, and Consumer’s Perceptions of Mobile Wallet Transactions. The findings from this study may benefit consumers, device manufacturers, and mobile wallet application, vendors. Future research is recommended to replicate the study using a quantitative methodology, however, in a different setting with a larger sample of older adults as participants.