Yoshinobu Kano | Shizuoka University (original) (raw)
Papers by Yoshinobu Kano
Lecture Notes in Computer Science, 2023
Biocomputing 2008 - Proceedings of the Pacific Symposium, 2007
Recently, several text mining programs have reached a near-practical level of performance. Some s... more Recently, several text mining programs have reached a near-practical level of performance. Some systems are already being used by biologists and database curators. However, it has also been recognized that current Natural Language Processing (NLP) and Text Mining (TM) technology is not easy to deploy, since research groups tend to develop systems that cater specifically to their own requirements. One of the major reasons for the difficulty of deployment of NLP/TM technology is that re-usability and interoperability of software tools are typically not considered during development. While some effort has been invested in making interoperable NLP/TM toolkits, the developers of end-to-end systems still often struggle to reuse NLP/TM tools, and often opt to develop similar programs from scratch instead. This is particularly the case in BioNLP, since the requirements of biologists are so diverse that NLP tools have to be adapted and reorganized in a much more extensive manner than was originally expected. Although generic frameworks like UIMA (Unstructured Information Management Architecture) provide promising ways to solve this problem, the solution that they provide is only partial. In order for truly interoperable toolkits to become a reality, we also need sharable type systems and a developer-friendly environment for software integration that includes functionality for systematic comparisons of available tools, a simple I/O interface, and visualization tools. In this paper, we describe such an environment that was developed based on UIMA, and we show its feasibility through our experience in developing a protein-protein interaction (PPI) extraction system.
Proceedings of the 2022 International Joint Workshop on Multimedia Artworks Analysis and Attractiveness Computing in Multimedia
In order to deal with the variety of meanings and contexts of words, we created a Japanese Situat... more In order to deal with the variety of meanings and contexts of words, we created a Japanese Situation-dependent Sentiment Polarity Dictionary (SiSP) of sentiment values labeled for 20 different situations. This dictionary was annotated by crowdworkers with 25,520 Japanese words, and consists of 10 responses for each situation of each word. Using our SiSP, we predicted the polarity of each word in the dictionary and that of dictionary words in sentences considering the context. In both experiments, situation-dependent prediction showed superior results in determining emotional polarity. CCS CONCEPTS • Computing methodologies → Language resources.
Transactions of the Japanese Society for Artificial Intelligence, 2022
In recent years, there has been a lot of research on building dialogue systems using deep learnin... more In recent years, there has been a lot of research on building dialogue systems using deep learning, which can generate relatively fluent response sentences to user utterances. Nevertheless, they tend to produce responses that are not diverse and which are less context-dependent. Assuming that the problem is caused by the Softmax Cross-Entropy (SCE) loss, which treats all words equally without considering the imbalance in the training data, a loss function Inverse Token Frequency (ITF) loss, which multiplies the SCE loss by a weight based on the inverse of the token frequency, was proposed and confirmed the improvement of dialogue diversity. However, in the diversity of sentences, it is necessary to consider not only the information of independent tokens, but also the frequency of incorporating a sequence of tokens. Using frequencies that incorporate a sequence of tokens to compute weights that dynamically change depending on the context, we can better represent the diversity we seek. Therefore, we propose a loss function, Inverse N-gram Frequency (INF) loss, which is weighted based on the inverse of the n-gram frequency of the tokens instead of the frequency of the tokens. In order to confirm the effectiveness of the proposed method on INF loss, we conducted metric-based and human evaluations of sentences automatically generated by models trained on the Japanese and English Twitter datasets. In the metric-based evaluation, Perplexity, BLEU, DIST-N, ROUGE, and length were used as evaluation indices. In the human evaluation, we assessed the coherence and diversity of the response sentences. In the metric-based evaluation, the proposed INF model achieved higher scores in Perplexity, DIST-N, and ROUGE than the previous methods. In the human evaluation, the INF model also showed superior values.
This paper proposes a method for neural machine translation (NMT) with kanji decomposition of Jap... more This paper proposes a method for neural machine translation (NMT) with kanji decomposition of Japanese text. NMT models have restrictions of the vocabulary size, which can be solved by applying subword, character-level, or byte-based models. In Japanese text, the vocabulary size would not be minimized even in a character level because of kanji varieties. We report an experimental result of NMT model using Japanese text with kanji decomposition that is expected to satisfy both of decreasing vocabulary size and keeping kanji information.
The file is a guideline to show classification criteria for disease tweets (written in English). ... more The file is a guideline to show classification criteria for disease tweets (written in English). The guideline was used for NTCIR-13 MedWeb task.
Searching precedence is an error-prone and time-intensive task in the legal field. The query cons... more Searching precedence is an error-prone and time-intensive task in the legal field. The query construction is difficult since too few search terms result in an abundance of results, but too many increase the chance of missing an important piece. This research follows the question: How should a precedent retrieval system without manual query construction be developed? A queryless system is researched using entire documents as an input. First the best performing scoring functions are tested, followed by research on the best preprocessing steps using the scoring functions from the first step. The More Like This query from Elasticsearch and TF–IDF perform better as scoring functions than doc2vec and LDA. The preprocessing research shows that basic preprocessing steps perform well and that data-specific steps could increase performance. These results are a good starting point for creating a legal search system. To research the integration of a legal information retrieval into the legal fi...
The Review of Socionetwork Strategies, 2022
We summarize the 8th Competition on Legal Information Extraction and Entailment. In this edition,... more We summarize the 8th Competition on Legal Information Extraction and Entailment. In this edition, the competition included five tasks on case law and statute law. The case law component includes an information retrieval Task (Task 1), and the confirmation of an entailment relation between an existing case and an unseen case (Task 2). The statute law component includes an information retrieval Task (Task 3), an entailment/question answering task based on retrieved civil code statutes (Task 4) and an entailment/question answering task without retrieved civil code statutes (Task 5). Participation was open to any group based on any approach. Eight different teams participated in the case law competition tasks, most of them in more than one task. We received results from six teams for Task 1 (16 runs) and 6 teams for Task 2 (17 runs). On the statute law task, there were eight different teams participating, most in more than one task. Six teams submitted a total of 18 runs for Task 3, 6 t...
Journal of Information Processing and Management, 2017
Recently, medical records are increasingly written on electronic media instead of on paper, there... more Recently, medical records are increasingly written on electronic media instead of on paper, thereby increasing the importance of information processing in medical fields. We have organized an NTCIR-10 pilot task for medical records. Our pilot task, MedNLP, comprises three tasks: (1) de-identification, (2) complaint and diagnosis, and (3) free. These tasks represent elemental technologies used to develop computational systems supporting widely diverse medical services. Development has yielded 22 systems for task (1), 15 systems for task (2), and 1 system for task (3). This report presents results of these systems, with discussion clarifying the issues to be resolved in medical NLP fields. Keywords medical records, electronic health records (EHR), de-identification, named entity recognition (NER), shared task and evaluation 1.
To understand the key characteristics of NLP tools, evaluation and comparison against different t... more To understand the key characteristics of NLP tools, evaluation and comparison against different tools is important. And as NLP applications tend to consist of multiple semiindependent sub-components, it is not always enough to just evaluate complete systems, a fine grained evaluation of underlying components is also often worthwhile. Standardization of NLP components and resources is not only significant for reusability, but also in that it allows the comparison of individual components in terms of reliability and robustness in a wider range of target domains. But as many evaluation metrics exist in even a single domain, any system seeking to aid inter-domain evaluation needs not just predefined metrics, but must also support pluggable user-defined metrics. Such a system would of course need to be based on an open standard to allow a large number of components to be compared, and would ideally include visualization of the differences between components. We have developed a pluggable...
Journal of Natural Language Processing, 2016
The Todai Robot Project aims at integrating various AI technologies including natural language pr... more The Todai Robot Project aims at integrating various AI technologies including natural language processing (NLP), as well as uncovering novel AI problems that have been missed while the fragmentation of the research field, through the development
New Frontiers in Artificial Intelligence
EPiC Series in Computing
A central issue of yes/no question answering is usage of knowledge source given a question. While... more A central issue of yes/no question answering is usage of knowledge source given a question. While yes/no question answering has been studied for a long time, legal yes/no question answering largely differs from other domains. The most distinguishing characteristic is that legal issues require precise linguistic analysis such as predicates, case-roles, conditions, etc. We have developed a yes/no question answer-ing system for answering questions in a legal domain. Our system uses linguistic analysis, in order to find correspondences of predicates and arguments given problem sentences and knowledge source sentences. We applied our system to the COLIEE (Competition on Legal Information Extraction/Entailment) 2017 task. Our team shared the second place in this COLIEE 2017 Phase Two task, which asks to answer yes or no given a problem sentence. This result shows that precise linguistic analyses are effective even without the big data approach with machine learning, rather better in its a...
2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)
“Are you a werewolf?” is one of the most popular communication games and is played globally. The ... more “Are you a werewolf?” is one of the most popular communication games and is played globally. The AIWolf Project developed an agent, named “theAIWolf,” that can play “Areyou a werewolf?”. An AIWolf utters its thoughts using an AIWolf Protocol. As it is difficult for humans to understand the AIWolf Protocol, translation into natural language is required when human players are involved. However, the conventional method of translation uses a word-to-word method, creating the impression that the utterances have been generated by a machine. This study aimed make the utterances of AIWolf sound more human. The authors set the target that a human player would be unable to distinguish human speech from that generated by AIWolves (the Turing test). The authors define the situation as the maximum value of humanity. The output of translated AIWolf Protocol was paraphrased using data from Werewolf BBS Logs. This study considers making the utterances of AIWolf sound more human using Werewolf BBS Logs and a possibility assignment equation with fuzzy sets. In this paper, an experiment was conducted to confirm whether paraphrasing the utterances of AIWolf using Werewolf BBS Logs for human-like speech is useful or not. It was shown that the experimental method produced slightly more human-like speech than the conventional method.
We present an overview of two tasks (Tasks 3 and 4) of COLIEE-2018 using the Japanese statute law... more We present an overview of two tasks (Tasks 3 and 4) of COLIEE-2018 using the Japanese statute law (civil law) and its English translation. Task 3 is the task of retrieving articles to decide the appropriateness of the legal question and Task 4 is the task of entailing whether the legal question is correct or not. There are 17 run submissions from eight teams (including two run submissions from the organizers) for Task 3 and seven run submissions from three teams for Task 4. We present a summary of the evaluation results of both tasks.
Language resources, including corpus and tools, are normally required to be combined in order to ... more Language resources, including corpus and tools, are normally required to be combined in order to achieve a user?s specific task. However, resources tend to be developed independently in different, incompatible formats. In this paper we describe about U-Compare, which consists of the U-Compare component repository and the U-Compare platform. We have been building a highly interoperable resource library, providing the world largest ready-to-use UIMA component repository including wide variety of corpus readers and state-of-the-art language tools. These resources can be deployed as local services or web services, even possible to be hosted in clustered machines to increase the performance, while users do not need to be aware of such differences. In addition to the resource library, an integrated language processing platform is provided, allowing workflow creation, comparison, evaluation and visualization, using the resources in the library or any UIMA component, without any programming...
Electronic medical records are now often replacing paper documents, and thus the importance of in... more Electronic medical records are now often replacing paper documents, and thus the importance of information processing in medical fields has increased. We have already organized the NTCIR-10 MedNLP pilot task. It has been the very first shared task attempt to evaluate technologies to retrieve important information from medical reports written in Japanese, whereas the NTCIR-11 MedNLP-2 task has been designed for more advanced and practical use for the medical fields. This task was consisted of three sub tasks: (Task 1) the task to extract disease names and dates, (Task 2) the task to add ICD-10 code to disease names, (Task 3) free task. Ten groups (24 systems) participated in Task 1, 9 groups (19 systems) participated in Task 2, and 2 groups (2systems) participated in Task 3. This report is to present results of these groups with discussions that are to clarify the issues to be resolved in medical natural language processing fields.
This paper describes an overview of the first QA Lab (Question Answering Lab for Entrance Exam) t... more This paper describes an overview of the first QA Lab (Question Answering Lab for Entrance Exam) task at NTCIR 11. The goal of the QA lab is to provide a module-based platform for advanced question answering systems and comparative evaluation for solving real-world university entrance exam questions. In this task, “world history” questions are selected from The National Center Test for University Admissions and from the secondary exams at 5 universities in Japan. This paper also describes the used data, baseline systems, formal run results, the characteristic aspects of the participating groups’ systems and their contribution.
Lecture Notes in Computer Science, 2023
Biocomputing 2008 - Proceedings of the Pacific Symposium, 2007
Recently, several text mining programs have reached a near-practical level of performance. Some s... more Recently, several text mining programs have reached a near-practical level of performance. Some systems are already being used by biologists and database curators. However, it has also been recognized that current Natural Language Processing (NLP) and Text Mining (TM) technology is not easy to deploy, since research groups tend to develop systems that cater specifically to their own requirements. One of the major reasons for the difficulty of deployment of NLP/TM technology is that re-usability and interoperability of software tools are typically not considered during development. While some effort has been invested in making interoperable NLP/TM toolkits, the developers of end-to-end systems still often struggle to reuse NLP/TM tools, and often opt to develop similar programs from scratch instead. This is particularly the case in BioNLP, since the requirements of biologists are so diverse that NLP tools have to be adapted and reorganized in a much more extensive manner than was originally expected. Although generic frameworks like UIMA (Unstructured Information Management Architecture) provide promising ways to solve this problem, the solution that they provide is only partial. In order for truly interoperable toolkits to become a reality, we also need sharable type systems and a developer-friendly environment for software integration that includes functionality for systematic comparisons of available tools, a simple I/O interface, and visualization tools. In this paper, we describe such an environment that was developed based on UIMA, and we show its feasibility through our experience in developing a protein-protein interaction (PPI) extraction system.
Proceedings of the 2022 International Joint Workshop on Multimedia Artworks Analysis and Attractiveness Computing in Multimedia
In order to deal with the variety of meanings and contexts of words, we created a Japanese Situat... more In order to deal with the variety of meanings and contexts of words, we created a Japanese Situation-dependent Sentiment Polarity Dictionary (SiSP) of sentiment values labeled for 20 different situations. This dictionary was annotated by crowdworkers with 25,520 Japanese words, and consists of 10 responses for each situation of each word. Using our SiSP, we predicted the polarity of each word in the dictionary and that of dictionary words in sentences considering the context. In both experiments, situation-dependent prediction showed superior results in determining emotional polarity. CCS CONCEPTS • Computing methodologies → Language resources.
Transactions of the Japanese Society for Artificial Intelligence, 2022
In recent years, there has been a lot of research on building dialogue systems using deep learnin... more In recent years, there has been a lot of research on building dialogue systems using deep learning, which can generate relatively fluent response sentences to user utterances. Nevertheless, they tend to produce responses that are not diverse and which are less context-dependent. Assuming that the problem is caused by the Softmax Cross-Entropy (SCE) loss, which treats all words equally without considering the imbalance in the training data, a loss function Inverse Token Frequency (ITF) loss, which multiplies the SCE loss by a weight based on the inverse of the token frequency, was proposed and confirmed the improvement of dialogue diversity. However, in the diversity of sentences, it is necessary to consider not only the information of independent tokens, but also the frequency of incorporating a sequence of tokens. Using frequencies that incorporate a sequence of tokens to compute weights that dynamically change depending on the context, we can better represent the diversity we seek. Therefore, we propose a loss function, Inverse N-gram Frequency (INF) loss, which is weighted based on the inverse of the n-gram frequency of the tokens instead of the frequency of the tokens. In order to confirm the effectiveness of the proposed method on INF loss, we conducted metric-based and human evaluations of sentences automatically generated by models trained on the Japanese and English Twitter datasets. In the metric-based evaluation, Perplexity, BLEU, DIST-N, ROUGE, and length were used as evaluation indices. In the human evaluation, we assessed the coherence and diversity of the response sentences. In the metric-based evaluation, the proposed INF model achieved higher scores in Perplexity, DIST-N, and ROUGE than the previous methods. In the human evaluation, the INF model also showed superior values.
This paper proposes a method for neural machine translation (NMT) with kanji decomposition of Jap... more This paper proposes a method for neural machine translation (NMT) with kanji decomposition of Japanese text. NMT models have restrictions of the vocabulary size, which can be solved by applying subword, character-level, or byte-based models. In Japanese text, the vocabulary size would not be minimized even in a character level because of kanji varieties. We report an experimental result of NMT model using Japanese text with kanji decomposition that is expected to satisfy both of decreasing vocabulary size and keeping kanji information.
The file is a guideline to show classification criteria for disease tweets (written in English). ... more The file is a guideline to show classification criteria for disease tweets (written in English). The guideline was used for NTCIR-13 MedWeb task.
Searching precedence is an error-prone and time-intensive task in the legal field. The query cons... more Searching precedence is an error-prone and time-intensive task in the legal field. The query construction is difficult since too few search terms result in an abundance of results, but too many increase the chance of missing an important piece. This research follows the question: How should a precedent retrieval system without manual query construction be developed? A queryless system is researched using entire documents as an input. First the best performing scoring functions are tested, followed by research on the best preprocessing steps using the scoring functions from the first step. The More Like This query from Elasticsearch and TF–IDF perform better as scoring functions than doc2vec and LDA. The preprocessing research shows that basic preprocessing steps perform well and that data-specific steps could increase performance. These results are a good starting point for creating a legal search system. To research the integration of a legal information retrieval into the legal fi...
The Review of Socionetwork Strategies, 2022
We summarize the 8th Competition on Legal Information Extraction and Entailment. In this edition,... more We summarize the 8th Competition on Legal Information Extraction and Entailment. In this edition, the competition included five tasks on case law and statute law. The case law component includes an information retrieval Task (Task 1), and the confirmation of an entailment relation between an existing case and an unseen case (Task 2). The statute law component includes an information retrieval Task (Task 3), an entailment/question answering task based on retrieved civil code statutes (Task 4) and an entailment/question answering task without retrieved civil code statutes (Task 5). Participation was open to any group based on any approach. Eight different teams participated in the case law competition tasks, most of them in more than one task. We received results from six teams for Task 1 (16 runs) and 6 teams for Task 2 (17 runs). On the statute law task, there were eight different teams participating, most in more than one task. Six teams submitted a total of 18 runs for Task 3, 6 t...
Journal of Information Processing and Management, 2017
Recently, medical records are increasingly written on electronic media instead of on paper, there... more Recently, medical records are increasingly written on electronic media instead of on paper, thereby increasing the importance of information processing in medical fields. We have organized an NTCIR-10 pilot task for medical records. Our pilot task, MedNLP, comprises three tasks: (1) de-identification, (2) complaint and diagnosis, and (3) free. These tasks represent elemental technologies used to develop computational systems supporting widely diverse medical services. Development has yielded 22 systems for task (1), 15 systems for task (2), and 1 system for task (3). This report presents results of these systems, with discussion clarifying the issues to be resolved in medical NLP fields. Keywords medical records, electronic health records (EHR), de-identification, named entity recognition (NER), shared task and evaluation 1.
To understand the key characteristics of NLP tools, evaluation and comparison against different t... more To understand the key characteristics of NLP tools, evaluation and comparison against different tools is important. And as NLP applications tend to consist of multiple semiindependent sub-components, it is not always enough to just evaluate complete systems, a fine grained evaluation of underlying components is also often worthwhile. Standardization of NLP components and resources is not only significant for reusability, but also in that it allows the comparison of individual components in terms of reliability and robustness in a wider range of target domains. But as many evaluation metrics exist in even a single domain, any system seeking to aid inter-domain evaluation needs not just predefined metrics, but must also support pluggable user-defined metrics. Such a system would of course need to be based on an open standard to allow a large number of components to be compared, and would ideally include visualization of the differences between components. We have developed a pluggable...
Journal of Natural Language Processing, 2016
The Todai Robot Project aims at integrating various AI technologies including natural language pr... more The Todai Robot Project aims at integrating various AI technologies including natural language processing (NLP), as well as uncovering novel AI problems that have been missed while the fragmentation of the research field, through the development
New Frontiers in Artificial Intelligence
EPiC Series in Computing
A central issue of yes/no question answering is usage of knowledge source given a question. While... more A central issue of yes/no question answering is usage of knowledge source given a question. While yes/no question answering has been studied for a long time, legal yes/no question answering largely differs from other domains. The most distinguishing characteristic is that legal issues require precise linguistic analysis such as predicates, case-roles, conditions, etc. We have developed a yes/no question answer-ing system for answering questions in a legal domain. Our system uses linguistic analysis, in order to find correspondences of predicates and arguments given problem sentences and knowledge source sentences. We applied our system to the COLIEE (Competition on Legal Information Extraction/Entailment) 2017 task. Our team shared the second place in this COLIEE 2017 Phase Two task, which asks to answer yes or no given a problem sentence. This result shows that precise linguistic analyses are effective even without the big data approach with machine learning, rather better in its a...
2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)
“Are you a werewolf?” is one of the most popular communication games and is played globally. The ... more “Are you a werewolf?” is one of the most popular communication games and is played globally. The AIWolf Project developed an agent, named “theAIWolf,” that can play “Areyou a werewolf?”. An AIWolf utters its thoughts using an AIWolf Protocol. As it is difficult for humans to understand the AIWolf Protocol, translation into natural language is required when human players are involved. However, the conventional method of translation uses a word-to-word method, creating the impression that the utterances have been generated by a machine. This study aimed make the utterances of AIWolf sound more human. The authors set the target that a human player would be unable to distinguish human speech from that generated by AIWolves (the Turing test). The authors define the situation as the maximum value of humanity. The output of translated AIWolf Protocol was paraphrased using data from Werewolf BBS Logs. This study considers making the utterances of AIWolf sound more human using Werewolf BBS Logs and a possibility assignment equation with fuzzy sets. In this paper, an experiment was conducted to confirm whether paraphrasing the utterances of AIWolf using Werewolf BBS Logs for human-like speech is useful or not. It was shown that the experimental method produced slightly more human-like speech than the conventional method.
We present an overview of two tasks (Tasks 3 and 4) of COLIEE-2018 using the Japanese statute law... more We present an overview of two tasks (Tasks 3 and 4) of COLIEE-2018 using the Japanese statute law (civil law) and its English translation. Task 3 is the task of retrieving articles to decide the appropriateness of the legal question and Task 4 is the task of entailing whether the legal question is correct or not. There are 17 run submissions from eight teams (including two run submissions from the organizers) for Task 3 and seven run submissions from three teams for Task 4. We present a summary of the evaluation results of both tasks.
Language resources, including corpus and tools, are normally required to be combined in order to ... more Language resources, including corpus and tools, are normally required to be combined in order to achieve a user?s specific task. However, resources tend to be developed independently in different, incompatible formats. In this paper we describe about U-Compare, which consists of the U-Compare component repository and the U-Compare platform. We have been building a highly interoperable resource library, providing the world largest ready-to-use UIMA component repository including wide variety of corpus readers and state-of-the-art language tools. These resources can be deployed as local services or web services, even possible to be hosted in clustered machines to increase the performance, while users do not need to be aware of such differences. In addition to the resource library, an integrated language processing platform is provided, allowing workflow creation, comparison, evaluation and visualization, using the resources in the library or any UIMA component, without any programming...
Electronic medical records are now often replacing paper documents, and thus the importance of in... more Electronic medical records are now often replacing paper documents, and thus the importance of information processing in medical fields has increased. We have already organized the NTCIR-10 MedNLP pilot task. It has been the very first shared task attempt to evaluate technologies to retrieve important information from medical reports written in Japanese, whereas the NTCIR-11 MedNLP-2 task has been designed for more advanced and practical use for the medical fields. This task was consisted of three sub tasks: (Task 1) the task to extract disease names and dates, (Task 2) the task to add ICD-10 code to disease names, (Task 3) free task. Ten groups (24 systems) participated in Task 1, 9 groups (19 systems) participated in Task 2, and 2 groups (2systems) participated in Task 3. This report is to present results of these groups with discussions that are to clarify the issues to be resolved in medical natural language processing fields.
This paper describes an overview of the first QA Lab (Question Answering Lab for Entrance Exam) t... more This paper describes an overview of the first QA Lab (Question Answering Lab for Entrance Exam) task at NTCIR 11. The goal of the QA lab is to provide a module-based platform for advanced question answering systems and comparative evaluation for solving real-world university entrance exam questions. In this task, “world history” questions are selected from The National Center Test for University Admissions and from the secondary exams at 5 universities in Japan. This paper also describes the used data, baseline systems, formal run results, the characteristic aspects of the participating groups’ systems and their contribution.