Automated Detection of Reference Structures in Law (original) (raw)

Reference Extraction and Resolution for Legal Texts

Lecture Notes in Computer Science, 2005

An application to the legal domain of information extraction is presented. Its goal is to automate the extraction of references from legal documents, their resolution, and the storage of their information in order to facilitate an automatic treatment of these information items by services offered in digital libraries. References are extracted matching the texts in the collection against sets of patterns, using grammars.

Dealing with automatic reference extraction in the legal domain digital libraries

Palabras clave : Referencias, extracción de información, textos jurídicos Résumé Nous présentons une application de l'extraction d'information au domaine juridique. Le but est d'automatiser l'extraction de références des documents juridiques (par un analyse du contenu). Les informations concernant les références extraites sont stockées, et utilisées par des services offerts dans les bibliothèques électroniques. Le traitement couvre l'analyse du domaine juridique à l'implantation des logiciels, et quelques expérimentations. Ce travail est fait en collaboration avec des juristes.

Solon: A Holistic Approach for Modelling, Managing and Mining Legal Sources

Algorithms

Recently there has been an exponential growth of the number of publicly available legal resources. Portals allowing users to search legal documents, through keyword queries, are now widespread. However, legal documents are mainly stored and offered in different sources and formats that do not facilitate semantic machine-readable techniques, thus making difficult for legal stakeholders to acquire, modify or interlink legal knowledge. In this paper, we describe Solon, a legal document management platform. It offers advanced modelling, managing and mining functions over legal sources, so as to facilitate access to legal knowledge. It utilizes a novel method for extracting semantic representations of legal sources from unstructured formats, such as PDF and HTML text files, interlinking and enhancing them with classification features. At the same time, utilizing the structure and specific features of legal sources, it provides refined search results. Finally, it allows users to connect a...

Linking European Case Law: BO-ECLI Parser, an Open Framework for the Automatic Extraction of Legal Links

International Conference on Legal Knowledge and Information Systems, 2017

In this paper we present the BO-ECLI Parser, an open framework for the extraction of legal references from case-law issued by judicial authorities of European member States. The problem of automatic legal links extraction from texts is tackled for multiple languages and jurisdictions by providing a common stack which is customizable through pluggable extensions in order to cover the linguistic diversity and specific peculiarities of national legal citation practices. The aim is to increase the availability in the public domain of machine readable references metadata for case-law by sharing common services, a guided methodology and efficient solutions to recurrent problems in legal references extraction, that reduce the effort needed by national data providers to develop their own extraction solution. Keywords. natural language processing, legal references, case law databases, linked open data 1 Council conclusions inviting the introduction of the European Case Law Identifier (ECLI) and a minimum set of uniform metadata for case law (CELEX:52011XG0429(01)).

Automatic Annotation Service: Utilizing a Named Entity Linking Tool in Legal Domain

2019

Texts referencing court decisions, statutes, and EU directives can be difficult to understand without context. It can be time consuming and expensive to find related statutes or to learn about context specific terminology. As a solution, we utilized an automatic annotation tool, Nelli, for extracting information and tailored it to a service that can automatically annotate legal documents to provide context to the readers. The service can identify and link named entities and references to legal texts to corresponding vocabularies and data sources by combining statisticsand rule-based named entity recognition with named entity linking. The results provide users with enhanced reading experience with contextual information and possibility to access related materials such as statutes and court decisions.

Constructing a semantic network for legal content

Proceedings of the 10th …, 2005

The Dutch Tax and Customs Administration (DTCA) is one of many organizations that deal with a multitude of electronic legal data, from various sources and in different formats. In this paper, we describe the results of a study aimed at better access to these sources by having a supplier and format independent knowledge store that describes the sources and their interrelations in a semantic network. Furthermore we developed parsers to automatically detect the identity of sources and typed references within the sources to other legal documents. These parsers can be used to fill and update the semantic network as new documents are added.

Legalurn: A Framework for Organizing and Surfing Legal Documents on the Web

2005

Identifying resources is a critical issue in the wide web information space. Several identification systems have been defined, each tailored to a specific domain or application field, and characterized by many limitations. In this paper we describe an identification system compliant to URN specification that has been defined and implemented specifically for the legal domain, while providing several innovative features. The system allows to easily manage references to juridical documents and to automate the distributed hyperlinking building process. Moreover, the system provides a resolution service associating to the logical identifier a physical resource (e.g. URL) and other facilities to ensure semantic coherence and unambiguousness in the uniform names attribution task. Finally, we briefly outline future work concerning the opportunity to investigate other relevant properties in the legal domain by representing laws as a directed graph.

An Annotation Language for Semantic Search of Legal Sources

Language Resources and Evaluation, 2018

While formalizing legal sources is an important challenge, the generation of a formal representation from legal texts has been far less considered and requires considerable expertise. In order to improve the uniformity, richness, and efficiency of legal annotation, it is necessary to experiment with annotations and the annotation process. This paper reports on a first experiment, which was a campaign to annotate legal instruments provided by the Scottish Government's Parliamentary Counsel Office and bearing on Scottish smoking legislation and regulation. A small set of elements related to LegalRuleML was used. An initial guideline manual was produced to annotate the text using annotations related to these elements. The resulting annotated corpus is converted into a LegalRuleML XML compliant document, then made available via an online visualisation and query tool. In the course of annotating the documents, a range of important interpretive and practical issues arose, highlighting the value of a focused study on legal text annotation.

LegalVis: Exploring and Inferring Precedent Citations in Legal Documents

IEEE Transactions on Visualization and Computer Graphics, 2022

To reduce the number of pending cases and conflicting rulings in the Brazilian Judiciary, the National Congress amended the Constitution, allowing the Brazilian Supreme Court (STF) to create binding precedents (BPs), i.e., a set of understandings that both Executive and lower Judiciary branches must follow. The STF's justices frequently cite the 58 existing BPs in their decisions, and it is of primary relevance that judicial experts could identify and analyze such citations. To assist in this problem, we propose LegalVis, a web-based visual analytics system designed to support the analysis of legal documents that cite or could potentially cite a BP. We model the problem of identifying potential citations (i.e., non-explicit) as a classification problem. However, a simple score is not enough to explain the results; that is why we use an interpretability machine learning method to explain the reason behind each identified citation. For a compelling visual exploration of documents and BPs, LegalVis comprises three interactive visual components: the first presents an overview of the data showing temporal patterns, the second allows filtering and grouping relevant documents by topic, and the last one shows a document's text aiming to interpret the model's output by pointing out which paragraphs are likely to mention the BP, even if not explicitly specified. We evaluated our identification model and obtained an accuracy of 96%; we also made a quantitative and qualitative analysis of the results. The usefulness and effectiveness of LegalVis were evaluated through two usage scenarios and feedback from six domain experts.

Annotating legal documents with GaiusT 2.0

International Journal of Metadata, Semantics and Ontologies , 2017

We present the GaiusT 2.0 framework for annotating legal documents. The framework was designed and implemented as a web-based system to semi-automate the extraction of legal concepts from text. In requirements analysis these concepts can be used to identify requirements a software system has to fulfil to comply with a law or regulation. The analysis and annotation of legal documents in prescriptive natural language is still an open problem for research in the field. In GaiusT 2.0, a multistep process exploits a number of linguistic and technological resources to offer a comprehensive annotation environment. The modules of the system are presented as evolutions from corresponding modules of the original GaiusT framework, which in turn was based on a general-purpose annotation tool, Cerno. The application of GaiusT 2.0 is illustrated with two use cases, to demonstrate the extraction process and its adaptability to different law models.