HTML browsing and curation (indra.assemblers.html.assembler) — INDRA 1.22.0 documentation (original) (raw)

INDRA

Format a set of INDRA Statements into an HTML-formatted report which also supports curation.

indra.assemblers.html.assembler.DB_TEXT_COLOR = 'black'

The text color for database sources when shown as source count badges

indra.assemblers.html.assembler.READER_TEXT_COLOR = 'white'

The text color for reader sources when shown as source count badges

indra.assemblers.html.assembler.generate_source_css(fname, source_colors=None)[source]

Save a stylesheet defining color, background-color for the given sources

Parameters

fname (str) – Where to save the stylesheet
source_colors (Optional[List[Tuple[str, Dict[str, Union[str, Dict[str, str]]]]]]) – Colors defining the styles. Default: DEFAULT_SOURCE_COLORS.

class indra.assemblers.html.assembler.HtmlAssembler(statements=None, summary_metadata=None, ev_counts=None, beliefs=None, source_counts=None, curation_dict=None, title='INDRA Results', db_rest_url=None, sort_by='default', custom_stats=None, custom_sources=None)[source]

Generates an HTML-formatted report from INDRA Statements.

The HTML report format includes statements formatted in English (by the EnglishAssembler), text and metadata for the Evidence object associated with each Statement, and a Javascript-based curation interface linked to the INDRA database (access permitting). The interface allows for curation of statements at the evidence level by letting the user specify type of error and (optionally) provide a short description of of the error.

Parameters

statements (Optional _[_list[ indra.statements.Statement ] ]) – A list of INDRA Statements to be added to the assembler. Statements can also be added using the add_statements method after the assembler has been instantiated.
summary_metadata (Optional _[_dict]) – Dictionary of statement corpus metadata such as that provided by the INDRA REST API. Default is None. Each value should be a concise summary of O(1), not of order the length of the list, such as the evidence totals. The keys should be informative human-readable strings. This information is displayed as a tooltip when hovering over the page title.
ev_counts (Optional _[_dict]) – A dictionary of the total evidence available for each statement indexed by hash. If not provided, the statements that are passed to the constructor are used to determine these, with whatever evidences these statements carry.
beliefs (Optional _[_dict]) – A dictionary of the belief of each statement indexed by hash. If not provided, the beliefs of the statements passed to the constructor are used.
source_counts (Optional _[_dict]) – A dictionary of the itemized evidence counts, by source, available for each statement, indexed by hash. If not provided, the statements that are passed to the constructor are used to determine these, with whatever evidences these statements carry.
title (str) – The title to be printed at the top of the page.
db_rest_url (Optional _[_str]) – The URL to a DB REST API to use for links out to further evidence. If given, this URL will be prepended to links that load additional evidence for a given Statement. One way to obtain this value is from the configuration entry indra.config.get_config(‘INDRA_DB_REST_URL’). If None, the URLs are constructed as relative links. Default: None
sort_by (str or function or None) –
If str, it indicates which parameter to sort by, such as ‘belief’ or ‘ev_count’, or ‘ag_count’. Those are the default options because they can be derived from a list of statements, however if you give a custom list of stats with the custom_stats argument, you may use any of the parameters used to build it. The default, ‘default’, is mostly a sort by ev_count but also favors statements with fewer agents.
Alternatively, you may give a function that takes a dict as its single argument, a dictionary of metrics. The contents of this dictionary always include “belief”, “ev_count”, and “ag_count”. If source_counts are given, each source will also be available as an entry (e.g. “reach” and “sparser”). As with string values, you may also add your own custom stats using the custom_stats argument.
The value may also be None, in which case the sort function will return the same value for all elements, and thus the original order of elements will be preserved. This could have strange effects when statements are grouped (i.e. when grouping_level is not ‘statement’); such functionality is untested.
custom_stats (Optional _[_list]) – A list of StmtStat objects containing custom statement statistics to be used in sorting of statements and statement groups.
custom_sources (SourceInfo) –
Use this if the sources in the statements are from sources other than the default ones present in indra/resources/source_info.json The structure of the input must conform to:
{
"source_key": {
"name": "Source Name",
"link": "",
"type": "reader|database",
"domain": "",
"default_style": {
"color": "",
"background-color": ""
}
},
...
}
Where and must be color names or color codes allowed in an html document per the CSS3 specification:https://www.w3.org/TR/css-color-3/#svg-color

statements

A list of INDRA Statements to assemble.

Type

list[indra.statements.Statement]

model

The HTML report formatted as a single string.

Type

str

metadata

Dictionary of statement list metadata such as that provided by the INDRA REST API.

Type

dict

ev_counts

A dictionary of the total evidence available for each statement indexed by hash.

Type

dict

beliefs

A dictionary of the belief score of each statement, indexed by hash.

Type

dict

db_rest_url

The URL to a DB REST API.

Type

str

add_statements(statements)[source]

Add a list of Statements to the assembler.

Parameters

statements (list[ indra.statements.Statement ]) – A list of INDRA Statements to be added to the assembler.

make_json_model(grouping_level='agent-pair', no_redundancy=False, **kwargs)[source]

Return the JSON used to create the HTML display.

Parameters

grouping_level (Optional _[_str]) – Statements can be grouped at three levels, ‘statement’ (ungrouped), ‘relation’ (grouped by agents and type), and ‘agent-pair’ (grouped by ordered pairs of agents). Default: ‘agent-pair’.
no_redundancy (Optional _[_bool]) – If True, any group of statements that was already presented under a previous heading will be skipped. This is typically the case for complexes where different permutations of complex members are presented. By setting this argument to True, these can be eliminated. Default: False

Returns

json – A complexly structured JSON dict containing grouped statements and various metadata.

Return type

dict

make_model(template=None, grouping_level='agent-pair', add_full_text_search_link=False, no_redundancy=False, **template_kwargs)[source]

Return the assembled HTML content as a string.

Parameters

template (a Template object) – Manually pass a Jinja template to be used in generating the HTML. The template is responsible for rendering essentially the output ofmake_json_model.
grouping_level (Optional _[_str]) – Statements can be grouped under sub-headings at three levels, ‘statement’ (ungrouped), ‘relation’ (grouped by agents and type), and ‘agent-pair’ (grouped by ordered pairs of agents). Default: ‘agent-pair’.
add_full_text_search_link (bool) – If True, link with Text fragment search in PMC journal will be added for the statements.
no_redundancy (Optional _[_bool]) –
If True, any group of statements that was already presented under a previous heading will be skipped. This is typically the case for complexes where different permutations of complex members are presented. By setting this argument to True, these can be eliminated. Default: False
All other keyword arguments are passed along to the template. If you are using a custom template with args that are not passed below, this is how you pass them.

Returns

The assembled HTML as a string.

Return type

str

append_warning(msg)[source]

Append a warning message to the model to expose issues.

save_model(fname, **kwargs)[source]

Save the assembled HTML into a file.

Other kwargs are passed directly to make_model.

Parameters

fname (str) – The path to the file to save the HTML into.

indra.assemblers.html.assembler.src_url(ev)[source]

Given an Evidence object, provide the URL for the source

Return type

str

indra.assemblers.html.assembler.tag_text(text, tag_info_list)[source]

Apply start/end tags to spans of the given text.

Parameters

text (str) – Text to be tagged
tag_info_list (list of tuples) – Each tuple refers to a span of the given text. Fields are (start_ix, end_ix, substring, start_tag, close_tag), where substring, start_tag, and close_tag are strings. If any of the given spans of text overlap, the longest span is used.

Returns

String where the specified substrings have been surrounded by the given start and close tags.

Return type

str

indra.assemblers.html.assembler.complete_source_counts(source_counts)[source]

Return source counts that are complete with respect to all sources.

This is necessary because the statement presentation module expects that all sources that appear in any statement source count appear in all statement source counts (even if the count is 0).