HTML browsing and curation (indra.assemblers.html.assembler) — INDRA 1.22.0 documentation (original) (raw)
Format a set of INDRA Statements into an HTML-formatted report which also supports curation.
indra.assemblers.html.assembler.DB_TEXT_COLOR = 'black'
The text color for database sources when shown as source count badges
indra.assemblers.html.assembler.READER_TEXT_COLOR = 'white'
The text color for reader sources when shown as source count badges
indra.assemblers.html.assembler.generate_source_css(fname, source_colors=None)[source]
Save a stylesheet defining color, background-color for the given sources
Parameters
- fname (str) – Where to save the stylesheet
- source_colors (Optional[List[Tuple[str, Dict[str, Union[str, Dict[str, str]]]]]]) – Colors defining the styles. Default: DEFAULT_SOURCE_COLORS.
class indra.assemblers.html.assembler.HtmlAssembler(statements=None, summary_metadata=None, ev_counts=None, beliefs=None, source_counts=None, curation_dict=None, title='INDRA Results', db_rest_url=None, sort_by='default', custom_stats=None, custom_sources=None)[source]
Generates an HTML-formatted report from INDRA Statements.
The HTML report format includes statements formatted in English (by the EnglishAssembler), text and metadata for the Evidence object associated with each Statement, and a Javascript-based curation interface linked to the INDRA database (access permitting). The interface allows for curation of statements at the evidence level by letting the user specify type of error and (optionally) provide a short description of of the error.
Parameters
- statements (Optional _[_list[ indra.statements.Statement ] ]) – A list of INDRA Statements to be added to the assembler. Statements can also be added using the add_statements method after the assembler has been instantiated.
- summary_metadata (Optional _[_dict]) – Dictionary of statement corpus metadata such as that provided by the INDRA REST API. Default is None. Each value should be a concise summary of O(1), not of order the length of the list, such as the evidence totals. The keys should be informative human-readable strings. This information is displayed as a tooltip when hovering over the page title.
- ev_counts (Optional _[_dict]) – A dictionary of the total evidence available for each statement indexed by hash. If not provided, the statements that are passed to the constructor are used to determine these, with whatever evidences these statements carry.
- beliefs (Optional _[_dict]) – A dictionary of the belief of each statement indexed by hash. If not provided, the beliefs of the statements passed to the constructor are used.
- source_counts (Optional _[_dict]) – A dictionary of the itemized evidence counts, by source, available for each statement, indexed by hash. If not provided, the statements that are passed to the constructor are used to determine these, with whatever evidences these statements carry.
- title (str) – The title to be printed at the top of the page.
- db_rest_url (Optional _[_str]) – The URL to a DB REST API to use for links out to further evidence. If given, this URL will be prepended to links that load additional evidence for a given Statement. One way to obtain this value is from the configuration entry indra.config.get_config(‘INDRA_DB_REST_URL’). If None, the URLs are constructed as relative links. Default: None
- sort_by (str or function or None) –
If str, it indicates which parameter to sort by, such as ‘belief’ or ‘ev_count’, or ‘ag_count’. Those are the default options because they can be derived from a list of statements, however if you give a custom list of stats with the custom_stats argument, you may use any of the parameters used to build it. The default, ‘default’, is mostly a sort by ev_count but also favors statements with fewer agents.
Alternatively, you may give a function that takes a dict as its single argument, a dictionary of metrics. The contents of this dictionary always include “belief”, “ev_count”, and “ag_count”. If source_counts are given, each source will also be available as an entry (e.g. “reach” and “sparser”). As with string values, you may also add your own custom stats using the custom_stats argument.
The value may also be None, in which case the sort function will return the same value for all elements, and thus the original order of elements will be preserved. This could have strange effects when statements are grouped (i.e. when grouping_level is not ‘statement’); such functionality is untested. - custom_stats (Optional _[_list]) – A list of StmtStat objects containing custom statement statistics to be used in sorting of statements and statement groups.
- custom_sources (SourceInfo) –
Use this if the sources in the statements are from sources other than the default ones present in indra/resources/source_info.json The structure of the input must conform to:
{
"source_key": {
"name": "Source Name",
"link": "",
"type": "reader|database",
"domain": "",
"default_style": {
"color": "",
"background-color": ""
}
},
...
}
Where and must be color names or color codes allowed in an html document per the CSS3 specification:https://www.w3.org/TR/css-color-3/#svg-color
statements
A list of INDRA Statements to assemble.
Type
list[indra.statements.Statement]
model
The HTML report formatted as a single string.
Type
metadata
Dictionary of statement list metadata such as that provided by the INDRA REST API.
Type
ev_counts
A dictionary of the total evidence available for each statement indexed by hash.
Type
beliefs
A dictionary of the belief score of each statement, indexed by hash.
Type
db_rest_url
The URL to a DB REST API.
Type
add_statements(statements)[source]
Add a list of Statements to the assembler.
Parameters
statements (list[ indra.statements.Statement ]) – A list of INDRA Statements to be added to the assembler.
make_json_model(grouping_level='agent-pair', no_redundancy=False, **kwargs)[source]
Return the JSON used to create the HTML display.
Parameters
- grouping_level (Optional _[_str]) – Statements can be grouped at three levels, ‘statement’ (ungrouped), ‘relation’ (grouped by agents and type), and ‘agent-pair’ (grouped by ordered pairs of agents). Default: ‘agent-pair’.
- no_redundancy (Optional _[_bool]) – If True, any group of statements that was already presented under a previous heading will be skipped. This is typically the case for complexes where different permutations of complex members are presented. By setting this argument to True, these can be eliminated. Default: False
Returns
json – A complexly structured JSON dict containing grouped statements and various metadata.
Return type
make_model(template=None, grouping_level='agent-pair', add_full_text_search_link=False, no_redundancy=False, **template_kwargs)[source]
Return the assembled HTML content as a string.
Parameters
- template (a Template object) – Manually pass a Jinja template to be used in generating the HTML. The template is responsible for rendering essentially the output ofmake_json_model.
- grouping_level (Optional _[_str]) – Statements can be grouped under sub-headings at three levels, ‘statement’ (ungrouped), ‘relation’ (grouped by agents and type), and ‘agent-pair’ (grouped by ordered pairs of agents). Default: ‘agent-pair’.
- add_full_text_search_link (bool) – If True, link with Text fragment search in PMC journal will be added for the statements.
- no_redundancy (Optional _[_bool]) –
If True, any group of statements that was already presented under a previous heading will be skipped. This is typically the case for complexes where different permutations of complex members are presented. By setting this argument to True, these can be eliminated. Default: False
All other keyword arguments are passed along to the template. If you are using a custom template with args that are not passed below, this is how you pass them.
Returns
The assembled HTML as a string.
Return type
Append a warning message to the model to expose issues.
save_model(fname, **kwargs)[source]
Save the assembled HTML into a file.
Other kwargs are passed directly to make_model.
Parameters
fname (str) – The path to the file to save the HTML into.
indra.assemblers.html.assembler.src_url(ev)[source]
Given an Evidence object, provide the URL for the source
Return type
indra.assemblers.html.assembler.tag_text(text, tag_info_list)[source]
Apply start/end tags to spans of the given text.
Parameters
- text (str) – Text to be tagged
- tag_info_list (list of tuples) – Each tuple refers to a span of the given text. Fields are (start_ix, end_ix, substring, start_tag, close_tag), where substring, start_tag, and close_tag are strings. If any of the given spans of text overlap, the longest span is used.
Returns
String where the specified substrings have been surrounded by the given start and close tags.
Return type
indra.assemblers.html.assembler.complete_source_counts(source_counts)[source]
Return source counts that are complete with respect to all sources.
This is necessary because the statement presentation module expects that all sources that appear in any statement source count appear in all statement source counts (even if the count is 0).