Jouni Väliaho - Academia.edu (original) (raw)
Papers by Jouni Väliaho
Several immunodeficiency-related genes are known and a numb er of disease causing mutations have ... more Several immunodeficiency-related genes are known and a numb er of disease causing mutations have been identified. Immunodeficiency related mutation information have been collected into IDbases. We are currently maintaining 15 registries, which are available at http://www.uta.fi/imt/bioinfo/. The databases are available for the following disorders: adenosine deaminase, CD3 epsilon, CD3 gamma, JAK3, RAGI, RAG2, and ZAP-70 deficiencies, Bloom, X-linked hyper IgM and X-linked lymphoproliferative syndromes, X-linked agammaglobulinemia, and four forms of chronic granulomatous disease. The databases serve as knowledge bases for the disorders. They have been collected with the aid from the immunodeficiency research community and we ask the scientist performing mutation analyses to continue to submit data to the registries. (Less)
Human Mutation
Accurate prediction of the impact of genomic variation on phenotype is a major goal of computatio... more Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine.
Molecular medicine (Cambridge, Mass.), 2000
Bloom syndrome (BS) is characterized by mutations within the BLM gene. The Bloom syndrome protein... more Bloom syndrome (BS) is characterized by mutations within the BLM gene. The Bloom syndrome protein (BLM) has similarity to the RecQ subfamily of DNA helicases, which contain seven conserved helicase domains and share significant sequence and structural similarity with the Rep and PcrA DNA helicases. We modeled the three-dimensional structure of the BLM helicase domain to analyze the structural basis of BS-causing mutations. The sequence alignment was performed for RecQ DNA helicases and Rep and PcrA helicases. The crystal structure of PcrA helicase (PDB entry 3PJR) was used as the template for modeling the BLM helicase domain. The model was used to infer the function of BLM and to analyze the effect of the mutations. The structural model with good stereochemistry of the BLM helicase domain contains two subdomains, 1A and 2A. The electrostatic potential of the model is highly negative over most of the surface, except for the cleft between subdomains 1A and 2A which is similar to the t...
Human mutation, Jan 16, 2015
Knowledge about features distinguishing deleterious and neutral variations is crucial for interpr... more Knowledge about features distinguishing deleterious and neutral variations is crucial for interpretation of novel variants. Bruton tyrosine kinase (BTK) contains among the human protein kinases the highest number of unique disease-causing variations, still it is just 10% of all the possible single nucleotide substitution-caused amino acid variations. In the BTK kinase domain (BTK-KD) can appear altogether 1495 such variants. We investigated them all with bioinformatic and protein structure analysis methods. Most disease-causing variations affect conserved and buried residues disturbing protein stability. Minority of exposed residues is conserved, but strongly tied to pathogenicity. 67% of the variations are predicted to be harmful. In 39% of the residues, all the variants are likely harmful, while in 10% of sites all the substitutions are tolerated. Results indicate the importance of the entire kinase domain, involvement in numerous interactions, and intricate functional regulation ...
Background: Although biomedical information is growing rapidly, it is difficult to find and retri... more Background: Although biomedical information is growing rapidly, it is difficult to find and retrieve validated data especially for rare hereditary diseases. There is an increased need for services capable of integrating and validating information as well as proving it in a logically organized structure. A XML-based language enables creation of open source databases for storage, maintenance and delivery for different platforms. Methods: Here we present a new data model called fact file and an XML-based specification Inherited Disease Markup Language (IDML), that were developed to facilitate disease information integration, storage and exchange. The data model was applied to primary immunodeficiencies, but it can be used for any hereditary disease. Fact files integrate biomedical, genetic and clinical information related to hereditary diseases. Results: IDML and fact files were used to build a comprehensive Web and WAP accessible knowledge base ImmunoDeficiency Resource (IDR) available at http://bioinf.uta.fi/idr/. A fact file is a user oriented user interface, which serves as a starting point to explore information on hereditary diseases. Conclusion: The IDML enables the seamless integration and presentation of genetic and disease information resources in the Internet. IDML can be used to build information services for all kinds of inherited diseases. The open source specification and related programs are available at http:// bioinf.uta.fi/idml/.
Advances in Genetics, 2001
... Giliani 6 , Lennart Hammarström 7 , Michael S. Hershfield 3 , Paul G. Heyworth 8 , Amy P. Hsu... more ... Giliani 6 , Lennart Hammarström 7 , Michael S. Hershfield 3 , Paul G. Heyworth 8 , Amy P. Hsu 9 , Aleksi Lähdesmäki 7 , Ilkka Lappalainen 10 , 1 , Luigi D. Notarangelo 6 , Jennifer M. Puck 9 ... 1 Institute of Medical Technology, FIN-33014 University of Tampere, Tampere, Finland. ...
A Molecular and Genetic Approach, 2013
Nucleic Acids Research, 2002
The ImmunoDeficiency Resource (IDR), freely available at http://www.uta.fi/imt/bioinfo/idr/, is a... more The ImmunoDeficiency Resource (IDR), freely available at http://www.uta.fi/imt/bioinfo/idr/, is a comprehensive knowledge base on immunodeficiencies. It is designed for different user groups such as researchers, physicians and nurses as well as patients and their families and the general public. Information on immunodeficiencies is stored as fact files, which are diseaseand gene-based information resources. We have developed an inherited disease markup language (IDML) data model, which is designed for storing disease-and gene-specific data in extensible markup language (XML) format. The fact files written by the IDML can be used to present data in different contexts and platforms. All the information in the IDR is validated by expert curators.
Journal of Clinical Immunology, 2000
Primary immunodeficiencies (IDs) are caused by inherited genetic defects leading to intrinsic def... more Primary immunodeficiencies (IDs) are caused by inherited genetic defects leading to intrinsic defects in cells of the immune systems. Most IDs are rare diseases and can be difficult to diagnose because similar symptoms characterize several disorders. Mutation detection is the most reliable method in such cases. These tests are not available at most centers and physicians can have difficulties in finding laboratories that could analyze the genetic defects because certain genes are possibly analyzed by just one laboratory. The IDdiagnostics registry has been established to provide information for physicians and other health care professionals. The database at http://bioinf.uta.fi/IDdiagnostics contains currently information for the analysis of defects in 30 ID-related genes. Another part of IDdiagnostics is a database of clinical tests. Laboratories performing these analyses, either gene or clinical tests, are asked to submit their information to the database by using a printed form or electronic submission at http://bioinf.uta.fi/cgi-bin/submit/IDClini.cgi. The clinical test database contains information about tests for clinical data, immune status, and studies of function, antibody response, cell function, enzyme assays, clinical function, and apoptosis assays. Both the services are freely available and regularly updated. The services aim at increasing the awareness of IDs and helping to obtain exact and early diagnosis.
Immunome Research, 2007
Background: The ImmunoDeficiency Resource (IDR) is a knowledge base for the integration of the cl... more Background: The ImmunoDeficiency Resource (IDR) is a knowledge base for the integration of the clinical, biochemical, genetic, genomic, proteomic, structural, and computational data of primary immunodeficiencies. The need for the IDR arises from the lack of structured and systematic information about primary immunodeficiencies on the Internet, and from the lack of a common platform which enables doctors, researchers, students, nurses and patients to find out validated information about these diseases. Description: The IDR knowledge base, first released in 1999, has grown substantially. It contains information for 158 diseases, both from a clinical as well as molecular point of view. The database and the user interface have been reformatted. This new IDR release has a richer and more complete breadth, depth and scope. The service provides the most complete and up-to-date dataset. The IDR has been integrated with several internal and external databases and services. The contents of the IDR are validated and selected for different types of users (doctors, nurses, researchers and students, as well as patients and their families). The search engine has been improved and allows either a detailed or a broad search from a simple user interface. Conclusion: The IDR is the first knowledge base specifically designed to capture in a systematic and validated way both clinical and molecular information for primary immunodeficiencies. The service is freely available at http://bioinf.uta.fi/idr and is regularly updated. The IDR facilitates primary immunodeficiencies informatics and helps to parameterise in silico modelling of these diseases. The IDR is useful also as an advanced education tool for medical students, and physicians.
Immunological Reviews, 2000
The Internet contains scientific information in increasing amounts. It is possible to obtain the ... more The Internet contains scientific information in increasing amounts. It is possible to obtain the latest information, and Web services can easily be maintained and updated. We have set up three Internet services on immunodeficiencies. Immunodeficiency-related mutation infor mation is available in immunodeficiency mutation databases (IDbases). Currently 14 registries are distributed, including information about Bloom syndrome (BLMbase), X-linked agammaglobulinemia (BTKbase), X-linked and autosomal recessive chronic granulomatous diseases (CYBBbase for X-linked CGD, CYBAbase for p22(phox) deficiency, NCF1base for p47(phox) deficiency, NCF2base for p67(phox) deficiency), CD3gamma and CD3epsilon deficiencies (CD3Gbase, CD3Ebase), X-linked hyper-IgM syndrome (CD40Lbase), T-B+ severe combined immunodeficiency (JAK3base), V(D)J recombination defects (RAG1base, RAG2base), X-linked lymphoproliferative syndrome (SH2D1Abase), and ZAP-70 deficiency (ZAP70base). Information on laboratories analysing the genetic defects is collected to IDdiagnostics registry. Due to the rareness of immunodeficiencies there are very few laboratories performing genetic diagnostics. Such laboratories are listed in IDdiagnostics and physicians can use the registry to find a suitable laboratory for their diagnostic needs. Immunodeficiency Resource (IDR) is a comprehensive integrated knowledge base for all the information on immunode ficiencies, including clinical, biochemical, genetic, structural and computational data and analyses. All three services are available at http: //www.uta.fi/imt/bioinfo/.
Immunological Reviews, 2005
Bruton&am... more Bruton's tyrosine kinase (Btk) is encoded by the gene that when mutated causes the primary immunodeficiency disease X-linked agammaglobulinemia (XLA) in humans and X-linked immunodeficiency (Xid) in mice. Btk is a member of the Tec family of protein tyrosine kinases (PTKs) and plays a vital, but diverse, modulatory role in many cellular processes. Mutations affecting Btk block B-lymphocyte development. Btk is conserved among species, and in this review, we present the sequence of the full-length rat Btk and find it to be analogous to the mouse Btk sequence. We have also analyzed the wealth of information compiled in the mutation database for XLA (BTKbase), representing 554 unique molecular events in 823 families and demonstrate that only selected amino acids are sensitive to replacement (P < 0.001). Although genotype-phenotype correlations have not been established in XLA, based on these findings, we hypothesize that this relationship indeed exists. Using short interfering-RNA technology, we have previously generated active constructs downregulating Btk expression. However, application of recently established guidelines to enhance or decrease the activity was not successful, demonstrating the importance of the primary sequence. We also review the outcome of expression profiling, comparing B lymphocytes from XLA-, Xid-, and Btk-knockout (KO) donors to healthy controls. Finally, in spite of a few genes differing in expression between Xid- and Btk-KO mice, in vivo competition between cells expressing either mutation shows that there is no selective survival advantage of cells carrying one genetic defect over the other. We conclusively demonstrate that for the R28C-missense mutant (Xid), there is no biologically relevant residual activity or any dominant negative effect versus other proteins.
Human Mutation, 2006
For the Immunogenetics Special Issue X-linked agammaglobulinemia (XLA) is a hereditary immunodefi... more For the Immunogenetics Special Issue X-linked agammaglobulinemia (XLA) is a hereditary immunodeficiency caused by mutations in the gene encoding Bruton tyrosine kinase (BTK). XLA patients have a decreased number of mature B cells and a lack of all immunoglobulin isotypes, resulting in susceptibility to severe bacterial infections. XLA-causing mutations are collected in a mutation database (BTKbase), which is available at http://bioinf.uta.fi/BTKbase. For each patient the following information is given (when available): the identification of the entry, a plain English description of the mutation followed by a reference, formal characterization of the mutation, and the various parameters from the patient. BTKbase is implemented with the MUTbase program suite, which provides an easy, interactive, and quality controlled submission of information to mutation databases. BTKbase version 8 lists mutation entries of 1,111 patients from 973 unrelated families showing 602 unique molecular events. The localization of the mutations on the gene and protein for BTK can be analyzed by clicking sequences on the web pages. The distribution of the mutations in the five structural domains is approximately proportional to the length of the domains, except for the Tec homology (TH) domain. The most frequently affected sites are CpG dinucleotides. The majority of the missense mutations are structural-disturbing Bruton tyrosine kinase (Btk) folding or decreasing stability. Many of the mutations affect functionally significant, conserved residues. The structural consequences of the mutations in all the domains have been studied based on crystallographic and nuclear magnetic resonance (NMR) structures as well as computer-aided molecular modeling.
Human Mutation, 1999
X-linked agammaglobulinemia (XLA) is an immunodeficiency caused by mutations in the gene coding f... more X-linked agammaglobulinemia (XLA) is an immunodeficiency caused by mutations in the gene coding for Bruton agammaglobulinemia tyrosine kinase (BTK). A database (BTKbase) of BTK mutations lists 544 mutation entries from 471 unrelated families showing 341 unique molecular events. In addition to mutations, a number of variants or polymorphisms have been found. Mutations in all the five domains of BTK cause the disease, the single most common event being missense mutations. Most mutations lead to truncation of the enzyme. The mutations appear almost uniformly throughout the molecule. About one-third of point mutations affect CpG sites, which usually code for arginine residues. The putative structural implications of all the missense mutations are provided in the database. BTKbase is available at http://www.uta.fi/imt/bioinfo.
Human Mutation, 2012
High-throughput sequencing data generation demands the development of methods for interpreting th... more High-throughput sequencing data generation demands the development of methods for interpreting the effects of genomic variants. Numerous computational methods have been developed to assess the impact of variations because experimental methods are unable to cope with both the speed and volume of data generation. To harness the strength of currently available predictors, the Pathogenic-or-Not-Pipeline (PON-P) integrates five predictors to predict the probability that nonsynonymous variations affect protein function and may consequently be disease related. Random forest methodology-based PON-P shows consistently improved performance in crossvalidation tests and on independent test sets, providing ternary classification and statistical reliability estimate of results. Applied to missense variants in a melanoma cancer cell line, PON-P predicts variants in 17 genes to affect protein function. Previous studies implicate nine of these genes in the pathogenesis of various forms of cancer. PON-P may thus be used as a first step in screening and prioritizing variants to determine deleterious ones for further experimentation.
Human Mutation, 2006
For the Immunogenetics Special Issue Primary immunodeficiencies (IDs) are a heterogenic group of ... more For the Immunogenetics Special Issue Primary immunodeficiencies (IDs) are a heterogenic group of inherited disorders of the immune system. Immunodeficiency patients have increased susceptibility to recurrent and persistent, even life-threatening infections. Mutations in a large number of genes can cause defects in different cellular functions and lead to impaired immune response. To date, approximately 150 IDs and more than 100 affected genes have been identified. ID-related genes are distributed throughout the genome, and diseases can be inherited in an X-linked, an autosomal recessive, or an autosomal dominant way. We have collected ID mutation data into locus-specific patient-related mutation databases, IDbases (http:// bioinf.uta.fi/IDbases). Mutations are described at DNA, mRNA, and protein levels with links to reference sequences and reference articles. The mutation data has been collated into entries along with some clinical information. IDbases offer an easy way, e.g., to find recently identified mutations, to reveal genotype-phenotype correlations, and to discover a specific mutation or to examine the most common mutations in a single immunodeficiency related gene. At the moment we have databases for 107 ID genes with 4,140 public patient entries. An exhaustive statistical analysis of mutation data from the IDbases was made. Missense and nonsense mutations are the most common mutation types, and the most common single substitution is a nonsense mutation from tryptophan to a stop codon. Arginine is the most mutated as well as the most abundant mutant amino acid.
Human Mutation, 2005
A large number of disease-causing mutations have been identified from several protein kinases. Ki... more A large number of disease-causing mutations have been identified from several protein kinases. KinMutBase is a comprehensive knowledge base for human disease-related mutations in protein kinase domains (http://bioinf.uta.fi/KinMutBase/). The latest version contains 582 different mutations for 1,790 cases in 1,322 families. KinMutBase entries are described on the DNA, mRNA, and protein level. Numbers for affected patients and families are also provided. KinMutBase has extensive amount of links and cross-references to literature, other databases, and information sources. There are numerous interactive pages about sequences, structures, mutation statistics, and diseases. Detailed statistical study was done on frequencies of different types of mutations both on the DNA and protein level in serine/threonine kinase (PSK) and tyrosine kinase (PTK). Three-dimensional structures indicate clustering of disease-related mutations mainly to conserved subdomains, and substrate and coligand binding amino acids, although mutations appear throughout the sequences. CpG containing codons, especially for arginine, constitute the majority of mutational hotspots. There are certain clear differences in mutation patterns and types between PSKs and PTKs.
Human Mutation, 2007
Communicated by Richard Cotton PhenCode (Phenotypes for ENCODE; www.bx.psu.edu/phencode) is a col... more Communicated by Richard Cotton PhenCode (Phenotypes for ENCODE; www.bx.psu.edu/phencode) is a collaborative, exploratory project to help understand phenotypes of human mutations in the context of sequence and functional data from genome projects. Currently, it connects human phenotype and clinical data in various locus-specific databases (LSDBs) with data on genome sequences, evolutionary history, and function from the ENCODE project and other resources in the UCSC Genome Browser. Initially, we focused on a few selected LSDBs covering genes encoding alpha-and beta-globins (HBA, HBB), phenylalanine hydroxylase (PAH), blood group antigens (various genes), androgen receptor (AR), cystic fibrosis transmembrane conductance regulator (CFTR), and Bruton's tyrosine kinase (BTK), but we plan to include additional loci of clinical importance, ultimately genomewide. We have also imported variant data and associated OMIM links from Swiss-Prot. Users can find interesting mutations in the UCSC Genome Browser (in a new Locus Variants track) and follow links back to the LSDBs for more detailed information. Alternatively, they can start with queries on mutations or phenotypes at an LSDB and then display the results at the Genome Browser to view complementary information such as functional data (e.g., chromatin modifications and protein binding from the ENCODE consortium), evolutionary constraint, regulatory potential, and/or any other tracks they choose. We present several examples illustrating the power of these connections for exploring phenotypes associated with functional elements, and for identifying genomic data that could help to explain clinical phenotypes.
BMC Biochemistry, 2012
Background: STAT1 is an essential transcription factor for interferon-γ-mediated gene responses. ... more Background: STAT1 is an essential transcription factor for interferon-γ-mediated gene responses. A distinct sumoylation consensus site (ψKxE) 702 IKTE 705 is localized in the C-terminal region of STAT1, where Lys703 is a target for PIAS-induced SUMO modification. Several studies indicate that sumoylation has an inhibitory role on STAT1-mediated gene expression but the molecular mechanisms are not fully understood. Results: Here, we have performed a structural and functional analysis of sumoylation in STAT1. We show that deconjugation of SUMO by SENP1 enhances the transcriptional activity of STAT1, confirming a negative regulatory effect of sumoylation on STAT1 activity. Inspection of molecular model indicated that consensus site is well exposed to SUMO-conjugation in STAT1 homodimer and that the conjugated SUMO moiety is directed towards DNA, thus able to form a sterical hindrance affecting promoter binding of dimeric STAT1. In addition, oligoprecipitation experiments indicated that sumoylation deficient STAT1 E705Q mutant has higher DNA-binding activity on STAT1 responsive gene promoters than wild-type STAT1. Furthermore, sumoylation deficient STAT1 E705Q mutant displayed enhanced histone H4 acetylation on interferon-γ-responsive promoter compared to wild-type STAT1. Conclusions: Our results suggest that sumoylation participates in regulation of STAT1 responses by modulating DNA-binding properties of STAT1.
Bruton&am... more Bruton's tyrosine kinase (Btk) is encoded by the gene that when mutated causes the primary immunodeficiency disease X-linked agammaglobulinemia (XLA) in humans and X-linked immunodeficiency (Xid) in mice. Btk is a member of the Tec family of protein tyrosine kinases (PTKs) and plays a vital, but diverse, modulatory role in many cellular processes. Mutations affecting Btk block B-lymphocyte development. Btk is conserved among species, and in this review, we present the sequence of the full-length rat Btk and find it to be analogous to the mouse Btk sequence. We have also analyzed the wealth of information compiled in the mutation database for XLA (BTKbase), representing 554 unique molecular events in 823 families and demonstrate that only selected amino acids are sensitive to replacement (P < 0.001). Although genotype-phenotype correlations have not been established in XLA, based on these findings, we hypothesize that this relationship indeed exists. Using short interfering-RNA technology, we have previously generated active constructs downregulating Btk expression. However, application of recently established guidelines to enhance or decrease the activity was not successful, demonstrating the importance of the primary sequence. We also review the outcome of expression profiling, comparing B lymphocytes from XLA-, Xid-, and Btk-knockout (KO) donors to healthy controls. Finally, in spite of a few genes differing in expression between Xid- and Btk-KO mice, in vivo competition between cells expressing either mutation shows that there is no selective survival advantage of cells carrying one genetic defect over the other. We conclusively demonstrate that for the R28C-missense mutant (Xid), there is no biologically relevant residual activity or any dominant negative effect versus other proteins.
Several immunodeficiency-related genes are known and a numb er of disease causing mutations have ... more Several immunodeficiency-related genes are known and a numb er of disease causing mutations have been identified. Immunodeficiency related mutation information have been collected into IDbases. We are currently maintaining 15 registries, which are available at http://www.uta.fi/imt/bioinfo/. The databases are available for the following disorders: adenosine deaminase, CD3 epsilon, CD3 gamma, JAK3, RAGI, RAG2, and ZAP-70 deficiencies, Bloom, X-linked hyper IgM and X-linked lymphoproliferative syndromes, X-linked agammaglobulinemia, and four forms of chronic granulomatous disease. The databases serve as knowledge bases for the disorders. They have been collected with the aid from the immunodeficiency research community and we ask the scientist performing mutation analyses to continue to submit data to the registries. (Less)
Human Mutation
Accurate prediction of the impact of genomic variation on phenotype is a major goal of computatio... more Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine.
Molecular medicine (Cambridge, Mass.), 2000
Bloom syndrome (BS) is characterized by mutations within the BLM gene. The Bloom syndrome protein... more Bloom syndrome (BS) is characterized by mutations within the BLM gene. The Bloom syndrome protein (BLM) has similarity to the RecQ subfamily of DNA helicases, which contain seven conserved helicase domains and share significant sequence and structural similarity with the Rep and PcrA DNA helicases. We modeled the three-dimensional structure of the BLM helicase domain to analyze the structural basis of BS-causing mutations. The sequence alignment was performed for RecQ DNA helicases and Rep and PcrA helicases. The crystal structure of PcrA helicase (PDB entry 3PJR) was used as the template for modeling the BLM helicase domain. The model was used to infer the function of BLM and to analyze the effect of the mutations. The structural model with good stereochemistry of the BLM helicase domain contains two subdomains, 1A and 2A. The electrostatic potential of the model is highly negative over most of the surface, except for the cleft between subdomains 1A and 2A which is similar to the t...
Human mutation, Jan 16, 2015
Knowledge about features distinguishing deleterious and neutral variations is crucial for interpr... more Knowledge about features distinguishing deleterious and neutral variations is crucial for interpretation of novel variants. Bruton tyrosine kinase (BTK) contains among the human protein kinases the highest number of unique disease-causing variations, still it is just 10% of all the possible single nucleotide substitution-caused amino acid variations. In the BTK kinase domain (BTK-KD) can appear altogether 1495 such variants. We investigated them all with bioinformatic and protein structure analysis methods. Most disease-causing variations affect conserved and buried residues disturbing protein stability. Minority of exposed residues is conserved, but strongly tied to pathogenicity. 67% of the variations are predicted to be harmful. In 39% of the residues, all the variants are likely harmful, while in 10% of sites all the substitutions are tolerated. Results indicate the importance of the entire kinase domain, involvement in numerous interactions, and intricate functional regulation ...
Background: Although biomedical information is growing rapidly, it is difficult to find and retri... more Background: Although biomedical information is growing rapidly, it is difficult to find and retrieve validated data especially for rare hereditary diseases. There is an increased need for services capable of integrating and validating information as well as proving it in a logically organized structure. A XML-based language enables creation of open source databases for storage, maintenance and delivery for different platforms. Methods: Here we present a new data model called fact file and an XML-based specification Inherited Disease Markup Language (IDML), that were developed to facilitate disease information integration, storage and exchange. The data model was applied to primary immunodeficiencies, but it can be used for any hereditary disease. Fact files integrate biomedical, genetic and clinical information related to hereditary diseases. Results: IDML and fact files were used to build a comprehensive Web and WAP accessible knowledge base ImmunoDeficiency Resource (IDR) available at http://bioinf.uta.fi/idr/. A fact file is a user oriented user interface, which serves as a starting point to explore information on hereditary diseases. Conclusion: The IDML enables the seamless integration and presentation of genetic and disease information resources in the Internet. IDML can be used to build information services for all kinds of inherited diseases. The open source specification and related programs are available at http:// bioinf.uta.fi/idml/.
Advances in Genetics, 2001
... Giliani 6 , Lennart Hammarström 7 , Michael S. Hershfield 3 , Paul G. Heyworth 8 , Amy P. Hsu... more ... Giliani 6 , Lennart Hammarström 7 , Michael S. Hershfield 3 , Paul G. Heyworth 8 , Amy P. Hsu 9 , Aleksi Lähdesmäki 7 , Ilkka Lappalainen 10 , 1 , Luigi D. Notarangelo 6 , Jennifer M. Puck 9 ... 1 Institute of Medical Technology, FIN-33014 University of Tampere, Tampere, Finland. ...
A Molecular and Genetic Approach, 2013
Nucleic Acids Research, 2002
The ImmunoDeficiency Resource (IDR), freely available at http://www.uta.fi/imt/bioinfo/idr/, is a... more The ImmunoDeficiency Resource (IDR), freely available at http://www.uta.fi/imt/bioinfo/idr/, is a comprehensive knowledge base on immunodeficiencies. It is designed for different user groups such as researchers, physicians and nurses as well as patients and their families and the general public. Information on immunodeficiencies is stored as fact files, which are diseaseand gene-based information resources. We have developed an inherited disease markup language (IDML) data model, which is designed for storing disease-and gene-specific data in extensible markup language (XML) format. The fact files written by the IDML can be used to present data in different contexts and platforms. All the information in the IDR is validated by expert curators.
Journal of Clinical Immunology, 2000
Primary immunodeficiencies (IDs) are caused by inherited genetic defects leading to intrinsic def... more Primary immunodeficiencies (IDs) are caused by inherited genetic defects leading to intrinsic defects in cells of the immune systems. Most IDs are rare diseases and can be difficult to diagnose because similar symptoms characterize several disorders. Mutation detection is the most reliable method in such cases. These tests are not available at most centers and physicians can have difficulties in finding laboratories that could analyze the genetic defects because certain genes are possibly analyzed by just one laboratory. The IDdiagnostics registry has been established to provide information for physicians and other health care professionals. The database at http://bioinf.uta.fi/IDdiagnostics contains currently information for the analysis of defects in 30 ID-related genes. Another part of IDdiagnostics is a database of clinical tests. Laboratories performing these analyses, either gene or clinical tests, are asked to submit their information to the database by using a printed form or electronic submission at http://bioinf.uta.fi/cgi-bin/submit/IDClini.cgi. The clinical test database contains information about tests for clinical data, immune status, and studies of function, antibody response, cell function, enzyme assays, clinical function, and apoptosis assays. Both the services are freely available and regularly updated. The services aim at increasing the awareness of IDs and helping to obtain exact and early diagnosis.
Immunome Research, 2007
Background: The ImmunoDeficiency Resource (IDR) is a knowledge base for the integration of the cl... more Background: The ImmunoDeficiency Resource (IDR) is a knowledge base for the integration of the clinical, biochemical, genetic, genomic, proteomic, structural, and computational data of primary immunodeficiencies. The need for the IDR arises from the lack of structured and systematic information about primary immunodeficiencies on the Internet, and from the lack of a common platform which enables doctors, researchers, students, nurses and patients to find out validated information about these diseases. Description: The IDR knowledge base, first released in 1999, has grown substantially. It contains information for 158 diseases, both from a clinical as well as molecular point of view. The database and the user interface have been reformatted. This new IDR release has a richer and more complete breadth, depth and scope. The service provides the most complete and up-to-date dataset. The IDR has been integrated with several internal and external databases and services. The contents of the IDR are validated and selected for different types of users (doctors, nurses, researchers and students, as well as patients and their families). The search engine has been improved and allows either a detailed or a broad search from a simple user interface. Conclusion: The IDR is the first knowledge base specifically designed to capture in a systematic and validated way both clinical and molecular information for primary immunodeficiencies. The service is freely available at http://bioinf.uta.fi/idr and is regularly updated. The IDR facilitates primary immunodeficiencies informatics and helps to parameterise in silico modelling of these diseases. The IDR is useful also as an advanced education tool for medical students, and physicians.
Immunological Reviews, 2000
The Internet contains scientific information in increasing amounts. It is possible to obtain the ... more The Internet contains scientific information in increasing amounts. It is possible to obtain the latest information, and Web services can easily be maintained and updated. We have set up three Internet services on immunodeficiencies. Immunodeficiency-related mutation infor mation is available in immunodeficiency mutation databases (IDbases). Currently 14 registries are distributed, including information about Bloom syndrome (BLMbase), X-linked agammaglobulinemia (BTKbase), X-linked and autosomal recessive chronic granulomatous diseases (CYBBbase for X-linked CGD, CYBAbase for p22(phox) deficiency, NCF1base for p47(phox) deficiency, NCF2base for p67(phox) deficiency), CD3gamma and CD3epsilon deficiencies (CD3Gbase, CD3Ebase), X-linked hyper-IgM syndrome (CD40Lbase), T-B+ severe combined immunodeficiency (JAK3base), V(D)J recombination defects (RAG1base, RAG2base), X-linked lymphoproliferative syndrome (SH2D1Abase), and ZAP-70 deficiency (ZAP70base). Information on laboratories analysing the genetic defects is collected to IDdiagnostics registry. Due to the rareness of immunodeficiencies there are very few laboratories performing genetic diagnostics. Such laboratories are listed in IDdiagnostics and physicians can use the registry to find a suitable laboratory for their diagnostic needs. Immunodeficiency Resource (IDR) is a comprehensive integrated knowledge base for all the information on immunode ficiencies, including clinical, biochemical, genetic, structural and computational data and analyses. All three services are available at http: //www.uta.fi/imt/bioinfo/.
Immunological Reviews, 2005
Bruton&am... more Bruton's tyrosine kinase (Btk) is encoded by the gene that when mutated causes the primary immunodeficiency disease X-linked agammaglobulinemia (XLA) in humans and X-linked immunodeficiency (Xid) in mice. Btk is a member of the Tec family of protein tyrosine kinases (PTKs) and plays a vital, but diverse, modulatory role in many cellular processes. Mutations affecting Btk block B-lymphocyte development. Btk is conserved among species, and in this review, we present the sequence of the full-length rat Btk and find it to be analogous to the mouse Btk sequence. We have also analyzed the wealth of information compiled in the mutation database for XLA (BTKbase), representing 554 unique molecular events in 823 families and demonstrate that only selected amino acids are sensitive to replacement (P < 0.001). Although genotype-phenotype correlations have not been established in XLA, based on these findings, we hypothesize that this relationship indeed exists. Using short interfering-RNA technology, we have previously generated active constructs downregulating Btk expression. However, application of recently established guidelines to enhance or decrease the activity was not successful, demonstrating the importance of the primary sequence. We also review the outcome of expression profiling, comparing B lymphocytes from XLA-, Xid-, and Btk-knockout (KO) donors to healthy controls. Finally, in spite of a few genes differing in expression between Xid- and Btk-KO mice, in vivo competition between cells expressing either mutation shows that there is no selective survival advantage of cells carrying one genetic defect over the other. We conclusively demonstrate that for the R28C-missense mutant (Xid), there is no biologically relevant residual activity or any dominant negative effect versus other proteins.
Human Mutation, 2006
For the Immunogenetics Special Issue X-linked agammaglobulinemia (XLA) is a hereditary immunodefi... more For the Immunogenetics Special Issue X-linked agammaglobulinemia (XLA) is a hereditary immunodeficiency caused by mutations in the gene encoding Bruton tyrosine kinase (BTK). XLA patients have a decreased number of mature B cells and a lack of all immunoglobulin isotypes, resulting in susceptibility to severe bacterial infections. XLA-causing mutations are collected in a mutation database (BTKbase), which is available at http://bioinf.uta.fi/BTKbase. For each patient the following information is given (when available): the identification of the entry, a plain English description of the mutation followed by a reference, formal characterization of the mutation, and the various parameters from the patient. BTKbase is implemented with the MUTbase program suite, which provides an easy, interactive, and quality controlled submission of information to mutation databases. BTKbase version 8 lists mutation entries of 1,111 patients from 973 unrelated families showing 602 unique molecular events. The localization of the mutations on the gene and protein for BTK can be analyzed by clicking sequences on the web pages. The distribution of the mutations in the five structural domains is approximately proportional to the length of the domains, except for the Tec homology (TH) domain. The most frequently affected sites are CpG dinucleotides. The majority of the missense mutations are structural-disturbing Bruton tyrosine kinase (Btk) folding or decreasing stability. Many of the mutations affect functionally significant, conserved residues. The structural consequences of the mutations in all the domains have been studied based on crystallographic and nuclear magnetic resonance (NMR) structures as well as computer-aided molecular modeling.
Human Mutation, 1999
X-linked agammaglobulinemia (XLA) is an immunodeficiency caused by mutations in the gene coding f... more X-linked agammaglobulinemia (XLA) is an immunodeficiency caused by mutations in the gene coding for Bruton agammaglobulinemia tyrosine kinase (BTK). A database (BTKbase) of BTK mutations lists 544 mutation entries from 471 unrelated families showing 341 unique molecular events. In addition to mutations, a number of variants or polymorphisms have been found. Mutations in all the five domains of BTK cause the disease, the single most common event being missense mutations. Most mutations lead to truncation of the enzyme. The mutations appear almost uniformly throughout the molecule. About one-third of point mutations affect CpG sites, which usually code for arginine residues. The putative structural implications of all the missense mutations are provided in the database. BTKbase is available at http://www.uta.fi/imt/bioinfo.
Human Mutation, 2012
High-throughput sequencing data generation demands the development of methods for interpreting th... more High-throughput sequencing data generation demands the development of methods for interpreting the effects of genomic variants. Numerous computational methods have been developed to assess the impact of variations because experimental methods are unable to cope with both the speed and volume of data generation. To harness the strength of currently available predictors, the Pathogenic-or-Not-Pipeline (PON-P) integrates five predictors to predict the probability that nonsynonymous variations affect protein function and may consequently be disease related. Random forest methodology-based PON-P shows consistently improved performance in crossvalidation tests and on independent test sets, providing ternary classification and statistical reliability estimate of results. Applied to missense variants in a melanoma cancer cell line, PON-P predicts variants in 17 genes to affect protein function. Previous studies implicate nine of these genes in the pathogenesis of various forms of cancer. PON-P may thus be used as a first step in screening and prioritizing variants to determine deleterious ones for further experimentation.
Human Mutation, 2006
For the Immunogenetics Special Issue Primary immunodeficiencies (IDs) are a heterogenic group of ... more For the Immunogenetics Special Issue Primary immunodeficiencies (IDs) are a heterogenic group of inherited disorders of the immune system. Immunodeficiency patients have increased susceptibility to recurrent and persistent, even life-threatening infections. Mutations in a large number of genes can cause defects in different cellular functions and lead to impaired immune response. To date, approximately 150 IDs and more than 100 affected genes have been identified. ID-related genes are distributed throughout the genome, and diseases can be inherited in an X-linked, an autosomal recessive, or an autosomal dominant way. We have collected ID mutation data into locus-specific patient-related mutation databases, IDbases (http:// bioinf.uta.fi/IDbases). Mutations are described at DNA, mRNA, and protein levels with links to reference sequences and reference articles. The mutation data has been collated into entries along with some clinical information. IDbases offer an easy way, e.g., to find recently identified mutations, to reveal genotype-phenotype correlations, and to discover a specific mutation or to examine the most common mutations in a single immunodeficiency related gene. At the moment we have databases for 107 ID genes with 4,140 public patient entries. An exhaustive statistical analysis of mutation data from the IDbases was made. Missense and nonsense mutations are the most common mutation types, and the most common single substitution is a nonsense mutation from tryptophan to a stop codon. Arginine is the most mutated as well as the most abundant mutant amino acid.
Human Mutation, 2005
A large number of disease-causing mutations have been identified from several protein kinases. Ki... more A large number of disease-causing mutations have been identified from several protein kinases. KinMutBase is a comprehensive knowledge base for human disease-related mutations in protein kinase domains (http://bioinf.uta.fi/KinMutBase/). The latest version contains 582 different mutations for 1,790 cases in 1,322 families. KinMutBase entries are described on the DNA, mRNA, and protein level. Numbers for affected patients and families are also provided. KinMutBase has extensive amount of links and cross-references to literature, other databases, and information sources. There are numerous interactive pages about sequences, structures, mutation statistics, and diseases. Detailed statistical study was done on frequencies of different types of mutations both on the DNA and protein level in serine/threonine kinase (PSK) and tyrosine kinase (PTK). Three-dimensional structures indicate clustering of disease-related mutations mainly to conserved subdomains, and substrate and coligand binding amino acids, although mutations appear throughout the sequences. CpG containing codons, especially for arginine, constitute the majority of mutational hotspots. There are certain clear differences in mutation patterns and types between PSKs and PTKs.
Human Mutation, 2007
Communicated by Richard Cotton PhenCode (Phenotypes for ENCODE; www.bx.psu.edu/phencode) is a col... more Communicated by Richard Cotton PhenCode (Phenotypes for ENCODE; www.bx.psu.edu/phencode) is a collaborative, exploratory project to help understand phenotypes of human mutations in the context of sequence and functional data from genome projects. Currently, it connects human phenotype and clinical data in various locus-specific databases (LSDBs) with data on genome sequences, evolutionary history, and function from the ENCODE project and other resources in the UCSC Genome Browser. Initially, we focused on a few selected LSDBs covering genes encoding alpha-and beta-globins (HBA, HBB), phenylalanine hydroxylase (PAH), blood group antigens (various genes), androgen receptor (AR), cystic fibrosis transmembrane conductance regulator (CFTR), and Bruton's tyrosine kinase (BTK), but we plan to include additional loci of clinical importance, ultimately genomewide. We have also imported variant data and associated OMIM links from Swiss-Prot. Users can find interesting mutations in the UCSC Genome Browser (in a new Locus Variants track) and follow links back to the LSDBs for more detailed information. Alternatively, they can start with queries on mutations or phenotypes at an LSDB and then display the results at the Genome Browser to view complementary information such as functional data (e.g., chromatin modifications and protein binding from the ENCODE consortium), evolutionary constraint, regulatory potential, and/or any other tracks they choose. We present several examples illustrating the power of these connections for exploring phenotypes associated with functional elements, and for identifying genomic data that could help to explain clinical phenotypes.
BMC Biochemistry, 2012
Background: STAT1 is an essential transcription factor for interferon-γ-mediated gene responses. ... more Background: STAT1 is an essential transcription factor for interferon-γ-mediated gene responses. A distinct sumoylation consensus site (ψKxE) 702 IKTE 705 is localized in the C-terminal region of STAT1, where Lys703 is a target for PIAS-induced SUMO modification. Several studies indicate that sumoylation has an inhibitory role on STAT1-mediated gene expression but the molecular mechanisms are not fully understood. Results: Here, we have performed a structural and functional analysis of sumoylation in STAT1. We show that deconjugation of SUMO by SENP1 enhances the transcriptional activity of STAT1, confirming a negative regulatory effect of sumoylation on STAT1 activity. Inspection of molecular model indicated that consensus site is well exposed to SUMO-conjugation in STAT1 homodimer and that the conjugated SUMO moiety is directed towards DNA, thus able to form a sterical hindrance affecting promoter binding of dimeric STAT1. In addition, oligoprecipitation experiments indicated that sumoylation deficient STAT1 E705Q mutant has higher DNA-binding activity on STAT1 responsive gene promoters than wild-type STAT1. Furthermore, sumoylation deficient STAT1 E705Q mutant displayed enhanced histone H4 acetylation on interferon-γ-responsive promoter compared to wild-type STAT1. Conclusions: Our results suggest that sumoylation participates in regulation of STAT1 responses by modulating DNA-binding properties of STAT1.
Bruton&am... more Bruton's tyrosine kinase (Btk) is encoded by the gene that when mutated causes the primary immunodeficiency disease X-linked agammaglobulinemia (XLA) in humans and X-linked immunodeficiency (Xid) in mice. Btk is a member of the Tec family of protein tyrosine kinases (PTKs) and plays a vital, but diverse, modulatory role in many cellular processes. Mutations affecting Btk block B-lymphocyte development. Btk is conserved among species, and in this review, we present the sequence of the full-length rat Btk and find it to be analogous to the mouse Btk sequence. We have also analyzed the wealth of information compiled in the mutation database for XLA (BTKbase), representing 554 unique molecular events in 823 families and demonstrate that only selected amino acids are sensitive to replacement (P < 0.001). Although genotype-phenotype correlations have not been established in XLA, based on these findings, we hypothesize that this relationship indeed exists. Using short interfering-RNA technology, we have previously generated active constructs downregulating Btk expression. However, application of recently established guidelines to enhance or decrease the activity was not successful, demonstrating the importance of the primary sequence. We also review the outcome of expression profiling, comparing B lymphocytes from XLA-, Xid-, and Btk-knockout (KO) donors to healthy controls. Finally, in spite of a few genes differing in expression between Xid- and Btk-KO mice, in vivo competition between cells expressing either mutation shows that there is no selective survival advantage of cells carrying one genetic defect over the other. We conclusively demonstrate that for the R28C-missense mutant (Xid), there is no biologically relevant residual activity or any dominant negative effect versus other proteins.