Gaurav Bhalotia | University of California, Berkeley (original) (raw)

Uploads

Papers by Gaurav Bhalotia

Bookmarks Related papers MentionsView impact

Conference on Management of Data, 2016

Bookmarks Related papers MentionsView impact

Copyright information:Taken from "Tools for loading MEDLINE into a local ... more Copyright information:Taken from "Tools for loading MEDLINE into a local relational database"BMC Bioinformatics 2004;5():146-146.Published online 7 Oct 2004PMCID:PMC524480.Copyright © 2004 Oliver et al; licensee BioMed Central Ltd. Database schema designers determine how entities and elements are converted to table names or field names in the database schema. See Figure 2 for the author table.

Bookmarks Related papers MentionsView impact

Copyright information:Taken from "Tools for loading MEDLINE into a local ... more Copyright information:Taken from "Tools for loading MEDLINE into a local relational database"BMC Bioinformatics 2004;5():146-146.Published online 7 Oct 2004PMCID:PMC524480.Copyright © 2004 Oliver et al; licensee BioMed Central Ltd. In Step 2, the conversion software parses the XML files and either loads the data directly into the database (2a), or writes the data out to intermediate text files (2b). If intermediate text files are generated, data from those files are loaded into the database as a separate step in Step 3.

Bookmarks Related papers MentionsView impact

The classical association rule mining algorithms assume a horizontal data layout, wherein each ro... more The classical association rule mining algorithms assume a horizontal data layout, wherein each row in the database records a transcation, and the items present in the transaction. Of late there has been considerable interest in alternative vertical data representations, wherein each item is associated with a column of values representing the transactions in which it is present. The vertical mining algorithms that have been proposed show performance improvements over their horizontal counterparts, but suffer from some limitations -- they are either efficient only for certain database sizes, or assume specific characteristics of the database contents, or are applicable only to special kinds of database schemas. To address the above limitations, we present a new vertical mining algorithm called VIPER (Vertical Itemset Partitioning for Efficient Rule-extraction). VIPER is a "general-purpose" algorithm, which makes no assumptions about the underlying database, and integrates a number of novel optimizations. We analyze the performance of VIPER for a range of synthetic database workloads. Our experimental results indicate significant performance gains, especially for large databases, over previously proposed vertical and horizontal mining algorithms.

Bookmarks Related papers MentionsView impact

IEEE Data(base) Engineering Bulletin, 2001

Bookmarks Related papers MentionsView impact

Keyword search of documents on the Web has been enormously successful Much data is resident in da... more Keyword search of documents on the Web has been enormously successful Much data is resident in databases Organizational, government, scientific, medical data Deep web

Bookmarks Related papers MentionsView impact

In many applications, users specify target values for certain attributes/features without requiri... more In many applications, users specify target values for certain attributes/features without requiring exact matches to these values in return. Instead, the result is typically a ranked list of the top k objects that best match the specified feature values. User subjectivity is an important aspect of such queries, i.e., which objects are relevant to the user and which are not depends on the perception of the user. Due to the subjective nature of similarity-based retrieval, the answers returned by the system to a user query often do not satisfy the user’s information need right away; either because the weights and the distance functions associated with the features do not accurately capture the user’s perception or because the specified target values do not fully capture her information need or both. The most commonly used technique to overcome this problem is query refinement. In this technique, the user provides to the system some feedback on the “relevance” of the answers to the user...

Bookmarks Related papers MentionsView impact

Conference on Management of Data, 2016

Bookmarks Related papers MentionsView impact

Copyright information:Taken from "Tools for loading MEDLINE into a local ... more Copyright information:Taken from "Tools for loading MEDLINE into a local relational database"BMC Bioinformatics 2004;5():146-146.Published online 7 Oct 2004PMCID:PMC524480.Copyright © 2004 Oliver et al; licensee BioMed Central Ltd. In Step 2, the conversion software parses the XML files and either loads the data directly into the database (2a), or writes the data out to intermediate text files (2b). If intermediate text files are generated, data from those files are loaded into the database as a separate step in Step 3.

Bookmarks Related papers MentionsView impact

IEEE Data(base) Engineering Bulletin, 2001

Bookmarks Related papers MentionsView impact