Eric Tschetter - Academia.edu (original) (raw)
Papers by Eric Tschetter
Druid is an open source data store designed for real-time exploratory analytics on large data set... more Druid is an open source data store designed for real-time exploratory analytics on large data sets. The system combines a column-oriented storage layout, a distributed, shared-nothing architecture, and an advanced indexing structure to allow for the arbitrary exploration of billion-row tables with sub-second latencies. In this paper, we describe Druid’s architecture, and detail how it supports fast aggregations, flexible filters, and low latency data ingestion.
Druid is an open source data store designed for real-time exploratory analytics on large data set... more Druid is an open source data store designed for real-time exploratory analytics on large data sets. The system combines a column-oriented storage layout, a distributed, shared-nothing architecture, and an advanced indexing structure to allow for the arbitrary exploration of billion-row tables with sub-second latencies. In this paper, we describe Druid’s architecture, and detail how it supports fast aggregations, flexible filters, and low latency data ingestion.
Current trends in scientific collaboration focus on developing effective Web-based communication ... more Current trends in scientific collaboration focus on developing effective Web-based communication tools, such as instant messaging (i.,e IM) or video-conferencing. One objective is to provide informal communication opportunities for collaborating scientists. This paper focuses on developing an assistive technology for predicting where breakdown situations have likely occurred in chat communications. The immediate goal of the assistive technology is to help users cope with discrepancies between their expectations and the tools they are using during scientific collaboration. The ideas in this paper also have far reaching implications, introducing a method that can be used by any human-interactive application to help the application understand when the user does not know what to do. 1 A breakdown situation is a situation in which a conversation "breaks down" due to some influence external from the actual conversation. A technical failure with the communication tool or some other peripheral device is one example.
Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, 2014
Druid is an open source 1 data store designed for real-time exploratory analytics on large data s... more Druid is an open source 1 data store designed for real-time exploratory analytics on large data sets. The system combines a column-oriented storage layout, a distributed, shared-nothing architecture, and an advanced indexing structure to allow for the arbitrary exploration of billion-row tables with sub-second latencies. In this paper, we describe Druid's architecture, and detail how it supports fast aggregations, flexible filters, and low latency data ingestion.
Proceedings of the 50th Hawaii International Conference on System Sciences (2017), 2017
The Real-time Analytics Data Stack, colloquially referred to as the RADStack, is an open-source d... more The Real-time Analytics Data Stack, colloquially referred to as the RADStack, is an open-source data analytics stack designed to provide fast, flexible queries over up-to-the-second data. It is designed to overcome the limitations of either a purely batch processing system (it takes too long to surface new events) or a purely real-time system (it's difficult to ensure that no data is left behind and there is often no way to correct data after initial processing). It will seamlessly return best-effort results on very recent data combined with guaranteed-correct results on older data. In this paper, we introduce the architecture of the RADStack and discuss our methods of providing interactive analytics and a flexible data processing environment to handle a variety of real-world workloads.
Druid is an open source data store designed for real-time exploratory analytics on large data set... more Druid is an open source data store designed for real-time exploratory analytics on large data sets. The system combines a column-oriented storage layout, a distributed, shared-nothing architecture, and an advanced indexing structure to allow for the arbitrary exploration of billion-row tables with sub-second latencies. In this paper, we describe Druid’s architecture, and detail how it supports fast aggregations, flexible filters, and low latency data ingestion.
Druid is an open source data store designed for real-time exploratory analytics on large data set... more Druid is an open source data store designed for real-time exploratory analytics on large data sets. The system combines a column-oriented storage layout, a distributed, shared-nothing architecture, and an advanced indexing structure to allow for the arbitrary exploration of billion-row tables with sub-second latencies. In this paper, we describe Druid’s architecture, and detail how it supports fast aggregations, flexible filters, and low latency data ingestion.
Current trends in scientific collaboration focus on developing effective Web-based communication ... more Current trends in scientific collaboration focus on developing effective Web-based communication tools, such as instant messaging (i.,e IM) or video-conferencing. One objective is to provide informal communication opportunities for collaborating scientists. This paper focuses on developing an assistive technology for predicting where breakdown situations have likely occurred in chat communications. The immediate goal of the assistive technology is to help users cope with discrepancies between their expectations and the tools they are using during scientific collaboration. The ideas in this paper also have far reaching implications, introducing a method that can be used by any human-interactive application to help the application understand when the user does not know what to do. 1 A breakdown situation is a situation in which a conversation "breaks down" due to some influence external from the actual conversation. A technical failure with the communication tool or some other peripheral device is one example.
Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, 2014
Druid is an open source 1 data store designed for real-time exploratory analytics on large data s... more Druid is an open source 1 data store designed for real-time exploratory analytics on large data sets. The system combines a column-oriented storage layout, a distributed, shared-nothing architecture, and an advanced indexing structure to allow for the arbitrary exploration of billion-row tables with sub-second latencies. In this paper, we describe Druid's architecture, and detail how it supports fast aggregations, flexible filters, and low latency data ingestion.
Proceedings of the 50th Hawaii International Conference on System Sciences (2017), 2017
The Real-time Analytics Data Stack, colloquially referred to as the RADStack, is an open-source d... more The Real-time Analytics Data Stack, colloquially referred to as the RADStack, is an open-source data analytics stack designed to provide fast, flexible queries over up-to-the-second data. It is designed to overcome the limitations of either a purely batch processing system (it takes too long to surface new events) or a purely real-time system (it's difficult to ensure that no data is left behind and there is often no way to correct data after initial processing). It will seamlessly return best-effort results on very recent data combined with guaranteed-correct results on older data. In this paper, we introduce the architecture of the RADStack and discuss our methods of providing interactive analytics and a flexible data processing environment to handle a variety of real-world workloads.