7th Data Workshop (original) (raw)

November 6, 2024 in Hangzhou, China

Seventh International Workshop on Data:

Data Acquisition & Analysis in the Era of AI

AGENDA

The workshop will be held on Wednesday, November 6th 2024, in Hangzhou, China.

NOTE: Time listed below are in Hangzhou local time, click "link to world clock" to convert to your local time.

Datasets:

Water Supply System Dataset: Non-Invasive Sensor Data for Smart Water Pumps

9:00 - 9:20

Carmen Cheh, Aaron Han Yen Tay, Zhen Wei Ng, Binbin Chen, Xin Lou, Zaki Masood, David K.Y. Yau

BD3: Building Defects Detection Dataset for Benchmarking Computer Vision Techniques for Automated Defect Identification

9:20 - 9:40

Praveen Kottari, Pandarasamy Arjunan

Data-driven Soil Moisture Sensing with mmWave Radar

9:40 - 10:00

Yujie Zhuang, Yang Zhao, Jie Liu

Systems and Tools:

NILMInspector: An Interactive Tool for Data Visualization and Manipulation in Load Disaggregation

10:00 - 10:20

Mazen Bouchur, Andreas Reinhardt

Semantic Provisioning of IoT Devices for Autonomous Fault Detection Services

10:20 - 10:40

Hervé Pruvost, Andreas Wilde

A hybrid actor- and microservices-based platform for scalable smart building application deployment

10:40 - 11:00

Flavia de Andrade Pereira, Kyriakos Katsigarakis, Nikos Kostis, Dimitrios Rovas

Reviews and New Challenges:

Large Language Models for the Creation and Use of Semantic Ontologies in Buildings: Requirements and Challenges

11:00 - 11:20

Ozan Baris Mulayim, Lazlo Paul, Marco Pritoni, Anand Krishnan Prakash, Malavikha Sudarshan, Gabe Fierro

A Critical Review of Household Water Datasets

11:20 - 11:40

Justus Breyer, Maximilian Petri, Muhammad Hamad Alizai, Klaus Wehrle

Panel discussion:


ABOUT THE WORKSHOP

As the enthusiasm for and success of the Internet of Things (IoT), Cyber-Physical Systems (CPS), and Smart Buildings grows, so too does the volume and variety of data collected by these systems. How do we ensure that this data is of high quality, and how do we maximize the utility of collected data such that many projects can benefit from the time, cost, and effort of deployments? With the development of large AI models such as Large Language Models (LLMs), how can we incorporate cyber-physical data with these powerful tools? Large AI models, including recent varieties based on the transformer architecture, may assist in the acquisition, analysis, manipulation and consumption of data.

The DATA: Data Acquisition To Analysis in the Era of AI workshop aims to look broadly at interesting data from interesting sensing systems and/or how such data can be adopted to large models. The workshop considers problems, solutions, and results from all across the real-world data pipeline. We solicit submissions on unexpected challenges and solutions in the collection of datasets, on new and novel datasets of interest to the community, on experiences and results—explicitly including negative results—in using prior datasets to develop new insights, and on discussions of impact and new found opportunities with large AI models.

LLMs could enhance data quality through sophisticated data cleaning, preprocessing, and augmentation techniques. LLMs can facilitate analysis of data streams while identifying anomalies, inconsistencies, and potential biases. Generative AI can also create synthetic datasets that maintain the essential characteristics of real-world data while expanding the available training samples. This may be valuable when real data is challenging due to privacy concerns or logistical constraints. Transformer models can integrate multi-modal data, such as blending textual inputs from sensor logs with quantitative data from measurements. This new flavor of AI-driven analysis can factor in more contextual information, opening new areas of research in enhancing the predictive and diagnostic capabilities of data-driven AI systems deployed in smart environments.

Furthermore, new areas of future work may emerge from exploring the ethical implications of deploying LLMs within these domains—ensuring that the benefits of AI are equitably distributed while safeguarding user privacy. The workshop's focus on privacy challenges and solutions becomes increasingly relevant in the era of AI, where the capacity to analyze vast amounts of sensitive data poses significant risks.

The workshop aims to bring together a community of application researchers and algorithm researchers in the sensing systems and building domains to promote breakthroughs from integration of the generators and users of datasets. The workshop will foster cross-domain understanding by enabling both the understanding of application needs and data collection limitations.

CALL FOR PAPERS

The workshop seeks contributions across two major thrusts, but is open to a broad view of interesting questions around the collection, dissemination, and use of data as well as interesting datasets:

The collection, evaluation, analysis, and use of data

New and interesting datasets, including but not limited to:

To enable the longevity and continued utility of submitted datasets, all datasets must be uploaded to a permanent data repository such as a Zenodo or CRAWDAD as part of the camera-ready preparation. Submissions may refer to datasets hosted on personal or temporary hosting but this hosting must be made permanent by time of publication.

Submission Format

Submissions may range from 1-5 pages in PDF format, excluding references, using the standard ACM conference template. DATA 2024 follows the single-blind review policy. The names and affiliations of all the authors must be present in the submitted manuscript. Submissions are strongly encouraged to use only as much space as needed to clearly convey the significance of the work—we fully expect many submissions, especially datasets, to use only 1-2 pages, but wish to allow those interested in fully elucidating positions on data collection and use or insights from reproducibility efforts ample space to do so. Submissions should use only as much space as necessary to clearly convey their ideas and contributions.

Submission Site

HotCRP link

Important Dates (UTC-12)

Workshop Paper Due: September 15, 2024, AoE Extended: September 20, 2024, AoE

Workshop Paper Notification: September 27, 2024, AoE

Workshop Paper Camera Ready: October 4, 2024, AoE

Workshop Day: November 6th, 2024

ORGANIZATION

Web Chair

THE VENUE

The Dragon Hotel, Hangzhou, China

The 7th DATA workshop is co-located with SenSys 2024.

For venue details, visa information, etceteraplease visit the SenSys venue page.