Gerald Friedland - Academia.edu (original) (raw)
Papers by Gerald Friedland
eLearn, 2009
Progress in multimedia capture, analysis, and delivery, combined with the rapid adoption of broad... more Progress in multimedia capture, analysis, and delivery, combined with the rapid adoption of broadband communication, have resulted in educational multimedia systems that have advanced traditional forms of teaching and learning. New trends in technology, such as mobile multimedia or advanced approaches for the automatic analysis of multimodal signals, offer novel and exciting opportunities for teaching and learning. However, many scientists have stopped working on the topic because they were disappointed with mediocre results that seemed to have no impact. In this article, we investigate the reasons for this and argue that the question of how multimedia can really make education more exploratory and enjoyable is as yet unanswered, and we are just beginning to understand the real contribution of multimedia to education. Based on a brief overview of the history of educational multimedia systems and a rough analysis of the current situation, we venture a glimpse into the future and argue that educational multimedia is (still) a vivid and relevant area for research.
Appscio: A Software Environment for Semantic Multimedia Analysis
2008 IEEE International Conference on Semantic Computing, 2008
Automated Lecture Recording
Encyclopedia of Multimedia, 2008
Proceedings of the international workshop on Educational multimedia and multimedia education - Emme '07, 2007
This paper introduces a system that enables a presenter to work directly on the projected screen ... more This paper introduces a system that enables a presenter to work directly on the projected screen without interrupting the presentation flow by mouse or keyboard use. One uses his fingers as a natural pointing device. The system consists basically of an LED and a webcam as well as a small software system that runs on the presenter's notebook. The camera observes the LED position and the software converts it into mouse events.
Lecture Notes in Computer Science, 2005
Webcasting and recording of university lectures has become common practice. While much effort has... more Webcasting and recording of university lectures has become common practice. While much effort has been put into the development and improvement of formats and codecs, few scientist have studied how to improve the quality of the signal before it is digitized. Lecture halls or seminar rooms are not professional recording studios. Good quality recordings require full-time technicians to setup and monitor the signals. This paper describes a tool that eases studioless voice recording by automatizing several tasks usually handled by audio technicians. The expert system measures the quality of the sound hardware used, monitors possible hardware malfunctions, prevents common user mistakes, and provides gain control and filter mechanisms.
Opportunities and challenges of parallelizing speech recognition
Automatic speech recognition enables a wide range of current and emerging applications such as au... more Automatic speech recognition enables a wide range of current and emerging applications such as automatic transcription, multimedia content analysis, and natural human-computer interfaces. This article provides a glimpse of the opportunities and challenges that parallelism provides for automatic speech recognition and related application research from the point of view of speech researchers. The increasing parallelism in computing platforms opens three major possibilities for speech recognition systems: improving recognition accuracy in non-ideal, everyday noisy environments; increasing recognition throughput in batch processing of speech data; and reducing recognition latency in real-time usage scenarios. We describe technical challenges, approaches we've taken, and possible directions for future research to guide the design of efficient parallel software and hardware infrastructures.
Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g.,... more Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g., a Flash mob or Attempting a bike trick, using its content characteristics. Recent research has focused on approaches that use semantically defined "concepts" trained with annotated audio clips. Using audio concepts allows us to show semantic evidence of their relationship to events, by looking at the probability distribution of the audio concepts per event. However, while the concept-based approach has been useful in image detection, audio concepts have generally not surpassed the performance of low-level audio features like Mel Frequency Cepstral Coefficients (MFCCs) in addressing the unstructured acoustic composition of video events. Such audio-concept based systems could benefit from temporal information, due to one of the intrinsic characteristics of audio: it occurs across a time interval. This paper presents a multimedia event detection system that uses audio concepts; it exploits the temporal correlation of audio characteristics for each particular event at two levels. The first level involves analyzing the short-and long-term surrounding context information for the audio concepts, through an implementation of a Hierarchical Deep Neural Network (H-DNN), to determine engineered audio-concept features. At the second level, we use Hidden Markov Models (HMMs) to describe the continuous and non-stationary characteristics of the audio signal throughout the video. Experiments using the TRECVID MED 2013 corpus show that an HMM system based on audio-concept features can perform competitively when compared with an MFCC-based system.
Estimating User-Produced-Multimedia Classification Performance Efficiently with Python
Audio-based multimedia retrieval tasks may identify semantic information in audio streams, i.e., ... more Audio-based multimedia retrieval tasks may identify semantic information in audio streams, i.e., audio concepts (such as music, laughter, or a revving engine). Conventional Gaussian-Mixture-Models have had some success in classifying a reduced set of audio concepts. However, multi-class classification can benefit from context window analysis and the discriminating power of deeper architectures. Although deep learning has shown promise in various applications such as speech and object recognition, it has not yet met the expectations for other fields such as audio concept classification. This paper explores, for the first time, the potential of deep learning in classifying audio concepts on User-Generated Content videos. The proposed system is comprised of two cascaded neural networks in a hierarchical configuration to analyze the short-and long-term context information. Our system outperforms a GMM approach by a relative 54%, a Neural Network by 33%, and a Deep Neural Network by 12% on the TRECVID-MED database.
The placing task: A large-scale geo-estimation challenge for social-media videos and images
The Benchmark as a Research Catalyst: Charting the Progress of Geo-prediction for Social Multimedia
Multimodal Location Estimation of Videos and Images, 2014
2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008
State-of-the-art speaker diarization systems for meetings are now at a point where overlapped spe... more State-of-the-art speaker diarization systems for meetings are now at a point where overlapped speech contributes significantly to the errors made by the system. However, little if no work has yet been done on detecting overlapped speech. We present our initial work toward developing an overlap detection system for improved meeting diarization. We investigate various features, with a focus on high-precision performance for use in the detector, and examine performance results on a subset of the AMI Meeting Corpus. For the high-quality signal case of a single mixed-headset channel signal, we demonstrate a relative improvement of about 7.4% DER over the baseline diarization system, while for the more challenging case of the single far-field channel signal relative improvement is 3.6%. We also outline steps towards improvement and moving beyond this initial phase.
Teaching Privacy: Multimedia Making a Difference
IEEE MultiMedia, 2015
Article about an interdisciplinary collaboration, Teachingprivacy.com, to create principles and g... more Article about an interdisciplinary collaboration, Teachingprivacy.com, to create principles and guidelines for teaching the youth and the general population about privacy. Julia Bernd, Blanca Gordo, Jaeyoung Choi, Bryan Morgan, Nicholas Henderson, Serge Egelman, Daniel Garcia, Gerald Friedland. Teaching Privacy: Multimedia Making A Difference. IEEE Multimedia, IEEE Computer Society, Pgs. 2-9, January-March 2015.
Message from the workshop co-chairs
Proceedings - 2011 International Conference on Network-Based Information Systems, NBiS 2011, 2011
The future of multimedia education and educational multimedia
Proceedings of the international workshop on Educational multimedia and multimedia education - Emme '07, 2007
ABSTRACT This document introduces a panel for the ACM Workshop on Educational Multimedia and Mult... more ABSTRACT This document introduces a panel for the ACM Workshop on Educational Multimedia and Multimedia Education (ACM EMME 2007) held in conjunction with ACM Multimedia 2007.
Proceedings of the 2004 ACM SIGMM workshop on Effective telepresence - ETP '04, 2004
This paper presents E-Chalk, a software system which transforms a large touch sensitive screen in... more This paper presents E-Chalk, a software system which transforms a large touch sensitive screen into a smart teaching tool. The instructor writes on the screen using a special stylus and the software emulates a classical chalkboard. The lecturer can paste images to the board, can send queries to remote web services, can activate a computer algebra system, and can paste interactive Java Applets on the board. A copy of the lecture's audio, the board strokes (and an optional video) is stored on a server. The lecture is also transmitted live over the Internet and can be synchronized with teleconferencing systems for student feedback.
Pharmacokinetic Drug Interactions Between Opioid Agonist Therapy and Antiretroviral Medications
JAIDS Journal of Acquired Immune Deficiency Syndromes, 2006
Opioid dependence and HIV/AIDS are 2 of the most serious yet treatable diseases worldwide. Global... more Opioid dependence and HIV/AIDS are 2 of the most serious yet treatable diseases worldwide. Global access to opioid agonist therapy and HIV treatment is expanding but when concurrently used, problematic pharmacokinetic drug interactions can occur. We reviewed English, Spanish, French, and Italian language articles from 1966 to 2005 in Medline using the following keywords: HIV, AIDS, HIV therapy, antiretroviral therapy, HAART, drug interactions, methadone, and buprenorphine. Additionally, we reviewed abstracts from national and international meetings and conference proceedings. Selected references from these articles were reviewed as well. Clinical case series and carefully controlled pharmacokinetic interaction studies have been conducted between methadone and most approved antiretroviral therapies. Important pharmacokinetic drug interactions have been demonstrated within each class of agents, affecting either methadone or antiretroviral agents. Few studies, however, have been conducted with buprenorphine. The metabolism of both therapies, description of the known interactions, and clinical implications and management of these interactions are reviewed. Certain interactions between methadone and antiretroviral medications are known and may have important clinical consequences. To optimize care, clinicians must be alert to these interactions and have a basic knowledge regarding their management.
International Journal on Artificial Intelligence Tools, 2004
This article presents a system that maps classroom lectures directly into web based education les... more This article presents a system that maps classroom lectures directly into web based education lessons. In the classroom, the lecturer writes on a wide, pen sensitive display. The system tracks all actions and makes it possible for the lecture to be replayed from the web any time. The remote viewer can follow the progress of the lecture: Audio, the creation of the board content, and an optional video image of the instructor is transmitted. In addition to usual drawing functionality the board can handle a range of multimedia elements from the Internet. The board can integrate different kinds of modules, invoked by board drawings. One of these modules is described here: A computer algebra system that evaluates mathematical expressions or plots functions is placed at the lecturers disposal by a handwriting recognition.
eLearn, 2009
Progress in multimedia capture, analysis, and delivery, combined with the rapid adoption of broad... more Progress in multimedia capture, analysis, and delivery, combined with the rapid adoption of broadband communication, have resulted in educational multimedia systems that have advanced traditional forms of teaching and learning. New trends in technology, such as mobile multimedia or advanced approaches for the automatic analysis of multimodal signals, offer novel and exciting opportunities for teaching and learning. However, many scientists have stopped working on the topic because they were disappointed with mediocre results that seemed to have no impact. In this article, we investigate the reasons for this and argue that the question of how multimedia can really make education more exploratory and enjoyable is as yet unanswered, and we are just beginning to understand the real contribution of multimedia to education. Based on a brief overview of the history of educational multimedia systems and a rough analysis of the current situation, we venture a glimpse into the future and argue that educational multimedia is (still) a vivid and relevant area for research.
Appscio: A Software Environment for Semantic Multimedia Analysis
2008 IEEE International Conference on Semantic Computing, 2008
Automated Lecture Recording
Encyclopedia of Multimedia, 2008
Proceedings of the international workshop on Educational multimedia and multimedia education - Emme '07, 2007
This paper introduces a system that enables a presenter to work directly on the projected screen ... more This paper introduces a system that enables a presenter to work directly on the projected screen without interrupting the presentation flow by mouse or keyboard use. One uses his fingers as a natural pointing device. The system consists basically of an LED and a webcam as well as a small software system that runs on the presenter's notebook. The camera observes the LED position and the software converts it into mouse events.
Lecture Notes in Computer Science, 2005
Webcasting and recording of university lectures has become common practice. While much effort has... more Webcasting and recording of university lectures has become common practice. While much effort has been put into the development and improvement of formats and codecs, few scientist have studied how to improve the quality of the signal before it is digitized. Lecture halls or seminar rooms are not professional recording studios. Good quality recordings require full-time technicians to setup and monitor the signals. This paper describes a tool that eases studioless voice recording by automatizing several tasks usually handled by audio technicians. The expert system measures the quality of the sound hardware used, monitors possible hardware malfunctions, prevents common user mistakes, and provides gain control and filter mechanisms.
Opportunities and challenges of parallelizing speech recognition
Automatic speech recognition enables a wide range of current and emerging applications such as au... more Automatic speech recognition enables a wide range of current and emerging applications such as automatic transcription, multimedia content analysis, and natural human-computer interfaces. This article provides a glimpse of the opportunities and challenges that parallelism provides for automatic speech recognition and related application research from the point of view of speech researchers. The increasing parallelism in computing platforms opens three major possibilities for speech recognition systems: improving recognition accuracy in non-ideal, everyday noisy environments; increasing recognition throughput in batch processing of speech data; and reducing recognition latency in real-time usage scenarios. We describe technical challenges, approaches we've taken, and possible directions for future research to guide the design of efficient parallel software and hardware infrastructures.
Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g.,... more Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g., a Flash mob or Attempting a bike trick, using its content characteristics. Recent research has focused on approaches that use semantically defined "concepts" trained with annotated audio clips. Using audio concepts allows us to show semantic evidence of their relationship to events, by looking at the probability distribution of the audio concepts per event. However, while the concept-based approach has been useful in image detection, audio concepts have generally not surpassed the performance of low-level audio features like Mel Frequency Cepstral Coefficients (MFCCs) in addressing the unstructured acoustic composition of video events. Such audio-concept based systems could benefit from temporal information, due to one of the intrinsic characteristics of audio: it occurs across a time interval. This paper presents a multimedia event detection system that uses audio concepts; it exploits the temporal correlation of audio characteristics for each particular event at two levels. The first level involves analyzing the short-and long-term surrounding context information for the audio concepts, through an implementation of a Hierarchical Deep Neural Network (H-DNN), to determine engineered audio-concept features. At the second level, we use Hidden Markov Models (HMMs) to describe the continuous and non-stationary characteristics of the audio signal throughout the video. Experiments using the TRECVID MED 2013 corpus show that an HMM system based on audio-concept features can perform competitively when compared with an MFCC-based system.
Estimating User-Produced-Multimedia Classification Performance Efficiently with Python
Audio-based multimedia retrieval tasks may identify semantic information in audio streams, i.e., ... more Audio-based multimedia retrieval tasks may identify semantic information in audio streams, i.e., audio concepts (such as music, laughter, or a revving engine). Conventional Gaussian-Mixture-Models have had some success in classifying a reduced set of audio concepts. However, multi-class classification can benefit from context window analysis and the discriminating power of deeper architectures. Although deep learning has shown promise in various applications such as speech and object recognition, it has not yet met the expectations for other fields such as audio concept classification. This paper explores, for the first time, the potential of deep learning in classifying audio concepts on User-Generated Content videos. The proposed system is comprised of two cascaded neural networks in a hierarchical configuration to analyze the short-and long-term context information. Our system outperforms a GMM approach by a relative 54%, a Neural Network by 33%, and a Deep Neural Network by 12% on the TRECVID-MED database.
The placing task: A large-scale geo-estimation challenge for social-media videos and images
The Benchmark as a Research Catalyst: Charting the Progress of Geo-prediction for Social Multimedia
Multimodal Location Estimation of Videos and Images, 2014
2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008
State-of-the-art speaker diarization systems for meetings are now at a point where overlapped spe... more State-of-the-art speaker diarization systems for meetings are now at a point where overlapped speech contributes significantly to the errors made by the system. However, little if no work has yet been done on detecting overlapped speech. We present our initial work toward developing an overlap detection system for improved meeting diarization. We investigate various features, with a focus on high-precision performance for use in the detector, and examine performance results on a subset of the AMI Meeting Corpus. For the high-quality signal case of a single mixed-headset channel signal, we demonstrate a relative improvement of about 7.4% DER over the baseline diarization system, while for the more challenging case of the single far-field channel signal relative improvement is 3.6%. We also outline steps towards improvement and moving beyond this initial phase.
Teaching Privacy: Multimedia Making a Difference
IEEE MultiMedia, 2015
Article about an interdisciplinary collaboration, Teachingprivacy.com, to create principles and g... more Article about an interdisciplinary collaboration, Teachingprivacy.com, to create principles and guidelines for teaching the youth and the general population about privacy. Julia Bernd, Blanca Gordo, Jaeyoung Choi, Bryan Morgan, Nicholas Henderson, Serge Egelman, Daniel Garcia, Gerald Friedland. Teaching Privacy: Multimedia Making A Difference. IEEE Multimedia, IEEE Computer Society, Pgs. 2-9, January-March 2015.
Message from the workshop co-chairs
Proceedings - 2011 International Conference on Network-Based Information Systems, NBiS 2011, 2011
The future of multimedia education and educational multimedia
Proceedings of the international workshop on Educational multimedia and multimedia education - Emme '07, 2007
ABSTRACT This document introduces a panel for the ACM Workshop on Educational Multimedia and Mult... more ABSTRACT This document introduces a panel for the ACM Workshop on Educational Multimedia and Multimedia Education (ACM EMME 2007) held in conjunction with ACM Multimedia 2007.
Proceedings of the 2004 ACM SIGMM workshop on Effective telepresence - ETP '04, 2004
This paper presents E-Chalk, a software system which transforms a large touch sensitive screen in... more This paper presents E-Chalk, a software system which transforms a large touch sensitive screen into a smart teaching tool. The instructor writes on the screen using a special stylus and the software emulates a classical chalkboard. The lecturer can paste images to the board, can send queries to remote web services, can activate a computer algebra system, and can paste interactive Java Applets on the board. A copy of the lecture's audio, the board strokes (and an optional video) is stored on a server. The lecture is also transmitted live over the Internet and can be synchronized with teleconferencing systems for student feedback.
Pharmacokinetic Drug Interactions Between Opioid Agonist Therapy and Antiretroviral Medications
JAIDS Journal of Acquired Immune Deficiency Syndromes, 2006
Opioid dependence and HIV/AIDS are 2 of the most serious yet treatable diseases worldwide. Global... more Opioid dependence and HIV/AIDS are 2 of the most serious yet treatable diseases worldwide. Global access to opioid agonist therapy and HIV treatment is expanding but when concurrently used, problematic pharmacokinetic drug interactions can occur. We reviewed English, Spanish, French, and Italian language articles from 1966 to 2005 in Medline using the following keywords: HIV, AIDS, HIV therapy, antiretroviral therapy, HAART, drug interactions, methadone, and buprenorphine. Additionally, we reviewed abstracts from national and international meetings and conference proceedings. Selected references from these articles were reviewed as well. Clinical case series and carefully controlled pharmacokinetic interaction studies have been conducted between methadone and most approved antiretroviral therapies. Important pharmacokinetic drug interactions have been demonstrated within each class of agents, affecting either methadone or antiretroviral agents. Few studies, however, have been conducted with buprenorphine. The metabolism of both therapies, description of the known interactions, and clinical implications and management of these interactions are reviewed. Certain interactions between methadone and antiretroviral medications are known and may have important clinical consequences. To optimize care, clinicians must be alert to these interactions and have a basic knowledge regarding their management.
International Journal on Artificial Intelligence Tools, 2004
This article presents a system that maps classroom lectures directly into web based education les... more This article presents a system that maps classroom lectures directly into web based education lessons. In the classroom, the lecturer writes on a wide, pen sensitive display. The system tracks all actions and makes it possible for the lecture to be replayed from the web any time. The remote viewer can follow the progress of the lecture: Audio, the creation of the board content, and an optional video image of the instructor is transmitted. In addition to usual drawing functionality the board can handle a range of multimedia elements from the Internet. The board can integrate different kinds of modules, invoked by board drawings. One of these modules is described here: A computer algebra system that evaluates mathematical expressions or plots functions is placed at the lecturers disposal by a handwriting recognition.