MarsyasX: multimedia dataflow processing with implicit patching (original) (raw)

Implicit patching for dataflow-based audio analysis and synthesis

2005

ABSTRACT Programming software for audio analysis and synthesis is challenging. Dataflow-based approaches provide a declarative specification of computation and result in efficient code. Most practitioners of computer music are familiar with some form of dataflow programming where audio applications are constructed by connecting components with “wires” that carry data. Examples include networks of unit generators in Music-V style languages and visual patches in Max/Msp or PD.

Eclipse: Heterogeneous Multiprocessor Architecture for Flexible Media Processing

IEEE Design & Test of Computers, 2002

Eclipse is a scalable architecture template for designing data-dependent stream-processing subsystems of media-processing SoCs. It combines application configuration flexibility with the efficiency of function-specific coprocessors that concurrently execute the tasks of one or more applications

SONIC – A Plug-In Architecture for Video Processing

Lecture Notes in Computer Science, 1999

This paper presents the SONIC reconfigurable computing architecture and the first implementation, SONIC-1. SONIC is designed to support the software plug-in methodology to accelerate video image processing applications. SONIC differs from other architectures through the use of Plug-In Processing Elements (PIPEs) and the Application Programmer's Interface (API). Each PIPE contains a reconfigurable processor, a scaleable router that also formats video data, and a frame-buffer memory. The SONIC architecture integrates multiple PIPEs together using a specialised bus structure which enables flexible and optimal pipelined processing. SONIC-1communicates with the host PC through the PCI bus and has 8 PIPEs. We have developed an easy to use API which allows SONIC-1 to be used by multiple applications simultaneously. Preliminary results show that a 19 tap separable 2-D FIR filter implemented on a single PIPE achieves processing rates of more than 15 frames per second operating on 512 x 512 video transferred over the PCI bus. We estimate that using all 8 PIPEs, we could obtain real-time processing rates for complex operations such as image warping.

A heterogeneous multiprocessor architecture for flexible media processing

IEEE Design & Test of Computers, 2002

Eclipse is a scalable architecture template for designing data-dependent stream-processing subsystems of media-processing SoCs. It combines application configuration flexibility with the efficiency of function-specific coprocessors that concurrently execute the ...

Computer-aided parallelization of continuous media applications

Proceedings of the seventh ACM international conference on Multimedia (Part 1), 1999

Parallel servers for I./O and compute intensive continuous media applications are difficult to develop. A server application comprises many threads located in different address spaces as well as files striped over multiple disks located on different computers. The present contribution describes the construction of a continuous media server, the 4D beating heart slice server, based on a computer-aided parallelization tool (CAP) and on a library of parallel file system components enabling the combination of pipelined parallel disk access and processing operations. Thanks to CAP, the presented archictecture is concisely described as a set of threads, operations located within the threads and flow of data and parameters (tokens) between operations. Continuous media applications are supported by allowing tokens to be suspended during a period of time specified by a user-defined function. Our target application, the 4D beating heart server supports the extraction of freely oriented slices from a 4D beating heart volume (one 3D volume per time sample). This server application requires both a high I/O throughput for accessing from disks the set of 4D subvolumes (extents) intersecting the desired slices and a large amount of processing power to extract these slices and to resample them into the display grid. With a server configuration of 3 PCs and 24 disks, up to 7.3 slices can be delivered per second, i.e. 43 MB/s are continuously read from disks and 4.1 MB/s of slice parts are extracted, transfered to the client, merged, buffered and displayed. This performance is close to the maximal performance deliverable by the underlying hardware. The observed single stream server delay jitter varies between 0.6s (52% of maximal display rate) and 1.4s (92% of the maximal display rate). For the same resource utilization, the jitter is proportional to the number of streams that are accessed synchronously. The presented 4D beating heart application suggests that powerful continuous media server applications can be built on top of a set of simple PCs connected to SCSI disks.

MarsyasX

2008

The design and implementation of multimedia signal processing systems is challenging especially when efficiency and real-time performance is desired. In many modern applications, software systems must be able to handle multiple flows of various types of multimedia data such as audio and video. Researchers frequently have to rely on a combination of different software tools for each modality to assemble proof-of-concept systems that are inefficient, brittle and hard to maintain. Marsyas is a software framework originally developed to address these issues in the domain of audio processing. In this paper we describe MarsyasX, a new open-source cross-modal analysis framework that aims at a broader score of applications. It follows a dataflow architecture where complex networks of processing objects can be assembled to form systems that can handle multiple and different types of multimedia flows with expressiveness and efficiency.

Flexible scheduling for dataflow audio processing

Proc. ICMC, 2006

The notions of audio and control rate have been a pervasive feature of audio programming languages and environments. Real-time computer music systems depend on schedulers to coordinate and order the execution of many tasks over the course of time. In this paper we describe the scheduling infrastructure of Marsyas-0.2, an open source framework for audio analysis and synthesis. We describe how to support multiple, simultaneous, dynamic control rates while retaining the efficiency of block audio ...

Exploring the concurrency of an MPEG RVC decoder based on dataflow program analysis

IEEE Transactions on Circuits and Systems for Video Technology, 2009

This paper presents an in-depth case study on dataflow-based analysis and exploitation of parallelism in the design and implementation of a MPEG reconfigurable video coding decoder. Dataflow descriptions have been used in a wide range of digital signal processing (DSP) applications, such as applications for multimedia processing and wireless communications. Because dataflow models are effective in exposing concurrency and other important forms of high level application structure, dataflow techniques are promising for implementing complex DSP applications on multicore systems, and other kinds of parallel processing platforms. In this paper, we use the client access license (CAL) language as a concrete framework for representing and demonstrating dataflow design techniques. Furthermore, we also describe our application of the differential item functioning dataflow interchange format package (TDP), a software tool for analyzing dataflow networks, to the systematic exploitation of concurrency in CAL networks that are targeted to multicore platforms. Using TDP, one is able to automatically process regions that are extracted from the original network, and exhibit properties similar to synchronous dataflow (SDF) models. This is important in our context because powerful techniques, based on static scheduling, are available for exploiting concurrency in SDF descriptions. Detection of SDF-like regions is an important step for applying static scheduling techniques within a dynamic dataflow framework. Furthermore, segmenting a system into SDF-like regions also allows us to explore cross-actor concurrency that results from dynamic dependences among different regions. Using SDF-like region detection as a preprocessing step to software synthesis generally provides an efficient way for mapping tasks to multicore systems, and improves the system performance of video processing applications on multicore platforms.