Reconfigurable Hardware Research Papers - Academia.edu (original) (raw)
2025
In this paper, we report the design and implementation of a reconfigurable system that exploits regional clocking resources that exist in Xilinx Virtex-4 FPGAs for increased performance and, for the first time, enhanced reliability.... more
In this paper, we report the design and implementation of a reconfigurable system that exploits regional clocking resources that exist in Xilinx Virtex-4 FPGAs for increased performance and, for the first time, enhanced reliability. Unlike previous approaches, our system is able to individually manage the regional clock buffers (BUFRs) to adjust the frequency delivered to each hardware task and to detect and recover from faults affecting the clock-tree on-the-fly. Towards this end, we propose global and regional clock multiplexers, named GCMUX and RCMUX respectively, which allow for switching to spare clocking resources whenever needed. These multiplexers are based on the inner programmable interconnection points of the FPGA, leading to zero area overheads.
2025, Journal of biomedical informatics
This paper introduces keytagging, a novel technique to protect medical image-based tests by implementing image authentication, integrity control and location of tampered areas, private captioning with role-based access control,... more
This paper introduces keytagging, a novel technique to protect medical image-based tests by implementing image authentication, integrity control and location of tampered areas, private captioning with role-based access control, traceability and copyright protection. It relies on the association of tags (binary data strings) to stable, semistable or volatile features of the image, whose access keys (called keytags) depend on both the image and the tag content. Unlike watermarking, this technique can associate information to the most stable features of the image without distortion. Thus, this method preserves the clinical content of the image without the need for assessment, prevents eavesdropping and collusion attacks, and obtains a substantial capacity-robustness tradeoff with simple operations. The evaluation of this technique, involving images of different sizes from various acquisition modalities and image modifications that are typical in the medical context, demonstrates that a...
2025, The Scientific World Journal
In FPGA-based control system design, partial reconfiguration is especially well suited to implement preemptive systems. In real-time systems, the deadline for critical task can compel the preemption of noncritical one. Besides, an... more
In FPGA-based control system design, partial reconfiguration is especially well suited to implement preemptive systems. In real-time systems, the deadline for critical task can compel the preemption of noncritical one. Besides, an asynchronous event can demand immediate attention and, then, force launching a reconfiguration process for high-priority task implementation. If the asynchronous event is previously scheduled, an explicit activation of the reconfiguration process is performed. If the event cannot be previously programmed, such as in dynamically scheduled systems, an implicit activation to the reconfiguration process is demanded. This paper provides a hardware-based approach to explicit and implicit activation of the partial reconfiguration process in dynamically reconfigurable SoCs and includes all the necessary tasks to cope with this issue. Furthermore, the reconfiguration service introduced in this work allows remote invocation of the reconfiguration process and then th...
2025, Optics Express
Flexible photonic integrated circuit technology is an emerging field expanding the usage possibilities of photonics, particularly in sensor applications, by enabling the realization of conformable devices and introduction of new... more
Flexible photonic integrated circuit technology is an emerging field expanding the usage possibilities of photonics, particularly in sensor applications, by enabling the realization of conformable devices and introduction of new alternative production methods. Here, we demonstrate that disposable polymeric photonic integrated circuit devices can be produced in lengths of hundreds of meters by ultra-high volume roll-to-roll methods on a flexible carrier. Attenuation properties of hundreds of individual devices were measured confirming that waveguides with good and repeatable performance were fabricated. We also demonstrate the applicability of the devices for the evanescent wave sensing of ambient refractive index. The production of integrated photonic devices using ultrahigh volume fabrication, in a similar manner as paper is produced, may inherently expand methods of manufacturing low-cost disposable photonic integrated circuits for a wide range of sensor applications.
2025
The increasing demand of mobile users imposes a better utilization of the available spectrum. In this regard, Cognitive Radio (CR) technology represents a novel solution for avoiding spectrum scarcity. The implementation of the CR system... more
The increasing demand of mobile users imposes a better utilization of the available spectrum. In this regard, Cognitive Radio (CR) technology represents a novel solution for avoiding spectrum scarcity. The implementation of the CR system follows two different approaches: Mitola Radio, which takes control of all parameters, and Cognitive Radio with just Spectrum Sensing (SS) techniques. This paper will be concerned with SS techniques. From a theoretical point of view, CR is addressed through several solutions in regard to novel SS methods. However, system and hardware development is progressing at a lower pace. Today, field-programmable gate arrays (FPGAs) and regular desktop computers are fast enough to handle complete baseband processing chains. There are several platforms, both open source and commercial, providing such solutions for supporting the development of novel SS methods. The aims of this paper is to give an overview of some of the available platforms and testbeds and the...
2025, Application-Specific Systems, Architectures, and Processors
Biosequence similarity search is an important application in modern molecular biology. Search algorithms aim to identify sets of sequences whose extensional similarity suggests a common evolutionary origin or function. The most widely... more
Biosequence similarity search is an important application in modern molecular biology. Search algorithms aim to identify sets of sequences whose extensional similarity suggests a common evolutionary origin or function. The most widely used similarity search tool for biosequences is BLAST, a program designed to compare query sequences to a database. Here, we present the design of BLASTN, the version of BLAST that searches DNA sequences, on the Mercury system, an architecture that supports high-volume, high-throughput data movement off a data store and into reconfigurable hardware. An important component of application deployment on the Mercury system is the functional decomposition of the application onto both the reconfigurable hardware and the traditional processor. Both the Mercury BLASTN application design and its performance analysis are described.
2025
This paper presents a demonstrator for partial reconfiguration of FPGAs applied to image processing tasks. The main goal of the project is to develop an environment which allows users to assess some of the advantages of using dynamic... more
This paper presents a demonstrator for partial reconfiguration of FPGAs applied to image processing tasks. The main goal of the project is to develop an environment which allows users to assess some of the advantages of using dynamic reconfiguration. The demonstration platform is built around a Xilinx Virtex-5 FPGA, which is used to implement a chain of four reconfigurable filters for processing images. Using a graphical interface, the user can choose which filter goes into which reconfigurable slot, submit images for processing and visualize the outcome of the whole process.
2025
Descripció: En el área de la bioinformática podemos encontrar aplicaciones que suponen un reto para el diseño de nuevas arquitecturas de procesadores en términos de rendimiento, ya que sus características difieren de las de las... more
Descripció: En el área de la bioinformática podemos encontrar aplicaciones que suponen un reto para el diseño de nuevas arquitecturas de procesadores en términos de rendimiento, ya que sus características difieren de las de las aplicaciones de propósito ...
2025, IEEE Access
Elliptic curve cryptography (ECC) is largely deployed public key cryptographic algorithms in the design of key exchange, digital signature, and secure multiparty computation protocols. A compact and high-performance implementation of ECC... more
Elliptic curve cryptography (ECC) is largely deployed public key cryptographic algorithms in the design of key exchange, digital signature, and secure multiparty computation protocols. A compact and high-performance implementation of ECC is essential to enable deployments of associated protocols in privacy-preserving applications. Scalar point multiplication (SPM), the chief and performance-limiting primitive in ECC is computationally intensive. To speed up the computation of SPM with low resource consumption, this paper presents ComCrypt, a novel compact and low latency hardware architecture over any generic prime field. The proposed design features new novel unified hardware architectures for low-level finite field arithmetic primitives. These architectures are developed by introducing optimization both at algorithmic and circuit levels. In these basic primitives, parallelism opportunities are exploited at the algorithmic level to increase the achievable frequency, whereas, a novel resource-sharing strategy is deployed to reduce the hardware cost at the circuit level. Due to these efforts, the proposed SPM design produces better area-time product and efficiency results. It is implemented on Xilinx Virtex-7, Kintex-7, and Virtex-6 FPGA platforms for 256-bit modulus length. It significantly improves latency and resource consumption compared to the existing ECC-based hardware accelerators.
2025, Volume 17, Issue 2
Impossible-differential cryptanalysis is one of the powerful methods utilized for evaluating the robustness of block ciphers; however, mCrypton is one of the block ciphers whose master key has not been recovered with this method in the... more
Impossible-differential cryptanalysis is one of the powerful methods utilized for evaluating the robustness of block ciphers; however, mCrypton is one of the block ciphers whose master key has not been recovered with this method in the single-key scenario. This paper first clarifies the branch number of the linear layer of mCrypton block ciphers with an observation. It has been shown that the branch number of the linear layer in mCrypton block cipher is four. Then, using this result, a 4-round impossible differential in a single-key scenario has been found. On the other hand, by exploiting the result of several observations, some vulnerabilities in the key-schedule algorithm were discovered and introduced. As a result, by exploiting the discovered vulnerabilities and 4-round property, impossible-differential cryptanalysis was successfully applied to seven rounds of mCrypton-64. To our knowledge, this is the first impossible differential cryptanalysis applied on mCrypton-64. In addition, this method requires 2 36.0 bytes of memory, 2 59.0 chosen plaintexts (with the corresponding ciphertexts), and 2 59.6 encryptions to recover the master key.
2025, The Scientific World Journal
In FPGA-based control system design, partial reconfiguration is especially well suited to implement preemptive systems. In real-time systems, the deadline for critical task can compel the preemption of noncritical one. Besides, an... more
In FPGA-based control system design, partial reconfiguration is especially well suited to implement preemptive systems. In real-time systems, the deadline for critical task can compel the preemption of noncritical one. Besides, an asynchronous event can demand immediate attention and, then, force launching a reconfiguration process for high-priority task implementation. If the asynchronous event is previously scheduled, an explicit activation of the reconfiguration process is performed. If the event cannot be previously programmed, such as in dynamically scheduled systems, an implicit activation to the reconfiguration process is demanded. This paper provides a hardware-based approach to explicit and implicit activation of the partial reconfiguration process in dynamically reconfigurable SoCs and includes all the necessary tasks to cope with this issue. Furthermore, the reconfiguration service introduced in this work allows remote invocation of the reconfiguration process and then th...
2025
Multi-Index Driver Drowsiness Detection Method Based on Driver's Facial Recognition Using Haar Features and Histograms of Oriented Gradients.
2025, 1996 IEEE International Symposium on Circuits and Systems. Circuits and Systems Connecting the World. ISCAS 96
Arti cial evolution can automatically derive the con guration of a recon gurable hardware system such t h a t i t performs a given task. Individuals of the evolving population are evaluated when instantiated as real circuits, so if... more
Arti cial evolution can automatically derive the con guration of a recon gurable hardware system such t h a t i t performs a given task. Individuals of the evolving population are evaluated when instantiated as real circuits, so if constraints inherent t o h uman design (but not to evolution) are dropped, then the natural physical dynamics of the hardware can be exploited in new ways. The notion of an arti cially evolving `species' (SAGA) allows the open-ended incremental evolution of complex circuits. Theoretical arguments are given, as well as the real-world example of an evolved hardware robot controller.
2025, Proceedings 2002 NASA/DoD Conference on Evolvable Hardware
The purpose of this paper is mofold: first, to illustrate a stand-alone board-level evolvable system (SABLES) and its performunee, and second to illustrate some problems that occur during evolution with real hardware in the loop, or when... more
The purpose of this paper is mofold: first, to illustrate a stand-alone board-level evolvable system (SABLES) and its performunee, and second to illustrate some problems that occur during evolution with real hardware in the loop, or when the intention of the user is not completely reflected in the fitness function. SABLES is part of an effort to achieve integrated evolvable systems. SABLES provides autonomous, fast (tens to hundreds of seconds), on-chip evolution involving about lO0,OOO circuit evaluations. Its main components are a JPL Field Programmable Transistor Array (FPTA) chip used as transistor-level reconfigurable hardware, and a TI DSP that implements the evolutionary algorithm controlling the FPTA reconfiguration. The paper details an example of evolution on SABLES and points out to certain transient and memory effects that affect the stability of solutions obtained reusing the same piece of hardware for rapid testing of individuals during evolution. It also illustrates how specifications not completely reflected in the fitness function, such as the time scales of response for logical circuits, m a y lead to overall unsatisfactory solutions. Both such situations can be handled with appropriate modification of fitness function and additional testing.
2025, Lecture Notes in Computer Science
In this paper we describe the hardware evolution of analog circuits performing signal separation tasks using JPL's Stand-Alone Board-Level Evolvable System (SABLES). SABLES integrates a Field Programmable Transistor Array chip (FPTA-2)... more
In this paper we describe the hardware evolution of analog circuits performing signal separation tasks using JPL's Stand-Alone Board-Level Evolvable System (SABLES). SABLES integrates a Field Programmable Transistor Array chip (FPTA-2) and a Digital Signal Processor (DSP) implementing the Evolutionary Platform (EP). The FPTA-2 is a second generation reconfigurable mixed signal array chip whose cells can be programmed at the transistor level. Its chip architecture consists of an 8x8 matrix of reconfigurable cells. The FPTA-2 is reconfigured by evolution to achieve circuits that can extract a target signal that is combined with an undesired component or to perform the separation of a combination of two signals. The paper considers also an adaptive filter where the fitness function depends on the input signal. The results demonstrate that SABLES is not only able to perform signal separation and extraction, but it is also flexible enough to adapt to different input signals without human intervention, such as in the c.ase of self-tuning and adaptive filters.
2025, NASA/DoD Conference on Evolvable Hardware, 2003. Proceedings.
This paper presents experimental results of fast intrinsic evolutionary design and evolutionary fault recovery of a 4-bit Digital to Analog Converter PAC) using the JPL stand-alone board-level evolvable system (SABLES). SABLES is part of... more
This paper presents experimental results of fast intrinsic evolutionary design and evolutionary fault recovery of a 4-bit Digital to Analog Converter PAC) using the JPL stand-alone board-level evolvable system (SABLES). SABLES is part of an elport to achieve integrated evolvable systems and provides autonomous, fast (tens to hundreds of seconak), on-chp evolution involving about 100.000 circuit evaluations. Its main components are a JPL Field Programmable Tronsistor Array (FPTA) chip used ar transistor-level recontgurable hardware, and a TI DSP that implements the evolutionary algorithm controlling the FPTA reconfiguration. The paper describes an experiment consisting of the hierarchical evolution of a 4-bit DAC using 20 cells of the FPTA chip. Fault-recovery is demonstrated aper applying stuck-at 0 faults to all switches of one particular cell, and using evolution to recoverfunctionality. It has been veri$ed that thefunctionality can be recovered in less than one minute aBer the fmlt is detected while the evolutionary design of the 4-bit DACfrom scratch took about 3 minutes. ' lLSB =Vrefl16, forVref=453mV.
2025
U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 1 El objeto de este proyecto es el... more
U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 1 El objeto de este proyecto es el desarrollo de un entorno para inyectar errores en una FPGA, en este caso del modelo Virtex II Pro-y así evaluar el posible impacto que pueden tener las alteraciones en la memoria de configuración de un dispositivo reconfigurable sobre un determinado diseño. Estas alteraciones emulan el efecto que una partícula cósmica pudiera tener sobre una celda RAM de la memoria de configuración. Para ello, cargaremos un circuito específico en la FPGA, conectándolo previamente a un módulo de entrada/salida que nos permita enviar y recibir datos de la placa. Una vez cargado, aplicaremos una serie de entradas al circuito y obtendremos sus salidas. Tras ello iremos insertando errores en el circuitomediante la modificación sucesiva de los bits de configuración de la FPGA-y veremos si la salida es o no la esperada. The goal of this project is to develop an environment for injecting faults into a FPGA device (Virtex-II Pro model) and therefore to evaluate which consequences may those changes have in the configuration memory of a reconfigurable device using any design. These changes emulate the same effect as if a cosmic particle hit on a memory configuration RAM cell. In order to do this, we will load a specific circuit on the FPGA device, which will be previously connected to an output/intput module so that we can send and receive data from the device. Once the circuit is loaded we will apply several inputs to the circuit and we will receive its outputs. After that we will inject faults into the circuit modifying the configuration bits of the FPGA device and we will see if we obtain the expected output. U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 2 1 Marco de trabajo Las FPGA's (Field Programmable Gate Array) son dispositivos que contienen bloques de lógica cuya interconexión y funcionalidad es programable. En ellas se puede programar desde una simple puerta lógica hasta un complejo circuito secuencial. Existen otras formas de implementar circuitos digitales, tales como ASIC (Aplication Specific Integrated Circuit), microcontroladores (con un conjunto fijo de instrucciones), CPLDs [9] (basado en memorias no reconfigurables, pero más lentos y densos que una FPGA). Las ventajas que tienen las FPGA's, [3] y [5] , es que son dispositivos reconfigurables, tienen un costo menor con respecto a los ASIC y sus circuitos se ejecutan más rápido que en otros dispositivos reprogramables. Además al tratarse de una solución basada en hardware, su ejecución es "en paralelo", cosa que no ocurre en un microcontrolador en el que las instrucciones se ejecutan de forma secuencial. La reconfiguración permite además la ejecución de múltiples tareas, multiplexadas en el tiempo, lo que nos permite pensar en una verdadera multitarea hardware. Nosotros trabajaremos con una placa basada en una FPGA del modelo Virtex-II Pro, [1], [8] y [10], como la que se muestra en la Figura 1.1. Figura 1.1 Fotografía de la placa XUP Virtex-II Pro Development System U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 4 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 6 Como en todo sistema, necesitábamos poder leer y escribir datos en nuestra FPGA. Teníamos que conseguir una forma de comunicarnos con la placa. En cursos anteriores, habíamos trabajado con FPGA's -en este caso de una placa basada en el modelo Spartan III-para las cuales introducíamos las entradas utilizando los switches y los botones de los que viene provisto dicha placa, y visualizábamos las salidas tanto a través de los leds como del display de 7 segmentos que tenía aquel modelo de placa. U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 10 • gFrecClk: indica la frecuencia de reloj de la placa. Usando este genérico podremos implementar nuestro módulo de entrada/salida en otra placa que tenga un reloj diferente. En nuestro caso, la frecuencia de reloj de la Virtex II Pro es de 100 MHz. • gBaud: indica la velocidad de transmisión de la unidad de entrada/salida. En nuestro caso, como ya dijimos anteriormente, usaremos 9600 bps. A partir del reloj de la placa de 100 MHz (Clk), queremos proporcionar una señal con frecuencia de 9600 Hz (ClkBaud). Este reloj tendrá por tanto un periodo de 104,167 µs, y estará a '1' durante un solo ciclo de reloj, estando el resto de tiempo a '0'. En la Figura 2.7 se muestra cómo se divide la frecuencia de reloj. U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 21 2.2. Interfaz de usuario U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 23 5. Ejecutar: Ejecutar implica enviar los datos del test bench al circuito y recibir las salidas que produzca como consecuencia de esas entradas. Para ello se traducirán los datos en forma de cadenas de 0's y 1's a los bytes que se desea enviar. Para recibir haremos el mismo proceso a la inversa de tal forma que de una serie de bytes, obtengamos una representación en forma de cadenas de 0's y 1's. Mientras se ejecuta se comparará la salida obtenida con una salida golden (ubicada en salidas/Golden.txt) U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20 01 10 0 U Un ni iv ve er rs si id da ad d C Co om mp pl lu ut te en ns se e d de e M Ma ad dr ri id d F Fa ac cu ul lt ta ad d d de e I In nf fo or rm má át ti ic ca a C Cu ur rs so o 2 20 00 09 9/ /2 20...
2025
Now that we know how to measure cot 0, we need to know the value of cot B. Referring to Figure 8, we suppose we have a camera with lx1 CID plane (usually 256 x 25b), focal length f 2 , and pixel size b x b. Knowing which pixel we are... more
Now that we know how to measure cot 0, we need to know the value of cot B. Referring to Figure 8, we suppose we have a camera with lx1 CID plane (usually 256 x 25b), focal length f 2 , and pixel size b x b. Knowing which pixel we are looking at, we can calculate cot B i from the following relat.ionship: cot B i -(n-i) b f 2 According to the above, if we have two look-up tables, one loaded with the values of cot R i , and the other one with the values of cot a i calculated from data extracted from the resultant bit plane of Figure 7d, we can find the value corresponding to R i .
2025
Tao is a high performance ,platform for implementing ,reconfigurable hardware ,designs.
2025
A large number of the world’s most important cultural heritage structures are built with carbonate stones, and are particularly sensitive to the deteriorative factors; as a consequence, the survival of many irreplaceable historical... more
A large number of the world’s most important cultural heritage structures are built with carbonate stones, and are particularly sensitive to the deteriorative factors; as a consequence, the survival of many irreplaceable historical properties in jeopardy. The deterioration of the stone is not only determined by physical and chemical effects, but also by biological agents. Plants are one of the least studied organisms in relation to building stone biodegradation. It is important to know the different species actively contributing to the biodeterioration of building stone, because their presence often leads to physical and chemical actions that are causing deterioration of the structure of historical buildings. Moreover, these biodeteriorative factors can be used as bio-indicators of the state of abandonment of these monuments. The aim of this research was to identify flora, especially plant species growing on the walls of 4 Moroccan historical monuments. The results of the studies ca...
2025, 2008 Design, Automation and Test in Europe
The inductance and coupling effects in interconnects and non-linear receiver loads has resulted in complex input signals and output loads for gates in the modern deep submicron CMOS technologies. As a result, the conventional method of... more
The inductance and coupling effects in interconnects and non-linear receiver loads has resulted in complex input signals and output loads for gates in the modern deep submicron CMOS technologies. As a result, the conventional method of timing characterization, which is based on lookup tables with input slew and output load capacitance as indices, is no longer adequate. The focus has now shifted to current source based standard cell models which are based on the fundamental property of transconductance of MOSFETs. In this paper 1 we propose a systematic methodology for obtaining a current based delay model for gates, which can accommodate both single (SIS) and multi-input (MIS) switching signals of arbitrary shape and complex non-linear output loads. We use an analytical model for the gate output current expressed as a function of the node voltages. This results in an average error less than 0.5% with maximum standard deviation of 2.5% in error when compared with SPICE for a large number of standard cells. When compared with SPICE, using the proposed models gives stage delay and output slew with an average error of less than 3% and 2% respectively for arbitrary inputs and output load combinations.
2025
Integrated circuits (ICs) sometimes fail when their power supply is disrupted by external noise, such as might occur during an electrical fast transient (EFT). A delay model was proposed in [1] which can be used to predict the variations... more
Integrated circuits (ICs) sometimes fail when their power supply is disrupted by external noise, such as might occur during an electrical fast transient (EFT). A delay model was proposed in [1] which can be used to predict the variations in the delays through logic circuits caused by electromagnetic induced noise in the power supply voltage. This model is relatively simple and requires few parameters, giving it the potential to be used even when the IC is a "black box" and little information is available about the inner circuits. While design information might be approximated through testing, critical process characteristics may not be available which are needed for accurate results. The parameter of greatest concern is the velocity saturation index, since this parameter can exponentially increase the impact of power supply noise on delay. This paper describes an investigation of the sensitivity of the delay model in [1] to the velocity saturation index. Results indicate that the estimated delay, found while treating much of the circuit as a black box, is largely insensitive to the velocity saturation index. This result suggests that this model can be used effectively for prediction of electromagnetically-induced delay errors, even when limited process or circuit information is known.
2025, 2015 Asia-Pacific Symposium on Electromagnetic Compatibility (APEMC)
Integrated circuits (ICs) sometimes fail when their power supply is disrupted by external noise, such as might occur during an electrical fast transient (EFT). A delay model was proposed in [1] which can be used to predict the variations... more
Integrated circuits (ICs) sometimes fail when their power supply is disrupted by external noise, such as might occur during an electrical fast transient (EFT). A delay model was proposed in [1] which can be used to predict the variations in the delays through logic circuits caused by electromagnetic induced noise in the power supply voltage. This model is relatively simple and requires few parameters, giving it the potential to be used even when the IC is a "black box" and little information is available about the inner circuits. While design information might be approximated through testing, critical process characteristics may not be available which are needed for accurate results. The parameter of greatest concern is the velocity saturation index, since this parameter can exponentially increase the impact of power supply noise on delay. This paper describes an investigation of the sensitivity of the delay model in [1] to the velocity saturation index. Results indicate that the estimated delay, found while treating much of the circuit as a black box, is largely insensitive to the velocity saturation index. This result suggests that this model can be used effectively for prediction of electromagnetically-induced delay errors, even when limited process or circuit information is known.
2025, IEEE Transactions on Computer-aided Design of Integrated Circuits and Systems
This paper proposes a systematic strategy to efficiently explore the design space of field-programmable gate array (FPGA) routing architectures. The key idea is to use stochastic methods to quickly locate near-optimal solutions in... more
This paper proposes a systematic strategy to efficiently explore the design space of field-programmable gate array (FPGA) routing architectures. The key idea is to use stochastic methods to quickly locate near-optimal solutions in designing FPGA routing architectures without exhaustively enumerating all design points. The main objective of this paper is not as much about the specific numerical results obtained, as it is to show the applicability and effectiveness of the proposed optimization approach. To demonstrate the utility of the proposed stochastic approach, we developed the tool for optimizing routing architecture (TORCH) software based on the versatile place and route tool. Given FPGA architecture parameters and a set of benchmark designs, TORCH simultaneously optimizes the routing channel segmentation and switch box patterns using the performance metric of average interconnect power-delay product estimated from placed and routed benchmark designs. Special techniques - such ...
2025
Mestrado em Engenharia Electrónica e TelecomunicaçõesO presente trabalho apresenta técnicas de processamento digital de sinal, nomeadamente em processamento de vídeo, recorrendo a tecnologia FPGA. Consiste numa introdução teórica sobre... more
Mestrado em Engenharia Electrónica e TelecomunicaçõesO presente trabalho apresenta técnicas de processamento digital de sinal, nomeadamente em processamento de vídeo, recorrendo a tecnologia FPGA. Consiste numa introdução teórica sobre tópicos tais como o papel da visão artificial nos dias de hoje, reconhecimento de imagem, e técnicas matemáticas de processamento e análise morfol ógica de imagem. Aborda o tema do papel das FPGAs na tecnologia actual, e as suas vantagens quando utilizadas no processamento digital de sinal. Finalmente e demonstrado e explicado o algoritmo implementado na FPGA para deteção de contornos no processamento de vídeo, concluindo com uma análise a nível da sua eficiência, e discussão de melhorias a fazer num possível trabalho futuro em termos de otimização de recursos utilizados e velocidade de processamento.The present work presents techniques of digital signal processing, namely in video processing, using FPGA technology. It consists of a theoretical introd...
2025
Spike-based processing technology is capable of very high speed throughput, as it does not rely on sensing and processing sequences of frames. Besides, it allows building complex and hierarchically structured cortical-like layers for... more
Spike-based processing technology is capable of very high speed throughput, as it does not rely on sensing and processing sequences of frames. Besides, it allows building complex and hierarchically structured cortical-like layers for sophisticated processing. In this paper we summarize the fundamental properties of this sensing and processing technology applied to artificial vision systems and the AER (Address Event Representation) protocol used in hardware spiking systems. Finally a four-layer system is described for character recognition. The system is slightly based on the Fukushima´s Neocognitron. Realistic simulations using figures of already existing AER devices are provided, which show recognition delays under 10μs.
2025, Proceedings of XXVI Conferencia Lationamericana de Informática CLEI’2000, Monterrey. Mexico.
Resumen Se propone un modelo computacional orientado al procesamiento de listas de funciones recursivas lineales. En este modelo conceptual, denominado Máquina Paralelo-Recursiva o MPR, son aprovechados dos importantes esquemas de cómputo... more
Resumen Se propone un modelo computacional orientado al procesamiento de listas de funciones recursivas lineales. En este modelo conceptual, denominado Máquina Paralelo-Recursiva o MPR, son aprovechados dos importantes esquemas de cómputo que rigen su organización y funcionamiento: el paralelismo y la recursión. En este enfoque es explotado el paralelismo en grano fino mediante el pipeline o encauzamiento, la concurrencia y el control por flujo de datos o dataflow. La recursión es usada para organizar la estructura de la MPR y controlar el cómputo masivo de grandes cadenas de funciones recursivas lineales. Los componentes esenciales que conforman este modelo son: una cola de funciones recursivas lineales a ser evaluadas, dos pipeline o cauces de procesadores, el intercambio de mensajes como mecanismo de comunicación entre los procesadores, y un conjunto de operadores elementales que conforman su repertorio básico de instrucciones. El presente trabajo muestra el modelo de la MPR y un estudio de su rendimiento.
2025, 2013 8th International Workshop on Reconfigurable and Communication-Centric Systems-on-Chip (ReCoSoC)
Due to the recent popularity of context-sensitive applications, there is a growing need for reliable, long-lifetime ubiquitous sensor nodes. The severe energy-efficiency requirements of these energy-scarce devices require complementing... more
Due to the recent popularity of context-sensitive applications, there is a growing need for reliable, long-lifetime ubiquitous sensor nodes. The severe energy-efficiency requirements of these energy-scarce devices require complementing traditional circuit-level energy saving techniques, with architecture-level methods. Traditional approaches such as exploiting parallelism have however limited impact in sensor node processors, due to their control-dominated and event-based, irregular data processing workload patterns. Executing eventbased tasks in specialized finite state machines relieves the onboard microcontroller, however, at the penalty of reduced postmanufacturing configurability. An architecture proposal for configurable finite state machines assisting sensor node processors is presented, which allows saving energy through task off-load while maintaining system flexibility. Simulations demonstrate 46% energy savings when compared to a sensor node that executes tasks in a microcontroller. This gain comes at relatively minor area overhead.
2025
The advantages of dynamic reconfiguration can only be exploited if devices, tools and design flows are available to support the partial reconfiguration of FPGA-based systems. For a number of applications, enabling the swap of cores at... more
The advantages of dynamic reconfiguration can only be exploited if devices, tools and design flows are available to support the partial reconfiguration of FPGA-based systems. For a number of applications, enabling the swap of cores at run-time, under software control, is an essential feature that allows tailoring the system response to the needs of different methods, standards and power/performance requirements. The paper proposes a method to support the exchange of intellectual property (IP) cores during system operation. The approach is based on the definition of a base system, with reserved or dynamic areas, where different cores may be plugged in, providing timesharing of the system resources. It is shown how bitstream-level IP cores can be used in a design flow that allows different cores to be used in one or more host areas, with minimal intervention from the designer. A demonstration system along with example applications are presented to illustrate the approach.
2025
Reconfigurable hardware platforms are the key to extensible high speed networks. They provide flexibility without hindering performance through the internet. Current development of the Field-programmable Port Extender (FPX), a... more
Reconfigurable hardware platforms are the key to extensible high speed networks. They provide flexibility without hindering performance through the internet. Current development of the Field-programmable Port Extender (FPX), a reconfigurable hardware platform allows reconfiguration through an ATM network. However, majority of the internet today is based on the highly popular TCP/IP protocol. The contribution of this work will allow modular components to be reprogrammed via TCP/IP
2025
FPGA technology has become widely used for real-time network intrusion detection. In this paper, a novel packet classification architecture called BV-TCAM is presented, which is implemented for an FPGA-based Network Intrusion Detection... more
FPGA technology has become widely used for real-time network intrusion detection. In this paper, a novel packet classification architecture called BV-TCAM is presented, which is implemented for an FPGA-based Network Intrusion Detection System (NIDS). The classifier can report multiple matches at gigabit per second network link rates. The BV-TCAM architecture combines the Ternary Content Addressable Memory (TCAM) and the Bit Vector (BV) algorithm to effectively compress the data representations and boost throughput. A tree-bitmap implementation of the BV algorithm is used for source and destination port lookup while a TCAM performs the lookup of the other header fields, which can be represented as a prefix or exact value. The architecture eliminates the requirement for prefix expansion of port ranges. With the aid of a small embedded TCAM, packet classification can be implemented in a relatively small part of the available logic of an FPGA. The design is prototyped and evaluated in a Xilinx FPGA XCV2000E on the FPX platform. Even with the most difficult set of rules and packet inputs, the circuit is fast enough to sustain OC48 traffic throughput. Using larger and faster FPGAs, the system can work at speeds greater than OC192.
2025, Computer Networks
This paper presents the dynamic hardware plugins (DHP) architecture for implementing multiple networking applications in hardware at programmable routers. By enabling multiple applications to be dynamically loaded into a single hardware... more
This paper presents the dynamic hardware plugins (DHP) architecture for implementing multiple networking applications in hardware at programmable routers. By enabling multiple applications to be dynamically loaded into a single hardware device, the DHP architecture provides a scalable mechanism for implementing high-performance programmable routers. The DHP architecture is presented within the context of a programmable router architecture which processes flows in both software and hardware. Implementation options are described as well as the prototype testbed at Washington University in Saint Louis which utilizes the partial reconfiguration capability of modern field programmable gate arrays.
2025
Continuing growth in optical link speeds places increasing demands on the performance of Internet routers, while deployment of embedded and distributed network services imposes new demands for flexibility and programmability. IP adress... more
Continuing growth in optical link speeds places increasing demands on the performance of Internet routers, while deployment of embedded and distributed network services imposes new demands for flexibility and programmability. IP adress lookup has become a significant performance bottleneck for the highest performance routers. New commercial products utilize dedicated Content Addressable Memory (CAM) devides to achieve high lookup speeds. This paper describes an efficient, scalable lookup engine design, able to achieve high-performance with the use of a small portion of a reconfigurable logic device and a commodity Random Access Memory (RAM) device. Based on Eatherton's Tree Bitmap algorithm [1]..
2025
High-performance rule processing systems are needed by network administrators in order to protect Internet systems from attack. Researchers have been working to implement components of intrusion detection systems (IDS), such as the highly... more
High-performance rule processing systems are needed by network administrators in order to protect Internet systems from attack. Researchers have been working to implement components of intrusion detection systems (IDS), such as the highly popular Snort system, in reconfigurable hardware. While considerable progress has been made in the areas of string matching and header processing, complete systems have not yet been demonstrated that effectively combine all of the functionality necessary to perform rule processing for network systems. In this paper, a framework for implementing a rule processing system in reconfigurable hardware is presented. The framework integrates the functionality to scan data flows for regular expressions, fixed strings, and header values. It also allows modules to be added to perform extended functionality to support all features found in Snort rules. To prove the framework viable, a system has been built that scans all bytes of Transmission Control Protocol/Internet Protocol (TCP/IP) traffic entering and leaving a network's gateway at multi-gigabit rates. Using Xilinx FPGA hardware on the Field programmable Port eXtender (FPX) platform, the framework can process 32,768 complex rules at data rates of 2.5 Gbps. Systems to handle data at 10 Gbps rates can be built today using the same framework in the latest reconfigurable hardware devices such as the Virtex 4.
2025, International Journal of Scientific and Engineering Research
An inverse multiplexing method for irreducible polynomials is presented in this paper based on the theory of substitution boxes. The method is based on the theory of substitution boxes. Following a series of successful experiments, the... more
An inverse multiplexing method for irreducible polynomials is presented in this paper based on the theory of substitution boxes. The method is based on the theory of substitution boxes. Following a series of successful experiments, the new approach was put into practice. For reasons of increased complexity and security, the affine conversion period in the Galois field (2^8) has been increased to the maximum value of the period between input and output of 102, the Strict Avalanche Criterion (SAC) has been reduced to nearly half of its original value, and the results are bijective as a result. It was decided to use the number 112 after the Bit Independent Criterion effect had been reduced to produce good results. These breakthroughs are being used to protect information security and to strengthen the advanced encryption standard that is currently in use, according to the researchers. In addition to the addition of a new s-box, their encryption will be more secure and private, making our services even more valuable.
2025
Spike-based processing technology is capable of very high speed throughput, as it does not rely on sensing and processing sequences of frames. Besides, it allows building complex and hierarchically structured cortical-like layers for... more
Spike-based processing technology is capable of very high speed throughput, as it does not rely on sensing and processing sequences of frames. Besides, it allows building complex and hierarchically structured cortical-like layers for sophisticated processing. In this paper we summarize the fundamental properties of this sensing and processing technology applied to artificial vision systems and the AER (Address Event Representation) protocol used in hardware spiking systems. Finally a four-layer system is described for character recognition. The system is slightly based on the Fukushima´s Neocognitron. Realistic simulations using figures of already existing AER devices are provided, which show recognition delays under 10μs.
2025
Spike-based processing technology is capable of very high speed throughput, as it does not rely on sensing and processing sequences of frames. Besides, it allows building complex and hierarchically structured cortical-like layers for... more
Spike-based processing technology is capable of very high speed throughput, as it does not rely on sensing and processing sequences of frames. Besides, it allows building complex and hierarchically structured cortical-like layers for sophisticated processing. In this paper we summarize the fundamental properties of this sensing and processing technology applied to artificial vision systems and the AER (Address Event Representation) protocol used in hardware spiking systems. Finally a four-layer system is described for character recognition. The system is slightly based on the Fukushima´s Neocognitron. Realistic simulations using figures of already existing AER devices are provided, which show recognition delays under 10μs.
2025, 2007 IEEE International Conference on Microelectronic Systems Education (MSE'07)
This paper presents a tool that simulates a reconfigurable cache whose parameters can be changed at runtime through a special instruction at the ISA level. The tool was developed through a series of laboratory exercises in Computer... more
This paper presents a tool that simulates a reconfigurable cache whose parameters can be changed at runtime through a special instruction at the ISA level. The tool was developed through a series of laboratory exercises in Computer Architecture. The proposed tool simulates a cache system that can be reconfigured within a variety of 298 combinations of C, W and L (cache capacity, block size and number of blocks per set) without changing its architecture. The students are introduced to reconfigurable hardware architecture while refreshing their knowledge on Computer Architecture issues like Digital Design, Register Transfer Level and Computer System Level.
2024
MGSim is an open source discrete event simulator for on-chip hardware components, developed at the University of Amsterdam. It is intended to be a research and teaching vehicle to study the fine-grained hardware/software interactions on... more
MGSim is an open source discrete event simulator for on-chip hardware components, developed at the University of Amsterdam. It is intended to be a research and teaching vehicle to study the fine-grained hardware/software interactions on many-core and hardware multithreaded processors. It includes support for core models with different instruction sets, a configurable multi-core interconnect, multiple configurable cache and memory models, a dedicated I/O subsystem, and comprehensive monitoring and interaction facilities. The default model configuration shipped with MGSim implements Microgrids, a many-core architecture with hardware concurrency management. MGSim is furthermore written mostly in C++ and uses object classes to represent chip components. It is optimized for architecture models that can be described as process networks.
2024
a difundir y utilizar con fines académicos tanto la propia memoria como el código.
2024
A novel approach for the time-resolved analysis of high -speed sequences of particle images is presented. The proposed method aims at the minimization of PIV errors in the lower velocity range by adjusting locally and dynamically the... more
A novel approach for the time-resolved analysis of high -speed sequences of particle images is presented. The proposed method aims at the minimization of PIV errors in the lower velocity range by adjusting locally and dynamically the interframe time interval of the PIV pairs in a recorded high-speed sequence. The algorithm performs the operation on a local basis, thus providing the same level of accuracy across the full velocity dynamic range of the flow. The present results indicate a greater performance than state-of-the-art PIV analysis based on multi-grid offset PIV and is successfully applied to synthetic and real flow cases. Figure 1 shows the application of the adaptive multi-frame PIV (AMF-PIV) algorithm to a cross-flow jet.
2024, IEEE Transactions on Computers
Low-level computer vision algorithms have extreme computational requirements. In this work, we compare two real-time architectures developed using FPGA and GPU devices for the computation of phase-based optical flow, stereo, and local... more
Low-level computer vision algorithms have extreme computational requirements. In this work, we compare two real-time architectures developed using FPGA and GPU devices for the computation of phase-based optical flow, stereo, and local image features (energy, orientation, and phase). The presented approach requires a massive degree of parallelism to achieve real-time performance and allows us to compare FPGA and GPU design strategies and trade-offs in a much more complex scenario than previous contributions. Based on this analysis, we provide suggestions to real-time system designers for selecting the most suitable technology, and for optimizing system development on this platform, for a number of diverse applications.
2024, IEEE/ASME Transactions on Mechatronics
Soft computing techniques are generally well suited for vehicular control systems that are usually modeled by highly nonlinear differential equations and working in unstructured environments. To demonstrate their applicability in... more
Soft computing techniques are generally well suited for vehicular control systems that are usually modeled by highly nonlinear differential equations and working in unstructured environments. To demonstrate their applicability in real-world applications, two intelligent controllers based on fuzzy logic and artificial neural network are designed for performing a wall-following task. Based on performance and flexibility considerations, the two controllers are implemented onto a reconfigurable hardware platform, namely a field-programmable gate array. As comparative studies of these two embedded hardware controllers designed for the same vehicular application are limited in literature, this research also presents an evaluation of the two controllers, comparing them in terms of hardware resource requirements, operational speeds, and trajectory tracking errors in following different predefined trajectories. Index Terms-Autonomous vehicles, embedded design, fieldprogrammable gate array (FPGA), fuzzy logic control, neural network control. I. INTRODUCTION D ESIGNS of autonomous car-like robots have received increased attention in recent years due to their potential applicability and usefulness in the automotive and robotics industry. Different areas of research, such as parking, navigation, vehicle tracking, and lane-following systems, are all actively being pursued. Autonomous vehicles are a promising technology that will improve the quality of life by providing collision avoidance, reducing traffic gridlock, and allowing the replacement of dangerous tasks currently performed by human drivers. A. Motivation Lane-following and parallel parking are two frequently utilized tasks that drivers are faced with every day. Current research involves designing algorithms for the execution of these tasks. Wall-following is a task much similar to lane-following and parallel parking. Although wall-following and lane-following may Manuscript
2024, IEEE Transactions on Very Large Scale Integration (VLSI) Systems
| W e describe a system, developed as part of the Cameron project, which compiles programs written in a single-assignment subset of C called SA-C into data ow graphs, and then into VHDL. The primary application domain is image processing.... more
| W e describe a system, developed as part of the Cameron project, which compiles programs written in a single-assignment subset of C called SA-C into data ow graphs, and then into VHDL. The primary application domain is image processing. The system consists of an optimizing compiler which produces data ow graphs, and a data ow graph to VHDL translator. The method used for the translation is described here, along with some results on an application. The objective is not to produce yet another design entry tool, but rather to shift the programming paradigm from HDLs to an algorithmic level, thereby extending the realm of hardware design to the application programmer. Keywords| Adaptive-computing, Con gurable, Imageprocessing, Recon gurable-components, Recon gurablecomputing, Recon gurable-systems
2024, IFIP Advances in Information and Communication Technology
This work presents an experimental characterization of electrochemically gated graphene field-effect transistors (EGFETs) to measure extracellular cell signals. The performance of the EGFETs was evaluated using cardiomyocytes cells.... more
This work presents an experimental characterization of electrochemically gated graphene field-effect transistors (EGFETs) to measure extracellular cell signals. The performance of the EGFETs was evaluated using cardiomyocytes cells. Extracellular signals with a peak value of 0.4 picoamperes (pA) embedded in a noise level of 0.1pA were recorded. Signals in current mode were compared with signals recorded as a voltage. Signals below 28 µV of magnitude can be detected in a noise floor of 7 µV with a signal-tonoise ratio of 4.
2024
El presente trabajo pretende el diseño de una metodología para la construcción de aplicaciones basadas en redes neuronales sobre una plataforma Muren. Las aplicaciones se restringen a sistemas de control y reconocimiento de patrones por... more
El presente trabajo pretende el diseño de una metodología para la construcción de aplicaciones basadas en redes neuronales sobre una plataforma Muren. Las aplicaciones se restringen a sistemas de control y reconocimiento de patrones por imágenes. Se describe la arquitectura del sistema de desarrollo Muren, basado en 2 procesadores ZISC de 78 neuronas cada uno, una FPGA Spartan II, bancos de memoria y lógica adicional de comunicación.
2024, IEEE Transactions on Parallel and Distributed Systems
In this paper, we describe an FPGA-based coprocessor architecture that performs a high-throughput branch-and-bound search of the space of phylogenetic trees corresponding to the number of input taxa. Our coprocessor architecture is... more
In this paper, we describe an FPGA-based coprocessor architecture that performs a high-throughput branch-and-bound search of the space of phylogenetic trees corresponding to the number of input taxa. Our coprocessor architecture is designed to accelerate maximum-parsimony phylogeny reconstruction for gene-order and sequence data and is amenable to both exhaustive and heuristic tree searches. Our architecture exposes coarse-grain parallelism by dividing the search space among parallel processing elements (PEs) and each PE exposes fine-grain memory parallelism for their lower-bound computation, the kernel computation performed by each PE. Inter-PE communication is performed entirely on-chip. When using this coprocessor for maximum-parsimony reconstruction for gene-order data, our coprocessor achieves a 40X improvement over software in search throughput, corresponding to a 14X end-to-end application improvement when including all communication and systems overheads. Index Terms-Biology and genetics, distributed systems, parallelism and concurrency, reconfigurable hardware. T HE heterogeneous computing model, where a general purpose CPU is accelerated using a special purpose coprocessor, is a common technique for 3D rendering [1], high-definition video playback [2], and simulation and gaming [3] but has only recently begun to emerge as a widely used, mainstream technique in scientific computing. This is evident by the integration of coprocessor devices into recent high-performance computers such as Los Alamos's Roadrunner, NCSA's Lincoln, Cray's line of XT5 XT5h computers, and SGI's RASC enhancement to their Altix computers. Each of these systems include integrated programmable or reconfigurable coprocessors, specifically IBM PowerXCell processors in the case of Roadrunner [4], NVIDIA GT200-series Tesla processors in the case of Lincoln, and Field Programmable Gate Arrays (FPGAs) coprocessors in the case of Cray [5] and SGI . Although Digital Signal Processors (DSP), Graphics Unit Processors (GPUs), and now so-called Stream Processors are attractive coprocessor devices due to their relatively simple programming model, at this time FPGAs still remain a popular choice for heterogeneous scientific computing. In this case, a special-purpose hardware version of an application's most expensive computation, or kernel computation, is implemented in custom FPGA logic. The kernel computations that are traditionally implemented on FPGAs are O(n) computations where input data are streamed through a pipeline on the coprocessor. Such implementations are common for numerical linear algebra [7],