Nader Rafla - Profile on Academia.edu (original) (raw)

Papers by Nader Rafla

Energy-Efficient Black Hole Router Detection in Network-on-Chip

2022 IEEE 35th International System-on-Chip Conference (SOCC)

OSA Continuum, 2021

In this work, the novel usage of a physically unclonable function composed of a network of Mach-Z... more In this work, the novel usage of a physically unclonable function composed of a network of Mach-Zehnder interferometers for authentication tasks is described. The physically unclonable function hardware is completely reconfigurable, allowing for a large number of seemingly independent devices to be utilized, thus imitating a large array of single-response physically unclonable functions. It is proposed that any reconfigurable array of Mach-Zehnder interferometers can be used as an authentication mechanism, not only for physical objects, but for information transmitted both classically and quantumly. The proposed use-case for a fully-optical physically unclonable function, designed with reconfigurable hardware, is to authenticate messages between a trusted and possibly untrusted party; verifying that the messages received are generated by the holder of the authentic device.

2011 IEEE Workshop on Microelectronics and Electron Devices, 2011

2006 49th IEEE International Midwest Symposium on Circuits and Systems, 2006

Finite State Machines (FSM), are one of the more complex structures found in almost all digital s... more Finite State Machines (FSM), are one of the more complex structures found in almost all digital systems today. Hardware Description Languages are used for high-level digital system design. VHDL (VHSIC Hardware Description Language) provides the capability of different coding styles for FSMs. Therefore, a choice of a coding style is needed to achieve specific performance goals and to minimize resource utilization for implementation in a re-configurable computing environment such as an FPGA. This paper is a study of the tradeoffs that can be made by changing coding styles. A comparative study on three different FSM coding styles is shown to address their impact on performance and resource utilization for the most commonly used encoding methods for FPGA designs. The results show that a particular coding style leads to a savings in resource utilization with a significant performance improvement over the others while the others pose a consistent performance regardless of the resource utilization outcome.

2010 53rd IEEE International Midwest Symposium on Circuits and Systems, 2010

I would like to sincerely thank my advisor Dr. Nader Rafla for his valuable guidance and support ... more I would like to sincerely thank my advisor Dr. Nader Rafla for his valuable guidance and support while completing my graduate education. I am grateful for his confidence in me that I could do a good job with my thesis. It has been a great pleasure, in fact, an honor to work with him. I would also like to thank Dr. Jennifer A. Smith and Dr. Thad Welch for being on my thesis committee and guiding and encouraging me throughout my research work. Finally, I would like to thank my family for their unwavering support and encouragement. I am grateful to my son for being patient and understanding during the entire process. Thank you all.

A new microelectronics program at Boise State University: The Idaho Microelectronics Manufacturing Research Center (IM2RC)

University/Government/Industry Microelectronics Symposium, 1989. Proceedings., Eighth

The microelectronics industry in Boise and the surrounding Inter-Mountain region has grown rapidl... more The microelectronics industry in Boise and the surrounding Inter-Mountain region has grown rapidly during the past decade. This growth has led to an urgent need for local microelectronics education opportunities at the associate, bachelor, and graduate levels. In 1995, an AAS program in Semiconductor Technology was created at BSU. Then, with strong support from local industry (particularly Micron Technology), the State of Idaho transformed what had been a satellite engineering program from University of Idaho, Moscow ID into a new School of Engineering at Boise State. BS degree programs in Civil, Electrical, and Mechanical engineering were established, a nationwide search for a new faculty was performed, and classes began in August 1996. Microelectronics is the clear focus of BSU's new program, with four of the new faculty specializing in this area

Vision-taction integration for surface representation

IEEE International Conference on Systems Engineering

A method called vision-taction exploration (VTE) for generating surface descriptions from range v... more A method called vision-taction exploration (VTE) for generating surface descriptions from range vision and tactile sensor data is described. The range vision systems available provide sparse 3D data about surfaces. These data are partially processed to provide primary surface features such as surface points and surface normals. With the use of tactile and force-torque sensors under position control, supplementary data

High level synthesis using vivado HLS for optimizations of SHA-3

2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS)

Hash functions represent a fundamental building block of many network security protocols. The SHA... more Hash functions represent a fundamental building block of many network security protocols. The SHA-3 hashing algorithm is the most recently developed hash function, and the most secure. Implementation of the SHA-3 hashing algorithm in Hardware Description Language (HDL) is time demanding and tedious to debug. On the other hand, High-Level Synthesis (HLS) tools offer potential solutions to the hardware design. HLS tools provide us with advanced capabilities for design evaluation and a wide variety of optimization techniques. In this paper, the SHA-3 hashing algorithm and its implementation onto a Xilinx® Zynq-7000 SoPC is explored. The SHA-3 hashing algorithm is initially coded in C programming language and then implemented with Xilinx Vivado HLS. The HLS tool enabled us to quickly analyze our design to make suitable optimizations which led to increased throughput of the SHA-3 hashing algorithm, up to 2000 Mbps. After pipelining the synthesized hardware design, it was capable of hashing a block of 1088 bits in 70 clock cycles.

Optimization of a Quantum-Secure Sponge-Based Hash Message Authentication Protocol

2018 IEEE 61st International Midwest Symposium on Circuits and Systems (MWSCAS)

Hash message authentication is a fundamental building block of many networking security protocols... more Hash message authentication is a fundamental building block of many networking security protocols such as SSL, TLS, FTP, and even HTTPS. The sponge-based SHA-3 hashing algorithm is the most recently developed hashing function as a result of a NIST competition to find a new hashing standard after SHA-1 and SHA-2 were found to have collisions, and thus were considered broken. We used Xilinx High-Level Synthesis to develop an optimized and pipelined version of the post-quantum-secure SHA-3 hash message authentication code (HMAC) which is capable of computing a HMAC every 280 clock-cycles with an overall throughput of 604 Mbps. We cover the general security of sponge functions in both a classical and quantum computing standpoint for hash functions, and offer a general architecture for HMAC computation when sponge functions are used.

tile Memory Array Base Conductive Bridge Memr

Much excitement has been gen potential uses of chalcogenide glasses and o circuits as "memri... more Much excitement has been gen potential uses of chalcogenide glasses and o circuits as "memristors" or as non-volatile memristor is a fourth passive two terminal postulated by Leon Chua in 1971 and rediscov Conductive Bridge Memristor (CBM) change response to current passing through it by dissolving a conductive molecular bridge insulating chalcogenide film. This paper outlines the design and simulatio memory using an array of CBM devices integr access transistors and read/write access circuitr We have designed and simulated a large mem using CBM devices accessed by an NMOS tran row/column read and write drivers. The desi cascode op-amp configured to integrate current a strategy for sensing the device resistance. Ea connected to the array through a single mini transistor. The design has been simulated usin for the PMC (Programmable Metallization demonstrate the feasibility of accessing the without exceeding the write threshold, and disc speed vs. array size associated with ...

SIFT Keypoint Descriptor Matching Algorithm: A Fully Pipelined Accelerator on FPGA(Abstract Only)

Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

Scale Invariant Feature Transform (SIFT) algorithm is one of the classical feature extraction alg... more Scale Invariant Feature Transform (SIFT) algorithm is one of the classical feature extraction algorithms that is well known in Computer Vision. It consists of two stages: keypoint descriptor extraction and descriptor matching. SIFT descriptor matching algorithm is a computational intensive process. In this work, we present a design and implementation of a hardware core accelerator for the descriptor-matching algorithm on a field programmable gate array (FPGA). Our proposed hardware core architecture is able to cope with the memory bandwidth and hit the roofline performance model to achieve maximum throughput. The matching-core was implemented using Xilinx Vivado® EDA design suite on a Zynq®-based FPGA Development board. The proposed matching-core architecture is fully pipelined for 16-bit fixed-point operations and consists of five main submodules designed in Verilog, High Level Synthesis, and System Generator. The area resources were significantly reduced compared to the most recen...

HexCell: a Hexagonal Cell for Evolvable Systolic Arrays on FPGAs: (Abstract Only)

Proceedings of the 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2018

This paper presents a novel cell architecture for evolvable systolic arrays. HexCell is a tile-ab... more This paper presents a novel cell architecture for evolvable systolic arrays. HexCell is a tile-able processing element with a hexagonal shape that can be implemented and dynamically reconfigured on field-programmable gate arrays (FPGAs). The cell contains a functional unit, three input ports, and three output ports. It supports two concurrent configuration schemes: dynamic partial reconfiguration (DPR), where the functional unit is partially reconfigured at run time, and virtual reconfiguration circuit (VRC), where the cell output port bypasses one of the input data or selects the functional unit output. Hence, HexCell combines the merits of DPR and VRC including resource-awareness, reconfiguration speed and routing flexibility. In addition, the cell structure supports pipelining and data synchronization for achieving high throughput for data-intensive applications like image processing. A HexCell is represented by a binary string (chromosome) that encodes the cell's function an...

Speech recognition has seen dramatic improvements in the last decade, though those improvements h... more Speech recognition has seen dramatic improvements in the last decade, though those improvements have focused primarily on adult speech. In this paper, we assess child-directed speech recognition and leverage a transfer learning approach to improve child-directed speech recognition by training the recent DeepSpeech2 model on adult data, then apply additional tuning to varied amounts of child speech data. We evaluate our model using the CMU Kids dataset as well as our own recordings of child-directed prompts. The results from our experiment show that even a small amount of child audio data improves significantly over a baseline of adult-only or child-only trained models. We report a final general Word-Error-Rate of 29% over a baseline of 62% that uses the adult-trained model. Our analyses show that our model adapts quickly using a small amount of data and that the general child model works better than school grade-specific models. We make available our trained model and our data colle...

Simulation Study of Sweeping Actuation of a Magnetic Shape Memory Micropump with Electromagnetic Coils

Visually guided tactile and force-torque sensing for object recognition and localization

HLS Implementation of Linear Discriminant Analysis Classifier

2020 IEEE International Symposium on Circuits and Systems (ISCAS), 2020

Data classification has improved significantly over time and nowadays is used in a variety of pur... more Data classification has improved significantly over time and nowadays is used in a variety of purposes and applications. This paper demonstrates the design and implementation of multivariate classifier linear discriminant algorithm on a Field Programable Gate Array (FPGA) as System on Chip (SoC). The classifier is optimized using High Level Synthesis (HLS) techniques. The optimized design is placed on the programmable logic part of the chip while its controller is built on the embedded processor of the same chip. The paper details the process of the classifier design and optimization and reports on the power consumption, resource utilization, latency, and algorithm accuracy before and after optimization.

With the growing complexity of modern digital systems and embedded system designs, the task of ve... more With the growing complexity of modern digital systems and embedded system designs, the task of verification has become the key to achieving the faster time-to-market requirement for such designs. This paper describes a graduate level, Verification of Digital Systems using SystemVerilog, offered at Boise State University as a part of the Master of Science program in Computer Engineering,. This course does not only teach syntax and semantics but also coveragedriven, constrained-random, and assertion-based verification methodologies employing the advanced features of SystemVerilog to ensure that designs meet the required specifications. The course also emphasizes the practical aspects of verification methodologies through providing students with hands-on experience on commercial verification tools such as QuestaSim, the Advanced Functional Verification suite from Mentor Graphics. Course goals are explained along with course content, format, and benefits to students. The course is desig...

Low-Complexity and Resource-Aware Compression Algorithm for FPGA Bitstreams

Runtime Packet-Dropping Detection of Faulty Nodes in Network-on-Chip

Due to the impact of ongoing deep sub-micron technology, billions of transistors are crammed in a... more Due to the impact of ongoing deep sub-micron technology, billions of transistors are crammed in an integrated circuit to combine multiple systems on a single chip. Network-on-Chip (NoC) has become the communication infrastructure among these systems’ components. On the other hand, scaling down the feature size has increased the probability of faults which could be experienced in runtime. Therefore, online fault detection is considered in the system design. This paper presents an efficient method to detect and avoid faulty nodes that silently discard packets from the network. This method deals with control faults of the NoC routers, where the packets are received but are not saved in the buffers. In this work, a high level fault model is proposed. Also, a detection technique and fault tolerant method is presented. The proposed scheme is analyzed and evaluated. The results show 3.91%, 9.97%, and 8.82% overhead in area, power, and performance, respectively, while guaranteeing packet de...