Massimo Ruo Roch | Politecnico di Torino (original) (raw)
Papers by Massimo Ruo Roch
Advances in intelligent systems and computing, 2017
The paper presents the MicroElectronics Cloud Alliance (MECA) project which brings together eight... more The paper presents the MicroElectronics Cloud Alliance (MECA) project which brings together eighteen partners from higher education institutions and enterprises to develop Cloud-based European infrastructure and organisation for education in micro- and nanoelectronics providing a range of open educational resources, remote access and sharing of educational and professional software, remote and practice-based learning facilities. Each university will provide remote access to its facilities, laboratory experiments or software systems for the partners in a cloud teaching system, giving them access to new resources. The common ones can be optimized, reducing the singular cost per institute and increasing the available computational and structural power.
Proceedings of 6th International Fuzzy Systems Conference, Nov 22, 2002
ABSTRACT This paper describes the architecture of a VLSI fuzzy chip designed to run at very high ... more ABSTRACT This paper describes the architecture of a VLSI fuzzy chip designed to run at very high speed: the processing rate is 320 ns when 4 inputs are processed, whichever is the fuzzy system. This processing rate is higher if less than 4 inputs are processed and reaches 100 ns for two inputs. The chip has been designed in 0.7 μm CMOS technology, its architecture is pipeline and only the actives rules are processed. To do that the fuzzy system to be processed is first converted in an equivalent one where all the rules are present and then it is loaded in the chip memory. Because of the dimension of the chip rule memory it is possible to construct such a chip when the 4 inputs have no more than 7 Membership Functions, MFs, for each input and the overlapping of the input MFs is not higher than two. The design has been done in VHDL language and it has been synthesized by the Cadence Opus SW obtained via Europractice. This chip has been sent to the ES2 foundry last december to be constructed, we received it back at the end of February and recently it has been successfully rested. At the end of the paper the chip layout is described
Electronics Letters, Feb 1, 2016
A simple and effective technique to skip the computation of reliable portions of a frame (windows... more A simple and effective technique to skip the computation of reliable portions of a frame (windows) for turbo code decoding is proposed. The proposed criterion relies on a very simple approximation of crossentropy measure by the means of thresholding. This criterion features negligible complexity and low memory requirements. Simulation results show that, in the best case, up to the 20% of windows can be skipped with no error-rate degradation. Such a significant computation reduction can be exploited to directly reduce the power consumption as well.
2022 IEEE 22nd International Conference on Nanotechnology (NANO), Jul 4, 2022
Proceedings of the 3rd World Congress on New Technologies, Aug 1, 2022
Lecture notes in electrical engineering, Aug 8, 2015
In the last years smart lighting systems have attracted a lot of attention due to the increasing ... more In the last years smart lighting systems have attracted a lot of attention due to the increasing interest in reducing wasted power consumption. This work describes the implementation on an embedded platform of a smart lighting system, where the lamps communicate together creating a cooperative network, to trim the amount of light a given place. The proposed implementation relies on the spread spectrum technique and on optical orthogonal codes borrowed from optical communication research. Experimental results performed on a Freescale Freedom board, prove the feasibility of the proposed system.
Low-latency Network-on-Chip (NoC) applications have tight constraints on the clock budget to perf... more Low-latency Network-on-Chip (NoC) applications have tight constraints on the clock budget to perform communication among nodes. This is a critical aspect in NoC-based designs where the number of clock cycles spent for communication depends mainly on the topology and on the routing algorithm. This work deals with logarithmic diameter topologies, that were proposed for computer networks, and shows that an optimal shortest-path routing algorithm can be efficiently implemented on this kind of topologies by means of a very simple circuit. The proposed circuit is then exploited to reduce the area and the power consumption of a recently proposed NoC-based design. Experimental results show that the proposed circuit allows for a reduction of about 14% and 10% for area and power consumption respectively, with respect to a shortest-path routingtable-based design.
Sensors, Mar 4, 2020
In the last years, the need for new efficient video compression methods grown rapidly as frame re... more In the last years, the need for new efficient video compression methods grown rapidly as frame resolution has increased dramatically. The Joint Collaborative Team on Video Coding (JCT-VC) effort produced in 2013 the H.265/High Efficiency Video Coding (HEVC) standard, which represents the state of the art in video coding standards. Nevertheless, in the last years, new algorithms and techniques to improve coding efficiency have been proposed. One promising approach relies on embedding direction capabilities into the transform stage. Recently, the Steerable Discrete Cosine Transform (SDCT) has been proposed to exploit directional DCT using a basis having different orientation angles. The SDCT leads to a sparser representation, which translates to improved coding efficiency. Preliminary results show that the SDCT can be embedded into the HEVC standard, providing better compression ratios. This paper presents a hardware architecture for the SDCT, which is able to work at a frequency of 188 MHz, reaching a throughput of 3.00 GSample/s. In particular, this architecture supports 8k UltraHigh Definition (UHD) (7680 × 4320) with a frame rate of 60 Hz, which is one of the best resolutions supported by HEVC.
The paper presents the MicroElectronics Cloud Alliance (MECA) project which brings together eight... more The paper presents the MicroElectronics Cloud Alliance (MECA) project which brings together eighteen partners from higher education institutions and enterprises to develop Cloud-based European infrastructure and organisation for education in micro- and nanoelectronics providing a range of open educational resources, remote access and sharing of educational and professional software, remote and practice-based learning facilities. Each university will provide remote access to its facilities, laboratory experiments or software systems for the partners in a cloud teaching system, giving them access to new resources. The common ones can be optimized, reducing the singular cost per institute and increasing the available computational and structural power.
Springer eBooks, 1994
VLSI technologies offer the capabilities to implement high performance processors for special app... more VLSI technologies offer the capabilities to implement high performance processors for special applications; in the particular field of Artificial Intelligence, several hardware solutions to speed up the execution of PROLOG programs are today available. Fast execution of PROLOG programs suggests to exercise the programming language also in new application fields such as real-time controls. The programming benefits using Prolog or similar declarative languages instead of the procedural ones are evident: rules description is simpler and more effective, even if it imposes different methods and perspective to programmers.
Electronics, Mar 21, 2023
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
ACS applied electronic materials, Jan 26, 2023
IEEE Transactions on Emerging Topics in Computing, Jul 1, 2016
In the last decades, the CMOS technology has undergone an extraordinary evolution. Because of the... more In the last decades, the CMOS technology has undergone an extraordinary evolution. Because of the continuous scaling process, CMOS transistors are now so small that millions can be easily fitted in a single chip. Shrinking transistor sizes have complex consequences on the performance of both the transistor itself and the system that is based upon it. Understanding and teaching the CMOS scaling process and its consequences on circuits are an increasingly difficult task. Furthermore, the scaling process is reaching an end, due to the continuously growing fabrication costs and the unavoidable physical limits on the smallest size achievable. As a consequence, many emerging technologies, such as carbon nanotubes and nanowires, are being studied as possible CMOS substitutes. Describing and teaching these new technologies, alongside the scaled transistor itself, adopting a complete and well-organized approach is a process that presents further difficulties. To solve these problems, we have started in the past years the development of TAMTAMS, a tool conceived to analyze CMOS circuits, from device to system level. The tool is based on models derived from the literature or, in some cases, internally developed and verified. It allows to analyze the main characteristics of a CMOS transistor, such as currents, threshold voltage, or mobility, considering different technology nodes and parameters, and to understand how they influence the circuits performance. The tool structure is open and modular, allowing, therefore, easy integration of further CMOS technologies and to compare them. In this paper, we present a total overview of the original tool, TAMTAMS Web. While the general concept behind the tool is still the same, the tool was completely rewritten around a Web interface. TAMTAMS Web is freely accessible to students and to any one interested in CMOS technology. As a future development, several post-CMOS technologies will be added to TAMTAMs Web, allowing, therefore, a comparison with the stateof-the-art CMOS. TAMTAMS Web is actively used in the Integrated System Technology (IST) held at the Politecnico di Torino. It defines a new way of learning, because students learn and understand the modern electronic technology both using TAMTAMS Web as an instrument, and being part of the development process, as part of the IST course. INDEX TERMS Scaling process, emerging technologies, web-based services, education.
AIP Advances, Dec 1, 2020
The requirement of high memory bandwidth for next-generation computing systems moved the attentio... more The requirement of high memory bandwidth for next-generation computing systems moved the attention to the development of devices that can combine storage and logic capabilities. Domain wall-based spintronic devices intrinsically combine both these requirements making them suitable both for non-volatile storage and computation. Co\Pt and Co\Ni were the technology drivers of perpendicular Nano Magnetic Logic devices (pNML), but for power constraints and depinning fields, novel CoFeB\MgO layers appear more promising. In this paper, we investigate the Ta 2 \CoFeB 1 \MgO 2 \Ta 3 stack at the simulation and experimental level, to show its potential for the next generation of magnetic logic devices. The micromagnetic simulations are used to support the experiments. We focus, first, at the experimental level measuring the switching field distribution of patterned magnetic islands, Ms via VSM and the domain wall speed on magnetic nanowires. Then, at the simulation level, we focus on the magnetostatic analysis of magnetic islands quantifying the stray field that can be achieved with different layout topologies. Our results show that the achieved coupling is strong enough to realize logic computation with magnetic islands, moving a step forward in the direction of low power perpendicularly magnetized logic devices.
Lecture notes in electrical engineering, Aug 8, 2015
Network-on-Chip is gaining interest in these years thanks to its regular and scalable design. Sev... more Network-on-Chip is gaining interest in these years thanks to its regular and scalable design. Several topologies have been proposed, and there is the need of a general framework for their test, validation and comparison. In this article a framework based on the OpenSPARC T2 processor is presented, where the NoC is used to replace the Cache Crossbar. With the introduction of protocol translators, it is possible to accomodate any NoC inside the T2. Processor regression tests can be used to validate the design and evaluate timing performance.
Lecture notes in electrical engineering, 2021
SARS-CoV2 pandemic stressed the need to increase adoption of remote teaching. Technical courses, ... more SARS-CoV2 pandemic stressed the need to increase adoption of remote teaching. Technical courses, specifically electronic engineering ones, suffered the miss of real lab experiments directly carried out by students. In this paper a new approach is presented, based on the usage of very low cost experimental boards, which act both as a measurement instrument and a programmable prototype circuit. A first board, targeted to analog and digital electronics courses experiments, has been designed, and is described in this paper.
In the last decade Quantum dot Cellular Automata technology has been one of the most studied amon... more In the last decade Quantum dot Cellular Automata technology has been one of the most studied among the emerging technologies. The magnetic implementation, NanoMagnet Logic (NML), is particularly interesting as an alternative solutions to CMOS technology. The main advantages of NML circuits resides in the possibility to mix logic and memory in the same device, the expected low power consumption and the remarkable tolerance to heat and radiations. NML and QCA circuits behavior is different w.r.t. their CMOS counterparts. Consequently architecture organization must be tailored to their characteristics, and it is important to identify which applications are best suited for this technology. Our contribution reported in this paper represents a considerable step-forward in this direction. We present an optimized implementation on NML technology of an hardware accelerator for biosequences analysis. The architecture leverages the systolic array structure, which is the best organization for this technology due to the regularity of the layout. The circuit is described using a VHDL model, simulated to verify the correct functionality from the application point of view, and performance are evaluated, both in terms of speed and power consumption. Results pinpoints that NML technology with the appropriate clock solution can reach a considerable reduction in power consumption over CMOS. This analysis highlights quantitatively, and not only qualitatively, that NML logic is perfectly suited for Massively Parallel Data Analysis applications.
Journal of Low Power Electronics and Applications, Oct 31, 2022
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Advances in intelligent systems and computing, 2017
The paper presents the MicroElectronics Cloud Alliance (MECA) project which brings together eight... more The paper presents the MicroElectronics Cloud Alliance (MECA) project which brings together eighteen partners from higher education institutions and enterprises to develop Cloud-based European infrastructure and organisation for education in micro- and nanoelectronics providing a range of open educational resources, remote access and sharing of educational and professional software, remote and practice-based learning facilities. Each university will provide remote access to its facilities, laboratory experiments or software systems for the partners in a cloud teaching system, giving them access to new resources. The common ones can be optimized, reducing the singular cost per institute and increasing the available computational and structural power.
Proceedings of 6th International Fuzzy Systems Conference, Nov 22, 2002
ABSTRACT This paper describes the architecture of a VLSI fuzzy chip designed to run at very high ... more ABSTRACT This paper describes the architecture of a VLSI fuzzy chip designed to run at very high speed: the processing rate is 320 ns when 4 inputs are processed, whichever is the fuzzy system. This processing rate is higher if less than 4 inputs are processed and reaches 100 ns for two inputs. The chip has been designed in 0.7 μm CMOS technology, its architecture is pipeline and only the actives rules are processed. To do that the fuzzy system to be processed is first converted in an equivalent one where all the rules are present and then it is loaded in the chip memory. Because of the dimension of the chip rule memory it is possible to construct such a chip when the 4 inputs have no more than 7 Membership Functions, MFs, for each input and the overlapping of the input MFs is not higher than two. The design has been done in VHDL language and it has been synthesized by the Cadence Opus SW obtained via Europractice. This chip has been sent to the ES2 foundry last december to be constructed, we received it back at the end of February and recently it has been successfully rested. At the end of the paper the chip layout is described
Electronics Letters, Feb 1, 2016
A simple and effective technique to skip the computation of reliable portions of a frame (windows... more A simple and effective technique to skip the computation of reliable portions of a frame (windows) for turbo code decoding is proposed. The proposed criterion relies on a very simple approximation of crossentropy measure by the means of thresholding. This criterion features negligible complexity and low memory requirements. Simulation results show that, in the best case, up to the 20% of windows can be skipped with no error-rate degradation. Such a significant computation reduction can be exploited to directly reduce the power consumption as well.
2022 IEEE 22nd International Conference on Nanotechnology (NANO), Jul 4, 2022
Proceedings of the 3rd World Congress on New Technologies, Aug 1, 2022
Lecture notes in electrical engineering, Aug 8, 2015
In the last years smart lighting systems have attracted a lot of attention due to the increasing ... more In the last years smart lighting systems have attracted a lot of attention due to the increasing interest in reducing wasted power consumption. This work describes the implementation on an embedded platform of a smart lighting system, where the lamps communicate together creating a cooperative network, to trim the amount of light a given place. The proposed implementation relies on the spread spectrum technique and on optical orthogonal codes borrowed from optical communication research. Experimental results performed on a Freescale Freedom board, prove the feasibility of the proposed system.
Low-latency Network-on-Chip (NoC) applications have tight constraints on the clock budget to perf... more Low-latency Network-on-Chip (NoC) applications have tight constraints on the clock budget to perform communication among nodes. This is a critical aspect in NoC-based designs where the number of clock cycles spent for communication depends mainly on the topology and on the routing algorithm. This work deals with logarithmic diameter topologies, that were proposed for computer networks, and shows that an optimal shortest-path routing algorithm can be efficiently implemented on this kind of topologies by means of a very simple circuit. The proposed circuit is then exploited to reduce the area and the power consumption of a recently proposed NoC-based design. Experimental results show that the proposed circuit allows for a reduction of about 14% and 10% for area and power consumption respectively, with respect to a shortest-path routingtable-based design.
Sensors, Mar 4, 2020
In the last years, the need for new efficient video compression methods grown rapidly as frame re... more In the last years, the need for new efficient video compression methods grown rapidly as frame resolution has increased dramatically. The Joint Collaborative Team on Video Coding (JCT-VC) effort produced in 2013 the H.265/High Efficiency Video Coding (HEVC) standard, which represents the state of the art in video coding standards. Nevertheless, in the last years, new algorithms and techniques to improve coding efficiency have been proposed. One promising approach relies on embedding direction capabilities into the transform stage. Recently, the Steerable Discrete Cosine Transform (SDCT) has been proposed to exploit directional DCT using a basis having different orientation angles. The SDCT leads to a sparser representation, which translates to improved coding efficiency. Preliminary results show that the SDCT can be embedded into the HEVC standard, providing better compression ratios. This paper presents a hardware architecture for the SDCT, which is able to work at a frequency of 188 MHz, reaching a throughput of 3.00 GSample/s. In particular, this architecture supports 8k UltraHigh Definition (UHD) (7680 × 4320) with a frame rate of 60 Hz, which is one of the best resolutions supported by HEVC.
The paper presents the MicroElectronics Cloud Alliance (MECA) project which brings together eight... more The paper presents the MicroElectronics Cloud Alliance (MECA) project which brings together eighteen partners from higher education institutions and enterprises to develop Cloud-based European infrastructure and organisation for education in micro- and nanoelectronics providing a range of open educational resources, remote access and sharing of educational and professional software, remote and practice-based learning facilities. Each university will provide remote access to its facilities, laboratory experiments or software systems for the partners in a cloud teaching system, giving them access to new resources. The common ones can be optimized, reducing the singular cost per institute and increasing the available computational and structural power.
Springer eBooks, 1994
VLSI technologies offer the capabilities to implement high performance processors for special app... more VLSI technologies offer the capabilities to implement high performance processors for special applications; in the particular field of Artificial Intelligence, several hardware solutions to speed up the execution of PROLOG programs are today available. Fast execution of PROLOG programs suggests to exercise the programming language also in new application fields such as real-time controls. The programming benefits using Prolog or similar declarative languages instead of the procedural ones are evident: rules description is simpler and more effective, even if it imposes different methods and perspective to programmers.
Electronics, Mar 21, 2023
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
ACS applied electronic materials, Jan 26, 2023
IEEE Transactions on Emerging Topics in Computing, Jul 1, 2016
In the last decades, the CMOS technology has undergone an extraordinary evolution. Because of the... more In the last decades, the CMOS technology has undergone an extraordinary evolution. Because of the continuous scaling process, CMOS transistors are now so small that millions can be easily fitted in a single chip. Shrinking transistor sizes have complex consequences on the performance of both the transistor itself and the system that is based upon it. Understanding and teaching the CMOS scaling process and its consequences on circuits are an increasingly difficult task. Furthermore, the scaling process is reaching an end, due to the continuously growing fabrication costs and the unavoidable physical limits on the smallest size achievable. As a consequence, many emerging technologies, such as carbon nanotubes and nanowires, are being studied as possible CMOS substitutes. Describing and teaching these new technologies, alongside the scaled transistor itself, adopting a complete and well-organized approach is a process that presents further difficulties. To solve these problems, we have started in the past years the development of TAMTAMS, a tool conceived to analyze CMOS circuits, from device to system level. The tool is based on models derived from the literature or, in some cases, internally developed and verified. It allows to analyze the main characteristics of a CMOS transistor, such as currents, threshold voltage, or mobility, considering different technology nodes and parameters, and to understand how they influence the circuits performance. The tool structure is open and modular, allowing, therefore, easy integration of further CMOS technologies and to compare them. In this paper, we present a total overview of the original tool, TAMTAMS Web. While the general concept behind the tool is still the same, the tool was completely rewritten around a Web interface. TAMTAMS Web is freely accessible to students and to any one interested in CMOS technology. As a future development, several post-CMOS technologies will be added to TAMTAMs Web, allowing, therefore, a comparison with the stateof-the-art CMOS. TAMTAMS Web is actively used in the Integrated System Technology (IST) held at the Politecnico di Torino. It defines a new way of learning, because students learn and understand the modern electronic technology both using TAMTAMS Web as an instrument, and being part of the development process, as part of the IST course. INDEX TERMS Scaling process, emerging technologies, web-based services, education.
AIP Advances, Dec 1, 2020
The requirement of high memory bandwidth for next-generation computing systems moved the attentio... more The requirement of high memory bandwidth for next-generation computing systems moved the attention to the development of devices that can combine storage and logic capabilities. Domain wall-based spintronic devices intrinsically combine both these requirements making them suitable both for non-volatile storage and computation. Co\Pt and Co\Ni were the technology drivers of perpendicular Nano Magnetic Logic devices (pNML), but for power constraints and depinning fields, novel CoFeB\MgO layers appear more promising. In this paper, we investigate the Ta 2 \CoFeB 1 \MgO 2 \Ta 3 stack at the simulation and experimental level, to show its potential for the next generation of magnetic logic devices. The micromagnetic simulations are used to support the experiments. We focus, first, at the experimental level measuring the switching field distribution of patterned magnetic islands, Ms via VSM and the domain wall speed on magnetic nanowires. Then, at the simulation level, we focus on the magnetostatic analysis of magnetic islands quantifying the stray field that can be achieved with different layout topologies. Our results show that the achieved coupling is strong enough to realize logic computation with magnetic islands, moving a step forward in the direction of low power perpendicularly magnetized logic devices.
Lecture notes in electrical engineering, Aug 8, 2015
Network-on-Chip is gaining interest in these years thanks to its regular and scalable design. Sev... more Network-on-Chip is gaining interest in these years thanks to its regular and scalable design. Several topologies have been proposed, and there is the need of a general framework for their test, validation and comparison. In this article a framework based on the OpenSPARC T2 processor is presented, where the NoC is used to replace the Cache Crossbar. With the introduction of protocol translators, it is possible to accomodate any NoC inside the T2. Processor regression tests can be used to validate the design and evaluate timing performance.
Lecture notes in electrical engineering, 2021
SARS-CoV2 pandemic stressed the need to increase adoption of remote teaching. Technical courses, ... more SARS-CoV2 pandemic stressed the need to increase adoption of remote teaching. Technical courses, specifically electronic engineering ones, suffered the miss of real lab experiments directly carried out by students. In this paper a new approach is presented, based on the usage of very low cost experimental boards, which act both as a measurement instrument and a programmable prototype circuit. A first board, targeted to analog and digital electronics courses experiments, has been designed, and is described in this paper.
In the last decade Quantum dot Cellular Automata technology has been one of the most studied amon... more In the last decade Quantum dot Cellular Automata technology has been one of the most studied among the emerging technologies. The magnetic implementation, NanoMagnet Logic (NML), is particularly interesting as an alternative solutions to CMOS technology. The main advantages of NML circuits resides in the possibility to mix logic and memory in the same device, the expected low power consumption and the remarkable tolerance to heat and radiations. NML and QCA circuits behavior is different w.r.t. their CMOS counterparts. Consequently architecture organization must be tailored to their characteristics, and it is important to identify which applications are best suited for this technology. Our contribution reported in this paper represents a considerable step-forward in this direction. We present an optimized implementation on NML technology of an hardware accelerator for biosequences analysis. The architecture leverages the systolic array structure, which is the best organization for this technology due to the regularity of the layout. The circuit is described using a VHDL model, simulated to verify the correct functionality from the application point of view, and performance are evaluated, both in terms of speed and power consumption. Results pinpoints that NML technology with the appropriate clock solution can reach a considerable reduction in power consumption over CMOS. This analysis highlights quantitatively, and not only qualitatively, that NML logic is perfectly suited for Massively Parallel Data Analysis applications.
Journal of Low Power Electronics and Applications, Oct 31, 2022
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY