Zhenis Otarbay - Academia.edu (original) (raw)

Papers by Zhenis Otarbay

Research paper thumbnail of Method of Coordination of Motion of Swarm Robotic Systems

Scientific Journal of Astana IT University

Maintaining a specific geometric pattern is essential in various applications where groups of aut... more Maintaining a specific geometric pattern is essential in various applications where groups of autonomous robots must follow a given path. Proper organization of the geometric pattern can lead to several benefits such as cost reduction, increased system reliability, and efficiency while providing a reconfigurable and flexible structure of the system. Military missions and traffic systems are examples where maintaining certain geometric patterns are widely used. However, little is known about how to develop an effective algorithm that guarantees collision avoidance and obstacle avoidance while maintaining the geometric pattern. This paper presents an algorithm for movement with a certain geometric structure of a group of autonomous mobile robots that maintains the required geometric pattern and ensures the avoidance of collisions and obstacles. The proposed algorithm is behavior-based and utilizes a set of rules that allow the robots to navigate around obstacles and avoid collisions. ...

Research paper thumbnail of A Concept of Unbiased Stochastically Deterministic Policy Gradient for Better Convergence in Bipedal Walker

2022 International Conference on Smart Information Systems and Technologies (SIST)

Research paper thumbnail of A Concept of Unbiased Deep Deterministic Policy Gradient for Better Convergence in Bipedal Walker

2022 International Conference on Smart Information Systems and Technologies (SIST)

After a quick overview of convergence issues in the Deep Deterministic Policy Gradient (DDPG) whi... more After a quick overview of convergence issues in the Deep Deterministic Policy Gradient (DDPG) which is based on the Deterministic Policy Gradient (DPG), we put forward a peculiar non-obvious hypothesis that 1) DDPG can be type of on-policy learning and acting algorithm if we consider rewards from mini-batch sample as a relatively stable average reward during a limited time period and a fixed Target Network as fixed actor and critic for the limited time period, and 2) an overestimation in DDPG with the fixed Target Network within specified time may not be an out-of-boundary behavior for low dimensional tasks but a process of reaching regions close to the real Q value's average before converging to better Q values. To empirically show that DDPG with a fixed or stable Target may not exceed Q value limits during training in the OpenAI's Pendulum-v1 Environment, we simplified ideas of Backward Q-learning which combined on-policy and offpolicy learning, calling this concept as a unbiased Deep Deterministic Policy Gradient (uDDPG) algorithm. In uDDPG we separately train the Target Network on actual Q values or discounted rewards between episodes (hence "unbiased" in the abbreviation). uDDPG is an anchored version of DDPG. We also use simplified Advantage or difference between current Q Network gradient over actions and current simple moving average of this gradient in updating Action Network. Our purpose is to eventually introduce a less biased, more stable version of DDPG. uDDPG version (DDPG-II) with a function "supernaturally" obtained during experiments that damps weaker fluctuations during policy updates showed promising convergence results.

Research paper thumbnail of Development of Innovative Digital Technologies for Enterprise Management

Scientific Journal of Astana IT University

The efficacy of the organization and corporate strategy are substantially impacted by enterprise ... more The efficacy of the organization and corporate strategy are substantially impacted by enterprise resource planning (ERP) technologies. However, it might be difficult to manage and apply the anticipated benefits of ERP. Workarounds further complicate outdated and evolving business procedures, which impedes the ongoing development of ERP. As a result, reaping the rewards is sometimes challenging. The complexity of realizing advantages also rises when subsidiaries are not able to discover new benefits on their own, even though benefits may differ from subsidiary to subsidiary. ERP systems, which integrate, synchronize, and centralize corporate data, are frequently viewed as a key resource for businesses that thrive in a fast-evolving global market. The selection of ERP systems is a crucial and challenging strategic choice because of the high cost of purchasing, installation, and implementation as well as the variety of offers. Because there are many different tangible and intangible re...

Research paper thumbnail of Development and Control of a Shoulder Joint for Humanoid Robotics Application

Research paper thumbnail of Development of a Shoulder Joint for Humanoid Robotics Application

2021 20th International Conference on Advanced Robotics (ICAR), 2021

In this paper we present the design and the kinematic model of a low-inertia, high-stiffness, ten... more In this paper we present the design and the kinematic model of a low-inertia, high-stiffness, tendon-driven shoulder joint for humanoid robotics application. The current system includes three limbs connected in a parallel architecture with 2 DOFs of mobility that allow rotations about the pitch and roll axes. In addition a third DOF can be easily included to implement the yaw rotation. Motion is possible thanks to three tendons displaced at 120 deg one from another and moved by pulleys connected to motors integrated in the base of the joint. The forces applied by the three tendons to the moving platform are measured with precise load cells sensors integrated in the joint structure. A customized motherboard was developed to allow integrating the micro-controller unit, the motor drivers and the instrumentation amplifiers. The forward kinematic model of a single limb of the parallel shoulder joint was obtained using screw theory and the inverse kinematics calculated from the orientation matrix. Preliminary tests of the joint were conducted using a customized graphical user interface that facilitate monitoring and controlling the actuators' status and all the system variables.

Research paper thumbnail of Development of a Teach Pendant for Humanoid Robotics with Cartesian and Joint-Space Control Modalities

2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2019

This paper presents the design, the construction and testing of a teach pendant for humanoid robo... more This paper presents the design, the construction and testing of a teach pendant for humanoid robotics applications. The system is equipped with a touch-based Graphical User Interface (GUI) from which the robot’s joints and the robot’s end-effectors can be easily controlled in the joint and Cartesian space respectively. A visual representation of the legs pose were integrated in the interface allowing the operator to test the motion of the limbs before their actual execution on the real robot. The forward and inverse kinematic models were formalized according to the Denavit-Hartenberg convention and implemented in Python 3 with the support of the Tkinter, NumPy and Matplotlib libraries. The chassis of the teach-pendant was designed using SolidWorks software to accommodate a 9-inch display with a touch sensor, a 5000 mAh battery, a Raspberry pi 3, and an ATmega168 microcontroller. On the frontal panel, rotary encoders and different buttons are present to access the menu and precisely tune the control variables.

Research paper thumbnail of Universal whole-genome Oxford nanopore sequencing of SARS-CoV-2 using tiled amplicons

Scientific Reports

We developed a comprehensive multiplexed set of primers adapted for the Oxford Nanopore Rapid Bar... more We developed a comprehensive multiplexed set of primers adapted for the Oxford Nanopore Rapid Barcoding library kit that allows universal SARS-CoV-2 genome sequencing. This primer set is designed to set up any variants of the primers pool for whole-genome sequencing of SARS-CoV-2 using single- or double-tiled amplicons from 1.2 to 4.8 kb with the Oxford Nanopore. This multiplexed set of primers is also applicable for tasks like targeted SARS-CoV-2 genome sequencing. We proposed here an optimized protocol to synthesize cDNA using Maxima H Minus Reverse Transcriptase with a set of SARS-CoV-2 specific primers, which has high yields of cDNA template for RNA and is capable of long-length cDNA synthesis from a wide range of RNA amounts and quality. The proposed protocol allows whole-genome sequencing of the SARS-CoV-2 virus with tiled amplicons up to 4.8 kb on low-titer virus samples and even where RNA degradation has occurred. This protocol reduces the time and cost from RNA to genome se...

Research paper thumbnail of Development of a Teach Pendant for Humanoid Robotics with Cartesian and Joint-Space Control Modalities

This paper presents the design, the construction and testing of a teach pendant for humanoid robo... more This paper presents the design, the construction and testing of a teach pendant for humanoid robotics applications. The system is equipped with a touch-based Graphical User Interface (GUI) from which the robot's joints and the robot's end-effectors can be easily controlled in the joint and Cartesian space respectively. A visual representation of the legs pose were integrated in the interface allowing the operator to test the motion of the limbs before their actual execution on the real robot. The forward and inverse kinematic models were formalized according to the Denavit-Hartenberg convention and implemented in Python 3 with the support of the Tkinter, NumPy and Matplotlib libraries. The chassis of the teach-pendant was designed using SolidWorks software to accommodate a 9-inch display with a touch sensor, a 5000 mAh battery, a Raspberry pi 3, and an ATmega168 microcontroller. On the frontal panel, rotary encoders and different buttons are present to access the menu and precisely tune the control variables.

Research paper thumbnail of Development of a Shoulder Joint for Humanoid Robotics Application

In this paper we present the design and the kinematic model of a low-inertia, high-stiffness, ten... more In this paper we present the design and the kinematic model of a low-inertia, high-stiffness, tendon-driven shoulder joint for humanoid robotics application. The current system includes three limbs connected in a parallel architecture with 2 DOFs of mobility that allow rotations about the pitch and roll axes. In addition a third DOF can be easily included to implement the yaw rotation. Motion is possible thanks to three tendons displaced at 120 deg one from another and moved by pulleys connected to motors integrated in the base of the joint. The forces applied by the three tendons to the moving platform are measured with precise load cells sensors integrated in the joint structure. A customized motherboard was developed to allow integrating the micro-controller unit, the motor drivers and the instrumentation amplifiers. The forward kinematic model of a single limb of the parallel shoulder joint was obtained using screw theory and the inverse kinematics calculated from the orientation matrix. Preliminary tests of the joint were conducted using a customized graphical user interface that facilitate monitoring and controlling the actuators' status and all the system variables.

Research paper thumbnail of A Concept of Unbiased Deep Deterministic Policy Gradient for Better Convergence in Bipedal Walker

After a quick overview of convergence issues in the Deep Deterministic Policy Gradient (DDPG) whi... more After a quick overview of convergence issues in the Deep Deterministic Policy Gradient (DDPG) which is based on the Deterministic Policy Gradient (DPG), we put forward a peculiar non-obvious hypothesis that 1) DDPG can be type of on-policy learning and acting algorithm if we consider rewards from mini-batch sample as a relatively stable average reward during a limited time period and a fixed Target Network as fixed actor and critic for the limited time period, and 2) an overestimation in DDPG with the fixed Target Network within specified time may not be an out-of-boundary behavior for low dimensional tasks but a process of reaching regions close to the real Q value's average before converging to better Q values. To empirically show that DDPG with a fixed or stable Target may not exceed Q value limits during training in the OpenAI's Pendulum-v1 Environment, we simplified ideas of Backward Q-learning which combined on-policy and offpolicy learning, calling this concept as a unbiased Deep Deterministic Policy Gradient (uDDPG) algorithm. In uDDPG we separately train the Target Network on actual Q values or discounted rewards between episodes (hence "unbiased" in the abbreviation). uDDPG is an anchored version of DDPG. We also use simplified Advantage or difference between current Q Network gradient over actions and current simple moving average of this gradient in updating Action Network. Our purpose is to eventually introduce a less biased, more stable version of DDPG. uDDPG version (DDPG-II) with a function "supernaturally" obtained during experiments that damps weaker fluctuations during policy updates showed promising convergence results.

Research paper thumbnail of Method of Coordination of Motion of Swarm Robotic Systems

Scientific Journal of Astana IT University

Maintaining a specific geometric pattern is essential in various applications where groups of aut... more Maintaining a specific geometric pattern is essential in various applications where groups of autonomous robots must follow a given path. Proper organization of the geometric pattern can lead to several benefits such as cost reduction, increased system reliability, and efficiency while providing a reconfigurable and flexible structure of the system. Military missions and traffic systems are examples where maintaining certain geometric patterns are widely used. However, little is known about how to develop an effective algorithm that guarantees collision avoidance and obstacle avoidance while maintaining the geometric pattern. This paper presents an algorithm for movement with a certain geometric structure of a group of autonomous mobile robots that maintains the required geometric pattern and ensures the avoidance of collisions and obstacles. The proposed algorithm is behavior-based and utilizes a set of rules that allow the robots to navigate around obstacles and avoid collisions. ...

Research paper thumbnail of A Concept of Unbiased Stochastically Deterministic Policy Gradient for Better Convergence in Bipedal Walker

2022 International Conference on Smart Information Systems and Technologies (SIST)

Research paper thumbnail of A Concept of Unbiased Deep Deterministic Policy Gradient for Better Convergence in Bipedal Walker

2022 International Conference on Smart Information Systems and Technologies (SIST)

After a quick overview of convergence issues in the Deep Deterministic Policy Gradient (DDPG) whi... more After a quick overview of convergence issues in the Deep Deterministic Policy Gradient (DDPG) which is based on the Deterministic Policy Gradient (DPG), we put forward a peculiar non-obvious hypothesis that 1) DDPG can be type of on-policy learning and acting algorithm if we consider rewards from mini-batch sample as a relatively stable average reward during a limited time period and a fixed Target Network as fixed actor and critic for the limited time period, and 2) an overestimation in DDPG with the fixed Target Network within specified time may not be an out-of-boundary behavior for low dimensional tasks but a process of reaching regions close to the real Q value's average before converging to better Q values. To empirically show that DDPG with a fixed or stable Target may not exceed Q value limits during training in the OpenAI's Pendulum-v1 Environment, we simplified ideas of Backward Q-learning which combined on-policy and offpolicy learning, calling this concept as a unbiased Deep Deterministic Policy Gradient (uDDPG) algorithm. In uDDPG we separately train the Target Network on actual Q values or discounted rewards between episodes (hence "unbiased" in the abbreviation). uDDPG is an anchored version of DDPG. We also use simplified Advantage or difference between current Q Network gradient over actions and current simple moving average of this gradient in updating Action Network. Our purpose is to eventually introduce a less biased, more stable version of DDPG. uDDPG version (DDPG-II) with a function "supernaturally" obtained during experiments that damps weaker fluctuations during policy updates showed promising convergence results.

Research paper thumbnail of Development of Innovative Digital Technologies for Enterprise Management

Scientific Journal of Astana IT University

The efficacy of the organization and corporate strategy are substantially impacted by enterprise ... more The efficacy of the organization and corporate strategy are substantially impacted by enterprise resource planning (ERP) technologies. However, it might be difficult to manage and apply the anticipated benefits of ERP. Workarounds further complicate outdated and evolving business procedures, which impedes the ongoing development of ERP. As a result, reaping the rewards is sometimes challenging. The complexity of realizing advantages also rises when subsidiaries are not able to discover new benefits on their own, even though benefits may differ from subsidiary to subsidiary. ERP systems, which integrate, synchronize, and centralize corporate data, are frequently viewed as a key resource for businesses that thrive in a fast-evolving global market. The selection of ERP systems is a crucial and challenging strategic choice because of the high cost of purchasing, installation, and implementation as well as the variety of offers. Because there are many different tangible and intangible re...

Research paper thumbnail of Development and Control of a Shoulder Joint for Humanoid Robotics Application

Research paper thumbnail of Development of a Shoulder Joint for Humanoid Robotics Application

2021 20th International Conference on Advanced Robotics (ICAR), 2021

In this paper we present the design and the kinematic model of a low-inertia, high-stiffness, ten... more In this paper we present the design and the kinematic model of a low-inertia, high-stiffness, tendon-driven shoulder joint for humanoid robotics application. The current system includes three limbs connected in a parallel architecture with 2 DOFs of mobility that allow rotations about the pitch and roll axes. In addition a third DOF can be easily included to implement the yaw rotation. Motion is possible thanks to three tendons displaced at 120 deg one from another and moved by pulleys connected to motors integrated in the base of the joint. The forces applied by the three tendons to the moving platform are measured with precise load cells sensors integrated in the joint structure. A customized motherboard was developed to allow integrating the micro-controller unit, the motor drivers and the instrumentation amplifiers. The forward kinematic model of a single limb of the parallel shoulder joint was obtained using screw theory and the inverse kinematics calculated from the orientation matrix. Preliminary tests of the joint were conducted using a customized graphical user interface that facilitate monitoring and controlling the actuators' status and all the system variables.

Research paper thumbnail of Development of a Teach Pendant for Humanoid Robotics with Cartesian and Joint-Space Control Modalities

2019 28th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), 2019

This paper presents the design, the construction and testing of a teach pendant for humanoid robo... more This paper presents the design, the construction and testing of a teach pendant for humanoid robotics applications. The system is equipped with a touch-based Graphical User Interface (GUI) from which the robot’s joints and the robot’s end-effectors can be easily controlled in the joint and Cartesian space respectively. A visual representation of the legs pose were integrated in the interface allowing the operator to test the motion of the limbs before their actual execution on the real robot. The forward and inverse kinematic models were formalized according to the Denavit-Hartenberg convention and implemented in Python 3 with the support of the Tkinter, NumPy and Matplotlib libraries. The chassis of the teach-pendant was designed using SolidWorks software to accommodate a 9-inch display with a touch sensor, a 5000 mAh battery, a Raspberry pi 3, and an ATmega168 microcontroller. On the frontal panel, rotary encoders and different buttons are present to access the menu and precisely tune the control variables.

Research paper thumbnail of Universal whole-genome Oxford nanopore sequencing of SARS-CoV-2 using tiled amplicons

Scientific Reports

We developed a comprehensive multiplexed set of primers adapted for the Oxford Nanopore Rapid Bar... more We developed a comprehensive multiplexed set of primers adapted for the Oxford Nanopore Rapid Barcoding library kit that allows universal SARS-CoV-2 genome sequencing. This primer set is designed to set up any variants of the primers pool for whole-genome sequencing of SARS-CoV-2 using single- or double-tiled amplicons from 1.2 to 4.8 kb with the Oxford Nanopore. This multiplexed set of primers is also applicable for tasks like targeted SARS-CoV-2 genome sequencing. We proposed here an optimized protocol to synthesize cDNA using Maxima H Minus Reverse Transcriptase with a set of SARS-CoV-2 specific primers, which has high yields of cDNA template for RNA and is capable of long-length cDNA synthesis from a wide range of RNA amounts and quality. The proposed protocol allows whole-genome sequencing of the SARS-CoV-2 virus with tiled amplicons up to 4.8 kb on low-titer virus samples and even where RNA degradation has occurred. This protocol reduces the time and cost from RNA to genome se...

Research paper thumbnail of Development of a Teach Pendant for Humanoid Robotics with Cartesian and Joint-Space Control Modalities

This paper presents the design, the construction and testing of a teach pendant for humanoid robo... more This paper presents the design, the construction and testing of a teach pendant for humanoid robotics applications. The system is equipped with a touch-based Graphical User Interface (GUI) from which the robot's joints and the robot's end-effectors can be easily controlled in the joint and Cartesian space respectively. A visual representation of the legs pose were integrated in the interface allowing the operator to test the motion of the limbs before their actual execution on the real robot. The forward and inverse kinematic models were formalized according to the Denavit-Hartenberg convention and implemented in Python 3 with the support of the Tkinter, NumPy and Matplotlib libraries. The chassis of the teach-pendant was designed using SolidWorks software to accommodate a 9-inch display with a touch sensor, a 5000 mAh battery, a Raspberry pi 3, and an ATmega168 microcontroller. On the frontal panel, rotary encoders and different buttons are present to access the menu and precisely tune the control variables.

Research paper thumbnail of Development of a Shoulder Joint for Humanoid Robotics Application

In this paper we present the design and the kinematic model of a low-inertia, high-stiffness, ten... more In this paper we present the design and the kinematic model of a low-inertia, high-stiffness, tendon-driven shoulder joint for humanoid robotics application. The current system includes three limbs connected in a parallel architecture with 2 DOFs of mobility that allow rotations about the pitch and roll axes. In addition a third DOF can be easily included to implement the yaw rotation. Motion is possible thanks to three tendons displaced at 120 deg one from another and moved by pulleys connected to motors integrated in the base of the joint. The forces applied by the three tendons to the moving platform are measured with precise load cells sensors integrated in the joint structure. A customized motherboard was developed to allow integrating the micro-controller unit, the motor drivers and the instrumentation amplifiers. The forward kinematic model of a single limb of the parallel shoulder joint was obtained using screw theory and the inverse kinematics calculated from the orientation matrix. Preliminary tests of the joint were conducted using a customized graphical user interface that facilitate monitoring and controlling the actuators' status and all the system variables.

Research paper thumbnail of A Concept of Unbiased Deep Deterministic Policy Gradient for Better Convergence in Bipedal Walker

After a quick overview of convergence issues in the Deep Deterministic Policy Gradient (DDPG) whi... more After a quick overview of convergence issues in the Deep Deterministic Policy Gradient (DDPG) which is based on the Deterministic Policy Gradient (DPG), we put forward a peculiar non-obvious hypothesis that 1) DDPG can be type of on-policy learning and acting algorithm if we consider rewards from mini-batch sample as a relatively stable average reward during a limited time period and a fixed Target Network as fixed actor and critic for the limited time period, and 2) an overestimation in DDPG with the fixed Target Network within specified time may not be an out-of-boundary behavior for low dimensional tasks but a process of reaching regions close to the real Q value's average before converging to better Q values. To empirically show that DDPG with a fixed or stable Target may not exceed Q value limits during training in the OpenAI's Pendulum-v1 Environment, we simplified ideas of Backward Q-learning which combined on-policy and offpolicy learning, calling this concept as a unbiased Deep Deterministic Policy Gradient (uDDPG) algorithm. In uDDPG we separately train the Target Network on actual Q values or discounted rewards between episodes (hence "unbiased" in the abbreviation). uDDPG is an anchored version of DDPG. We also use simplified Advantage or difference between current Q Network gradient over actions and current simple moving average of this gradient in updating Action Network. Our purpose is to eventually introduce a less biased, more stable version of DDPG. uDDPG version (DDPG-II) with a function "supernaturally" obtained during experiments that damps weaker fluctuations during policy updates showed promising convergence results.