Rajendra Kumar - Academia.edu (original) (raw)
Papers by Rajendra Kumar
International Journal of Computer and Electrical Engineering, 2010
In this paper, an all pairs optimized shortest path algorithm is presented for an unweighted and ... more In this paper, an all pairs optimized shortest path algorithm is presented for an unweighted and undirected graph with some additive error of at most 2.This algorithm can be extended for weighted graph also but it will not work for directed graph due to absence of commutative property. The run time of the algorithm is of order Ο(n5/2), where n is the number of vertices present in the graph. This algorithm is much simpler than the existing algorithms. A study of upper bounds on the size of a maximal independent set of such graphs has also been discussed.
International Journal of Computer Applications, 2010
In this paper, we present an algorithm to compute all pairs optimized shortest paths in an unweig... more In this paper, we present an algorithm to compute all pairs optimized shortest paths in an unweighted and undirected graph with some additive error of at most 2.This algorithm can be extended for weighted graph also but it will not work for directed graph due to absence of commutative property. The algorithm runs in n 5/2) times, where n is the number of vertices in the graph. This algorithm is much simpler than the existing algorithms. A study of upper bounds on the size of a maximal independent set of such graphs has been performed.
International Journal of Computer Applications, 2010
In this paper we present control flow prediction (CFP) in parallel register sharing architecture ... more In this paper we present control flow prediction (CFP) in parallel register sharing architecture to achieve high degree of ILP. The main idea behind this concept is to use a step beyond the prediction of common branch and permitting the architecture to have the information about the CFG (Control Flow Graph) components of the program to have better branch decision for ILP. The navigation bandwidth of prediction mechanism depends upon the degree of ILP. It can be increased by increasing control flow prediction at compile time. By this the size of initiation is increased that allows the overlapped execution of multiple independent flow of control. The multiple branch instruction can also be allowed. These are intermediate steps to be taken in order to increase the size of dynamic window to achieve a high degree of instruction level parallelism exploitation.
International Journal of Computer and Electrical Engineering, 2011
the exploitation of potential performance of superscalar processors has shown that processor is f... more the exploitation of potential performance of superscalar processors has shown that processor is fed with sufficient instruction bandwidth. The fetcher and the Instruction Stream Buffer (ISB) are the key elements to achieve this target. Beyond the basic blocks, the instruction stream is not supported by current ISBs. The split line instruction problem depreciates this situation for x86 processors. With the implementation of Line Weighted Branch Target Buffer (LWBTB), the advance branch information and reassembling of cache lines can be predicted by the ISB. The code generation for parallel register share architecture involves some issues that are not present in sequential code compilation and is inherently complex. To resolve such issues, a consistency contract between the code and the machine can be defined and a compiler is required to preserve the contract during the transformation of code. We want to achieve high level parallelism at faster clock speed it require distribution of processor resource and avoiding primitive that require single cycle global communication. Distribution of its resources, including instruction stream, register files, memory port and ALUs, over a pipelined two dimensional mesh interconnect are done by raw microprocessor [4]. In this paper, we propose a compiler RPCC for general purpose sequential programs on the raw machine.
Acta Informatica Malaysia
The vein recognition is the most accurate and secure technology in the field of biometrics. We ha... more The vein recognition is the most accurate and secure technology in the field of biometrics. We have seen that many criminal cases like risk of forgery or theft. All biometric traits have their own advantages and disadvantages. Like in long term, a person's fingerprint may be damaged due to environment, aging or ethnicity; face recognition has medium accuracy; iris recognition is costly affair. To overcome all the problems, the solution is vein recognition. The most important point about vein recognition system is that it works on living persons only as the infrared radiation is absorbed by hemoglobin present in blood of living persons. The skin integrity does not affect the accuracy or readability of finger vein recognition. There are lots of benefits to include vein recognition in biometric traits in which some are, users are not required to get in physical contact with the device. This technology can also be used for medical purposes like children that are below the age of 12 have very smooth vein which sometime cannot be detected by doctors to inject injection which can be detected by NIR cameras. In this paper we have presented the various techniques of palm vein recognition that are applied in today's scenario, Need and Scope of vein recognition system, its implementation challenges, its application, Merit and Demerit and finally conclusion.
Acta Electronica Malaysia
Un-wired Computers (Mobile Computers) such as Laptops, Net-books, Notebooks and Personal Digital ... more Un-wired Computers (Mobile Computers) such as Laptops, Net-books, Notebooks and Personal Digital Assistants (PDAs) are the fastest growing segments of Computing Industry; this changing aspect of computing has led to the invention of Mobile Ad-Hoc Networks (MANET). Mobile Ad Hoc networking is a new era of infrastructure less communication networks for mobile devices (hosts, nodes etc.), where mobile nodes that are in radio range for each other can directly communicate with each other when in range and can even use intermediate nodes as routers when are moving away or are getting out of range from the connected nodes, this property along with the ability of switching from one network topology to another help in improving node mobility. Due to this mobility factor and undefined infrastructure security is a major concern in mobile Ad Hoc networks. In this paper we are providing a detailed analysis of the performance of ad-hoc routing protocol AODV in Mobile Ad Hoc networks with and without the presence of malicious node. We have used Qualnet version 5.0 (simulator) to measure the effect of attack on mobile ad-hoc networks that gives a clear picture for the throughput, variations in CBR, packet delivery delay, average jitter and end-to-end delay in mobile ad-hoc networks when attack effects the mobile ad-hoc network.
International Journal of Computer Applications …, 2010
International Journal of Computer Applications, 2010
The exploitation of potential performance of superscalar processors has shown that processor is f... more The exploitation of potential performance of superscalar processors has shown that processor is fed with sufficient instruction bandwidth. The fetcher and the Instruction Stream Buffer (ISB) are the key elements to achieve this target. Beyond the basic blocks, the instruction stream is not supported by current ISBs. The split line instruction problem depreciates this situation for x86 processors. With the implementation of Line Weighted Branch Target Buffer (LWBTB), the advance branch information and reassembling of cache lines can be predicted by the ISB. The code generation for parallel register share architecture involves some issues that are not present in sequential code compilation and is inherently complex. To resolve such issues, a consistency contract between the code and the machine can be defined and a compiler is required to preserve the contract during the transformation of code. We want to achieve high level parallelism at faster clock speed it require distribution of processor resource and avoiding primitive that require single cycle global communication. Distribution of its resources, including instruction stream, register files, memory port and ALUs, over a pipelined two dimensional mesh interconnect are done by raw microprocessor [4]. In this paper, we propose a compiler RPCC for general purpose sequential programs on the raw machine.
In this paper, an all pairs optimized shortest path algorithm is presented for an unweighted and ... more In this paper, an all pairs optimized shortest path algorithm is presented for an unweighted and undirected graph with some additive error of at most 2.This algorithm can be extended for weighted graph also but it will not work for directed graph due to absence of commutative property. The run time of the algorithm is of order Ο(n5/2), where n is the number of vertices present in the graph. This algorithm is much simpler than the existing algorithms. A study of upper bounds on the size of a maximal independent set of such graphs has also been discussed.
In this paper we present a novel heuristic for selection of hyperblock in If-conversion. The if-c... more In this paper we present a novel heuristic for
selection of hyperblock in If-conversion. The if-conversion
has been applied to be promising method for exploitation
of ILP in the presence of control flow. The if-conversion in
the prediction is responsible for control dependency
between the branches and remaining instructions creating
data dependency between the predicate definition and
predicated structures of the program. As a result, the
transformation of control flow becomes optimized
traditional data flow and branch scheduling becomes
reordering of serial instructions. The degree of ILP can be
increased by overlapping multiple program path
executions. The main idea behind this concept is to use a
step beyond the prediction of common branch and
permitting the architecture to have the information about
the CFG (Control Flow Graph) components of the
program to have better branch decision for ILP. The
navigation bandwidth of prediction mechanism depends
upon the degree of ILP. It can be increased by increasing
control flow prediction in procedural languages at compile
time. By this the size of initiation is increased that allows
the overlapped execution of multiple independent flow of
control. The multiple branch instruction can also be
allowed as intermediate steps in order to increase the size
of dynamic window to achieve a high degree of ILP
exploitation.
In this paper, we present issues associated with hardware and compiler to exploit instruction lev... more In this paper, we present issues associated with hardware and compiler to exploit instruction level parallelism. In this reference the solutions related to balanced scheduling have been presented. The comparison of balanced scheduler and traditional scheduler has also been discussed. The balanced scheduling with three compiler optimizations is very helpful to increase ILP speedup with respect to loop unrolling, trace scheduling and cache locality analysis. Loop unrolling and trace scheduling increase ILP by giving the scheduler a large space of instructions from which to select. The cache locality analysis, in other way, utilizes the amount of ILP available more efficiently. By loop unrolling, the compiler can generate more ILP by duplication of iterations in multiple to the unrolling factor. The balanced scheduler can increase its advantages over the traditional scheduler in the cases when more ILP is available. We have shown how loop unrolling, trace scheduling and cache locality analysis in association with balanced scheduling can interlock the cycles by reducing them upto 5%. The same thing over the traditional scheduler can reduce cycles not less than 15%.
Instruction Level Parallelism (ILP) is not the new idea. Unfortunately ILP architecture not well ... more Instruction Level Parallelism (ILP) is not the new idea.
Unfortunately ILP architecture not well suited to for all conventional high level
language compilers and compiles optimization technique. Instruction Level
Parallelism is the technique that allows a sequence of instructions derived from
a sequential program (without rewriting) to be parallelized for its execution on
multiple pipelining functional units. As a result, the performance is increased
while working with current softwares. At implicit level it initiates by modifying
the compiler and at explicit level it is done by exploiting the parallelism
available with the hardware. To achieve high degree of instruction level
parallelism, it is necessary to analyze and evaluate the technique of speculative
execution control dependence analysis and to follow multiple flows of control.
The researchers are continuously discovering the ways to increase parallelism
by an order of magnitude beyond the current approaches. In this paper we
present impact of control flow support on highly parallel architecture with 2-
core and 4-core. We also investigated the scope of parallelism explicitly and
implicitly. For our experiments we used trimaran simulator. The benchmarks
are tested on abstract machine models created through trimaran simulator.
International Journal of Computer and Electrical Engineering, 2010
In this paper, an all pairs optimized shortest path algorithm is presented for an unweighted and ... more In this paper, an all pairs optimized shortest path algorithm is presented for an unweighted and undirected graph with some additive error of at most 2.This algorithm can be extended for weighted graph also but it will not work for directed graph due to absence of commutative property. The run time of the algorithm is of order Ο(n5/2), where n is the number of vertices present in the graph. This algorithm is much simpler than the existing algorithms. A study of upper bounds on the size of a maximal independent set of such graphs has also been discussed.
International Journal of Computer Applications, 2010
In this paper, we present an algorithm to compute all pairs optimized shortest paths in an unweig... more In this paper, we present an algorithm to compute all pairs optimized shortest paths in an unweighted and undirected graph with some additive error of at most 2.This algorithm can be extended for weighted graph also but it will not work for directed graph due to absence of commutative property. The algorithm runs in n 5/2) times, where n is the number of vertices in the graph. This algorithm is much simpler than the existing algorithms. A study of upper bounds on the size of a maximal independent set of such graphs has been performed.
International Journal of Computer Applications, 2010
In this paper we present control flow prediction (CFP) in parallel register sharing architecture ... more In this paper we present control flow prediction (CFP) in parallel register sharing architecture to achieve high degree of ILP. The main idea behind this concept is to use a step beyond the prediction of common branch and permitting the architecture to have the information about the CFG (Control Flow Graph) components of the program to have better branch decision for ILP. The navigation bandwidth of prediction mechanism depends upon the degree of ILP. It can be increased by increasing control flow prediction at compile time. By this the size of initiation is increased that allows the overlapped execution of multiple independent flow of control. The multiple branch instruction can also be allowed. These are intermediate steps to be taken in order to increase the size of dynamic window to achieve a high degree of instruction level parallelism exploitation.
International Journal of Computer and Electrical Engineering, 2011
the exploitation of potential performance of superscalar processors has shown that processor is f... more the exploitation of potential performance of superscalar processors has shown that processor is fed with sufficient instruction bandwidth. The fetcher and the Instruction Stream Buffer (ISB) are the key elements to achieve this target. Beyond the basic blocks, the instruction stream is not supported by current ISBs. The split line instruction problem depreciates this situation for x86 processors. With the implementation of Line Weighted Branch Target Buffer (LWBTB), the advance branch information and reassembling of cache lines can be predicted by the ISB. The code generation for parallel register share architecture involves some issues that are not present in sequential code compilation and is inherently complex. To resolve such issues, a consistency contract between the code and the machine can be defined and a compiler is required to preserve the contract during the transformation of code. We want to achieve high level parallelism at faster clock speed it require distribution of processor resource and avoiding primitive that require single cycle global communication. Distribution of its resources, including instruction stream, register files, memory port and ALUs, over a pipelined two dimensional mesh interconnect are done by raw microprocessor [4]. In this paper, we propose a compiler RPCC for general purpose sequential programs on the raw machine.
Acta Informatica Malaysia
The vein recognition is the most accurate and secure technology in the field of biometrics. We ha... more The vein recognition is the most accurate and secure technology in the field of biometrics. We have seen that many criminal cases like risk of forgery or theft. All biometric traits have their own advantages and disadvantages. Like in long term, a person's fingerprint may be damaged due to environment, aging or ethnicity; face recognition has medium accuracy; iris recognition is costly affair. To overcome all the problems, the solution is vein recognition. The most important point about vein recognition system is that it works on living persons only as the infrared radiation is absorbed by hemoglobin present in blood of living persons. The skin integrity does not affect the accuracy or readability of finger vein recognition. There are lots of benefits to include vein recognition in biometric traits in which some are, users are not required to get in physical contact with the device. This technology can also be used for medical purposes like children that are below the age of 12 have very smooth vein which sometime cannot be detected by doctors to inject injection which can be detected by NIR cameras. In this paper we have presented the various techniques of palm vein recognition that are applied in today's scenario, Need and Scope of vein recognition system, its implementation challenges, its application, Merit and Demerit and finally conclusion.
Acta Electronica Malaysia
Un-wired Computers (Mobile Computers) such as Laptops, Net-books, Notebooks and Personal Digital ... more Un-wired Computers (Mobile Computers) such as Laptops, Net-books, Notebooks and Personal Digital Assistants (PDAs) are the fastest growing segments of Computing Industry; this changing aspect of computing has led to the invention of Mobile Ad-Hoc Networks (MANET). Mobile Ad Hoc networking is a new era of infrastructure less communication networks for mobile devices (hosts, nodes etc.), where mobile nodes that are in radio range for each other can directly communicate with each other when in range and can even use intermediate nodes as routers when are moving away or are getting out of range from the connected nodes, this property along with the ability of switching from one network topology to another help in improving node mobility. Due to this mobility factor and undefined infrastructure security is a major concern in mobile Ad Hoc networks. In this paper we are providing a detailed analysis of the performance of ad-hoc routing protocol AODV in Mobile Ad Hoc networks with and without the presence of malicious node. We have used Qualnet version 5.0 (simulator) to measure the effect of attack on mobile ad-hoc networks that gives a clear picture for the throughput, variations in CBR, packet delivery delay, average jitter and end-to-end delay in mobile ad-hoc networks when attack effects the mobile ad-hoc network.
International Journal of Computer Applications …, 2010
International Journal of Computer Applications, 2010
The exploitation of potential performance of superscalar processors has shown that processor is f... more The exploitation of potential performance of superscalar processors has shown that processor is fed with sufficient instruction bandwidth. The fetcher and the Instruction Stream Buffer (ISB) are the key elements to achieve this target. Beyond the basic blocks, the instruction stream is not supported by current ISBs. The split line instruction problem depreciates this situation for x86 processors. With the implementation of Line Weighted Branch Target Buffer (LWBTB), the advance branch information and reassembling of cache lines can be predicted by the ISB. The code generation for parallel register share architecture involves some issues that are not present in sequential code compilation and is inherently complex. To resolve such issues, a consistency contract between the code and the machine can be defined and a compiler is required to preserve the contract during the transformation of code. We want to achieve high level parallelism at faster clock speed it require distribution of processor resource and avoiding primitive that require single cycle global communication. Distribution of its resources, including instruction stream, register files, memory port and ALUs, over a pipelined two dimensional mesh interconnect are done by raw microprocessor [4]. In this paper, we propose a compiler RPCC for general purpose sequential programs on the raw machine.
In this paper, an all pairs optimized shortest path algorithm is presented for an unweighted and ... more In this paper, an all pairs optimized shortest path algorithm is presented for an unweighted and undirected graph with some additive error of at most 2.This algorithm can be extended for weighted graph also but it will not work for directed graph due to absence of commutative property. The run time of the algorithm is of order Ο(n5/2), where n is the number of vertices present in the graph. This algorithm is much simpler than the existing algorithms. A study of upper bounds on the size of a maximal independent set of such graphs has also been discussed.
In this paper we present a novel heuristic for selection of hyperblock in If-conversion. The if-c... more In this paper we present a novel heuristic for
selection of hyperblock in If-conversion. The if-conversion
has been applied to be promising method for exploitation
of ILP in the presence of control flow. The if-conversion in
the prediction is responsible for control dependency
between the branches and remaining instructions creating
data dependency between the predicate definition and
predicated structures of the program. As a result, the
transformation of control flow becomes optimized
traditional data flow and branch scheduling becomes
reordering of serial instructions. The degree of ILP can be
increased by overlapping multiple program path
executions. The main idea behind this concept is to use a
step beyond the prediction of common branch and
permitting the architecture to have the information about
the CFG (Control Flow Graph) components of the
program to have better branch decision for ILP. The
navigation bandwidth of prediction mechanism depends
upon the degree of ILP. It can be increased by increasing
control flow prediction in procedural languages at compile
time. By this the size of initiation is increased that allows
the overlapped execution of multiple independent flow of
control. The multiple branch instruction can also be
allowed as intermediate steps in order to increase the size
of dynamic window to achieve a high degree of ILP
exploitation.
In this paper, we present issues associated with hardware and compiler to exploit instruction lev... more In this paper, we present issues associated with hardware and compiler to exploit instruction level parallelism. In this reference the solutions related to balanced scheduling have been presented. The comparison of balanced scheduler and traditional scheduler has also been discussed. The balanced scheduling with three compiler optimizations is very helpful to increase ILP speedup with respect to loop unrolling, trace scheduling and cache locality analysis. Loop unrolling and trace scheduling increase ILP by giving the scheduler a large space of instructions from which to select. The cache locality analysis, in other way, utilizes the amount of ILP available more efficiently. By loop unrolling, the compiler can generate more ILP by duplication of iterations in multiple to the unrolling factor. The balanced scheduler can increase its advantages over the traditional scheduler in the cases when more ILP is available. We have shown how loop unrolling, trace scheduling and cache locality analysis in association with balanced scheduling can interlock the cycles by reducing them upto 5%. The same thing over the traditional scheduler can reduce cycles not less than 15%.
Instruction Level Parallelism (ILP) is not the new idea. Unfortunately ILP architecture not well ... more Instruction Level Parallelism (ILP) is not the new idea.
Unfortunately ILP architecture not well suited to for all conventional high level
language compilers and compiles optimization technique. Instruction Level
Parallelism is the technique that allows a sequence of instructions derived from
a sequential program (without rewriting) to be parallelized for its execution on
multiple pipelining functional units. As a result, the performance is increased
while working with current softwares. At implicit level it initiates by modifying
the compiler and at explicit level it is done by exploiting the parallelism
available with the hardware. To achieve high degree of instruction level
parallelism, it is necessary to analyze and evaluate the technique of speculative
execution control dependence analysis and to follow multiple flows of control.
The researchers are continuously discovering the ways to increase parallelism
by an order of magnitude beyond the current approaches. In this paper we
present impact of control flow support on highly parallel architecture with 2-
core and 4-core. We also investigated the scope of parallelism explicitly and
implicitly. For our experiments we used trimaran simulator. The benchmarks
are tested on abstract machine models created through trimaran simulator.