Behzad Hosseini - Academia.edu (original) (raw)

Uploads

Papers by Behzad Hosseini

Research paper thumbnail of Similarity; Heuristics; Shingle

Identifying the same document is the task of near-duplicate detection. Among the near-duplicate d... more Identifying the same document is the task of near-duplicate detection. Among the near-duplicate detection algorithms, the fingerprinting algorithm is taken into consideration using in analysis, plagiarism, repair and maintenance of social softwards. The idea of using fingerprints is in order to identifying duplicated material like cryptographic hash functions which are secure against destructive attacks. These functions serve as high-quality fingerprinting functions. Cryptographic hash algorithms are including MD5 and SHA1 that have been widely applied in the file system. In this paper, using available heuristic algorithms in near-duplicate detection, a set of similar pair document are placed in a certain threshold, an each set is indentified according to being near- duplicate. Furthermore, comparing document is performed by fingerprinting algorithm, and finally, the total value is calculated using the standard method.

Research paper thumbnail of The Search of New Issues in the Detection of Near-duplicated Documents

Identifying the same document is the task of near-duplicate detection. Among the near-duplicate d... more Identifying the same document is the task of near-duplicate detection. Among the near-duplicate detection algorithms, the fingerprinting algorithm is taken into consideration using in analysis, plagiarism, repair and maintenance of social softwards. The idea of using fingerprints is in order to identifying duplicated material like cryptographic hash functions which are secure against destructive attacks. These functions serve as high-quality fingerprinting functions. Cryptographic hash algorithms are including MD5 and SHA1 that have been widely applied in the file system. In this paper, using available heuristic algorithms in near-duplicate detection, a set of similar pair document are placed in a certain threshold, an each set is indentified according to being near- duplicate. Furthermore, comparing document is performed by fingerprinting algorithm, and finally, the total value is calculated using the standard method.

Research paper thumbnail of Similarity; Heuristics; Shingle

Identifying the same document is the task of near-duplicate detection. Among the near-duplicate d... more Identifying the same document is the task of near-duplicate detection. Among the near-duplicate detection algorithms, the fingerprinting algorithm is taken into consideration using in analysis, plagiarism, repair and maintenance of social softwards. The idea of using fingerprints is in order to identifying duplicated material like cryptographic hash functions which are secure against destructive attacks. These functions serve as high-quality fingerprinting functions. Cryptographic hash algorithms are including MD5 and SHA1 that have been widely applied in the file system. In this paper, using available heuristic algorithms in near-duplicate detection, a set of similar pair document are placed in a certain threshold, an each set is indentified according to being near- duplicate. Furthermore, comparing document is performed by fingerprinting algorithm, and finally, the total value is calculated using the standard method.

Research paper thumbnail of The Search of New Issues in the Detection of Near-duplicated Documents

Identifying the same document is the task of near-duplicate detection. Among the near-duplicate d... more Identifying the same document is the task of near-duplicate detection. Among the near-duplicate detection algorithms, the fingerprinting algorithm is taken into consideration using in analysis, plagiarism, repair and maintenance of social softwards. The idea of using fingerprints is in order to identifying duplicated material like cryptographic hash functions which are secure against destructive attacks. These functions serve as high-quality fingerprinting functions. Cryptographic hash algorithms are including MD5 and SHA1 that have been widely applied in the file system. In this paper, using available heuristic algorithms in near-duplicate detection, a set of similar pair document are placed in a certain threshold, an each set is indentified according to being near- duplicate. Furthermore, comparing document is performed by fingerprinting algorithm, and finally, the total value is calculated using the standard method.

Log In