RFR(JDK11/NIO) 8202285: (fs) Add a method to Files for comparing file contents (original) (raw)

Joe Wang huizhe.wang at oracle.com
Tue May 1 21:54:17 UTC 2018


Thanks John for the background and detailed information.

-Joe

On 4/30/2018 6:18 PM, John Rose wrote:

On Apr 30, 2018, at 4:47 PM, Joe Wang <huizhe.wang at oracle.com_ _<mailto:huizhe.wang at oracle.com>> wrote:

Are there real-life use cases? It may be useful for example to check if the files have the same header. After equality comparison, lexical comparison is a key use case. By allowing the user to interpret the data around the mismatch, the comparison can be made sensitive to things like locales. As Paul implies, finding a mismatch is the correct operation to build equality checks on top of, because (a) a mismatch has to be detected anyway to prove inequality, and (b) giving the location of the mismatch, instead of throwing it away, unlocks a variety of other operations. If you want real-life use cases, look at uses of /usr/bin/cmp in Unix shell scripts.  The cmp command is to Unix files what Paul's array mismatch methods are to Java arrays.  Here's a man page reference: https://docs.oracle.com/cd/E19683-01/816-0210/6m6nb7m6c/index.html As with the array mismatch methods, the cmp command allows the user to specify optional offsets within each file to start comparing, as well as an optional length to stop comparing after. See the file BufferMismatch.java for the (partial) application of these ideas to NIO buffers. I suppose the Java-flavored version of "cmp - file" would be a file comparator which would take a byte buffer as a second operand, and return an indication of the location of the mismatch.  Note that "cmp - file" compares a computed stream against a stored file. I think Paul and I have sketched a natural "sweet spot" for performing bitwise comparisons on stored data.  It's up to you how much to implement. I suggest that, if you don't feel inspired to do it all in one go, that you leave room in the code for future expansions (maybe as with BufferMismatch), and perhaps file a follow-up RFE. — John



More information about the core-libs-dev mailing list