View Single Post
Old 09-17-2007, 03:08 PM   #4
trog
Human being with feelings
 
Join Date: Aug 2007
Posts: 2
Default

Quote:
Originally Posted by rob View Post
It would definitely be a nice feature. I'd assume that plain Md5 would be a bit slow in some cases though. There must be a way to do a similar check that isn't as costly. Rsync, is able to do so very quickly. Maybe there is some way to pull a checksum directly out of the file system?
I've often considered a "quick and dirty md5sum" variant that would be useful in the situation where you don't want to spend the time doing an entire md5sum.

Basically you could just md5 the first (say) 100k bytes of a file, the middle 100k bytes of a file, and the end 100k bytes of the file.

While this wouldn't be as accurate as doing the full file, you might be able to get "good enough" accuracy to make doing comparisons much faster.

Obviously some testing would be good to tweak the numbers and style (eg, maybe it'd be more effective to md5sum 10k bytes, then skip a big chuck, then the next 10k bytes, and so on).

I'd still have the option though for complete md5 checking for the situations when data integrity is important.
trog is offline   Reply With Quote