Checksums ELI5
I recently had a conversation with a friend who was studying for the CISA, an accounting certification for information systems. His book did a less-than-stellar job of describing the process of ensuring data quality with checksums.
His question originally was:
can you explain hash data?
Well, yes, if you elaborate on what type of hash data you need explained.
The process of applying a mathematical algorithm to the data that travel in the network and placing the results of this operation with the hash data is used for controlling data integrity
Ahhh yes, the use of a checksum. For some, the aforementioned definition is all you need.
Here’s how using checksums should be explained
Checksums Explained
The Goals
You want to send a file.
You want to make sure that the whole file is sent, and that nothing changed while that file was sent.
The How
Before you send the file, using an algorithm, make a combination of letters and numbers based on the file contents.
Send that unique string of ‘595f44fec1e92a71d3e9e77456ba80d1’ along with the data.
When it gets to where it’s going, apply the same algorithm to the file contents and see if the combination matches.
The Followup
So if it doesn’t match, there is likely a virus or some type of attempted hack?
Exactly! Either the file contents were maliciously tampered with, or some of the data didn’t get transferred accidentally. It’s important to remember that files get transferred in little packets.
Sometimes the packets don’t make it, or the packets are intercepted. Checking the file contents with a checksum(hash) before and after the transfer ensures that the file is the same at both the source and destination.