Graphs, Hashes, and Compression, Oh My!

by Matthew McCullough 

Thursday, 14 June 2012 15:20

Git is a version control system. We can look at it from that high level. Git is a content tracking system. Some teachers advise us to look at it from that lowered elevation. But I will take you to the very bottom. The floor. The code. The algorithms. The directed acyclic graph of hashed bit sequences made efficient through LZW commpression and deferred garbage collection determined by node reachability via hash relationships.

“But why?”, you may ask. “Why go this deep?”” Git is a tool that works so well for so many. It mystically corrects anticipated `merge` conflicts. It’s “where did code come from” results from `blame` are impressive. The ability to re-write history through `rebase` is awesome. The globally unique identifier nature of a hash-produced ref is revolutionary.

Uber-geeks are magic-slayers. We want and need to know precisely how things work. Like a hard 50 push-up workout, this study will make working with Git at the daily developer level a fraction of the effort — like a mere ten push-ups. Let’s dig into the guts of Git.