Last Updated: February 25, 2016
·
859
· lucasrinaldi

Git is easier than you think

Here i will run through some Git insides to help you understand it in an easier way, i personally think this is an awesome tool and i am sure a lot of people think the same, Linus did a great job.

You can either follow my lead and create a new Git repo just for using as example or use one that you already have.

Init and first version

Go to your terminal (OS X or Linux, i won't explain for Windows, if you want
download Cygwin) and type this to create our repo:

~$ mkdir mynewrepo
~$ cd mynewrepo/
~/mynewrepo$ touch index.html
~/mynewrepo$ echo "<title>Coder Wall Post</title>" > index.html 
~/mynewrepo$ git init
~/mynewrepo$ git add index.html 
~/mynewrepo$ git commit -m "Created index page"

You can notice that a hidden folder named .git was created, this is where your repository live and each different repo has one .git folder, feel free to check what we have inside this folder.

Changes and second version

We have now a repo with one file and one commit, this commit is the first version of our repo.

Let's try this:

~/mynewrepo$ echo "<div>The simplicity of Git</div>" >> index.html 
~/mynewrepo$ cat index.html 
 <title>Coder Wall Post</title>
 <div>The simplicity of Git</div>
~/mynewrepo$ git add index.html 
~/mynewrepo$ git commit -m "Changes made on index.html"

We just created the second version of it. Now it comes the cool part, let's dig in the .git.

Diving on Git

Go into mynewrepo/.git folder, you will see that there is a folder named objects/, this folder is where the Git "object database" is, each file in this subfolder is a object and it can be either a blob (represents one file), a tree (multiple blob or tree objects), a commit (version of our code) or a tag.

~/mynewrepo/.git/objects$ find .
./1e/172dd78d8d0c06d70dc2368d2473d8823dfef1
./52/fd4a0493ef07bec1e4bbef605f53417cf268b9
./75/ea055e99cf2facf24b33125c0afff33cc12235
./a0/47435aad7147188c24614c01d46442ecae2b55
./d9/26b468aa2bd7e5d530ff4328a9d74f01c07c66
./e8/64d2df7cc14394b814459908b894e61ca93dba
./info
./pack

Inside a Git Object

These objects are all compressed with zlib, to read the contents of the objects we can use tool that git provides.

Reading the object with git cat-file:

~/mynewrepo/.git/objects$ git cat-file -p d926b468aa2bd7e5d530ff4328a9d74f01c07c66
<title>Coder Wall Post</title>

The output is the index.html contents, so we know that it's a blob object.

Let's try another:

~/mynewrepo/.git/objects$ git cat-file -p 75ea055e99cf2facf24b33125c0afff33cc12235
tree 52fd4a0493ef07bec1e4bbef605f53417cf268b9
parent e864d2df7cc14394b814459908b894e61ca93dba
author Lucas Rinaldi <...@gmail.com> 1378043411 +0200
committer Lucas Rinaldi <...@gmail.com> 1378043411 +0200

Changes made on index.html

This is a commit object, it's our last commit, check the comment we added earlier.

Each of this objects filenames are their content's SHA1 Hashes with some user data together, so if you are following along, notice that your filenames aren't the same as mines. Git calls this naming thing of "content-adressable".

So, what the freck this means?

The commit object is the complete version of your code at the moment of your commit. All you need to know is included in it:

tree 52fd4a0493ef07bec1e4bbef605f53417cf268b9
parent e864d2df7cc14394b814459908b894e61ca93dba
author Lucas Rinaldi <...@gmail.com> 1378043411 +0200
committer Lucas Rinaldi <...@gmail.com> 1378043411 +0200

The hash that appears after tree, indicates the state of the repository tree at the moment, which files and folders are inside of it, let's check:

~/mynewrepo/.git/objects$ git cat-file -p 52fd4a0493ef07bec1e4bbef605f53417cf268b9
100644 blob 1e172dd78d8d0c06d70dc2368d2473d8823dfef1    index.html

Because we just have one file, that's only what is going to appear on the tree. I will not add more files and folders because this text is already big enough, but try adding, specially folders to it, you will see how cool is the way they deal with it... cof cof..recur..cof..sion.

And the parent hash is just the pointer to the previous commit made.

Branches, heads, refs, tags

For this small project is easy to find the last commit made, and check all the files, but can you imagine for a huge project? The time that would take to find all the information?

That's why git write down the hash of the last commit made:

~/mynewrepo/.git$ cat refs/heads/master 
75ea055e99cf2facf24b33125c0afff33cc12235

And now you just saw what a branch is, just the hash of the last commit made on the branch, and it leads to the whole history of it.

~/mynewrepo/.git$ git checkout -b new_branch
Switched to a new branch 'new_branch'
~/mynewrepo/.git$ ls refs/heads/
$ ls refs/heads/
master
new_branch
~/mynewrepo/.git$ cat HEAD
ref: refs/heads/new_branch
~/mynewrepo/.git$ cat refs/heads/new_branch
75ea055e99cf2facf24b33125c0afff33cc12235

As git supports multiple branches, they write down which branch you are currently working on:

~/mynewrepo/.git$ cat HEAD 
ref: refs/heads/master

It can contain the hash of the commit or the name of the branch. And it is the same thing with tags, just a hash of the commit tagged.

Conclusion

I hope this little resume of git guts has helped you understand and get more interested in it, for me git is not really straight-forward, i think its commands are not semantic and sometimes i get confused with them. Knowing the things that i wrote here helped me to understand the machine better.

Feel free to add more knowledge :).