Github and the Magical Black Hole of Version Control
03 Oct 2014In celebration of my first-ever (successful!) pull request, I’m dedicating this post to the use of git and Github. The concept of git and Github was a huge source of frustration to me early in the summer, when I painfully managed to get my website onto Github without any understanding of what I was doing. I’m still learning how to understand and use these tools effectively and appropriately, and I’m sure I will be for a very long time.
Git isn’t something you can really see, certainly not in the way you can see the results of HTML or Ruby that you wrote, and a beginner uses git and Github differently from someone working on a team. And unlike with other concepts, there aren’t many resources out there that explain the basics or how to improve on your own. This tutorial is great for learning the commands, but I when I started, I didn’t understand what I was accomplishing with these commands or why.
First you need to understand that git is a method of version control and tracks your changes, while Github is a website that makes git much easier to use and understand. Git runs Github, not the other way around. I think of git as a magical black hole that permanently absorbs information from files and can communicate between your computer and Github, but only if you tell it to and are responsible about doing so.
You can connect files on your computer to Github by creating a folder, or directory, adding in the files that you want to put on Github, and sprinkling magical git ‘init’ dust on the folder to transform it into a repository. This doesn’t really change the folder; it just tells git to watch the folder. You then create a repository on Github, and tell git as directed that you want to connect the repository on your computer and the respository on Github.
Once your folder is a git repository, git can see the files in it, but it can’t do anything with them. You need to ‘add’ those files. Now they’re ‘staged’ but git still isn’t doing anything with them; it just has the potential. If you want git to actually save those files or the new changes in them, you need to ‘commit’ them. The files on your computer will be unchanged, but now copies of those files, any past versions of those files that you committed, and the ‘commit’ messages you added will permanently reside in the magical black hole of git. It’s only at this point that you can tell Github about these files, and if you’re ready, ‘push’ them into the ‘master’ branch of the repository on Github.
If you change your documents and want the changes to appear on Github you need to ‘add’, ‘commit’, and ‘push’, or just ‘add’ and ‘commit’ if you want git to save and track your changes but you don’t want them to appear online yet. You might notice that you can edit files in Github and think that this will be easier, but it won’t be. This will likely result in conflicts, or differences between versions, which are frustrating and confusing and are not supposed to happen while you’re working alone. Later when you’re a contributing directly to a repository with other developers, you will have to carefully ‘pull’ their changes as they add them, thereby updating what you have locally on your computer, while making and pushing your own changes. This is when conflicts typically occur.
Git doesn’t like conflicts because its purpose is to orchestrate different people working on the same project and to encourage organized, planned changes. To git, even if you’re working alone, the repository on your computer and the repository on Github are still separate places. It expects that you are working only on the files on your computer and that you, like your files, are unaware of what is happening on Github until you ask git and it tells you.
If you’re interested in someone else’s project, you can ‘fork’ or ‘clone’ it, downloading it to your own Github account or computer, where you can play with it as you wish. If eventually you do this to an open source project and you commit useful changes to your forked or cloned version, you can submit a ‘pull request’ to ask those in charge of the original project to ‘pull’ your changes into their own main version of the project, and actually use them.
If you want to make a series of changes you’re unsure about or take your work on differing paths, you can create a new branch. Branches separate different series of changes. In practice, if you have a currently functioning project but want to change it, you will probably create and commit your new changes to a ‘development’ branch, where the changes are completely separate from the main ‘master’ branch that your functioning project most likely lives in. The changes in ‘development’ will leave your master branch untouched until you ‘merge’ the branches, adding the new changes from the new branch into the original branch.
Git does far, far more than I can even begin to talk about, but I wanted to address some basic misunderstandings that I’ve struggled with and have seen others struggling with. There is lots of information out there, but almost all of it assumes some prior level of understanding (and that you’re working on a large project with others), even when explaining the basics.
This has been another overly long post, but if you’re interested in the difficulties of beginner programmers (whether you’ve been one yourself or not), this piece is an incredibly interesting, non-git-related blog post that I recently read and highly recommend.
P.S. Save completely separate backups before attempting anything in git that you aren’t confident about.