Why version control matters
Version control enables designers and developers to work on the same code at the same time, without overwriting each others changes. Version control allows you to do several things, but let’s focus on a few key points, and why they matter.
Let’s say you're creating a new feature for your project. At some point during this process, you realize that the code is not functioning and you need to revert to a previous working state. Version control allows you to back up to a previous state if you realize that a modification was a mistake, or not a good idea. With Git, you can experiment and make changes with confidence knowing that if a file is deleted, it can be recovered.
If you work on a team, you have likely come across the situation where multiple people are working on the same file, at the same time. For some, the remedy for this can range from shouting “hey, I’m working on file xyz, no one touch it!". But is there a better way? With Git, multiple people can make changes to the same file or collection of files without the concern that modifications will go undetected.
Another advantage of using version control is having the ability to attach a date to your changes. Whether you’re working solo, or on a team, having the option to jump back to any moment in time is invaluable. With version control, once a file has been edited, you can save it and reference it later. Do you need a file as it was on October 19th, 2005? Now it's only a click, or command away. The use of commit messages are another powerful communication tool. They help you understand the motivation and history behind a specific change, whether it happened two weeks ago, or two years ago.
What makes Git a solid choice
Like most aspects of technology, there are several options available to meet your version control needs. And like most tools, you need to understand the advantages and disadvantages of your options to make the best decision for your needs.
The most popular version control tool in recent years was Subversion (commonly referred to as SVN). A centralized service, SVN required developers to connect to the master copy of the code to commit changes. In our current landscape of remote working and distributed teams, this aspect of SVN is less than ideal. SVN can be slower to work with and less flexible when working with large numbers of branches.
This was part of why Git was created. Intended to be a lightweight, distributed VCS, Git is intended to ensure that developers are never constrained by a network connection or waiting around for code updates from coworkers. Git fits into the modern development mentality of making fast, small changes and making them often. Iteration is the key and Git gives the developer an easy to use map of how a project has progressed.
In addition to speed, Git offers many other advantages. A developer is not constrained by a network connection; changes can be made offline, then pushed to a remote repo later. Git also fits different workflows, enabling teams of different types to use the VCS in a way that suits them best. And, unlike SVN, switching between or merging branches is straightforward.
Last, and most important, Git can remove some of the fear from making changes. The changes from any commit can be “undone” with the various options Git provides. This is crucial, especially for young developers. Git’s easy branching and multiple options for reverting changes mean you can experiment with your code without fearing any lasting consequences.
To summarize, what makes Git attractive to developers?
- fewer constraints
- flexible workflows
- intuitive branching
- changes can be easily rolled back
- useful for distributed teams
It’s easy to see why Git has become such a popular choice in the last 5–10 years, surpassing all other version control systems.
Having said all that, there are some disadvantages to using Git. No technology is perfect, and Git is no exception to that general rule. How does it fare poorly in comparison to a VCS like SVN?
First, it can be difficult to grasp how to use for those brand new to version control. Although we listed “flexible workflows” as an advantage to Git, this can also be what makes it hard to get started. If you're new to programming in general and are not familiar with common practices, Git’s “anything goes” approach can be daunting. Not to worry though, that’s why we (and others) are creating a guide on this subject.
Relatedly, teams can cause themselves problems if they do not make the effort to have an agreed upon and communicated workflow. Because each developer has a copy of the repo, agreeing to follow the same conventions for different events is vital. Whereas SVN forces team members to contribute changes to the “trunk” in the same manner. Git is flexible and as such, teams can get themselves into a painful mess, very quickly.
As well, Git does not work well with large repos. Whereas an SVN repo can be several GB in size, Git will have trouble with a repo greater than 1 GB. Also, Git does not handle large files (or binary files) well, whereas this is not an issue with SVN.
Last, for some organizations, having multiple copies of their code on different computers is a security concern. For industries like banking and health care, this approach is not feasible or desired when compared to a centralized store of the code on internal infrastructure.
However, all these concerns aside, there is a reason why Git is the fastest growing VCS. The number of Git repos hosted on Beanstalk surpassed the number of SVN repos in mid-2015. And the trend is twofold: SVN repos are dropping while Git repos rise. For most teams, the advantages Git offers far outweigh any other aspects of the tool. If you’re transitioning to Git from SVN, or another tool, we recommend the "Migrating to Git" as a reference.
Let’s Git started
Before we can begin exploring Git and it’s capabilities, you’ll want to make sure that you have it installed on your computer. If you haven’t already, we have a few guides that can help you get setup below.
Setting up your Git config is one of the first tasks you’ll want to take when setting up your Git environment. It allows you to instruct Git on how you want it to function for you. Once you've installed Git, you’ll want to link your username and email address to your Git profile. Git uses this information from your configuration to stamp each commit you make with your credentials. You can run this command to add your username:
git config --global user.name "Ashley Harpp"
You can read the output of your git config with this command:
git config --global user.name
Then, you should see an output like this:
The username should be of the user that will be making the commits. This is another one time setup — but it’s an important one. You can also tell Git what text editor to open by default when you’re using it.
For example, you can run this command to tell Git that you want the program "Atom" as your default text editor:
git config --global core.editor "atom --wait"
Working with repositories
What is a repository?
If you’re going to use a form of version control, you’ll need to know how to work with repositories. But what exactly does a repository do? What is its purpose? A repository is a place where your project lives. It’s a directory that will store all of your code, text and image files.
When you make changes to your project, your repository should always reflect the most recent updates. The general idea and best practice is to commit early, and commit often. In order to use Git, you must have a repository. They are the most basic component of Git.
How do I create a repository?
Note: You can access Git by using various Git graphical user interface (GUI) clients, but this guide is focusing on the command line. Understanding how to use Git on the command line could make the transition to a GUI client easier down the road.
Now that you know what a repository is — it’s time to create one. If you’re just getting started with Git, the git init command is one you don’t want to forget. This command creates a new Git repository. Let’s see how this works in action. Let’s say you have a new project, we’ll call it, “my-first-project”. If you’re working on the command line you’ll first need to locate that file on your local machine, and initialize it with Git. You can do that by typing the following:
cd my-first-project git init
Once you’ve done that you have created a Git repository. You can confirm that the initialization was successful by looking for the .git directory. This directory is by hidden by default, but you can locate it by typing:
This command is telling Git to list all of the files in the repository, including those that are hidden.
Creating a repository on Beanstalk
The video below explains how to create a repository on Beanstalk.
The staging area
In Git, the staging area is a place where modified files live until they are ready to be committed. If you add a file by using the git add (file) command, or if you make changes to a file that already exist — those files will go to the staging area in Git until it is explicitly committed. Let’s discuss two fundamental commands you’ll need to know to use Git.
This command is simple, but it holds much value. If you are adding a new file to your repository, you’ll need to know how to tell Git that it exists and that you’d like to include it in the repo. This is where git add comes into action. Let’s say you want to add a new blog page called, “my-blog.html”. You’ve added the file to your repository, but how do you tell Git that the file is there? You can run:
git add my-blog.html
If you’ve used SVN the past, it’s important not to confuse the git add command with svn add. With the svn add command, you can add a file to a repository. Instead, the git add command itself does not directly impact the repository at all. It isn’t until you use the git commit command that the changes are recorded in your repo. You’ll need to use git add each time you update a file.
Tip: Files in your local directory are either tracked or untracked. Tracked files are files that are not new to your repository. If you're working on an existing repo, then all of the files are tracked. However, if you're adding a new file to a repository, then that file is untracked because Git does not have a record of it's existance in any of the previous working states. This means in order to track the file, it needs to be added to your repo using the git add and git commit commands.
With this in mind, you can think of git add as a command with multiple use cases. You can track brand new files, stage modified files, and mark files that have merge conflicts as resolved. Let’s look an an example of how you can use git add to modify a file that was already tracked.
Let’s say you have added the file, “beanstalk.html”. You’ve already used the git add command to add it to the staging area, and commit the file to your repo. Now you want to update the file, and commit the new changes to your repo. Let’s see what happens when you run the git status command to see how Git responds to your changes. Run:
On branch develop Changes not staged for commit: (use "git add <file> ... " to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: beanstalk.html no changes added to commit (use "git add" and/or "git commit -a")
Git is saying that the “beanstalk.html” file has been modified. Even though Git is tracking the file, you’ll still need to use the git add command to commit your changes to the repo. Now let’s run the Git add command to add the file to the staging area, and then we’ll run git status again.
On branch develop Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: beanstalk.html
Now the file has been added to the staging area and is waiting on you to commit it. But let’s say you notice a typo, and you need to make one more change to that file. You open the “beanstalk.html” file and fix the typo and then run git status again.
On branch develop Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: beanstalk.html Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in the working repository) modified: beanstalk.html
Is Git saying that you have two of the same file in your repo? Not at all. These modified files represent two separate versions of the same file. Let’s examine this in detail.
Under the “Changes to be committed” section the “beanstalk.html” file has been modified. This file represents the initial change made before correcting the typo. When the initial change was made, the “beanstalk.html” file was added to the staging area. But the most recent change (where the we fixed the typo) has not added to the staging area yet. That’s where the next section comes into play under the “Changes not staged for commit” section. The red color here, denotes that the latest version of the file was not added to the staging area. The green color shows that the file was added to the staging area, but not committed. To fix this, we need to use the git add command again to stage the latest version of the file. Now, we’ll use the git add command to stage the most recent update where the typo was fixed.
On branch develop Changes to be committed: (use "git reset HEAD YOURFILE..." to unstage) modified: beanstalk.html
Now we’re back to normal, and ready to commit our changes to the repo. But how do we do that?
This will open your text editor as you’ll now need to enter your commit message. If you want to enter the commit message without opening a text editor, you could run this command instead:
git commit -m "your commit message goes here"
- The staging area is for files that have been modified or added, but not committed.
- Commit atomic changes, instead of committing one file at a time or a set of minor changes.
- The credentials used in your git config connect you to your commits.
The power of branches
Branching is an essential aspect of version control, and Git is no exception. Branches allow you to have a separate working environment to add new features, fix bugs, and test code without affecting the main source code. In this section, we’ll discuss how to create branches, their benefits, and how to use them on your team.
Creating branches may sound intimidating, but Git allows you to do this with ease. Whether you’re working on a new project, or project that already exist you’ll likely need to make improvements, or test code after it’s live. To help make this process efficient you’ll want to have a solid branching workflow. First, let’s discuss the basics of branching, and then we can discuss workflows and strategies later.
You can create a branch by running this command:
git branch new-branch
To confirm that you've created the branch, run:
This command will give you a list of all the branches in your repository. By default, Git gives you a “master” branch. This branch is important, and is for production ready code — we’ll discuss why later.
To delete a branch you can run:
git branch -d my-new-branch
Git offers a few protective layers when running this command. First, you can not delete a branch that you’re currently working on. This would be equal to using a chainsaw to cut a table that you’re standing on — not a good situation. Due to these risks, Git will warn you of this impending travesty by displaying an error message. If you receive an error message like this:
error: Cannot delete the branch 'my-new-branch' which you are currently on.
That means you’ll need to checkout a different branch first, and then you can delete the branch you were previously using. But what is git checkout and how does it work?
The git checkout command allows you to switch between different branches that were created using the git branch command. You can run the git checkout command like this:
git checkout develop
Switched to branch 'develop'
This command tells Git to navigate to the “develop” branch.
Another protective layer is that Git will not allow you to delete a branch if it has changes that are not merged. You can force Git to delete a branch by running:
git branch -m my-new-branch
You’ll want to use this command with discretion. It should be used when you decided that the branch is not needed.
Understanding how branches work
Now that you know how to create, list, and delete branches, let’s discuss how branches function within your project. First, it’s important to know that branches are not a separate repository. Branches point to a set of commits. It does not change or affect the repository or its history. If you create a new branch, you are creating a new pointer to your latest commit. This means if you want to add new features, fix bugs, and test code, you’ll need to create a new branch for each unique task, and work within that branch until its ready to be merged into your master branch. Now let’s discuss how to determine which branch you’re working, and how to checkout and make changes to them.
You can always check to see which branch you’re currently working on by running:
You should see something like this. The asterisk symbolizes which branch you’re working on.
* develop master my-new-branch
You may have noticed the master branch in the output. The master branch is the default working branch. It is your official code base that Git creates for you once a repo has been inititalized. We do not reccomend using the master branch for testing purposes, it's only for production ready code.
Now let’s make a commit so that we can put everything we just learned together. Let’s add a new file to our develop branch.
git checkout develop git add about-us.html git commit -m "adding about us page"
Now that we have added and committed the new file, let’s check the history by running the git log command.
commit e3306c3c804cdb3105a5ddcd0528415d3443ad9 Author: Ashley Harpp <firstname.lastname@example.org> Date: Fri Feb 19 16:24:03 2016 -0500 adding about page commit 377bf6d3fedcdb02baea7a01d8914820d365db9c Author: Ashley Harpp <ashleyharpp@ashleys-MBP.net> Date: Tue Feb 16 22:00:37 2016-0500 my first commit yay.
This commit will only show in the develop branch because that’s where Git is instructed to make the commit. If you change to the master branch, you’ll see that the log is different. Let’s switch to the master branch and check the log to see what commits are in that branch.
git checkout master git log
Date: Tue Feb 16 22:00:37 2016 -0500 my first commit yay.
You may have noticed that the commit where the “about-us.html” is not showing in the log. This is because that commits are recorded within the branch they are created.
To summarize, branching gives you peace of mind. It confirms that any features, or bug fixes that are currently in progress, will live in it’s own independent line of development. Now that you know how to create branches, and commit to them, how can you integrate those changes into the main source code? You can use the git merge command to add the new changes to your master branch.
At some point, you’ll want to combine your develop or feature branch with your code in production. You can do this by using the git merge command. Here’s how this would look in it’s simplest form.
git checkout master git merge develop
By running the commands above, you are doing two things. First, you are telling Git that you want to switch to your master branch. Second, you are instructing Git to take all the commits from the develop branch and integrate them with the master branch. When merging, you cannot select which commits you want to merge. Instead, Git will look for the commits that do not exist in the current branch and integrate those.
The current branch will be updated to reflect the changes of the merge. The branch that was merged into the master (in this case, the develop branch) will remain unchanged. To keep your repository neat and tidy we recommend deleting a branch after it has been merged.
But what if your situation is more complex? What if you’re working on a feature branch, but then the master branch is being updated while your feature is still in development? This is when you may encounter a conflict. We’ll discuss merge conflicts in our advanced Git topics guide. In the meantime, we recommend reading over "Basic merge conflicts" from ProGit and dealing with merge conflicts from Tower. For now, we’ll focus on resolving simple problems you may encounter when merging.
- Creating a branch does not change or affect the repository history.
- Branches point to set of commits.
- The git checkout command allows you to navigate to different branches.
Resolving basic conflicts
While merging is a powerful tool, there are some boundaries. What happens when you’re trying to merge, but a line of code was changed in more than one branch? Which version does Git choose? The simple answer is, Git does not choose at all — you do. Let’s face it, merge conflicts can be downright scary. But once you understand how and why they occur, you’ll have better idea of how to resolve them.
There are a few things you’ll want to understand when dealing with conflicts. First, conflicts only occur on your local machine and not on the server. Yes, a conflict may slow down productivity but it will only affect you — not your teammates or colleagues. When you have a merge conflict Git is saying, “hey! I see that there has been more than one change done to the same file. I need help deciding which file I need to merge.” Git is asking you to help it decide what to do.
One command you’ll want to be comfortable with during this period is git status. This command tells you which changes are tracked, untracked, staged, or not staged. Think of it as a messenger friend that will keep you in the loop about changes that you haven’t committed or forgot about. The git status command will tell you which files need to be resolved. When you run git status you may receive an output like this:
git status # On branch develop # You have unmerged paths. # (fix conflicts and run "git commit")
Now your messenger friend (via git status) is saying that you need to decide which files need to be merged. You can change the necessary files and then run git add to tell Git that you have resolved the conflict. At this point, you can commit those changes to create a merge commit and continue working. Of course, settling conflicts can be a bit more detailed. You can read more about how to deal with conflicts in your Git repo in our help article here.
Tip: If you’re in a bind and need revert a merge, you can do so by running:
git merge -- abort
- Having a conflict means that Git needs help deciding what to do.
- Git status is your best friend.
- Merging is reversible.
Now we have a general idea of the function of branches, merging, and how to settle conflicts. But what if you’re working on a team where there are several feature branches, and bug fixes going on at once? How do you decide what to merge, and when to do it? There are many different branch workflows that exist, and we discuss some of them here.
Working with remote repositories
With Subversion, it’s easy to picture the “source of truth” for your code. It’s the code stored in central location for which teammates must check out of. With Git, the model is different and so there does not have to be any source of truth.
However, with a distributed team, it’s a good practice to use a code hosting service and to treat it as your “source of truth”. All teammates can pull from and push to this master copy of your repository. This is where the concept of a remote repository comes into play. With Git, most of your work is done locally: you make changes to files, stage those changes, then commit them.
But with a remote repository, you can then push those changes to the remote repository so that all your team members can access it. And with a solid development process, teammates can conduct reviews and merge branches when appropriate.
Having the master repo in a location that everyone on a team can access gives the team the “source of truth” to work from and enables a robust development workflow. All the advantages of Git (offline access, speed) are still available, but with the added advantages of having a centralized copy of your code.
Creating a remote repository is usually as straightforward as clicking a “Create new repository” button in various code hosting services. But once your team has a remote repo in place, how do you get it to your computer? This is where cloning in Git comes into play.
git clone https://accountinfo.git.beanstalkapp.com/test-repo.git
When you clone a repository, you are copying all the data from your remote host, to your working copy. You will also have access to all the commit and file history. This means, if something goes wrong, you are generally safe. You should be able to use any clone to restore the server to a previous state in time. If you’d like to get fancy and clone a repository to your local directory, you can run this command:
git clone https://accountinfo.git.beanstalkapp.com/test-repo.git mylocalydirectory
This command is like the first git clone command, except this one looks for a specific directory or folder to send the cloned repository. There are many protocols that you can use to clone a repo. Another common method is using the SSH protocol. This means that the URL you’re using when running the git clone command does not include "https://". Instead the SSH transfer protocol will look like this: "git://" or simply:
Pushing and Pulling
You’ve now made some changes to your Git repository on your computer. You have a remote repository in place and have your Git configuration pointed to it. How do you get your changes to your team? By doing a push:
git push remote_repo_name branch_name
This will take your changes and add them to the remote repository. Teammates who have access to this repo can now take your changes and add them to their own local copies of the repository.
As well, you can do the same with theirs. This activity is known as a “pull”. When you run the git pull remote command you will be able to sync changes made on the remote repo into your local repository. This is great for collaborating with other developers, and you want to mesh the changes they’ve made with yours.
But what if you only want to view what has been changed, and not sync them with your local repository? This is where you can use the command git fetch .
The git fetch command tells Git that you want to import commits from your remote repository. But instead of integrating the change in your local environment — the commits are hosted in separate branch. This is another marvel of Git. Now you have the opportunity to review any code before merging it in with your local repository. To fetch a repo, you can run:
git fetch remote-repo
Now that you understand the most basic components of Git, here’s a summary of what we’ve covered so far.
- The importance of version control, and why it matters. With Git, you can reverse changes, make multiple modifications, and track your history with one tool.
- We learned why Git makes a solid choice due to its flexibility, and how it compares with older VCS like SVN.
- We also discussed how to begin working on your local repository by setting up your git config, adding files to the staging area, and then committing them.
- We learned how branches function, the basics of merge conflicts, and how to resolve simple problems you may face when merging.
- We covered the basics of working on a remote repository by using the git pull and git fetch commands.
These commands are the tip of the iceberg when working with Git. There are countless Git workflows and there’s a ton of flexibility when Git is utilized to it’s potential. Still, there are more complex, and robust features that Git offers that we have yet to discuss. We’ll cover move advance Git topics in the proceeding guide.