Version Control System

Version Control System

Why use Version Control System (VCS)?

  • Allows more than one developer to work on the same file simultaneously.
  • Keeps track of the revision history on the file and allows roll back to previous version if needed.
  • Assists in integrating the work done on a file by many developers simultaneously. In most cases, edits to the same file can be combined without losing any work. In rare cases, when conflicting edits happens to the same line of a file, then the VCS requests human assistance in deciding what to do.

Repositories and working copies

Version control uses a repository (a database of changes) and a working copy where developer do the work.

Working copy (sometimes called a checkout) is developer personal copy of the file. Developer make arbitrary edits to this copy, without affecting other developers. When edits are completed, developer commit changes to the repository.

A repository is a database of all the edits to, and/or historical versions (snapshots) of the file. It is possible for the repository to contain edits that have not yet been applied to the developer working copy. Developer can update the working copy to incorporate any new edits or versions that have been added to the repository since the last time updated.

Distributed and centralized version control

There are two general varieties of version control: centralized and distributed. Some popular version control systems are Mercurial (distributed), Git (distributed), Subversion (centralized), TFS (centralized), Perforce (centralized), and CVS  (centralized).

The main difference between centralized and distributed version control is the number of repositories. In centralized version control, there is just one repository, and in distributed version control, there are multiple repositories.

version-control-fig2 version-control-fig3

In centralized version control, each developer gets his or her own working copy, but there is just one central repository. As soon as developer commit, it is possible for other developers to update and to see the changes. For other developers to see the changes, 2 things must happen:

  1. Developer commit
  2. Other developers update

In distributed version control, each developer gets his or her own repository and working copy. After developer commit, other developers  have no access to the changes until developer push the changes to the central repository. When developer update, developer do not get other developers changes unless developer have first pulled those changes into his or her repository. For other developers  to see the changes, 4 things must happen:

  1. Developer commit
  2. Developer push
  3. Other developers pull
  4. Other developers update

Notice that the commit and update commands only move changes between the working copy and the local repository, without affecting any other repository. By contrast, the push and pull commands move changes between the local repository and the central repository, without affecting the working copy.

Conflicts

A conflict occurs when two developers make simultaneous, different changes to the same line in a file. In this case, the VCS cannot automatically decide which of the two edits to use (or a combination of them, or neither!). Manual intervention is required to resolve the conflict.

Merging changes

In a centralized version control system, developer can update the working copy at any moment, even if he or she have uncommitted changes in the working copy. The VCS merges developer uncommitted changes in the working copy with the ones in the repository. The implicit merging that a centralized VCS performs when developer update is a common source of confusion and mistakes.

Version control best practices

  • Use a descriptive commit message

This is useful when someone is examining the change, because it indicates the purpose of the change. This is useful when someone is looking for changes related to a given concept, because they can search through the commit messages.

  • Make each commit a logical unit

Each commit should have a single purpose and should completely implement that purpose. This makes it easier to locate the changes related to some particular feature or bug fix, to see them all in one place, to undo them, to determine the changes that are responsible for buggy behavior, etc. The utility of the version control history is compromised if one commit contains code that serves multiple purposes, or if code for a particular purpose is spread across multiple different commits.

During the course of one task, developer may notice another issue and want to fix it too. If a single file contains changes that serve multiple purposes, developer may need to save all edits, then re-introduce them in logical chunks, committing as a go.

  • Incorporate others’ changes frequently

Work with the most up-to-date version of the files as possible. That means that developer should update the working copy very frequently.

When two developers make conflicting edits simultaneously, then manual intervention is required to resolve the conflict. But if someone else has already completed a change before developer even start to edit, it is a huge waste of time to create, then manually resolve, conflicts. Developer would have avoided the conflicts if the working copy had already contained the other developer changes before started to edit.

  • Share your changes frequently

The reason is the same as the reason for incorporating others’ changes frequently.

  • Coordinate with your co-workers

If you plan to make significant changes to (a part of) a file that others may be editing, coordinate with them so that one of you can finish work (commit) before the other gets started. This is the best way to avoid conflicts. A special case of this is any change that touches many files (or parts of them), which requires you to coordinate with all your teammates.

  • Don’t commit generated files

Version control is intended for files that people edit. Generated files should not be committed to version control. For example, do not commit binary files that result from compilation, such as .dll files or .exe files. Generated files are not necessary in version control; each user can re-generate them (typically by running a build program such as MSBuild or MAnt). Generated files can bloat the version control history (the size of the database that is stored in the repository). Eventually, this affects performance of the version control system.

  • Understand your merge tool

The least pleasant part of working with version control is resolving conflicts. If you follow best practices, you will have to resolve conflicts relatively rarely.

You are most likely to create conflicts at a time you are stressed out, such as near a deadline. You do not have time, and are not in a good mental state, to learn a merge tool. So, you should make sure that you understand your merge tool ahead of time. When an actual conflict comes up, you don’t want to be thrown into an unfamiliar UI and make mistakes. Practice on a temporary repository to give yourself confidence.

Leave a comment