Lab 2: Version Control
Overview
In this lab, you will learn the basics of Mercurial, a distributed version control system. You will use Mercurial for the remainder of this class to safely backup and store history of your lab work, including saving a final version for grading.
Please note: From this point on, I will only grade your work saved in Mercurial. Emailed project code, USB keys, printouts, etc... are not acceptable methods of submitting your work and will be returned without being graded.
Pre-Lab
Before class, create an account at BitBucket.org, a free service that can host both Mercurial and Git repositories. BitBucket is a "freemium" service. You can have an unlimited number of private repositories in your account, but you can only share your repositories with a limited number of users. Thus, the free service is suitable only for small groups, but is sufficient for typical class projects.
- Go to the BitBucket.org website
- Click the Signup button
Enter your name, email, and password to create a new account
This is your personal account.
My goal is that you will become comfortable using version control in this class, and will continue to use this tool in future classes
Next, create a private repository (i.e. a version control folder) in your account. This repository will hold all your your course work for this class, including: C source code, MIPS assembly code, Python code, PDF documents, plain text files, etc...) Inside the repository, you can create as many nested folders as you want to organize things. Although private, you will grant me read-only access to your repository so that I can grade your work.
- Choose Repositories->Create Repositories (if there isn't a big obvious create button on the screen already)
- Enter this project name: 2015_fall_ecpe170 (Why did I choose this name? I hope you'll use this tool for other courses in future semesters / years, and wanted an organized naming scheme)
- Make sure the private box is checked - your work is not to be shared by default
- Select the repository type: Mercurial
- Select the language type: C (This just helps the web source code browser. You can - and will - be checking all kinds of code into this project, including C, MIPS, and Python)
- Enter a description: ECPE 170 (or whatever you want)
- Select Create Repository
Finally, add a special instructor account as a read-only user on the private repository you just created. For grading, I will take the last commit into your repository that was submitted before the posted deadline.
- While on your repository web page, select the Gears Wheel (icon)
- Select Access Management on the left side
- Under the Users area, enter the username: pacific_ecpe170 (The drop-down entry should be labeled "Pacific ECPE 170"). Click Add.
- Select the Read button to grant this instructor account read-only access
Checkpoint to obtain pre-lab credit - due by start of class:
Email me ( jshafer -at- pacific.edu) the "hg clone" command displayed under the "Clone" button on the Overview tab of your repository. Be sure to select the HTTPS tab.
This command should look like this: hg clone https://yourname@bitbucket.org/yourname/2015_fall_ecpe170
I will use this command to get an initial (empty) copy of your coursework repository, and pull the latest updates throughout the semester for grading.
I will also grant you read-only access to a repository containing initial code for future labs.
Lab Part 1 - Using your Personal Repository
Assuming that you have already created your repository at BitBucket (or at one of any number of other online services), the first thing to do is make a copy of the repository on your local computer. This action -- making a complete copy of a repository -- is called cloning.
When cloning a repository, you need to determine two things:
- What is the repository I want to clone, and where is it stored?
(The hg clone command emailed in the pre-lab contains the answer to this question) - Where do I want to place the repository on my local computer?
(My personal preference is a folder in my home directory with an obvious name, so I know anything inside it is under version control)
First, make sure you are at your home directory. Then, create a directory to hold all of your repositories, and then enter it.
unix> cd
unix> mkdir bitbucket
unix> cd bitbucket
Now, inside the nice obvious "bitbucket" directory, place a clone of your repository. (Make sure to replace <yourname> with your actual username):
unix> hg clone https://yourname@bitbucket.org/yourname/2015_fall_ecpe170
http authorization required
realm: Bitbucket.org HTTP
user: yourname
password:
destination directory: 2015_fall_ecpe170
no changes found
updating to branch default
0 files updated, 0 files merged, 0 files removed, 0 files unresolved
A side note: Why is the program for Mercurial named hg? Hg is the chemical symbol for the element Mercury... not the best name, but it is very short and fast to type!
Think of this cloned repository as your own private sandbox. It has a complete copy of all files in your project. You can add, modify, and delete files here to your heart's content, and it won't affect the online repository until you explicitly upload your changes.
Now, enter the folder for your new repository:
unix> cd 2015_fall_ecpe170
Inside the repository, open a configuration file (hgrc) for Mercurial in a hidden directory (.hg):
unix> gedit .hg/hgrc &
Append the following information to the bottom of the existing configuration file, save it, and close. Be careful not to delete the existing content! Note that username variable should remain intact - only its value changes!
[ui]
username = Your Name <your@email.com>
For example, on my computer, the complete hgrc file looks like:
[paths]
default = https://jshafer@bitbucket.org/jshafer/2015_fall_ecpe170
[ui]
username = Jeff Shafer <my.real.email@goes.here>
Inside the repository, let's create a directory structure to keep organized.
unix> mkdir lab{02..12..1}
This uses "brace expansion" in the Bash shell to run a for-loop and create many directories with the name lab02, lab03, lab04, etc.. It is the equivalent of this longer command: mkdir lab02 lab03 lab04 lab05 lab06 lab07 lab08 lab09 lab10 lab11 lab12
You will use this one repository for all of your work in ECPE 170.
Whenever you work on a lab, be sure you are working inside the corresponding repository lab folder created above.
For other classes, create separate repositories to stay organized.
By design, Mercurial won't track empty folders. To bypass this -- and to eliminate any possibility of grading confusion -- we're going to put a text file named author containing your name in each folder. The echo command (as the name implies) will output the string in quotes on standard output. The caret (>) redirects the output to the specified file, which will be created if it doesn't exist. Repeat once per folder.
unix> echo "YOUR NAME" > lab02/author
unix> echo "YOUR NAME" > lab03/author
unix> echo "YOUR NAME" > lab04/author
unix> echo "YOUR NAME" > lab05/author
unix> echo "YOUR NAME" > lab06/author
unix> echo "YOUR NAME" > lab07/author
unix> echo "YOUR NAME" > lab08/author
unix> echo "YOUR NAME" > lab09/author
unix> echo "YOUR NAME" > lab10/author
unix> echo "YOUR NAME" > lab11/author
unix> echo "YOUR NAME" > lab12/author
At this point, you have a simple directory structure and some files in your repository.
Go to the BitBucket.org website and log in. Select your repository. View your repository files by clicking on the Source tab. Is anything there?
Lab Report:
(1) Take a Screenshot (using the Ubuntu Screenshot tool) showing what the BitBucket source view looks like at this point.
By design, you have to explicitly tell Mercurial when to save a version, and when to copy it from your local repository to a remote repository, such as the BitBucket site. Until you do that, there will be nothing shown at the BitBucket website. Let's start that process now.
It is a good practice to commit work to the version control system frequently - whenever you are at a convenient stopping point.
First, check and see the status of your current repository:
unix> hg status
? lab02/author
? lab03/author
? lab04/author
? lab05/author
? lab06/author
? lab07/author
? lab08/author
? lab09/author
? lab10/author
? lab11/author
? lab12/author
All those question-mark files indicate files that are not currently tracked by the version control system. You need to explicitly tell Mercurial to track a file by using the add command - otherwise its history will not be recorded. Let's add all of the "author" files now:
unix> hg add lab*/author
Re-run the hg status command now, and note the difference:
unix> hg status
A lab02/author
A lab03/author
A lab04/author
A lab05/author
A lab06/author
A lab07/author
A lab08/author
A lab09/author
A lab10/author
A lab11/author
A lab12/author
Instead of a "?" indicating an untracked file, you get an "A" indicating a newly-added file.
If you forget what the little letters in front of each filename mean, you can always find out from the built-in Subversion help system:
unix> hg help status | less
Simply adding a file is not enough. Mercurial won't track every keystroke you make. Rather, it will save snapshots whenever you explicitly tell it to via the commit command. Let's do that now:
unix> hg commit -m "Creating initial directory structure for class repository"
The commit command saves a snapshot of all files that Mercurial is tracking (via prior use of the add command). This snapshot is saved to your local repository. Notice my use of the optional -m argument to specific the commit message. This message appears in the log of all commits made to the repository. Imagine you've been working on a large project for several weeks, and have dozens of commits. You want to look at the code right before you made a major change to the project. Having descriptive commit messages makes it easier to find the right place in the project history.
Re-run the hg status command now, and note that the newly added files no longer appear. This is because you've told Mercurial to save a snapshot of their contents, and they have not been modified since the last snapshot.
unix> hg status
Go to the BitBucket.org website again. View your repository files by clicking on the Source tab. Is anything there?
Lab Report:
(2) Take a Screenshot (using the Ubuntu Screenshot tool) showing what the BitBucket source view looks like at this point.
Mercurial is what is known as a Distributed Version Control System. Essentially, the repository on your local disk is an independent entity from the repository on the BitBucket server. In order to have a good backup, you need to push from your repository to the BitBucket repository.
unix> hg push
pushing to https://yourname@bitbucket.org/yourname/2015_fall_ecpe170
http authorization required
realm: Bitbucket.org HTTP
user: yourname
password:
searching for changes
remote: adding changesets
remote: adding manifests
remote: adding file changes
remote: added 1 changesets with 11 changes to 11 files
Go to the BitBucket.org website again. View your repository files by clicking on the Source tab. Is anything there? There should be!
Lab Report:
(3) Take a Screenshot showing what the BitBucket source view looks like at this point.
(4) Take a Screenshot showing what the BitBucket version history looks like at this point (under the Commits tab)
Note: Do you see warnings like this when you use Mercurial? "bitbucket.org certificate with fingerprint 24:9c:45:8b:9c:aa:ba:55:4e:01:6d:58:ff:e4:28:7d:2a:14:ae:3b not verified (check hostfingerprints or web.cacerts config setting)". If so, add two lines to your hgrc file to specify which HTTPS certificate authority to use for verification. For Ubuntu and other Debian-based Linux distributions, the appropriate lines to add are:
[web]
cacerts = /etc/ssl/certs/ca-certificates.crt
Let's continue working in our repository. Enter the lab02 folder:
unix> cd lab02
Create a text file called reasons_to_use_vcs.txt and open it in an editor:
unix> gedit reasons_to_use_vcs.txt &
Go read the discussion on version control systems at http://stackoverflow.com/questions/1408450/why-should-i-use-version-control.
In your text file, briefly summarize three reasons to use version control systems in the form of a bulleted list.
This is a good point to add the new file to Mercurial, and check it in:
unix> hg add reasons_to_use_vcs.txt
unix> hg status
lab02/reasons_to_use_vcs.txt
unix> hg commit -m "Lab 2: Added file on why version control is awesome"
unix> hg push
Typically I always run the hg status command before an hg commit. Why? The status command will show you all of the new/modified/deleted files that will be included as part of the commit. Further, it shows you any untracked files that you forget to include (by mistake).
Lab Report:
(5) Take a Screenshot showing what the BitBucket version history looks like at this point (under the Commits tab)
Let's see how Mercurial handles existing files that are modified.
Open the text file and add two more reasons to use version control to your bulleted list. Save the file, and exit.
What is the status of the repository now?
unix> hg status
M lab02/reasons_to_use_vcs.txt
? lab02/reasons_to_use_vcs.txt~
The "M" indicates a modified file, which you just did. The second line (for the file reasons_to_use_vcs.txt~) is an annoyance. Gedit isn't smart enough to realize we're using a VCS, and thus it saves the prior file version in a file ending with a tilde (~) character. Just ignore it. Better text editors (Emacs, anyone?) don't have this problem.
The Mercurial diff command can show you the difference between the current version of a file, and the file as last committed to the repository. In other words, what has changed?
hg diff
The output of this command is in "diff" (difference) format, which is a common Unix utility. Let's see if you can deduce what each line represents.
Lab Report:
(6) What does the line starting in --- represent?
(7) What does the line starting in +++ represent?
(8) What do the lines starting in + represent?
(9) Can you guess what a line starting in - would represent? (Depending on the edits you made, you might have a few of these lines too)
Now commit your updated work:
unix> hg commit -m "Lab 2: Added additional reasons on why version control is awesome"
unix> hg push
Version control systems (like Mercurial) provide special commands to copy or move files. You should use these special commands, instead of the generic Unix cp or mv commands, because they preserve version history. (Thus, for example, if you rename a file, you could see its history back through its original filename).
To move a file within a Mercurial repository, use: hg mv <filename> <destination_filename>
To copy a file within a Mercurial repository, use: hg cp <filename> <destination_filename>
There are more topics that we haven't covered in this introductory tutorial. For example:
- Branching, Merging, and resolving Conflicts - In this scenario, different users modify the same files at the same time, committing them to their own private repositories, and then trying to combine their work together at a later date. That is really what these tools are designed to help with!
- Reverting to an earlier version of a file (or project)
Lab Report:
(10) What benefits would a large team of developers get from version control? Identify at least two.
(11) What benefits would a single developer (working alone) get from version control? Identify at least two.
(12) What kind of files should you put in version control?
(13) What kind of files should you not put in version control? Why?
(14) What is the difference between an add, commit and a push?
(15) Why is it better to use hg cp instead of plain-old regular cp to copy a file within the repository?
(Optional) Lab report:
(1) How would you suggest improving this lab in future semesters?
Lab Report Submission: Check this lab report into your BitBucket.org personal repository (under the lab02 folder) by the posted deadline. Reports must be submitted as PDF files, not as LibreOffice files (Choose File->Export as PDF).
Lab Part 2 - Using the Class Repository
In ECPE 170, you will use two different repositories:
- Your personal repository used for your classwork and submitting assignments. You have write access to this repository.
- The class repository used to obtain the latest boilerplate code. You have read-only access to this repository.
By placing boilerplate code in a repository instead of .tar.gz files on the website, it will be easier to update them with bug fixes as the class progresses.
Clone the class repository now. (Make sure to replace <yourname> with your user name):
unix> cd ~/bitbucket
unix> hg clone https://yourname@bitbucket.org/shafer/2015_fall_ecpe170_boilerplate
http authorization required
realm: Bitbucket.org HTTP
user: yourname
password:
destination directory: 2015_fall_ecpe170_boilerplate
Get an authorization error? Check with the instructor and make sure your username was granted read-permissions on this repository.
At the beginning of each future lab, you will follow the following process to obtain the boilerplate code:
- Pull the latest version of the class repository (because I may have added or updated files)
- Update your current working directory of the class repository - Without this, you won't see the new/updated files!
- Copy the files you want from the class repository into your personal repository
- Add the new files to version control in your personal repository
- Commit the new files in your personal repository, so you can easily go back to the original starter code if necessary.
- Push the version to the bitbucket.org website
- Begin work on the lab!
Lab Part 3 - Solidifying Your Knowledge
There is no formal post-lab. However, reading these two articles on version control and distributed version control systems will help solidify your hands-on experience today, as well as give you more exposure to branching and merging. They're easy to read and have nice visual diagrams on these abstract concepts.