3 Git and GitHub
Before we start: everybody make sure you have Git installed. Open a terminal and type:
git -v
If not installed follow the instruction in textbook.
Goal for the day: create a repository, push something to the repository, connect RStudio to GitHub, clone the class notes.
We want to avoid this:
This is particularly true when more than one person is collaborating on editing the file. And even more important when there are multiple files, as there is in software development, and to some extend data analysis.
Git is a version control system that provides a systematic approach to keeping versions of files.
But we have to learn some things:
I use < >
to denote a placeholder. So if I say <filename>
what you eventually type is the filename you want to use, without the < >
3.1 Why use Git and GitHub?
Sharing.
Collaborating.
Version control.
We focus on the sharing aspects of Git and GitHub, but introduce some of the basics that permit you to collaborate and version control.
3.2 What is Git?
3.3 What is GitHub?
Basically, it’s a service that hosts the remote repository (repo) on the web. This facilitates collaboration and sharing greatly.
There many other features such as
- a recognition system: reward, badges and stars, for example.
- hosting web pages, like the class notes for example.
- forks and pull requests,
- issue tracking
- automation tools
It has been describes a social network for software developers.
The main tool behind GitHub, is Git.
Similar to how to how main tool behind RStudio, is R.
3.4 GitHub accounts
Once you have a GitHub account, you are ready to connect Git and RStudio to this account.
A first step is to let Git know who we are. This will make it easier to connect with GitHub. We start by opening a terminal window in RStudio (remember you can get one through Tools in the menu bar). Now we use the git config
command to tell Git who we are. We will type the following two commands in our terminal window:
git config --global user.name "Your Name"
git config --global user.mail "your@email.com"
Consider adding a profile README.md
. Instructions are here
Looks like this
3.5 Repositories
You are now ready to create a GitHub repository (repo). This will be your remote repo.
The general idea is that you will have at least two copies of your code: one on your computer and one on GitHub. If you add collaborators to this repo, then each will have a copy on their computer. The GitHub copy is usually considered the main (previously called master) copy that each collaborator syncs to. Git will help you keep all the different copies synced.
Let’s go make one on GitHub…
Then create a directory on your computer, this will be the local repo, and connect it to the Github repository.
First copy and paste the location of your git repository
It should look something like this:
https://github.com/your-username/your-repo-name.git
git init
git remote add origin <remote-url>
Now the two are connected.
3.6 Overview of Git
The main actions in Git are to:
- pull changes from the remote repo, in this case the GitHub repo
- add files, or as we say in the Git lingo stage files
- commit changes to the local repo
- push changes to the remote repo, in our case the GitHub repo
3.6.1 The four areas of Git
3.6.2 Status
git status filename
3.6.3 Add
Use git add
to move put file to staging area.
git add <filename>
git status <filename>
3.6.4 Commit
Use
git commit -m "must add comment"
to move all the added files to the local repository. This file is now tracked and a copy of this version is kept going forward… this is like adding V1
to your filename.
You can commit files directly without using add
by explicitely writing the files at the end of the commit:
git commit -m "must add comment" <filename>
3.6.5 Push
To move to upstream repo we use
git push -u origin main
The -u
flag sets the upstream, so in the future, you can simply use git push to push changes. So going forward we can just type:
git push
Here we need to be careful as if collaborating this will affect the work of others. It might also create a conflict
.
3.6.6 Fetch
To update our local repository to the remote one we use
git fetch
3.6.7 Merge
Once we are sure this is good, we can merge with our local files
git merge
3.6.8 Pull
It is common to want to just skip the fetch step and just update everything. For this we use
git pull
3.6.9 Checkout
If you want to pull down a specific file you from the remote repo you can use:
git checkout filename
But if you have a newer version in your local repository this will create a conflict. If you are sure you want to get rid of your local copy you can remove and then checkout.
You can also use checkout
to pull older version:
git checkout <commit-id> <filename>
You can get the commit-id
either on the GitHub webpage or using
git log filename
If you are asked for passwords when connecting or pushing things to you want to read this and avoid this. It will be impossible to use if you have to enter a password each time you push.
3.7 Branches
Git can be even more complex. We can have several branches. These are useful for working in parallel or testing stuff out that might not make the main repo.
We wont go over this. But you should at least now these three commands
git remote -v
git brach
3.8 Clone
If you
git clone <repo-url>
pwd
mkdir git-example
cd git-example
git clone https://github.com/rairizarry/murders.git
cd murders
3.9 Using Git in RStudio
Go to file, new project, version control, and follow the instructions. Then notice the Git
tab.
For more memes see Meme Git Compilation by Lulu Ilmaknun