Introduction to Version Control System & Github
This documentation has been made to share the knowledge about the GitHub platform, its advantages, features, and importance in building and sharing projects or code files online. This documentation also focuses on the need to have an online repositories, branches, commits, and pull requests.
- Version Control
- Tools for Version Control
- GitHub and Git
- Git Features
- Git Operations & commands
Version control can be thought of a management system that manages the changes that we make to the projects till the end. The changes might be adding new files, modifying old files, change the source code, etc.
Whenever we make any changes to the project, the version control system creates a snapshot of the entire project and saves it. And these snapshots are actually known as different versions. Snapshots are the entire state of your project at a particular time, i.e., it will contain what kind of files the project is storing at that time and what are the changes we have made.
We started building a web-page, and initially, we created the start page, say, “index.html”. Then we added an “about.html” page to it. Then again we made some changes to the “about.html” page by adding some texts, changing the page layout, etc.
Now, the VCS (version control system) shall detect that some modifications have been made, and something new has been created. We can consider all of these different modifications as different versions.
Version 1 – index.html webpage
Version 2 – Addition of “about.html” web page
Version 3 – Modification of “about.html” web page
Question. Can we go back to a previous version, if we make a mistake?
Yes, that is what the whole purpose of a version control system is.
Sometimes, we make changes, and then we don’t want them. VCS always keeps the older versions neatly packed inside it. If at a period of time, we want to roll-back to a previous version, we can.
Why Version Control?
The first thing that the version control system avails are “Collaboration”.
Let’s say, there are 3 developers working on a particular project, and everyone is working in isolation, or even if we are working in the same shared folder, there might be conflicts sometimes, when each one of us is trying to modify the same file. At the end, when we try to collaborate or merge the work together, we will end up with a lot of conflicts. We don’t know who has done what kind of changes.
But the VCS provides us with a shared workspace and continuously tells us who has made what kind of changes, and what has been changed. We get notified if someone has done some change in our project. Now, we can visualize everyone’s work properly. The project will evolve as a whole from start, and it will save a lot of time for us cutting down the time for resolving conflicts.
Saving a version of the project after making any changes, is very essential. Now, we may have some questions in our mind, like, how much would we save, would we save the entire project, or would we save only the changed part.
If we only save the changes, it would be hard for us, to view the entire project at a time. And if we save the entire data, we would have a large amount of redundant data and occupy a large amount of unnecessary space.
Another problem arrives when we name these different versions. Even if we are very organized, and use a very comprehensible naming scheme, with new varying versions, there is a chance that we will actually lose track of naming them.
The third problem is, how we actually know what the difference between the versions are, and what exactly was changed.
If we have a VCS, we don’t need to worry about all these.
VCS provides us a backup. We have a central server, where all the project files are located. And apart from that, every developer has a copy of the file on their local machines, known as local copies.
What developers do is actually, every time when they start their work, they fetch all the project files from the central server, and store them on their local machine. And when they are done working, they actually transfer all files back to the central server.
At the time of crisis, when the server crashes, we don’t need to worry. The copy of the entire project is stored on the local machines. If one developer, forgot to keep a backup, there is always someone who will keep the files updated.
Helps to Analyze the Project
When we have finished the project, and we want to know how the project actually evolved. We want to know what the drawbacks were. If we need to analyze the entire span of work, the VCS provides us with the proper description, what exactly was changed, and when was it changed.
Version Control Tools
There are 4 most famous Version Control Tools;
- Git (it is a distributed version control system)
- SubVersion (do not provide local backup functionality)
- CVS (do not provide local backup functionality)
- Mercurial (it is very similar to git)
Question. Is Git a open-source? Yes, Git is an open-source platform.
Git and GitHub
For now just consider repository as a data space, where we store all the project files or related files. In a distributed version control system, we got the central repository and local repository. Developers first do changes to their local repositories, and then push changes to the central repositories. Also periodically, the developers pull data from the central repository to their local repository for backup.
GitHub is the central repository, and Git is the tool that allows us to create a local repository.
Git is a version control management tool that will allow all these operations, i.e. to fetch data from the server and to push all local files to the central server.
GitHub is a code hosting platform for version control collaborations. It is a company that allows hosting the central repository on a central remote server. In short, it can be thought of as a social network for developers. Developers share their code.
In a distributed version control system, we do not need internet connection always. We just need it when we push or pull from the central server.
What is Git?
Git is a distributed Version Control Tool that supports distributed non-linear workflows by providing data assurance for developing quality software.
Different Features of Git
- Light Weight
- Open Source
Git allows the distributed development of the code. As we already know, every developer has a local copy of the entire development history, and changes are copied from one repository to another. It now is immaterial if the developers reside at different geographical locations, they can still work together.
It is compatible with existing systems and protocols. Migration from other Version Management system repositories to Git is possible. If we have an SVN and SVK repository, and we want to migrate to Git, it can be directly accessed using Git-SVN.
Git tracks the current state of the project by creating a tree graph from the object. It also includes techniques using which we can navigate and visualize all of our work.
It allows us to do non-linear software development. Git is the only one which has a branching module. We can have multiple independent branches. A master branch which starts from the start of the project and till the end contains the entire project.
Git uses lossless compression techniques to compress data on the client’s side. So don’t worry about the local repository.
It provides us with a lot of speed. We do not need to have internet available always to work in the distributed environment. We can work with our local repository. Git is actually written in C. It reduces all the run-time heads, and makes it faster.
Git was actually created by Linus Torvalds, the famous man who created the Linux Kernel. He actually used Git for the development of the Linux Kernel. The source code is available, and we can modify and use it.
It is very reliable. We have multiple backups. We can also make a duplicate copy of the central repository.
Git used the SHA1 to name and identify the objects. Whenever we make a change, it makes a commit object. SHA1 is a type of cryptographic encryption technic. No change is hidden from the entire group of developers.
Git is released under GPL’s license and is for free. We save a lot of money, by not using costly servers.
What is a Repository?
The repository is a directory or a storage space where our projects can live, and it can be local to a computer, or it can be a storage space on GitHub or another online host.
Types of repository are;
- Central Repository
- Local Repository
These files are stored in as a “.git” folder inside the project’s root folder locally and centrally.
Git Operations & Commands
For creating a central repository, we first need a GitHub account. Create one first, and then using the GUI, create a central repository.
Install Git on your local machine. Run the Git Bash. Go to a folder or directory, right click, and you will get the option “git bash”.
use the command to create a local repository
git remote add origin <link>
To link the local repository with the central repository. To find the link, go to the GitHub account and then to the central repository. Click on the green “Clone/Download” icon on the screen and copy the https url.
To pull files from the central repository
To push files from local repository to the central repository
Use this command to clone or download your existing repository from GitHub. Not file, clone the entire repository.
There is an intermediate layer called “index” which resides between the workspace (project folder) and the local repository. When we want to commit changes or make changes to the local repository, we have to add those files to the index first.
Shows us the files that are added to the index and are ready to commit.
git add <file_name>
This adds our files to the index
git add –A
All the files will be added to the index
This records a snapshot of the repository at a given time. Committed snapshots will never change unless done explicitly. Commit to the local repository.
git commit –m “ ”
Commit with a commit message. This is optional. It will automatically pop-up a window to enter commit message later.
To see how git performs commit.
Parallel Development – Branching
Branch is a pointer to a commit.
There are two types of branches;
- Local Branches
- Remote-tracking branches
git branch <branch_name>
As we were initially in the master branch. The new branch that we created will contain all files in the master branch.
git checkout <branch_name>
To change the current branch.
Parallel Development – Merging
git merge <branch_name>
Merge a branch to master branch. While performing this operation, you need to be in the master branch.
Parallel Development – Rebasing
Rebasing is also another kink of merging. The advantage is that we get a much cleaner project history. The new base commit is done to the tip of the master branch containing both the logs.
git rebase <branch_name>
To push from local repository to Central repository
To perform a push operation for a repository, we first need an SSH key.
To generate an SSH key.
Go to the GitHub account, then go to Personal settings, then SSH and GPG keys.
Add the SSH key. Provide a name. Paste the entire SSH key.
ssh –t <ssh url>
To make SSH authentication.
git push origin <name_of_local_branch>
Push local branch as a remote branch of the repository.