Also in Developer Stories
Challenging the expectations of open source
In my two years as an open source maintainer, I have found that the biggest barrier for newcomers participating in open source is not actually the coding part. In fact, Tern gets lots of Python programmers who know what changes are required in the code. The biggest barrier to entry I’ve seen is the Git workflow surrounding opening and updating a pull request. While opening a new pull request (PR) is a fairly straightforward process, updating that same PR can be difficult if your repository is not set up properly.
This comes as no surprise to me because when I first started contributing to open source, the hardest part of getting started was trying to understand how I could make Git do what I needed it to. I still remember being asked to update one of my first PRs and, despite my best efforts searching all corners of the internet, I couldn’t figure out how to do it. I ended up closing the pull request, re-cloning my fork, and opening an entirely new pull request 🤦♀️ . The good news is that by the end of reading this article you won’t have to go through the same struggles I did.
Configuring your Git development environment properly is the best way to avoid frustration as an open source contributor. While the GitHub webUI is fairly intuitive, Git command line operation and dexterity is not. However, you don’t need to be an expert in Git to be successful contributing to open source.
Open source projects will vary in scope, size, and complexity. The nature of these factors will determine how to pick the project you want to contribute to, but most projects will generally follow a maintainer/contributor model.
Maintainers are the project leaders and responsible for the overall health and direction of a project. They are the final reviewers of pull requests (PRs) and ultimately responsible for the code that gets merged to the project.
Contributors commit to the project in the form of documentation, code review, debug, or code contributions. Contributors suggest changes to the project by opening PRs for maintainers to review and ultimately merge to the project’s code base.
These instructions assume that you have a GitHub account, the git package installed in your development environment, and access to a command line shell (the one included in Visual Studios Code will work just fine). Also note that while Git contains a multitude of intricacies, I try to keep the instructions as high level as I can.
1. General setup
Pick your project
The very first step in setting up your development environment is to pick the project you want to contribute to. If you’re unsure and want to browse around, take a look at GitHub’s #good-first-issue tag or the Good First Issue project page for a list of projects with beginner-level work.
Optional: Create an SSH key
While it’s not required to have an SSH key registered with GitHub, it can make your development workflow easier and more secure. You only have to do it once. GitHub’s documentation around how to do this is great. Check out articles on generating a new SSH key and adding a new SSH key to your GitHub account to learn how to do this.
Setting up your Git configs
Git config is a tool that sets general Git configuration options and, more specifically, can help you customize your identity. When you open a PR for an open source project, you will likely need to “sign off” your commit. Configuring your username and email can save you time with this step later. To configure your sign-off identity, run:
git config --global user.name "Your Name"
git config --global user.email "your.email@address.com"
Fork the project
Once you’ve configured your identity with GitHub, you need to fork the project you are planning to contribute to. Because write access to open source projects is controlled by the maintainer(s), you won’t ever be making direct changes to the project repository. Instead, we fork (AKA copy) the project source code to our own personal account so we have somewhere to make changes without affecting the original repository. It’s easiest to fork the project via the GitHub WebUI. Once you navigate to the GitHub page of the project you would like to contribute to, look for a “Fork” button in the upper right hand corner.
Clone your fork
Once you’ve forked the main project repository, the next step is to clone your fork of the project to your working environment. You’ll want to do this on the command line using your shell of choice:
git clone git@github.com:rnjudge/tern.git
cd tern
2. Branch/environment setup
You can think of branches in Git as a way to organize and separate your changes. Different branches can point to different sets of changes without muddling the two sets of changes together. In the next set of steps we will set up your working branches so that you can easily update and rebase your work with the main project repository.
Add an upstream remote repository
A remote repository is a Git repository that’s hosted somewhere on the internet. When you clone your fork of the project, you are creating a local copy of your forked remote repository. When you run git clone
(git clone), Git automatically gives your remote repository the name origin. You can list your remotes using the git remote command. After running the clone command above, you should see your “origin” remote listed:
git remote -v
origin git@github.com:rnjudge/tern.git (fetch)
origin git@github.com:rnjudge/tern.git (push)
Your fork of the “origin” remote repository you cloned is not automatically kept up to date with the main project. This means that if changes get merged to the main project repository, your cloned fork will not know about those changes by default. This matters when we want to rebase a PR to match with the main project repository. In order to make this easier in the future, we add a remote repository named “upstream” that points to the main project.
If and only if you have your SSH key setup, run:
git remote add upstream git@github.com:tern-tools/tern.git
Otherwise, add the remote using https:
git remote add upstream /~https://github.com/tern-tools/tern.git
Create a “home base” branch to track changes on the main project
“Home base branch” is not a technical Git or GitHub term, but a phrase I use to describe a branch we’ll use to keep track of upstream repository changes, which will help us easily rebase our PRs in the future. This means you won’t use your home base branch for development or to make changes to source code. Rather, you’ll use it to rebase your development branches and create new working branches from it. The up branch helps you easily stay in sync with the upstream repository.
In this example, my home base branch is named up (but you can name it whatever you want). Your other development branches will be based on this up branch as well. In the following set of commands, Tern’s main branch is named main. In some repositories, it may still be named master so you might need to edit the commands accordingly. Run these commands to setup your up branch to track changes in the upstream project repo:
git fetch upstream
git checkout -b up upstream/main
git push origin up:refs/heads/main
Note that you will only need to set up the up branch once.
3. General workflow
Now that you have created your home base branch, you’re ready to start coding! The following workflow will help you make changes, submit a new PR, and update the same PR if necessary.
Create a working branch
The up branch is your home base branch that we'll use to create working branches. The working branches are where we will actually make changes to the code. Whenever you want to create a new working branch, run the following commands:
First, make sure up branch is current.
git checkout up
git pull --rebase
Then, create and switch to your working branch.
git checkout -b working_branch_name
The working branch can be named whatever you want.
Make and commit your changes
Any changes you make to the project source code will be associated with the working branch you’re on (to see what branch you’re on, run “git checkout” and look for the asterisk). Once you think your changes are sufficient and ready to be submitted to the upstream project, you’ll first need to add the files for staging. Staging your changed files for commit is a way of telling Git that the files are ready to be committed. In order to stage the files for commit, run:
git add <file/directory>
If you changed a lot of files in one particular directory, you can git add entire directories this way as well. Note that Git will only stage files that have been changed if you add an entire directory where some files are unchanged. To check which files have been staged for commit, you can run git status. If you want to delete a file as part of your commit, run git rm file to both delete the file and stage the removal for commit.
Commit your staged changes
Once you’ve staged all the desired files for commit, it’s time to commit the changes. Using the -s option will sign off your commit using the git config information that you did in the General Setup section.
git add <file/directory>
git commit -s
Important note: I do not encourage using the -m option with git commit. git commit -m<msg> allows you to use the given <msg> as the commit message at the same time that you commit your code changes. This, however, does not allow for writing detailed or well-organized commit messages as the commit messages when using the -m option tend to be short one-liners. I gave a talk at All Things Open about why and how to write good commit messages (which has also been converted to blog form). Many open source projects will also have commit message requirements for the project. Look at these before writing your commit message. Once you’ve finished writing your commit message, save and exit the commit message prompt.
Push your changes
Once you’ve saved and exited the commit message prompt, it’s time to push your changes to your remote fork. Up until now, your changes have lived in the local copy of the “origin” remote repository. Pushing your changes will upload your changes to the remote repository on GitHub.
git push origin <working_branch_name>
Opening the pull request
Once you’ve pushed your changes, you can use the GitHub WebUI to open the PR. Simply navigate to the main project page and GitHub will automatically suggest opening a PR from the changes that most recently got pushed to your fork.
4. Editing your commit after you’ve already opened a PR
Rebase your changes with upstream
If you are asked to make changes to your PR, it’s possible that other commits could’ve been merged to the upstream repository since you first submitted your changes. In order to make sure you’re not picking up any stale code or files when you update your PR, it’s best to rebase your changes with the upstream remote repository. To do this easily, use the home base branch. Keep in mind that the home base branch is called up in this example but you may have named it something else.
The following set of commands will first fetch any changes from upstream and then apply them to your up branch. This means you will update your up branch to match with the latest changes in the repository where you are submitting your PR. After you switch back to your working branch, the git rebase up command will then update your working branch (that contains your PR changes) to contain the latest changes from upstream (via the up branch) while preserving the changes you made on your working branch. This process enables you to pick up changes from the upstream repository so that your PR can be merged into the upstream repository without conflicts. It also ensures that any continuous integration tests that run once your PR is updated or submitted will run against the latest changes to the code base.
git checkout up
git pull --rebase
git checkout <existing_pr_working_branch_name>
git rebase up
Now that your working branch is current, you can make your changes. To make changes to source code files you will edit the file(s), and run git add to stage them for commit like you did before. To update your PR with these changes, you can amend your previous commit. Amending your commit will also give you the opportunity to edit your commit message if you need to. If you only need to make changes to the commit message and no source code files, skip the “git add” and just run:
git commit --amend
If you don’t need to update your commit message, just save and exit the commit message prompt after running amend. If you want to update your commit message, make changes now before saving and exiting the commit message prompt.
Re-push your changes
Make sure that you are still on your working branch where you just amended your commit. Amending your commit locally does not update the commit in the remote repository or change the pull request. To update your PR, you’ll need to re-push your changes to your forked remote repository:
git push -f origin <existing_pr_working_branch_name>
The -f/force push option can be dangerous when used incorrectly as it can overwrite the commit history in the remote repository with your own local history. Here, however, it is required because you are amending the old commit and intentionally rewriting git history on your fork to include your latest edits. When you force push here after amending your commit, you are creating a new git commit ID to associate with your changes.
If you go look at the original PR you opened you’ll see that it now contains your most recent changes. If CI/CD tests are configured for the repository you’re submitting your PR to, you will see those triggered again to re-run on your latest set of changes.
Continuing to contribute
Remember that your up branch is your home base branch that you’ll use to keep track of changes. You won’t use it for development or to make changes to source code. If you want to start working on a new issue for the same project you can start to follow the steps under step 3, “General workflow” since your home base branch would already be set up at this point.
Phew, that was a lot!
I hope this setup helps you find success in open source. If you ever have any questions you can reach out to me on Twitter (@rosejudge5) or via Tern’s GitHub page.