There’s a huge difference between writing small scripts for fun and shipping software professionally. In this article, let’s look at some of the common processes and tools that you should learn if you want to get a great job as a software engineer. If you’ve ever wondered what git, GitHub, PR’s, TDD, CI and CD are, read on to find out what they mean and why they matter!
If you’ve been writing software for a while, at some point, you’re going to have thought “I really wish I could just go back to how my code was 20 minutes ago”! You may have a text editor that allows you to undo the last few commands, but the most consistent and reliable way to “go back in time” (especially across multiple computers and multiple developers) is to use a version control system.
Version control systems allow you to keep track of the changes you’ve made to code over time. It’s a little like “track changes” in Google docs, but the difference is that you can save changes across a set of files, not just within an individual file.
Imagine you’re adding a new “about us” to a website. You might need to create a new HTML page, add some new rules to your CSS to make it display right, and upload a couple of images for the page. With a version control system, you can “check in” all of the changes to those different files with a single commit message “Add about us page”, and when someone looks back through the history of the commits, they’ll easily be able to find when the change was made and what files were impacted.
In addition, most version control systems support “branching.” With branches, you can have different versions of your code being developed at the same time, so one team can update your ticketing functionality while another changes how your email sending works. While there is now debate amongst high performing teams about whether they should continue to use branches, learning how to use them will help you to work in most engineering organizations.
There are still a number of version control systems out there, but these days, git is by far the most popular. It’s not the easiest to learn, but once you do, it is incredibly powerful. Look out for a future blog post giving you some hands on experience on learning the basics of git.
Collaborating on GitHub
If you’re writing software with other people, you’re going to need some place to share the code. There are lots of ways of hosting a git repository (all of the files in a project - plus information on the history of those files). One of the most popular is GitHub. GitHub was designed to make it easy for teams to collaborate — whether or not they knew or trusted each other. It’s where most open source software lives, and as a professional developer it’s important to have a GitHub account and to know how to use it to collaborate with your team.
Discussing new features
If you’re using git to keep track of your changes, and sharing your code with your team on GitHub, you’re probably going to use “Pull Requests” for discussions around new features.
When you create a Pull Request (often shortened to “PR”) in GitHub, it creates a special web page where you and your team can discuss, make, review and approve a set of changes. Historically teams used to create PR’s when they were done with a branch so they could get input on their changes and get them “merged in.” These days, teams will often start a PR before they even write most of their code, to create a place to share and discuss everything from the business requirements and the design mockups to the final implementation.
Improving your software design
Given its name, when people first hear about “Test Driven Development” (TDD), most people assume that it’s all about making sure that your code really works by writing lots of tests. But Kent Beck, one of the most famous developers in the Agile software development movement, famously stated that “correctness is a side effect” of TDD. That’s the kind of statement that only a programmer would make! What he means is that the fact that you happen to have a bunch of tests that prove your code does what you think it does - well, that’s just a bonus. The real benefit of TDD is that you design simpler, better software.
When you apply TDD (often called “test driving your code”), you start by thinking of the simplest possible thing you want your code to do. You then write a test and run it. And it returns “red” (the test failed). No big surprise as you haven’t actually written the software yet. If your test starts off green, you have some digging to do as either you made a mistake when writing the tests, or someone else has already written the code you were planning to work on!
Next up, you write the simplest possible code that will make the test pass. Generally if it’s more than 2-10 lines of code, you’re probably taking too big a step. Then you re-run the tests and hopefully they’re green now (they are passing and the code is doing what you wanted it to do). This is a good time to commit your changes to git to make sure you have a copy of the working code with all of the tests passing.
Finally, you get to review the code and see whether you can “refactor” it. Refactoring is the process of changing the implementation of your code - usually simplifying it or otherwise improving it, without changing its external behavior (all the tests should still pass).
And that is the “Red - Green - Refactor” process that is at the heart of TDD - and most modern software development.
Did you remember to run the tests?
As you start to work on a larger development team, sometimes you might download the latest code from GitHub, run the tests, and they may not all pass. There are two reasons this could occur. It might be that the last developer to save their changes and push them up to GitHub forgot to run the tests and broke something without realizing it (they created a “regression”). Or sometimes, the code “worked on their laptop,” but doesn’t work for other people. Perhaps they added a file or a configuration variable that they forgot to check into version control, or perhaps there is something else that’s different about their laptop.
Either way, that’s not good, and especially once you have more than 3-4 developers it can cause a bunch of wasted time as multiple developers all try to figure out what’s wrong with the last set of changes made.
One of the best fixes for this is to set up something called “Continuous Integration” (CI). With CI, every time a developer pushes changes to GitHub, a server is spun up, all of the tests are automatically run, and an email is sent to the dev team if any of the tests failed. In that way, the developer who “broke the build” can go back and look at their code and tests, fix any issues and then push it up again. That ensures the code on GitHub is always either working or everyone is notified immediately, keeping the code in good shape and allowing the dev team to focus on adding new features - not fixing bugs.
Let’s go live
Companies don’t typically get value from software until it’s released. Whether it’s a website, a mobile app or even an embedded script that runs on a router or in a smart speaker, it’s only going to start adding value to customers (and hence to the company) once it is shipped.
Even just 10-20 years ago, shipping software was a difficult, and therefore rare, activity. A team would painstakingly pull together (integrate) all of the various features they’d been working on, check through the application to make sure they hadn’t broken something, and then they’d finally ship the changes - whether it was updating a website or mailing out a CD-ROM to their clients.
These days, it’s common to do something called Continuous Delivery (CD). With CD, every time you finish a feature and merge the feature into your master/trunk branch (the ”main” branch for your code), it gets automatically tested (CI) and then automatically deployed to production. It’s why companies like Facebook and Etsy now deploy to production hundreds or thousands of times a week.
With CD, you minimize the “time to value,” with features often being available to users within just a couple of days of the developers starting to implement them. As a junior developer, you won’t have to know how to set up a deployment pipeline, but don’t be surprised if you end up using one to push your changes to production — possibly even on your first day at your new job!
Professional software development is a team sport, not a solitary activity. By learning key tools and concepts for collaboration, you’ll be in a much better position to get a job as a professional software developer. In addition, you’re more likely to get a good job with a good company as generally modern software development practices are correlated with teams that you’d want to work with.
Head of Data Science
Peter is a veteran technologist, CTO, entrepreneur, and longtime educator, having taught digital literacy at Columbia and authored numerous programming books.