Leading a project

After working here for one and a half years, I’ve led four projects in my team. I’ve recently completed the most sophisticated project so far™ and am really proud of it.

So far meme

At its peak, there were seven engineers, two business analysts and one engineering manager actively contributing to the project. It took around one quarter to be delivered.

This time, I’m leveraging the learnings from my experience. Here are 10 of them:

Bear in mind that some of these items may not apply to startups. Large corporations and startups have different priorities.

Requirements

1. Get the requirements correct#

“Writing code is the easy. Writing the right code is hard” - my skip manager

And StackOverflow agree to this.

As an engineer, you usually get complete requirements from the product manager/owner, engineering manager, CTO, or CEO (depending on the organization’s size) to build a new feature or system, and you execute them. But this is only partially true when you’re leading a project. You should know why the requirements come in that way, and your team members might be asking why it’s designed that way and why it is not in another way - you should be worried if your team does not concern about it.

Before starting a project, you should be clear on the objective and goal of the project. Once that’s done, we can then proceed with requirement gathering. Gathering requirement means you’ll need to reach out to many different teams & stakeholders. Always jot down the decisions & assumptions somewhere for future reference. It does not matter if it’s in a JIRA card, comments, Google Docs, or Notion. For example, you may have reach to a decision to only allows users from Malaysia to access the website because of legal reasons. Document that down so that engineers can refer to it in the future and make more informed decisions.

If you don’t get the requirement right, you might be building something totally wrong and not valuable. That translates to wasting time and money, although the actual consequence is far greater than that (psychology, opportunity cost, etc.). Buidling the wrong thing also means that you’ll probably need to pivot the project very soon after it’s launched, or worse, before it’s even launched. For example, the initial requirement says that you want to only support Google OAuth for login. But then suddenly, the requirement changed while the project is progressing and now you want to only support Active Directory for authentication (and remove Google OAuth). This will gives a lot of troubles to the engineers because huge part of the code has to be ripped off, refactored and rewritten again to support new auth.

If you’re working in a startup company, the cost of building the ‘wrong thing faster’ is probably cheaper than ‘building the right thing but taking a longer time.’ In a startup, you should be able to pivot quickly, but that’s different for large corporations. If you’ve found the secret recipe after many pivots, you probably and eventually have to rewrite your messy codebase in a more cleaner way, using a better tech stack. Many tech companies gone through this phase.

2. Prepare the tasks for the team#

We’re working on a scrum methodology. Towards the end of each sprint, we prepare the JIRA tickets for the team members, and anyone can pick up any ticket in the next sprint. In the past, I did not do a good job at preparing tasks for the team. Understandably, this led to unexpected tickets being brought up in the middle of the sprints and causing sprint interruptions. This is a frustrating experience for the engineers in the team. The engineers also do not have clarity in the sprint, which again, a frustrating experience.

This time, I’m allocating 2 to 3 hours every week with our Engineering Manager and Business Analysts to groom the tasks in detail for the next sprint. We emphasize on writing the objective and acceptance criteria of each of the tickets to make it clear to the engineer who will pick up the task next sprint. For example, when a user goes to this URL, we should respond with ’new URL’ redirect. This also helps us to design and run the QA testing on the ticket later.

As a result, engineers take much less time to implement the tickets because they are clear on what is expected from the ticket. Our sprint burndown chart has been smooth since we do this (I wish I could share the charts!).

3. Build a map / design doc#

This may only be true for some, but I always prefer and encourage to prepare a map or plan before kickstarting a project. For backend project, this means you should at least produce a system architecture diagram of the components you are building and sometimes, you may want to agree on contracts as well (e.g., Typescript types/interface) on how the data types/structure looks like.

This will help the team to get on the same page and help them understand how the data flow through the system. If your team doesn’t understand this, you should be worried because they might be building the system wrongly.

As discussed in point #5 (Shifting left), this will also help you uncover potential problems and pitfalls with your system design, and other engineers can help you to improve them. If you can identify potential issues with your design in this planning phase, you probably want to redesign it or make an informed decision if you still want to proceed with that. Otherwise, you might need to rewrite the system from start again when you face the issue in prod soon..

Ideally, you should collaborate with the team to come up with the map, not you come up on your own. This will give your team feel more sense of ownership on the project because they contributed from the beginning! It’s best to start with a blank canvas with your team and figure it out together. In the past, I used to work on the design alone and let other engineers review my design. People end up not really looking at it when implementing the tasks. Now, I collaborate with everyone in the team to come up with the design system together and now everyone feel like the design is ours instead of mine.

Without a map or plan, the engineers may be writing code aimlessly. You may reach to the destination but probably not in the ideal way. Worst case, you don’t even reach your goal and need take a step back to reflect the route you took and rework on that from start.

It’s like you’re driving a car to a destination. Of course you can drive around and look at the street signboards and hopefully you’ll reach your destination. If you’re driving for a short distance, high chances that you’ll get there successfully. However, what if you’re driving 500km away, to some new place you’re not familiar with, do you feel confident that you get there by just looking at the street signboards? I’d prefer open up Google Maps, see how the whole journey looks like, potential traffic jams and roadblocks and listen to turn-by-turn navigation instruction. That’s more convenient to me.

Bear in mind that this won’t be your final design. It’s impossible to get things right in the first iteration. After you start write code and uncover cans of worm, feel free to adjust your map accordingly with your team. This map is just to guide you where you’re going. When you hit a blocker, you may want to regroup with your team and discuss a better design.

Don’t mistake the motion for progress

Sometimes, engineers are just eager to get hand dirty, just execute and write code while neglecting how the code could end up affect the system as a whole.

Furthermore, without a plan, everyone in the team might have a different mental image on how the project and software looks like as a whole. This is very bad for the team because the code written by each engineers will looks very inconsistent from one other. One engineer might implement the code in Clean Code approach while other engineer implement the code in completely different structure. Combined together, this will end up being an unpleasant codebase to work with in the end.

Again, this may not be applicable to all scenarios. If you’re working on startup company, or R&D projects, or exploring totally new areas that you’re not familiar with, you probably don’t have to come up with a very clear map in the beginning.

Collaboration

4. Communicate on risks early#

As soon as you found new risk or blocker, raise it to your stakeholders. Do not wait until it’s late before raising them because this will make it difficult for others to help you in that situation. If you raise your flag early, the stakeholders might be able to de-scope some part of the requirements. If you’re lucky, other teams might be able to lend you some help to speed up the work. You might also be able to negotiate for deadline extensions.

Great engineer underpromise, over-deliver

Pro tip: If you’re seeking clarification from other people and it’s difficult to describe in words, you might want to record your screen explaining that visually instead. This is really helpful for async discussion and keeping the evidence.

5. Shifting left#

It’s cheap to fix the issue early in the cycle.

Imagine that you write thousands of lines of code and you push to prod then to find that code not working as expected. Or worse, it’s causing intermittent bug or performance issue that you’re unaware of. How would you feel? You have invested so many hours writing the code, writing unit tests & integration tests, getting a review from your colleague, deploying to staging and deploying to prod, just to find the PR need to be rolled back. Not to mention the damage and opportunity cost caused by downtime or wrong results the users see from the code you’ve shipped. I would feel tired and sad, but of course I’ll learn a lot of lessons from it.

Now imagine that you’re writing code, writing unit tests, and then you submit your PR for review. Right after that, someone points out that you can fix the result correctness and improve the performance of your code by using a certain technique. You can now return to your code, fix it, and test it again. After that, you ship your code to production and it’s performing really well. How would you feel? I’d feel very humbled and also learned a lot of lessons as well. This feedback loop is much faster (which means cheaper) compared to the former one in the above paragraph.

This is a process that you need to encourage in your team. It won’t work if only one person is doing this. Everyone must participate and give extra attention to each of the step.

In the past, I was eager to ship my code to prod and I did not test my code thoroughly. As a result, I’ve caused some trouble in staging and QA testing (thankfully it did not go to prod). It slows me down and also slows down the QA engineers because I’m taking their time to test my badly written code.

Lesson from here is, always try to catch the issue as early as possible. Read your code again before requesting for review for your PR. Quickly manually test your code & run all unit/integration tests before commiting your code to avoid causing regressions. This will save our faces someday :)

6. Making tradeoffs#

This isn’t an ideal world. You’ll need to compromise something in order to achieve something.

Negotiate and think about what’s best for the team.

In this project, we made several tradeoff calls for the team’s benefit, but we agreed to never compromise on our code quality. Refer to point #9.

In the past, I did not do this well. I tried to make the project perfect and avoid making tradeoff calls. Sure we can have 100% unit tests coverage, complete integration test suites, bug-free implementation, highly optimized code, auto scaling computes, but should we do all of them? In reality, the time pressure might limits your move and you can’t choose to do all of them. If try to take all of them, you might squeeze your team, introduce more risks to the project completion deadline and people will burn out. In my experience, finding balance is the hardest part. Another dimension to watch out for is the financial budget. Does the way you want to implement is financially sound to the company? For example, do you really need a overprovisioned Kafka cluster to support your system, or can you just use managed AWS Event Bridge to handle the load? Do you really need multi-region Kubernetes clusters with tens of always-up nodes or can you live with serverless container offerings like Google Cloud Run? Do you really need multi-region multi-master AWS Aurora DB or can you live with AWS RDS or even SQLite?

It takes wisdom to make this call. And sometimes, you might make the wrong call, and that’s okay as long as you’re aware of the risks. Consult the other team members or reach out to other teams if you need help & clarification on your decision. You’ll never learn unless you make a wrong decision here ;)

There is no right answer, only wrong answers

Execution

7. Delegate tasks#

When I was leading this project, I was so busy NOT writing code. I was busy preparing tasks for the next sprint, reviewing PRs, pairing with the team members to unblock them, seeking clarification with stakeholders, writing documents, attending meetings, etc. This gives me no time actually to write code.

Most of the code was written by the engineers in the team and not me. Make it easy for people to contribute to the project. This actually empowers the team members as well.

Trust your team. If you find anyone in the team struggling, offer help to them. Refer to Pair Programming section.

8. Leverage the expertise of the team#

Generally, people are keen to contribute to the projects. As the project leader, celebrate their ideas and facilitate on bringing their idea to the project.

This is especially true for me because I’m familiar with different programming language than what I’m currently using at the company. I came from a Python background, but now I work with NodeJS. I consider my general programming knowledge & system design as good, but my NodeJS knowledge in specific is still pretty basic. Someone else on the team might have more experience with the tech stack than I do, and their suggestion are valuable to the project.

Create opportunities for everyone. Let everyone on the team participate and contribute. This will also again, give the team members more sense of ownership of the project/system.

When we’re working on the project, one of the engineer suggested a way to improve the code by removing one extra regex execution. Considering this piece of code sits in the hot path of the stack and receiving millions of requests per day, this gives a great benefit to us in term of efficiency (less CPU power is needed). He also made suggestions on ways to improve the typing in the codebase for more safety.

In the past, I used to think that leading the project means you must have the best technical knowledge in the team. Now, I think that the project lead must know enough technical knowledge and let other members in the team to put up their ideas.

This is partly inspired by the Oppenheimer movie. Oppenheimer himself is a great physicist. He then gathered all the great minds in the US to work on a common goal: building the greatest nuclear bomb. The Manhattan project wouldn’t be successful if the experts in each of the field did not collaborate well together.

9. Be very picky when reviewing PRs#

The code you read is far more than modified. If harmful code goes to prod, you’ll slow down the team velocity in the next 3 months. It will make modification difficult for the team.

It will slows down the entire team from delivering new features because now, no one understands that code anymore. Making a simple change to the code require a lot mental gymnastics. Imagine making blue buttons becomes red require hundreds of lines of code change because your code is so convoluted. The PRs in the past to make blue buttons was written poorly but was accepted to prod. Now a lot of poorly written code and tech debts has accoumulated in the codebase and making a seemingly trivial change now becomes cumbersome work. This is the kind of situation what we want to avoid.

“Software engineering is programming integrated over time” - The Flamingo book

It’s easy to write code as if you’re going to use it for maybe the next 6 months or one year and you’re the only maintainer of the codebase. But what if the code you write today gonna live for the next 5 to 10 years, and modified by tens or hundreds or developers that come and go? “Simplicity is not a simple thing”.

It’s easy to identify the correctness regression you introduced in your code. Compare the result from your service before and after the deployment and you’ll see the difference. You can also prevent this using unit test & integration test. It’s also easy to measure performance regression you introduce from your code. Compare how the service is performing before and after your code deployment, either from load test or server metrics. Some programming language like Go even has performance testing. But what about code spaghetti? How do you know if your poorly written & structured code affect engineers productivity? AFAIK, there is no easy way to measure this.

I’ve reviewed over 60 PRs throughout this project, of course others reviewed a fair share of it as well. I once took around six days to review a big PR with over 1,500 lines of code change. This PR requires my full attention and focus to review. This PR proposes a significant change in the system architecture, making the components handling more than they should. After reviewing the PR, I suggested another way of achieving the same thing, but with a better separation of concern. After one day, the original author raised another PR with a much cleaner code!

10. Pair programming#

Help the team members to grow and succeed together. If someone is stuck for many hours or days, it’s clear that he/she need help and someone has to step in.

If you find someone struggling or blocked, give them a helping hand and pair with them. I usually block around 30 minutes or 1 hour every time someone raised a problem and require my attention.

It’s super important to speak in a way it makes people more confident the session. Never ever look down on people because, remember, you were once a junior as well!

I find pair programming to be a really useful way for the team to grow, everyone learn from each other. There’s no better way to learn.

Honorable mention#

I’m forever grateful to my Engineering Manager for the trust, guidance and supporting me & the team. Also thanks to the secondary project driver that helps me a lot in driving this project together.

Conclusion#

To summarize, my role in leading the project is to:

  • Collaborate & getting alignment with stakeholders on the project requirement & direction
  • Collaborate & align with the team on the project direction
  • Prepare & orchestrate the work
  • Encourage engineers to collaborate (make people talk to each other)
  • Review the artifacts (mostly PRs, sometimes documents) aligned to the project goal

For now, this works great for me. However, for future projects, I might need to change my style here and there depending on the team dynamics and maturity. In my opinion, it’s important to adapt to the situation and not stick to the rules blindly. Not all teams and projects are equal. Even though you didn’t change team, your same team today will feel different in the next 3 months (the team hopefully getting more matured).

Nevertheless, seeing how the ambiguous objective translates to requirements, and then slowly little-by-little translates to code, shipped to prod, done by the team, used by millions of users in the region, is the most satisfying feeling✨ I’ve had in a long time. This project is relatively small project compared to what other teams are doing, but this is huge for me.

I’m still learning.

© Fadhil Yaacob 2024