How 8 New York Software Engineers Solved Their Biggest Technical Challenges

For software engineers, dealing with technical challenges is often a juggling act. Whether it’s migrating a new tool, building a scaled platform or any number of other intricate tasks, tech professionals expect to come up against a variety of roadblocks in their daily duties. But how can these challenges be turned into an opportunity for professional growth?

Some top engineers believe that a change in mentality is of prime importance.

David Michael, a senior software engineer at Coinbase, believes that moving beyond the quick-fix mentality was an important step in his career journey. “We almost gave up many times,” he said. “The issues were so vexing, so diffuse and persistent that it seemed like an intractable problem. In the end, the solution was not just one fix, but a number of small fixes at different layers that added up to the solution we were after.”

Guayo Mena, a principal engineer at Parsley Health, added that when overcoming technical challenges, working as a team player can be a challenge in itself. “With such a big team, we knew we would have to be very organized if we wanted everyone to work continuously without getting blocked,” he said. “I was expecting that three times the people would mean three times the productivity, but I quickly realized that we would also have three times the planning, coordination, code reviews and answering questions.”

It’s clear that when it comes to technical challenges, software engineers often have a lot of balls in the air. Built In NYC sat down with eight industry leaders to talk about the biggest challenges they’ve faced in their work, how they overcame them and what new career skills they gained in the process.

Omar Andrade

Senior Software Engineer • Northwestern Mutual

What’s the biggest technical challenge you’ve faced recently in your work? What made this particular challenge so tricky?

As part of the migration to a new collaboration tool, we had an opportunity to improve security and compliance by automating the access process for users. This project required us to choose a data source that would best meet timelines, known patterns and enterprise architecture standards. At large companies like Northwestern Mutual, it’s not unusual to have multiple sources for the same data, each having pros and cons. In this continuous journey of modernization, we knew we had to make some early tradeoffs in selecting a data source to reach our initial target state.

How did you and your team overcome this challenge in the end? What were some of the specific tools or technologies you used?

As an internal engineering consultant, I have to be familiar with both legacy and modern tool sets to build the best solutions. In this case, I partnered with the migration team to understand the business objectives, overall migration timeline and their ability to support an automated access process. We considered all of this information along with technical feasibility to land on a batch service run in our Kubernetes environment written in JavaScript using Node.js. While we had to make compromises, this solution best reconciled a wish list of wants with what was possible. We were able to hit the migration timeline with lower risks and connect with teams that will be future partners in delivering the ideal target state after the initial migration is complete.

While we had to make compromises, this solution best reconciled a wish list of wants with what was possible.”

How did this challenge help you grow as an engineer or strengthen a specific skill?

During large efforts like this that span different teams and technologies, it’s not always easy to identify who to talk to, what service to use or how to get something done. Making new connections with people who could help with this in the future was a big gain for me. This effort also strengthened my software design and execution skills by providing experience reconciling an aggregate of ideal state requirements with what’s technically feasible given timelines, known patterns and enterprise standards.

They're Hiring | View 63 Jobs Northwestern Mutual is Hiring | View 63 Jobs

Coinbase office workspace with team members working on computers — Coinbase

David Michael

Senior Software Engineer • Coinbase

What’s the biggest technical challenge you’ve faced recently in your work? What made this particular challenge so tricky?

Protocol engineering involves facing big technical challenges every day. In fact, the whole practice of protocol engineering is relatively new and requires in-depth knowledge of blockchain protocols, cloud operations, networking and programming with a sprinkling of human communication. It’s like SRE, DevOps, blockchain engineering and software development all rolled into one. Developing this practice at Coinbase is without a doubt the biggest and most rewarding challenge we have faced.

One of our most recent challenges concerned a group of community validators on the Celo protocol. For months, we were having unacceptably low signature rates, sometimes as low as 60 percent signature success. Most of the time this low signature rate was not affecting validator scores because these signature misses were not sequential, just rocky. However, several of the validators began suffering score hits resulting in loss of potential rewards. A performance condition that initially was just not up to our own standards of excellence was now beginning to take a bite out of our delegator’s rewards.

A performance condition that initially was just not up to our own standards of excellence was now beginning to take a bite out of our delegator’s rewards.”

How did you and your team overcome this challenge in the end? What were some of the specific tools or technologies you used?

Our first naive approach was to throw more hardware at the problem assuming nodes were underpowered. This made no difference in validator behavior, and we set out to be more methodical in our approach. After a discussion with the community, it was discovered that a recent change in the consensus engine was discarding up to one third of valid signatures as an optimization. The core protocol team quickly released a patch which helped, but our validators still hovered at around a 75 percent signing rate. There was clearly something else going on.

Our next stop on the checklist was peer ingress and node identity, core requirements for healthy participation. After a quick netcat we confirmed that our firewalls were restricting access to the node’s p2p port. We patched our firewall and confirmed our validators now had ingress. Finally, we checked the validator’s enode id and found that it could not resolve its external IP address. The solution thankfully was fairly straightforward — we had to tell the node its external IP.

Only after correcting the core consensus code, realigning our port mappings and resolving node identity issues were our validators back to a near 100 percent signature rate.

How did this challenge help you grow as an engineer or strengthen a specific skill?

Overcoming this challenge required collaboration with Celo’s protocol team, digging into network connectivity issues and ultimately booting nodes with a modified configuration. There were issues converging in almost every layer of protocol operations from core protocol to our own infrastructure and the individual nodes.

What I love about this effort is that it includes all the elements of how diverse the role of a protocol engineer can be. It involves knowledge of a protocol and its consensus mechanisms, human communications with the protocol’s core development team and community, an understanding of how p2p works in general and an understanding of our orchestration platform. Protocol engineering is the ambiguous problem space we find ourselves in every day.

As I said, we almost gave up many times. But now, finding all the little solutions beyond the quick-fix mentality seems prototypical of the work that we do.

They're Hiring | View 413 Jobs Coinbase is Hiring | View 413 Jobs

Avalon McRae

Senior Software Engineer • Formation Bio

What’s the biggest technical challenge you’ve faced recently in your work? What made this particular challenge so tricky?

TrialSpark’s tech team recently hit a size and scale where we want to start breaking up our monolith and reducing coupling across our modules. We realized that some of our biggest pain points weren’t necessarily due to a shared repository, but were caused by poor separation of domains within the repository.

At the same time, we started a project to introduce automated updates for a subject’s status. This involved both adding a sizable new domain to the code and also interacting with existing domains across the code base, since statuses need to be referenced in many places and the automated updates had a variety of sources. In some cases, our data managers configure specific fields that should propagate a disposition update. In other cases, a clinical research coordinator manually discontinues a subject if they withdraw consent. We wanted to create better domain separation as part of this new feature, but also find the right balance of refactoring and delivering in a timely manner when leveraging our existing services. This was tricky because building subject disposition in clinical trials is a known difficult problem to solve in the industry with no status quo for how to handle it.

Building subject disposition in clinical trials is a known difficult problem to solve in the industry with no status quo for how to handle it.”

How did you and your team overcome this challenge in the end? What were some of the specific tools or technologies you used?

After identifying some of the key domains that we wanted to split and how we should draw those lines in our code base, we also introduced Blinker, a Python library for broadcast signaling via events and listeners. This allowed us to start introducing more service layer separation within the monorepo and helped us to decide where it made sense to draw the lines within our domain — specifically, which services should reference each other directly and which areas should be able to listen for signals without being aware of what’s dispatching them. This helped the team start to plan for a more decoupled, event-driven system with a very low lift approach, enabling us to more easily split these modules into their own services going forward. By the end of the project, we had successfully leveraged Blinker to encapsulate our disposition updates within their own service, as well as created a collection of generic signals that we could reuse in other areas of the code base.

How did this challenge help you grow as an engineer or strengthen a specific skill?

Identifying the right way to split a domain and which services should access each other is often a challenge in many code bases. Once the boundaries start to blur, it can be much easier to follow existing patterns in the code base rather than restructure the flow of control with each addition. Even after working in this code base for almost two years, introducing Blinker and starting to break code out based on how it should be split, rather than how it’s currently split, helped me learn even more about the complexities of our domain and allowed me to build stronger opinions about how our services should be split and why. This has helped me write cleaner code, provide better PR reviews and start carving out a path for the team on how we can begin splitting our services and modules better.

Flowcode team members having a meeting in the office — Flowcode

Emily Obaditch

Engineering Lead • Flowcode

What’s the biggest technical challenge you’ve faced recently in your work? What made this particular challenge so tricky?

When I started at Flowcode, one of our now core products, Flowpage, was in its earliest stages with only a few thousand lines of code. For the past two years, my team and I have grown Flowpage to what it is today, but not without some bumps. One particular challenge that has persisted throughout this project is efficient state management on the frontend.

After migrating our state management tool from Apollo Client to Redux Toolkit — a prerequisite for merging our frontend code bases into a monorepo — we were noticing performance issues due to a large amount of data being held in redux. By refactoring the store structure and trimming it down to a leaner object, we shaved seven seconds off network requests and noticed much smoother page navigation. State management can be a tricky and nuanced task, but doing it incorrectly will result in a clunky store or passing data to components that do not even need it. Doing it right will mean a noticeably more seamless user experience.

How did you and your team overcome this challenge in the end? What were some of the specific tools or technologies you used?

Our state management has gone through a few iterations and refactors, and like most code written, it will continue to be improved and iterated on. We are using Redux Toolkit, which has given us a lightweight framework to take advantage of redux while writing simpler code faster. While we’re not at the end of the road to a perfect application state, we’ve come a long way.

Right now, my team and I are continuing this effort of state management improvement by removing some of our current middleware and adding in RTK Query to support optimistic UI updates and automated state caching. Migrating from one tool to another always brings its challenges but I am enjoying navigating a new landscape of state management tools and am looking forward to the positive impact our users will feel!

While we’re not at the end of the road to a perfect application state, we’ve come a long way.”

How did this challenge help you grow as an engineer or strengthen a specific skill?

As a frontend-leaning engineer, I love to make things look beautiful, but can sometimes overlook the foundation beneath a good-looking page. Being constantly aware of how efficient state management can improve user experience — often more so than a flashy new animation or layout — is a must for a solid frontend developer. It may not be the prettiest thing to work on, but I’ve learned that proper state management is the backbone to a successful frontend application.

Refactors and migrations can often be daunting and hard to get buy-in from key stakeholders but it was awesome to see this behind the scenes work pay off in a more seamless user experience. As an engineer, I’ve grown to communicate how tech focused projects, as opposed to product focused ones, can often be the solution needed to move the needle on a key metric.

Will Burdett

Senior Data Engineer • Medly

What’s the biggest technical challenge you’ve faced recently in your work? What made this particular challenge so tricky?

The biggest technical challenge we’ve faced has been getting a Tableau server running in AWS so that we can confirm if it is a tool we will need to support. The challenge was amplified because I was learning so much about how our infrastructure is set up and the limitations of different approaches. I am not usually an infrastructure person, though I have some experience there and thought that this would be a great task for me to learn a lot while helping the team.

I was learning so much about how our infrastructure is set up and the limitations of different approaches.”

How did you and your team overcome this challenge in the end? What were some of the specific tools or technologies you used?

The biggest help to overcoming this challenge was taking a step back and finding a more direct path outside of our tooling. We decided to bypass Terraform because of the complexity of getting a Tableau server running for a proof of concept. This allowed us to get it into the hands of the people who wanted to test the software quicker.

How did this challenge help you grow as an engineer or strengthen a specific skill?

The skill I gained here was an understanding of how Terraform works. I had never used it before but given that Medly heavily relies on it, this task was very important to helping me be ready to use it in the future.

Vishal Salunkhe

Enterprise Solutions Architect • MarketAxess

What’s the biggest technical challenge you’ve faced recently in your work? What made this particular challenge so tricky?

MarketAxess has strived to be the market leader in corporate bond trading since its inception. Year over year, business growth has underlined what we already knew was overdue — the technical overhaul of our trading platform to support increasing trading volume with the least possible end-to-end latency. There were many things on the wish list, such as ensuring that the new trading platform was scalable, fault tolerant, distributed and microservices-based. It also had to be cloud-native and cloud-agnostic, meaning the platform can be deployed on any major cloud service provider with minimal change. But the biggest and trickiest challenge was for us to break the monolith platform into independent services, one at a time, and build and integrate the new trading platform with the existing one at various junctional points seamlessly to ensure continuity of business, while at the same time developing, testing and deploying the new platform and its services. One of our senior architects used to use the analogy of upgrading the engines of a plane with new ones while it’s still flying — that’s what we were doing.

How did you and your team overcome this challenge in the end? What were some of the specific tools or technologies you used?

The existing platform was cumulative development work done by hundreds of engineers over the last 22 years. The knowledge and expertise required was very distributed and diverse with extensive collaboration across teams. The team took a bottoms-up approach to map and build the blueprint for the existing platform. The data domains were identified and microservices were built around those domains using REST gRPC, Protobuf and Redis as data stores to service the data required for the workflow engine. This allowed the team to isolate and focus on building the workflow engine. We chose the actor pattern implemented by the Akka toolkit to implement the fault tolerant, highly concurrent and elastic workflow engine with built in backpressure. Kafka was our messaging platform choice due to some of the guarantees it provided, such as scalability and durability. We used Protobuf to build the DTO model for the efficient transfer of data across the wire. A suite of services and an anti-corruption layer were built for communicating with the existing services and platform, and kept inbound and outbound events contracts the same. This allowed us to gradually migrate the business onto a new platform.

The knowledge and expertise required was very distributed and diverse with extensive collaboration across teams.”

How did this challenge help you grow as an engineer or strengthen a specific skill?

This technical challenge helped me grow as an engineer by allowing me to work on different approaches to modernizing the existing platform, such as blue-green deployment and A/B testing. These are different from pure green field projects where there are no constraints coming from the existing system. The experience also instilled in me the importance of collaboration at various levels with other teams and people who are domain and subject matter experts.

They're Hiring | View 18 Jobs MarketAxess is Hiring | View 18 Jobs

Teachers Pay Teachers team photo — Teachers Pay Teachers

Elisha Greenwald

Tech Lead, Developer Tools • TPT (formerly Teachers Pay Teachers)

What’s the biggest technical challenge you’ve faced recently in your work? What made this particular challenge so tricky?

I joined Teachers Pay Teachers (TpT) in June 2021 as a member of the developer tools platform team. The team was facing challenges supporting an array of internally-built custom tooling. In the past, the custom internal tooling had been helpful because the tools were flexible and customizable to meet engineers’ demands. The challenge was that TpT is growing both in engineering headcount as well as in tech projects. With all the growth, it was becoming difficult to maintain and scale our tooling.

One of the many internal tools that our team has developed allows us to spin-up ephemeral on-demand environments (ODEs) for local testing and CI integration tests. The ODEs are useful for many reasons but with so many services being added to them, they are getting difficult to scale and support. Additionally, our fleet of integration tests that test against the ODEs were becoming flakey. There were intermittent failures and retrying became the norm. The flakiness was beginning to get frustrating and engineers started to ignore the results.

With all the growth, it was becoming difficult to maintain and scale our tooling.”

How did you and your team overcome this challenge in the end? What were some of the specific tools or technologies you used?

The team had to carefully consider our options due to the wide usage of the tools. One option was to double-down on our custom tooling. In addition to the scalability issues, we were also concerned about the difficulty of learning our internal custom tools and wanted to make it easy for new engineers to learn our stack. We tried enhancing internal documentation and providing helpful user feedback, but even with the robust documentation, we felt it would be simpler and ultimately easier to maintain if we went with off-the-shelf solutions.

We’re still planning to support a subset of our custom tooling for the tools that are most impactful, but we’ll consider SAAS solutions for other tools. For SAAS solutions, we’re currently evaluating purchasing license options for a new deployment system as well as for CI capabilities. Although we’ll still have to write some glue-code for our integrations, the team is trying to offload the maintenance and support to companies who specialize in these tools.

How did this challenge help you grow as an engineer or strengthen a specific skill?

As with most decisions like this, there are tradeoffs with any approach. One of TpT’s core company values is to chart a course and iterate, which was ultimately our approach here. It would be great to do a big-bang, splashy approach and replace everything at one time, but past experience shows that small incremental change is the safer and more ideal approach.

Our goal is to get to an end state where we’re in much better shape than we were originally. We’re pursuing the path of least breakage for our engineers and optimizing for easy adoption. Our plan right now is to replace our current functionality as it is and slowly improve our process. For me, it has been a growth experience to go through the process of getting budget approval, doing a vendor evaluation and, hopefully very soon, implementation!

Guayo Mena

Principal Engineer • Parsley Health

What’s the biggest technical challenge you’ve faced recently in your work? What made this particular challenge so tricky?

We recently had to implement a redesign for our whole website and it was considerably different from the design we had at the time. This meant we had to build a new UI library almost from scratch. There was very little we could repurpose and the deadline was very aggressive — we had about six weeks.

How did you and your team overcome this challenge in the end? What were some of the specific tools or technologies you used?

The first thing we did was bring more talent to the team. We borrowed some people from other teams and hired some new folks as well to put together around 15 people to work on this project. With such a big team, we knew we would have to be very organized if we wanted everyone to work continuously without getting blocked.

We had to split a big problem into smaller problems. We decided to use atomic design, a methodology created by Brad Frost. In summary, we looked for patterns that were repeated often in the design and created very simple react components from them called atoms. Next, we started combining those components into more complex ones called molecules and organisms. We had user stories for all atoms, molecules and organisms, and we prioritized the atoms that appeared in the design more often. We put all of those stories in a single backlog — like a stack of cards — so that when anyone in the team was finished with a task, they could simply go and grab the next one from the top of the backlog. We constantly had to update the backlog to make sure the tickets at the top were not blocked by other works in progress.

I was put in a position where it was required of me, more than ever, to be a team player.”

How did this challenge help you grow as an engineer or strengthen a specific skill?

I learned how to work with a larger team. I have to admit that initially I was expecting that three times the people would mean three times the productivity, but I quickly realized that we would also have three times the planning, coordination, code reviews and answering questions. Most importantly, I also needed to rely more on my teammates. I could not review every single PR or answer every single question. I was put in a position where it was required of me, more than ever, to be a team player.

They're Hiring | View 6 Jobs Parsley Health is Hiring | View 6 Jobs

Recent Articles