...

Improving Diversity in Data Through an Effective Tech Strategy

Nas Radev
March 21, 2023
Read: 9 min

In this blog post, I detail my belief that choosing an effective tech strategy can greatly contribute to our ability to hire diverse talent and improve our chances of success in delivering lasting business benefits to an organisation.

Key takeaways for the reader are:

  1. Modern tech has removed the need for a STEM degree to be an effective technologist. This vastly expands the talent pool, especially in data.
  2. Modern tech makes hard skills easier. Soft skills are what can make or break an organisation’s culture and ability to derive business value from data. This is because digital transformation and becoming more data-driven are about culture and process more than technology.
  3. Tech decision-makers and engineering managers must consciously choose not to create inflated barriers of entry to their technology estates. ‘Build over buy’ should be minimised. 'Build' means you need more specialised skills, which creates a barrier to entry. A happy medium between ‘build’ and ‘buy’ can be struck.
  4. The above is especially applicable in the field of data. This is because data is at the intersection of business and technology and is a field that is wide open to fresh talent with the right soft skills.
  5. Customer support specialists and other professionals with highly developed people skills can make for excellent hiring pools. This increases diversity and brings much-needed soft skills.

STEM degrees in the DEI context

Solving the lack of diversity in data & technology is not simply about getting more people from underrepresented groups into STEM degrees.

We often talk about the necessity to encourage more young women and people from other underrepresented groups to do a STEM degree in order to bring more diversity to technology. This is of course correct for many deeply technical disciplines.

However, I argue modern tech work is becoming less and less dependent on deep technical skills. And I believe that this unlocks a huge opportunity to diversify our field.

'Hard skills' are getting easier

This interactive visualisation shows more than 1,500 modern tools currently active in the field of data. What becomes apparent, after analysing some of the more niche ones, is that there is fantastic tech out there. It abstracts away and automates a lot of the really tricky things about working with data. Ingestion, transformation, quality control, data warehousing, streaming data, creating interactive dashboards.

Even working on data science problems is becoming increasingly easier. Things that used to take a small army of engineers can now be done with two mouse clicks with equal success.

This is partly due to the drive over the last decade to enable more self-service. Many powerful tools are tailored to the needs of the business user more so than the software developer. The market has realised that techies are a scarce resource and has been self-correcting to make non-techies more capable.

The big positive that results from self-service developments is that despite our field evolving and becoming more and more complex, hard skills are actually getting easier. Yes, we do require deeply technical experts to create tools for the rest of us. The vast majority of data professionals can then benefit from all the best practices distilled into these tools. They can use them to create business value more rapidly with less focus on the engineering. It means we need less of the traditional software engineering skill set.

This is great news for diversity, especially in a world where only 18% of new Computer Science graduates are women.

People and technology

The thing I love the most about the field of data is that it sits at the intersection of technology and business (meaning any formal organisation, both for-profit and otherwise). It has the potential to impact nearly every single one of them. On the other hand, I struggle deeply with the fact that we still haven't quite figured out how to consistently generate tangible and obvious business/organisational value from it. It is my belief that this is largely due to the lack of diversity in our field.

Allow me to elaborate.

When an organisation starts on a journey of digital transformation and tries to deliver more value from its data, it will tend to overemphasise technical skills, such as data architecture, data engineering, data science, business intelligence. Digital transformation, and concretely data transformation, is typically positioned in the domain of technology.

However, transformation does not start with tech. It starts with people. It passes through processes. It then materialises as culture. Somewhere in between, “tech happens” but it is not the main driver.

The times I have seen data transformation programmes work out well is when the right people-people were involved, i.e. people with what we traditionally refer to as soft skills (a term we have been looking to substitute for a more comprehensive one and are open to suggestions). The ones that, through empathy and curiosity, seek to understand and improve pockets of the business. And the really great ones are usually technologically savvy, but not necessarily engineers per se. They are not doing tech for the sake of doing tech but rather using it as an enabler.

 

Where do you find tech-savvy “people-people”?

As a matter of fact, we can find them in many fields out there.

Take customer service for example. Customer service professionals get calls from strangers and have only a few seconds to try and figure out how to assist them. This requires a high degree of empathy and analytical skills, and perhaps even a pinch of curiosity. It also requires an understanding of systems and processes. All of these are valuable skills in data too.

In fact, as part of the Talent Accelerator Programme at Infinite Lambda, we parachuted customer service professionals into the midst of a data transformation programme. And we saw them do very well.

They asked a lot of questions and focused on understanding the current systems and processes before rushing to solutions. They became allies to the business folks well before their software engineering counterparts did. And then they used accessible, powerful modern data tech to drive results. This showed us a new way to open new doors to diversity in tech.

While women currently hold only 26.7% of tech-related jobs, they hold 69.5% of customer service jobs.

I strongly believe that many people in customer service can have a prosperous career in the field of data, with minimal retraining, because of the way they wield critical skills that help them drive business value.

I have seen this first-hand: the soft skills that are often of secondary importance to many of us in tech are others’ primary strengths. And the hard skills can easily be taught in a short space of time.

Customer Service definitely isn’t the only such field, but it helps to illustrate the example, especially in a world where tools like OpenAI’s GPT are already making leaps and bounds to reduce the need for menial work done by humans.

Rigorous training remains vital

In our Talent Accelerator Programme, we hire a lot of people from non-technical backgrounds. Some of our top performers used to be in customer service. Others were full-time moms, sales assistants, finance professionals and come from a myriad of other backgrounds. Now they are all highly capable analytics engineers successfully delivering critical data capabilities for large organisations.

It takes 4 months of intensive training and, needless to say, we pay them full-time salaries so they can focus on it 8 hours per day. We teach them how to use modern data tools and key concepts like data governance, quality control and collaborative development.

We then deploy them on projects for a few months, working under the close supervision of someone more experienced so they can soak in best practices and ways of working.

In 16 months, they are indistinguishable from an early-mid level data professional in terms of hard skills. But their soft skills* are, generally, far superior, as demonstrated by the ease with which they get business processes and communicate with business stakeholders.

The above is just one example of how we can rapidly bring more diverse talent to our field and yet remain confident that they will bring heaps of value to the organisation.

 

A commitment to enable

As technologists, it is our responsibility not to create artificial barriers to diversity.

The way we architect and engineer data platforms can create or remove barriers to entry. There is usually more than one way to build a data platform. There is custom software development (e.g. building pipelines and encoding logic in Python, Java, Scala, etc.) And there are off-the-shelf tools that range anywhere from delivering the entire data platform end-to-end, to highly specialised, modular tools.

Custom development could create barriers

Now, more than ever, it is worth designing a platform in a way that minimises the need for highly specialised skills.

If we build a platform almost entirely from scratch, using a lot of custom development, we may end up relying exclusively on software engineers to be able to make heads or tails of it. The engineering team would need to work on evolving the data sets that the business needs as well as the software aspects of the platform, fixing bugs, reacting to changes and building new functionalities. All while the business is waiting to get numbers and insights.

The engineering team eventually becomes a bottleneck, always behind on business requests. The business folks are unable to help themselves because they are not software engineers and they cannot help their colleagues in tech either, so all they can do is wait. If they choose not to, they pick up a pet tool and get things done on their own, which has downstream implications typically referred to as Shadow IT.

In this scenario, hiring more people into the data team is challenging. Well-trained engineers are expensive and rare, and they tend to come from a computer science background. In my experience, it is much harder to retrain non-technical people into software engineers. This presents a higher barrier to entry which is bad for diversity and creates a bottleneck to rapidly transforming and evolving an organisation through data and technology.

The problem with all-in-one solutions

At the other extreme, if we pick a single tool that does almost everything end-to-end, we need anyone who uses the platform or contributes to it to know that tool well. As our data estate matures, we tend to use the tool more and more, in all its idiosyncratic ways. I see a lot of forcing the tool to do things it was never meant to do, building various workarounds in the process. This often creates a big bowl of spaghetti that is not only difficult to unpick but also requires a lot of highly specialised training in this particular tool that is not really transferable to many other tools.

This is a different type of bottle neck. Arguably, such tools should be easy to pick up for a beginner. They are usually point-and-click, with intuitive user interfaces and great documentation on the things that they can do well (but not the things it was not designed to do that are not immediately obvious).

Sadly, the majority of deployments of such all-in-one-tools will often grow so complex that they almost become a “programming language” in their own right. Which leads to the same challenges and barriers to entry as with custom software development: it is too niche, too specialised and it makes it harder to let people in.

In both the scenario of heavy custom development and fully relying on a singular end-to-end tool, it becomes difficult to bring non-experts in, which creates a problem not just for diversity, but also for driving value for the organisation as a whole.

A happy medium

А happy medium does exist. Highly specialised custom development skills and tools are necessary but only insofar as to enable less-specialised skills to be highly impactful.

Here, I refer to software engineering, site reliability engineering, data engineering or sophisticated all-in-one platforms as highly-specialised skills. They take years to learn, they are constantly evolving and can be used in a plethora of use cases. Such skills are useful when developing advanced processes necessary for the security, scalability or robustness of a platform.

At the same time, modular off-the-shelf tools like Fivetran, Snowflake, ThoughtSpot and Tableau are what I would categorise as less specialised data skills as they can be picked up quite quickly by non-technical folks. Such tools have a limited feature set that only focuses on solving one particular problem and let other tools solve other problems.

One of the great advantages of these tools is they have a low level of operational complexity. They are usually SaaS tools, available on demand through a web page for any tech-savvy non-expert user to start leveraging after minimal onboarding.

The organisations that get it right are the ones that deploy highly specialised skills sparingly, mostly to create robust guard rails for the less specialised skills to be used freely, effectively and safely.

This includes

  • Automating policies such as security and data quality;
  • Setting up DevOps and DataOps best practices that allow people to work with and connect modern modular tools as well as to deliver datasets and insights in a collaborative manner.

This scenario enables new talent to jump into data much more quickly and focus on transforming the business and driving value, while a smaller subset of people are having to focus on instrumenting the underlying technologies that enable all of this.

This is good for diversity and good for the business.

Conclusion

The way we do technology can either open or shut doors to new talent. Especially so for talent that comes from a non-technical background.

In this blog post, I argued that there are pools of talented people, with highly valuable soft skills, which can be tapped into to add valuable firepower to any organisation looking to become more data-driven. Furthermore, these talent pools may prove a great source for plugging the diversity gap in tech.

In my next blog post, I will explore various ways of deploying data talent across the organisation in pursuit of optimising the value that can be delivered. This will cover structures such as centralised data teams and domain-oriented data capabilities like the data mesh. I will share my observations of the power of aligning data-savvy people with soft skills within domains, which is a continuation of the key takeaways of this blog post.

Meanwhile, learn more about how we manage culturally diverse teams at Infinite Lambda and read the inspiring story of one of the engineers who joined our Talent Accelerator Programme.

More on the topic

Everything we know, we are happy to share. Head to the blog to see how we leverage the tech.

digital skills gap
How to Address the Digital Skills Gap to Build a Future-Proof Tech Workforce
If an organisation is to scale, it needs a data and cloud related talent strategy. A bold statement, I know, so let us look into...
May 20, 2023
dbt deferral
Using dbt deferral to simplify development
As data irrevocably grows in volume, complexity and value, so do the demands from business stakeholders, who need visibility in good time, whilst minimising cost....
May 11, 2023
How to implement Data Vault with dbt on Snowflake
How to Implement Data Vault with dbt on Snowflake
Data Vault is a powerful data modelling methodology when combined with dbt and Snowflake Data Cloud. It allows you to build a scalable, agile and...
April 27, 2023
Data Vault components
Data Vault Components: An Overview
Data Vault is a data warehousing methodology that provides a standardised and scalable approach to managing enterprise data. At its core, it is designed to...
April 21, 2023
Data Vault
Data Vault: Building a Scalable Data Warehouse
Over the past few years, modern data technologies have been allowing businesses to build data platforms of increasing complexity, serving ever more sophisticated operational and...
April 13, 2023
Snowpark for Python
Snowpark for Python: Best Development Practices
While machine learning applications have been enjoying a peak in popularity in the last few years, companies still have a hard time integrating these innovative...
March 29, 2023

Everything we know, we are happy to share. Head to the blog to see how we leverage the tech.