According to Tim O’Reilly, the open source advocate: "We’re entering a world in which data may be more important than software." More accurately, I would suggest, we are awakening to a world in which data has always been more important that software. A perfect example of this point is the rise of test automation tools as the ‘next big thing’ during the 1990s – with market-leaders such as Mercury QuickTest Professional and Rational Robot (now HP and IBM) becoming industry standards.

As more and more vendors entered the market with new technologies, it has become possible to achieve the benefits of test automation – increased efficiency, reduced costs and greater resource flexibility etc. – across the entire enterprise. Yet, few organisations have been able to realise these advantages, despite the significant initial investment required. Instead, most software development projects are still forced to compromise quality in order to meet tight deadlines and budgets. This is why it’s time to awaken to the importance of data in maximising the value of our testing effort.

More often than not, testing places a higher focus on the “Happy Path” and less on areas of negative and non-functional testing. In order to rigorously test your application, you need to be able to provision high quality ‘fit for purpose’ data, regardless of which methodologies or tools you use. In short, data matters! Despite this, data continues to be overlooked when designing project requirements. This has major implications for your ability to deliver quality software to market; on time, on budget and within scope. If you don’t understand the data, you can’t understand the business model you are testing.

For instance, when I bought my first car back in the early 1980s, it was possible for almost anyone to look under the bonnet and find the problem. In most cases, as long as you were armed with the requisite Haynes manual, you could see how each component fit together and therefore understand how the engine worked. It was so simple that thousands could ‘test’ their car on the drive at home.

However, the advent of modern car engines means that this pleasure is no longer afforded to us. Encased in aluminium, engines have become so complex that even mechanics have had to abandon the time-honoured oily rag in favour of expensive diagnostic computers to understand what is going on ‘beneath the covers’. This evolution has forced mechanics to become specialists in certain types of engine, narrowing the general knowledge base of mechanics who understand each engine.

In recent years, a similar trend has emerged within the software industry. As organisations grow, merge and diversify, their data and business models develop a labyrinthine complexity that requires users to have specialist knowledge. However, at the same time, business drivers are forcing most organisations to outsource their testing efforts. As a result, fewer testers working on each system truly understand the data they are working with, what data they need and how it flows between systems. Without this knowledge, it is very difficult to rigorously test the business model.

For those of you experienced in traditional development methodologies, consider how many times you have heard comments like:

“OK – you found a defect. What are the steps to duplicate? What did the data look like?

“The requirement documentation is ambiguous; do I need to guess how many types of account there are?”

“It worked in Development, it should work in Test!”

 “Use production data to test with, it’s all the same anyway!”

“Why do I need to test the same code more than once with different data? If it works – It works!”

“Non-functional testing takes care of itself: we do not need to provide data for the process”

Imagine how much ambiguity would be removed if we understood the data we were working with?

  • Increased business communication
  • Decreased complexity in coding
  • Create test case foundations
  • Increased scope for testing coverage
  • Decreased scope creep and increase delivery quality

So, how can we increase the understanding of our business models within these operational constraints? The simple answer would be to ensure that all documentation (data flows, dictionaries etc.) are kept up-to-date. However, this is easier said than done. Writing and maintaining documentation requires specialist knowledge, which is in increasingly short supply. It is also the first element to be left out in order to meet implementation deadlines. With the speed of change in today’s modern, complex applications, it is perhaps unrealistic to expect documentation to keep pace. Another, more reliable, solution is to profile – sample and analyse – the data during design.

Data profiling - known elsewhere as sampling or analysis - enables users to fully understand the existing data and data flows, whilst also highlighting exactly what data is needed in order to cover all the scenarios required to fully test the application. Automating your data profiling allows you to develop a full picture of your data relationships and which elements are sensitive; eliminating the need to rely on error-prone human factors, such as specialist knowledge or documentation.

Once you fully understand your data, it is easy to analyse the gaps in the data you need for high quality testing and improve it using coverage techniques. By enabling your personnel to understand the data and business models, and provisioning ‘fit for purpose’ data, you improve the quality of testing.

Ray Scott has over 25 years of experience in the IT industry, including 15 years as a project development lead. His newest venture is GT Agile, Grid-Tools’ Agile Consultancy Practice.