Organisations can spend as much as 60 percent of revenue to acquire the goods and services necessary to conduct business. Procurement professionals are being asked to reduce the organization's overall spend, some by as much as 20 percent in a year, while simultaneously improving supplier collaboration. Companies now realise more than ever the effect of procurement strategies on their profitability and viability.
The unique challenges in the management of product data can inhibit the search for supply chain optimisation, spend management and a more unified view of the enterprise.
To succeed in getting more control over corporate spend, companies need:
- Accurate information about items purchased and supplier quality, timeliness, performance, price and even technological advancement.
- A method to rank suppliers based on the information most important to them.
- Flexible, business-focused strategies designed to deliver continual cost savings.
Leader in Data Quality and Data Integrationwww.data ux.com877 846 FLUXInternational+44 (0) 1753 272 020A DataFlux White PaperPrepared by: DataFlux CorporationImproving Product Data Quality and Spend Analysis through Commodity CodingUntitled Document 1 Organizations can spend as much as 60 percent of revenue to acquire the goods and services necessary to conduct business. Procurement professionals are being asked to reduce the organization s overall spend, some by as much as 20 percent in a year, while simultaneously improving supplier collaboration. Companies now realize more than ever the effect of procurement strategies on their profitability and viability. Since all organizations have data on their products, inventory, parts and services and most organizations have more product data than they have customer data this information is becoming increasingly important to the overall health of a business. However, poor-quality product data has been causing problems as long as product data has been collected. The unique challenges in the management of product data can inhibit the search for supply chain optimization, spend management and a more unified view of the enterprise. The problems with product or item data stem from the actual structure and conventions of this type of information. Unlike customer data, which has a relatively small set of defined and universal attributes (name, address, email address, phone number), product data is much more complex. For example, the definition and description of a 60-watt light bulb may be completely different within the same company. Just imagine how inconsistent or unreliable product data can be if it arrives from a dozen different suppliers in your trading network. Organizations are also grappling with the fact that enterprise resource planning (ERP), supply chain management (SCM) and other applications have done little to solve these issues. These applications can encapsulate the processes that drive a business every day, yet they typically have no integrated data quality capabilities to find and eliminate bad data. However, creating additional ERP or SCM applications on top of existing applications which essentially develops redundant silos of product information has only complicated an already complex task. Issues such as duplicate product numbers, obsolete product IDs and inconsistent item descriptions exist across the organization, impacting every level of the operation. An inability to understand the products that are being sold can affect the organization s ability to plan for new products in the future. Similarly, a confused, disparate view of direct and indirect spending can foil the most well-intentioned spend management efforts. The bottom line is that poor-quality product data creates difficulties in controlling the costs of production, the productivity of the company, and the delivery of finished goods. After all, the data within your applications drives every decision, from long-range strategic planning to day-to-day operations. Product data lacksa standard structure, which traditionally makes product data quality a difficult goal to achieve Untitled Document 2 Addressing the Product Data Quality Problem Despite the difficulties that product data poses, a number of organizations are taking steps to directly address the data quality issues of product data. Given the complex nature of product data and the lack of standards across, and even within, organizations companies often look to data quality technology to standardize, validate and verify the integrity of this information. Data quality technology typically grew out of the customer data realm. In fact, data quality software started as a way to cleanse and de-duplicate database marketing and customer relations data. A standard process would include: " Data analysis Use data profiling or data discovery to uncover strengths and weaknesses in the data. " Data improvement Start to address the known problems in the data through automated standardization, verification, matching, clustering and enrichment practices. " Data controls Since new data is always arriving in an organization, apply some monitoring techniques to find and flag bad, suspicious or non-compliant information. This data quality process can provide the foundation for better management of product information. A data discovery effort can quickly determine if there are potential duplicates in the data set or if data lacks standards across systems. The most vexing problem for product data quality programs is the second phase. How can you improve data without a standard method for organizing, classifying and managing product data? Recently, companies have embraced industry-standard commodity coding systems like UNSPSC (United Nations Standards Products and Services Codes) and eCl@ss to provide a vendor-neutral, objective way of classifying data. A standard code, when applied to a product or inventory item, can be used a way to reference and sort this data across any application. For example, within UNSPSC, the code 10122101, has both the same commodity description and meaning for every organization (in this case Pig Food ). With this code appended to the record, every organization supporting this code can compare prices between various pig food suppliers more effectively. Or, a company can reconcile every product data entry within their applications that has the 10122101 code and begin to see how much the company is spending on that type of product. These standards are a way of acknowledging that product data can and will have unique representations within the systems. However, by providing a single, universal method for classifying that information, the data quality problems inherent in product data will not cause significant problems within business processes. Organizationsmust analyze, improve and control the quality of data on products, inventory and services Untitled Document 3 Using Commodity Classification to Improve Spend Analysis Commodity classification lets organizations group related items at a detailed level, validating comparisons between items within the group. One of the primary hierarchies used in every spend analysis implementation is a commodity classification structure. This structure allows a company to analyze their expenditures based on commodity type, then drill into that type to look at product groups. Table 1 provides an example hierarchy of a UNSPSC code for personal digital assistants (PDAs). Table 1 - Sample commodity coding structure. UNSPSC Code Product Classification 43000000 Communications and Computer Equipment and Peripherals43170000 Hardware and Accessories43171800 Computers43171804 Personal Digital Assistants (PDAs) With the 43171804 code attached to any PDA in the company s data sources, the organization can more readily understand how much is spent on PDAs. From here, a tool for spend analysis will give them the ability to determine what types of PDAs they are buying, who they are buying from, and how much they are spending with each vendor or for each type of PDA. However, if a company s groups were limited to a more generic category like Computer Hardware, all computer hardware would be part of the same group. Everything from an inkjet printer to a high-end server might fall into the same category. Today, most commodity classification is done as a service, usually through an offshore services engagement. Corporate information is sent in bulk to these services on a regular basis to have an industry-standard code attached to product records. This approach has several risks: " Accuracy - The staff working for these services firms does not know the client s business and may not know the correct code for an item. " Inconsistency Two people may code the same item in different ways or an individual can inadvertently apply different codes to the same item. " Expense These engagements are typically priced on a per-record basis, creating a costly regular expense. " Timeliness Coding services usually take weeks to complete, which means the organization continues to use questionable data while the data is being analyzed off-site. Commodityclassification provides a common language for product data Untitled Document 4 By contrast, automated coding by a data quality system is less risky. Rules are built into the system by product specialists or procurement professionals the people who know the parts and business. The resulting output from data quality technology is consistent and offers higher degrees of accuracy. It is easy to modify the rules and make the system more intelligent. The process can be run at any time, on any data, as often as the company needs to run it. Answers are available in minutes, not weeks. And given the ability to process these files internally, as often as required, the per-record coding expense is eliminated. The DataFlux Approach Since 1997, DataFlux has been a pioneer in developing technology for a variety of data quality problems. Through technology designed for business analysts and data stewards and not targeting just the IT department DataFlux technology allows those individuals who know what the data should look like to create more usable information. This capability extends into the realm of product data quality, through the DataFlux Accelerator for Commodity Coding. DataFlux Accelerators provide a combination of software, templates, documentation and business rules that provide the functionality and processes necessary to solve unique problems with corporate information. The Accelerator for Commodity Coding helps organizations classify product and service description data according to either the UNSPSC or eCl@ss taxonomies. The foundation of this offering is a match and classification engine that can take product data descriptions and appropriately assign the correct commodity code. Although product description data is often incomplete and non-standardized, DataFlux clears this hurdle by using industry-leading data profiling, data quality and data matching functionality to find the correct classification or best group of approximate matches. This match process can accommodate misspellings, missing information, redundant data and other anomalies that traditionally hamper the process of accurately classifying product and service description data. The technology for this Accelerator relies on the DataFlux Data Quality Integration Platform, which supplies the data access, workflow, match engine and classification processes used to assign the correct commodity classed to your data. All of these components working together provide a robust and configurable commodity classification engine. This capability is equally valuable when integrated into batch runs or through real-time web service calls that make it easy to integrate commodity coding functionality into enterprise applications like SAP or other systems. With this functionality integrated into other applications, you can make standardized product information a core component of the IT infrastructure. DataFluxtechnology helps business users match item descriptions to commodity classifications Untitled Document 5 The Steps to Better Product Information The DataFlux Accelerator for Commodity Coding is an innovative technology built on an easy-to-follow process. The steps in this process provide a blueprint to address product data quality issues, regardless of the originating application. Prepare and Process Reference Data The first phase prepares the reference data that will support all subsequent matching efforts. Here, reference data is taxonomy data (either UNSPSC or eCl@ss registries), as well as any other data that is already classified. This information may need to be cleansed in DataFlux jobs prior to load to ensure successful matches in subsequent steps. Reference data is then processed so individual words and information about those words (like frequency distributions) are identified, calculated and loaded into the reference database. Once the processing occurs on the reference data, it is loaded into a pre-defined schema in a relational database system, where it is ready to support the classification process. Run Classification Process The classification process takes product and service descriptions and runs them through a match engine that attempts to assign the correct commodity code. One or many possible answers are returned to the user either in a database table or in some other report format like a text or HTML file. DataFlux technology also has an interface that allows business users to review and confirm matches. Figure 1 shows how a business analyst can link certain attributes to different product types. DataFluxtechnology allows business users to explore and refine rules for standardizing product data Untitled Document 6 Figure 1 - DataFlux technology allows users to review and refine match criteria. Confirm Matches through a Web Interface A web server hosts the DataFlux Match Reviewer application. This application reads data from the reference database and displays it in the user s web browser. The Match Reviewer interface provides a number of features that allow a business user to refine the matches of records to industry standards like UNSPSC and eCl@ss. Figure 2 shows an introductory screen, displaying some statistics of how many records currently have confirmed matches to a commodity code. The DataFlux coretechnology can tackle many troublesome product data quality problems Untitled Document 7 Figure 2 - An introductory screen displays the number of confirmed and unconfirmed matches. The user can see a description of the item and the commodity code that the software believes is the closest match. This initial match uses a pre-built database of product information that is already classified into industry-standard codes. However, an automatic match is only the start to a more personalized program. The Match Reviewer application lets users review any match grouping that has not been automatically confirmed by the match engine. Users can view the hierarchy of the code to see if there is another similar class or family that is applicable to the item. Manually confirmed matches are stored in a database table for further analysis or to be used as part of the match process. Results can also be written to other file formats. Figure 3 shows a more detailed view of the Match Reviewer. A we -basedconsole provides an intuitive interface for reviewing and cataloging new rules for commodity classification Untitled Document 8 Figure 3 - The Match Reviewer allows users to drill down into individual codes to ensure that the right code is selected. Create Reports One of the critical elements of any data management effort is to take the results from the engagement and communicate them throughout the organization. This serves two purposes: it gives staff more faith in the data they use to make decisions, and it provides validation that the project is delivering the necessary results for all consumers of the information. The DataFlux Accelerator for Commodity Coding provides the ability to report on various facets of product data following the classification of items to a uniform code. Classification results can be stored in the reference database, stored in another database, sent to flat files, or sent to HTML files. Users canmanually review and refine how matches apply to certain types of data Untitled Document 9 One of the simplest reports is to sort product information by the newly-established commodity code. Figure 4 shows a sample HTML report, where the annual spend is displayed for several different parts, providing a quick analysis of spending trends. Figure 4 - A sample HTML report provides information on the items that most frequently appear in the data source. Custom reportsprovide immediate feedback on spending trends Untitled Document 10 Summary As companies expand into new markets or add new products to their set of offerings, they face a new and varied set of challenges. Companies now have a more global competitive landscape, where information on customers, products, suppliers and finances can be the key to good and, all too often, bad decisions. The quest to become more competitive through the more effective use of their data resources is driving organizations to take another look at their data management practices At the same time, many companies have invested in various procurement systems in an effort to improve procurement processes. Current procurement systems don t meet organizations most pressing procurement needs to create and implement the best sourcing strategies. To succeed in getting more control over corporate spend, companies need: " Accurate information about items purchased and supplier quality, timeliness, performance, price and even technological advancement. " A method to rank suppliers based on the information most important to them. " Flexible, business-focused strategies designed to deliver continual cost savings. The DataFlux Accelerator for Commodity Coding provides a vital first step in this process. By assigning industry-standard classification codes to product information, companies can develop fundamental way to analyze and manage data and realize more intelligence from their product data.