IDEA Tech Tip: Normalizing Vendor Names
Q: Does IDEA provide a way to normalize vendor names? For example, I would like to change AMAZON MKTPLACE PMTS to AMAZON, WAL-MART002 to WAL-MART, and UPS 091238547 to UPS. The database I am working with has at least 500,000 records.
A: If you are using IDEA 10, first look for exact duplicates, such as vendors that have the same phone number, address, email, etc. You remove those from your population and then run the Fuzzy Duplicate feature on the rest. This will cut down on the number of transactions you will need to look at.
The Fuzzy Duplicate feature allows you to play with a tolerance. The default is 0.7, and using a higher number will give you fewer duplicates while a lower number will give you more duplicates. This is probably the best place to start but you will still have to review the results as two vendors can have similar names but be different companies.
Depending on the other information in the file, you might also look for duplicates that have the same address, same phone numbers and same email addresses.
– Brian Element, CPA, CISA, CIDA, CISE, CFE – IDEA Certified Instructor and Financial Advisor at Public Works and Government Services Canada
A: You could clean up the file by creating one field with nested functions or a series of fields using each function separately until you get the desired end result. I will use your example to create a one-equation field.
First right-click in the Vendor Name field and choose Append Field from the pick list. Give the field a name, like Shrtvendname, make it virtual character, and use a length of 50 to make sure you do not truncate a large name.
For your example, the equation @left(@justletters(vendorname),@isini(" ", vendorname)-1) will return: AMAZON, WALMART, UPS. To explain the function, you are going inside Vendor Name and having IDEA look at just the letters. Then starting at the left of the field, pick up letters up to a space, then back up one character to remove the trailing space.
This may not work for all your vendors because you could have vendors like FOREVER21, which is the name of the vendor and losing the 21 will be confusing if you have another vendor named FOREVER YOURS FASHIONS.
Cleaning up a field as you described is not an easy task but through inventive use of functions and possible multiple fields it can be done.
- Steven Luciani, CPA, CGA, CIDA – IDEA Certified Trainer and Electronic Commerce Audit Specialist at Canada Revenue Agency
For more IDEA tech tips or to ask your own IDEA questions, visit IDEAScripting.com.
About Alain Soublière:
Alain Soublière has many years of experience working with computer audit software. He worked in a senior management role as the IDEA Product Manager for many years before becoming Director of Product strategy for CaseWare Analytics and more recently the Chief Product Strategist.
Connect: Alain Soublière