From date format errors to process disruptions: The broader implications of data quality

By Stephen Koch, Global Head of Data Quality, SmartStream

In this blog article, Stephen Koch, Global Head of Data Quality at SmartStream, shares a personal experience with his daughter’s medical insurance claim denial and draws a parallel to the failure of NASA’s Mars Climate Orbiter. The common thread in both situations is the impact of incorrect data format, emphasising the critical role of data quality. Stephen highlights how a seemingly minor discrepancy, such as a date format error, can disrupt processes, underscoring the importance of ensuring accurate data formats and units to prevent errors with potentially serious consequences. This blog talks about the broader implications of data quality in various contexts, particularly in the financial industry, where errors can lead to exposure to risk, miscalculated NAVs for funds, and missed trading opportunities.

For those who don’t live in the U.S., one of the highlights of our medical insurance system is that your insurance company sends an “explanation of benefits,” or EOB, after visiting a doctor or medical facility. The function of the EOB is to inform you which procedure or procedures the insurance will pay for and which ones they will not, depending on restrictions in your policy. Attached to each denial, or approval, are various codes as well as a brief explanation of what each code means.  In addition, included with the EOB is a detailed description of your rights and steps to appeal any denial of coverage.

A week or so ago, I received an EOB from my healthcare insurance company. I usually put them aside to review later but, for some reason, I opened it immediately.  This particular notice informed me that my daughter’s recent dental checkup was not paid.  The reason given was that she was not covered under my plan.  This could not be right, I thought, I had the insurance card in front of me and her name was clearly printed there as being covered.

The plan had recently changed so, thinking that I had not updated her dentist with the new plan information, I called the dentist office and confirmed the details. All was good, they had the right information so there should have been no problem.

Next stop, I reached out to the insurance company.  The insurance company representative quickly confirmed that my daughter was, indeed, listed on the account and the procedures performed were covered by the plan.  I should owe nothing. The mystery deepened; it looked like this was not a simple fix. There must be another place where the process had been disrupted. I needed to look deeper.

I surmised there were three places where the failure could have occurred. The first was the dentist’s office.  Someone could have sent the request incorrectly. They may have used the wrong claim code, so I called the office back. We walked through the claim, and it seemed there was nothing that would have led to the claim being rejected. The reject reason was she was “not covered in the plan,” but the plan numbers were all correct. The second place was at the insurance company.  I had already confirmed she was listed on the account and I confirmed on their website that she was listed on my policy.  I searched through the insurance website to see if I could find something that caused the process to fail.

At the same time, I started to work with the third place, my company benefits group. They both confirmed everything that I had already learned. There seemed to be no reason the claim had failed but it had. At this point I was as frustrated as could be, but I still had no clue what the real problem was.

All this time working on this issue reminded me that I had an appointment coming up and I had not yet confirmed it. I took some time away from tracking down the EOB problem to call my doctor. The nurse started the conversation, as always, by confirming that she was indeed talking to the right person. “Can you give me your full name and date of birth?” Date of birth. DATE OF BIRTH!  I quickly confirmed my appointment, hung up and opened the insurance website again. I clicked away until I was on the participant page and there it was.

The dates in the system were wrong. Her birthday was in European format (DD/MM/YYYY) not U.S. format (MM/DD/YYY). This attribute was entered into the system, it turns out, in Europe and the person didn’t realize that the system used U.S. date format. Birthdate was a critical matching field and when it did not match, it triggered the rejection. Fixing that one date was enough to get things flowing again.

In December 1998, NASA launched the Mars Climate Orbiter. After 11 months the probe was set to reach Mars and begin to make final preparations to complete its mission. The Mars Orbiter was intended to investigate several climate related measurements including how much water vapor was in the Mars atmosphere.  Unfortunately, its mission was cut short by a catastrophic failure. The orbiter crashed into the planet and destroyed the million-dollar probe.

At first the mission team was unable to find the problem. What system had failed? It appeared that none had. So, what was the reason it had crashed?

Controlling a spaceship millions of miles away is a tricky thing. Humans are very good at responding to changes in conditions on the fly but responding to changes occurring approximately 34 million miles away is fraught with danger.  First, it takes any data coming from the spaceship anywhere from 3 to 5 minutes depending on the relative position of the planets. Once this data is received, the humans on earth need to craft a response and then send back instructions that again take 3 to 5 minutes. Then they need to wait the same amount of time to figure out if the instructions worked and, if not, what more needs to be done.

A lot of systems are therefore automated as much as possible to limit the need for this interaction to occur.  Only very critical instructions are held for human interaction. Like me with my daughter’s claim, the space agency looked at all the points of failure. Did a particular automated system fail? Did a person fail to send a manual command? Did the radio signal fail? Was there a materials failure?

Like my daughter’s claim, the problem had to do with a difference between the U.S. and the rest of the world. NASA had changed their basic unit of measurement from the English system of measurement, inches and feet to the Metric system, centimeters and meters to be in line with the rest of the world.  The probe’s system being used to slow the probe to orbit Mars was programmed using the Metric system measurements. Unfortunately, one of the vendors providing the actual instructions to the probe sent their instructions in the English system. It was determined that the instructions, taken in as metric measurements, caused the probe to undershoot its targeted orbit and then crash into the surface of Mars.

These are two radically different situations. Yet, they show a common problem affecting all datasets. The data values can be correct, but the format is wrong. In the first case, something as simple as how to display a date value can corrupt an otherwise robust process.  The same is true in the data captured for dates for financial instruments. Both derivatives and fixed income securities make extensive use of dates, from expiration and maturity dates to issuance and coupon dates.  Inserting a wrong date for one of these can result in all types of exposure.  Evaluated pricing, used to calculate bond values, which rarely have any active bids and asks, rely heavily on the dates to complete those evaluations. Fund accountants rely on good dates for accruing coupons. Traders rely on the dates to understand the various yields, e.g., yield to call, yield to maturity, needed to choose which bonds to purchase. 

Wrong units are the bane of derivatives. For example, if you are buying an oil future that is based on 1000 gallons, but it is incorrectly marked as 1000 barrels, the settlement value will be 42 times greater than the actual value. If the calculation of the value of a future is cents per bushel and it is marked as dollars per bushel, the client may receive an erroneous margin call.

Data is more than just values. Values without context are worthless.  As these examples show, getting either format or unit size wrong can result in annoyance on one side of the spectrum, as in the case of my daughter’s dentist, to catastrophic loss on the other, as in the destruction of a multimillion-dollar space probe.  In the financial world, such errors can result in exposure to risk, recalculated NAVs for hedge and mutual funds, and missed trading opportunities.