Passionate about data

by Stephen Koch, Head of Data Quality | 11 July 2023

Several years ago, I attended an event at my daughter’s school. It was one of those events where parents got the opportunity to mingle with other parents over some refreshments. Inevitably, the discussions turned to occupations and there were lawyers, doctors and teachers, amongst others. I fell in and out of conversations as I wandered around the room until I ended up in a conversation with another parent.

Initially, she seemed reluctant to talk about her profession, claiming her job was difficult to explain and not something most people understood. This intrigued me because so many were so willing to talk about themselves and she was not. I pressed her, assuring her that I was, indeed, interested. What could it be that she did that normal people did not understand? I began to make up crazy jobs she could hold and needed to know if her job matched my imagination.

As soon as she began to explain, I understood. She explained that her job was a specialisation in the financial services world. I reassured her that I was very familiar with that world, having worked in it for many years myself. She went on to explain that she was responsible for managing a group in operations that sourced the reference data of financial instruments and made the data fit for purpose for the various users in her firm, from Accountants to Investment Advisors to Traders. She explained that it was her work that created the building blocks of all the other processes that take place in financial institutions.

The painstaking work of her staff was to gather and clean instrument data to make possible the execution, reconciliation and clearing of all financial security trades. It supported the creation of position reports, P&L calculations, regulatory reporting, risk analysis and the creation of customer statements. It was the basis for all the financial world. She spoke with passion about what she did but in her eye I saw that look people get when they think the listener’s eyes are about to gloss over with boredom.

Mine certainly were not. What she was describing was not foreign or boring to me. In fact, it was exactly what I did. I knew the passion she felt about the importance of what she and her team did because my team and I did the same thing. Whether one shares a passion for books or wine or painting or politics, finding someone who shares that passion and understands what you are talking about is an exciting thing. It always feels good to know that whatever is your passion, there is someone who understands you and the things you face, whether good or bad. Her surprise at learning of our common ground was quickly replaced by the joy one feels when one stumbles upon a kindred spirit. We spent the next hour or so in a quiet corner discussing all things data.

Data drives not only the financial world, but the whole world. When will it rain and how much, a farmer may want to know. What is the strength of a particular material, a builder or engineer building a bridge or a plane asks. What are the opinions of a particular district, a politician needs to determine to run a successful campaign. All this data must be captured from somewhere, must be verified, and then stored in a way that it can be easily accessed for its purpose.

Financial reference data, the data that describes and captures the essence of financial instruments, is my passion. The importance of getting the data correct cannot be overvalued. Without clean and correct data it would be impossible for the financial markets to run smoothly and efficiently. Buyers would not know whether the investments they made were appropriate and sellers would be unsure they received proper compensation. Trust in the markets would quickly devolve into chaos.

The recent concern over the funds deposited at Silicon Valley Bank reminded me of another bank collapse in the early 1990’s. A fairly large regional bank became insolvent threatening the same. At our firm we had a particular client that invested solely in CDs from various regional banks being careful to invest just enough so that his investments would be covered by federal insurance. All of his trades were done with this protection in mind.

Unfortunately, the descriptive data that our firm had employed was faulty. The database being used at that time only space for 20 characters in the name description. The person who added the name to the database, while trying to fit all the characters they could, had shortened the description and removed what they thought were inconsequential words. How wrong they were.

Our client, keeping to his strict strategy, purchased the maximum number of CDs from the regional bank that would be covered by the insurance. Then he moved on to find another CD. His broker found him one that was paying interest at the top of the range and based on the description fit in his strategy.  Unbeknownst to him, the CD he picked was the one where the data input clerk had removed “extraneous” characters. Those extraneous characters were the name of the bank who was issuing the CD. They left instead the name of the city branch and the word “bank.”

The result of this error was the client had invested double the maximum amount of cash covered by the FDIC insurance. When the bank collapsed a short time later, the client found themselves with an uncovered loss. There was a lot of finger pointing, first to the data input clerk, then the broker who suggested the CD then towards the client who did not research the CD thoroughly enough. Ultimately, the firm covered his large loss.

How could this happen? What safeguards could we have had in place to make sure that it does not happen again? It was this case that started my journey to the passion I feel about data today. It was this case that showed me how minute changes to the way we describe things, in this case a description of a financial instrument, could have dire consequences to those who rely on those descriptions.

This blog will touch on various aspects of how we handle data, how it can go wrong and how we work to prevent errors from creeping into the data that drives so much of our processes. Some who read it will understand the passion we in this community feel. I hope that those who don’t share this passion about this particular dataset will gain some insights into their own uses for whatever type of data they work with.