white paper

Why using search engines for financial risk assessment is . . . risky

Using a search engine for financial risk assessment won’t give you the results you need

Comprehensive due diligence is more important now than ever. The new normal brought about by COVID-19 has significantly impacted individuals and businesses. As a result, financial institutions have dealt with increased business due to the CARES Act and the issuance of Payroll Protection Program loans.

However, this uptick in business for financial institutions has also meant an increase in opportunity for fraudulent activity. With so much data and information seemingly at our fingertips, it is easy to think that vetting a new customer or checking the reputation of a business can be done through your favorite search engine.

This simplicity is a deception.

The truth is, search engines are not reliable tools for meaningful due diligence, particularly when it comes to identifying suspicious financial activity, fraud, money laundering, criminal involvement, or other forms of legal malfeasance.

In fact, organizations relying on mainstream search engines for fraud protection and regulatory compliance may be unwittingly exposing themselves to dangerously high levels of risk. In addition to its technical limitations, one of the worst things a basic internet search can do to a financial institution’s stakeholders is fool them into believing they have done their due diligence when they haven’t.

Not all search algorithms are the same

To understand why conventional search engines are such an inadequate tool for uncovering critical financial risk factors—and why targeted software solutions developed specifically for fraud prevention are so much more effective—it helps to understand some of the capabilities of these search engines.

A search engine’s bots—called “spiders” or “crawlers”—follow links on the internet and index what they find. Every time a user conducts a query, that search engine employs a series of algorithms to pull relevant content from within its index and display it—ranked according to what it thinks is most relevant for that specific search.

In order to identify the web pages most likely to satisfy a search query, search engine algorithms use several indicators, including term relevance, credibility of the website, usability, location, etc. Of course, another factor is that others conducting similar searches have found those pages useful, too.

However, search engines can only provide results for pages to which they have full access. Websites or sections within them that require logins—or some kind of information to be entered in order to access them—aren’t going to show in the search engine results pages (SERPs). They also can’t return sites or pages that aren’t indexable.

More isn’t always better

More is better in many ways, but the great paradox of big data is more information does not necessarily mean better information—in fact, it’s quite the opposite. The ease with which certain types of information can be found is often inversely proportional to the amount of information available. When the Google search engine was introduced in 1998, there were about 2.4 million websites on the internet. Now there are more than 1.5 billion websites worldwide containing 33 zettabytes of data or 33 trillion gigabytes. By 2025, the International Data Corp. (IDC) estimates that number will explode to 175 zettabytes and grow exponentially from there. 

  • 1998
    2.4 million websites on the internet
  • 2020
    More than 1.5 billion websites on the internet containing 33 zettabytes of data
  • 2025
    175 zettabytes of data expected, IDC estimates

With so much more data to sift through on the internet, it’s more important than ever to know where to look for information, what to look for, and how to determine the quality and integrity of the information one uncovers. It’s also much easier for criminals to hide information in obscure corners of the internet, and for investigators to miss vital information buried deep inside databases where conventional bots can’t roam.

“Often, the most revealing aspect of a search is what’s not there that should be there,” says Jim Richards, founder of RegTech Consulting and the former global head of financial crimes risk management for Wells Fargo.

Legitimate businesses should have licenses and certifications, for instance, and professionals should have records of activity and alliances in their chosen field. Diligent investigators using conventional search engines can unearth these and other types of pertinent information—but, says Richards, “While Google can be a useful tool for compliance, it shouldn’t be the only tool.”

Haystacks vs. needles

Indeed, there is a profound difference between a random internet search for crime data or a criminal and a search conducted using software specifically designed to locate information contained in court records and identify suspicious patterns of financial activity. If data were haystacks, a basic search engine would collect hundreds of them and invite you, via links, to search for needles. Risk-assessment software is more like a giant magnet that pulls the needles up and leaves the rest of the hay behind.

The universe of information being searched is also quite small compared to the entirety of information available on the internet. It is estimated that standard search engines only capture about 4% of the internet’s total web pages. The other 96% of the internet exists on the so-called deep web: data requiring logins and authentication, such as banking accounts, emails, etc.; information hidden behind paywalls; and content safely ensconced within proprietary databases.

In most cases, SERPs present pages that may or may not lead an investigator to the information they seek in regards to criminal activity. Most searches yield many thousands—if not millions—of results and examining them with anything close to the thoroughness required for responsible due diligence is an extraordinarily tedious and time-consuming task.

A better risk-assessment tool

Because search engines are consumer-oriented, advertising-driven tools, they are in no way designed to find the kinds of information financial investigators, compliance officers, or BSA/AML professionals need. Suppose a new customer or corporate client has been involved in civil litigation in several states and has received several fines and sanctions but no criminal charges or convictions. Unless the matters were covered in news reports, the likelihood that they’d show up in a standard internet search is very low.

To locate the information, an investigator would have to search individual court records in each state manually. Of course, that is possible, but it is tedious, time-consuming work that exposes you to the risk of error as vital information can easily slip through the net.

By contrast, a software solution using intelligent analytics can be programmed to search such databases automatically, can flag any activity involving the person or entity being searched, and evaluate or score the level of risk they represent. It might also have access to proprietary databases curated to contain such information as up-to-date property records, liens, judgments, defaults, bankruptcies, and other pertinent financial data.

Furthermore, if someone is searching for information on a single individual, a well-designed algorithm can match the name being searched with an identity profile distinguishing it from the hundreds or thousands of other people in the world with the same name. 

Identifying suspicious activity

Financial investigators are often looking for evidence of fraud and money laundering as part of BSA/AML compliance obligations. Money laundering, in particular, doesn’t happen in a vacuum; it requires networks of people working together to filter money through banks in seemingly legitimate ways. Deception is part of the game.

Also, BSA/AML regulatory requirements require data security. The use of search engines isn’t designed to record a reliable data trail for auditors, regulators, or managers. Beyond bookmarking websites, there is no way to document an investigative trail, should regulators want to investigate a Suspicious Activity Report (SAR).

Conversely, an intelligent fraud-prevention tool will keep customer data secure, create detailed logs, and have built-in reporting capabilities allowing users to access and organize data in any number of ways. It will also include ways to score levels of risk and more quickly clear legitimate entities, freeing up time and resources.  


Perhaps more than at any other time, financial risk assessments are of critical importance to financial institutions. Increased business for those institutions is likely to be coupled with a heightened risk of fraudulent activity. While standard internet search engines are undoubtedly useful to finance and compliance professionals, they are fraught with risks when compared to a dedicated database and analytics solution.

Identifying information and behavior patterns specific to fraud, money laundering, or other criminal activity is crucial in determining the reputation of a new customer or business. A dedicated solution not only provides peace of mind by offering accuracy of results, but it also saves valuable time and money.

It’s true that today’s internet contains a vast and ever-expanding wealth of publicly available data. Still, financial investigators need better filters and search mechanisms to make that data more useful.

And with the right tool, one person can do more reliable, higher-quality work in less time—work that financial institutions can be confident is thorough, accurate, defensible, and protected.

Thomson Reuters CLEAR

See how CLEAR makes it easier to locate people, businesses, assets and affiliations, and other critical information

Thomson Reuters is not a consumer reporting agency and none of its services or the data contained therein constitute a ‘consumer report’ as such term is defined in the Federal Fair Credit Reporting Act (FCRA), 15 U.S.C. sec. 1681 et seq. The data provided to you may not be used as a factor in consumer debt collection decisioning, establishing a consumer’s eligibility for credit, insurance, employment, government benefits, or housing, or for any other purpose authorized under the FCRA. By accessing one of our services, you agree not to use the service or data for any purpose authorized under the FCRA or in relation to taking an adverse action relating to a consumer application.