This AI Startup Wants To Automate Your Tedious Document Searches

For the casual internet user, a quick Google search is often all it takes to find plenty of information on any particular topic.

But for specialized financial research, analysts often find themselves laboriously searching proprietary databases, regulatory filings, and paywalled sources that aren’t even indexed by the big search engines, says Jack Kokko, the founder and CEO of financial search engine company AlphaSense.

That’s why he and cofounder and CTO Raj Neervannan created AlphaSense, which applies natural language processing and machine learning techniques to let users find relevant information in financial documents.

“It started from my first job out of college as an analyst at Morgan Stanley, where I was, as every analyst, going through these huge piles of paper on my desk and trying to find information very manually—nights and days spent toiling through that information and still fearing that I’m missing a lot,” Kokko says.

The San Francisco-based company takes in information from thousands of licensed data sources, as well as public web sources like news reports, and automatically processes them to extract meaning on a sentence-by-sentence level.

“When a company’s talking about building a semiconductor fab in Shenzhen or just production growth in China, those two mean the same thing, even though the words are very different,” he says. “We have clustering algorithms that are able to understand that those topics mean the same thing.”

That lets corporate customers search for information on esoteric topics and find results substantially faster than they would by looking for the data by hand, or with the individual search engines built into specialized databases. AlphaSense says it counts more than 450 companies as customers, including financial services firms with a total of more than $3 trillion assets under management.

In one case, Kokko says, representatives from a big Wall Street firm receiving a demo from AlphaSense tried searching for information on an obscure corner of the Japanese electric power industry, and happily discovered their firm had already researched the topic.

“Our system found a whole bunch of research by that top firm on that topic that even that firm themselves didn’t know about,” he says. “You can’t rely on people to know everything about their own internal content or the content they produce, let alone thousands of others that are offering and producing content.”

For Kasandra Davis, a senior manager in investor relations at Applied Materials, which supplies tools to computer chip manufacturers, AlphaSense’s tool makes it easy to search for information on the semiconductor industry—and quickly organize information about what the company’s own executives have said at conferences on particular topics without laborious searches through individual transcript files.

“I would have to go into the transcript of each one of those conferences and search for those words,” she says. “You can imagine what that would have been like from a time perspective.”

And, she says, she recently used the tool to locate information comparing video sizes for ultra-high-definition TV versus virtual reality. That’s something that was hard to find through a nonspecialized search engine.

“I did use Google to do that search on VR file size versus HD file size, and I couldn’t find anything to do [with] the comparison,” she says.

To ensure the tool continues to find relevant information, AlphaSense uses a mix of completely automated clustering processes and human-supervised machine learning, says Kokko. Company experts can tag sample content to train the search engine to understand, say, when a report is expressing positive or negative sentiment, and when it’s talking about the past or making a prediction about the future.

“We constantly refine and have the algorithms get better by comparing to what humans are doing,” he says.

Part of what makes processing all those documents practical is the rise of on-demand cloud computing services, letting AlphaSense activate servers to churn through documents and run those statistical models only as needed, rather than build a huge data center of its own, says Kokko.

“Before Amazon [Web Services], we really couldn’t have done this: The processing capabilities would have required a very, very big company’s resources,” he says. “Now, we were able to start small with a startup’s resources just focusing, using computing resources for minutes and shutting down, launching and shutting down thousands of machines, and being able to afford it, instead of owning those resources.”

And for better or worse, he says the search engine similarly lets its clients do research they couldn’t have previously done without hiring scores of human analysts.

“We’ve got some one-man shops that can do the work of 20 analysts that they don’t have to hire,” he says. “Or, they can actually just elevate the level of research that each person can do.”

 

Fast Company , Read Full Story

(26)