Remember natural language processing? NLP emerged several years ago, but it wasn’t until 2018 that AI researchers proved that it was possible to once train a neural network on a large amount of data and use again and again for different tasks. In 2019, Open AI’s GPT-2 and Google’s T5 appeared, showing they were surprisingly good (it’s now integrated with Google Duplex, pictured). Concerns have even been expressed about their possible misuse.
But since then things have gone, well, pretty exponentially.
2021 has seen a true “Cambrian explosion” of NLP start-ups and big language models.
This year, Google released LambDa, a great language model for chatbot applications. Then Deepmind released Alpha Code and later Flamingo – a language model capable of visual understanding. In July this year alone, the Big Science project released Bloom, a massive open-source language model, and Meta announced that it had trained a single language model capable of translating between 200 languages.
We are now reaching a kind of tipping point where we will see many more commercial NLP applications – some using some of these open source and publicly available platforms – coming to market. You could almost say that a gold rush has begun of start-ups trying to build on this technology, with an arms race developing between the major language model vendors.
One such startup is Humanloop, a University College AI spin-out that claims to make it “dramatically” easier for businesses to adopt this new wave of NLP technology through a suite of tools that help humans “ teach” AI algorithms. This means that a lawyer, doctor or banker can put a piece of knowledge into the platform which the software then applies at scale on a large data set, enabling wider application of AI to various industries. .
It is now being pulled into a $2.6 million seed funding round led by Index Ventures, with participation from Y Combinator, Local Globe and Albion.
Founded in 2020 by a team of leading computer scientists from UCL and Cambridge, and Google and Amazon alumni, applications for Humanloop, he says, could include building an image of a national real estate market from unstructured data on the Internet; reading electronic health records to identify people who might be candidates for trying new therapies; and even moderate comments on Facebook groups.
“People would be shocked if they knew what language-based AI is capable of now,” CEO Raza Habib said in a statement. “But getting the data into a form the algorithm can use is the biggest challenge. With Humanloop, we want to democratize access to AI and enable the next generation of intelligent self-service applications – by enabling any business to take their domain expertise and effectively distill it into a machine learning model.
Humanloop says its success is the growth of “probabilistic deep learning,” where algorithms can figure out what they don’t know, removing noise in datasets, finding the right things, and asking questions. help humans with the parts they don’t know. I do not understand.
Cohere AI ($164.9 million in funding) and Open AI GPT-3 are other startups building their own large language models and putting them behind APIs. Snorkel AI ($135.3 million in funding) is also a new startup in this space.
However, Humanloop says it focuses less on developing the models and more on the tools needed to adapt them to specific use cases.
“What a lot of people don’t know is that it’s not the lack of proper algorithms that keeps AI from being pervasive in every workplace – it’s the lack of data properly tagged,” adds Erin Price-Wright, a partner at Index Ventures who led the investment. “In fact, machine learning itself is becoming more commoditized and ready-to-use, but it’s still very difficult for non-technical people to take their knowledge to a machine and help the algorithm to refine his model. That’s why Humanloop allows people to edit data.
If the NLP gold rush is well and truly underway, expect a whole bunch of other startups to pop up soon…