Felix recalls his journey to startup CEO in the NLP milieu after coming from a computer and mechanical engineering background; plus the origin story of NeuralSpace. He discusses the difficulty behind building models for low-resource languages when faced with less than 1% of data available compared to FIGS.
The CEO recounts his experience connecting with investors to raise a USD 1.7m seed round. He explains how machine translation fits into their framework, with the main use-case being building chatbots for low-resource languages.
Felix reveals how they collect data and their plans to work directly with linguistic professionals to cover different pronunciations, tones, and emotions for text-to-speech. He shares plans to add more speech capabilities and cover nearly any fundamental NLP project or problem.
First up, Florian and Esther talk about the language industry news of the week, with the final agenda for SlatorCon Remote complete. The two-part event will feature 15 presentations and panels from over 25 language industry CEOs, senior executives, technologists, and end-client team leaders.
Esther talks about Transline Gruppe swapping majority owners as former backer LEAD Equities exits via sale to investment firm Blue Cap. Florian then reviews Slator’s coverage of the skills required as a post-editor, MT engineer, and PE consultant.
In M&A, ZOO Digital announces the launch of ZOO Korea, alongside the acquisition of a 51% stake in Seoul-based subtitling and dubbing provider, WhatSub Pro. Meanwhile, US-based Verbit expanded its transcription services in the UK with the acquisition of market research specialist Take Note.
Florian: First tell us about your professional background. You studied computer science and mechanical engineering, so how did you come across language tech and NLP? What triggered that for you?
Felix: It was due to multiple interesting aspects of language technology that I ended up in that industry. It was not an academic motivation at all. I have been traveling around the world quite a bit and I found it fascinating how culture influences language and how language influences culture. In Indonesia, for example, there are five different words for the ocean, ocean and rain, ocean and wind, and so on, because it is so strongly influenced in daily life. I found that fascinating. When you look at the alphabet of local Indian languages, Tamil, it is literary art. It is so beautiful. When you go into Tibet, as a language it is extremely beautiful when you just look at the alphabet and the letter, so there are different aspects that motivate me as a person. Then when you take my academic background into account, I find mathematical models powering NLP technology very interesting. Deep learning is obviously a term that everyone has heard about, but there are also loads of others that are on a pure statistical level and I have studied mathematics and statistics, so I find that very fascinating. When you can abstract numbers from a word, you represent them as the vector, and then you can play around with these vectors, and in the end, translate them again into words. That entire pipeline I find very fascinating.
Esther: Tell us more about NeuralSpace. What is it in a nutshell? What was the origin story of founding the company? What is your mission with the company?
Felix: We founded it three years ago now when we were at the end of our master’s. My two co-founders are both from India and came to Germany to do a master’s in computer science. We got along really well and wanted to build something in the NLP space. In the beginning, we were not sure what was the best thing to do, but we always had an extreme passion for low-resource languages. Low-resource languages are any local languages that are spoken in Asia and Africa. We had been experimenting with early tools that were developed by Google, Azure, Microsoft, AWS, and so on. We found quite quickly that they support only a few languages and as soon as it goes into these low-resource languages, the performance drops quite significantly. When you just look at machine translation for English to French or English to German, you have accurate translations, but then do Yoruba to French and every second word is wrong, so it is a massive challenge. We can have a massive impact when we provide technology with everyday languages. If you have that language barrier when using technology, it is a big burden for people. That is how we started. We then built a software developer toolkit. It is based on an API. You can build everything in the command-line interface. We also have a web interface that is completely no code with lots of dashboards. It is basically a collection of functionalities within the NLP space for low-resource languages.
Esther: Is there another way to define what exactly is a low-resource language? How do you define it? When does a language no longer qualify as low-resource?
Felix: The pure academic and scientific definition is a language with a much smaller data set available. English has a massive amount of data. Our data comes somewhere from the internet, maybe you collect tweets, maybe you collect translations of government documents. For example, in Canada, they publish everything in French and English and you have a nearly perfect translation, this is available on the internet. English is by nature the language that is predominantly represented on the internet. When we work on low-resource languages we often have 1% of the data available or sometimes less than 0.1% and then it gets increasingly difficult because most of the state-of-the-art models are built for English. GPT-3 is a brilliant example. It is all built in English and then people try to adapt it to low-resource languages and this is quite difficult with so little amount of data available.
Florian: What about synthetic data? Does that hold any promise? There is a language with a decent amount of data but still low-resource and then you somehow create synthetic data to train models or is that not something you are looking at?
Felix: People have done that. We do it by ourselves as well. It is working. It can be used, at least at NeuralSpace, to create semantically similar data that we already have. For example, in Yoruba when you say, I want to eat out tonight at restaurant X, Y, Z, we can create at least 10 sentences. You can also create a hundred similar sentences of another person expressing it in Yoruba the same way, so there is some value in what we can collect in original data, like how people interact, which the machine can rarely replicate. It can augment it, but not get that original value that human labored or human data can have.
Florian: You recently raised a seed round of 1.7 million dollars. How did you connect with investors? What was your 30-second pitch to them? How did you explain to them what you are trying to build so they get it?
Felix: Connecting to investors was luckily not that difficult. We went through an accelerator program by Techstars before, so they fully facilitate the connection to investors by themselves. I try just to pitch as often as I can. There are different events that are organized in London and our lead investor comes from Silicon Valley. We have not met in person but they saw me pitching at one of these events and got in touch later. It was quite an easy pitch, I would say because the partner was working for Google Research in the language products so he knew the issue of low-resource languages. In 30 seconds, I said, we want to be the NLP provider for low-resource languages and at the moment we believe we can get that technology to people in the most efficient way through a developer toolkit. We may shift a bit of focus in a few years. We might build more direct custom solutions but at the moment we are giving developers the possibility to not need to build the most complicated, deep learning models to provide voice functionality, for example, or other functionalities in NLP. They see the benefits and hopefully will use them.
Florian: Now you have a minimal viable product. You have launched this and you are starting to get early commercial traction, is that correct? Who are you getting commercial traction with and what are they building? Are you surprised by what they are now starting to build on it?
Felix: At the moment we give 500 US dollars free of credit so people are still signing up and then they get quite large and build whatever they want. Customers that we target are chatbot or conversational AI development companies. Why we target them specifically is because we have seen the landscape and it is a big player. There is Google, Dialogflow, Azure, LUIS, AWS, and Amazon Comprehend. In terms of the number of languages, they have around 30 languages and we have 87, so nearly three times the amount of languages. That is already quite a good approach and we have also seen people trying to get around that language constraint through translation and then analyze it in English. For example, you do Yoruba to English, then analyze it in English, but normally you lose some information because translations are not that accurate. We do all of that natively in the 87 languages that we provide and it gives great results in terms of performance and accuracy. Our investors are a huge help, so they definitely make plenty of introductions. We also work with some partners who find prospects for us. All of us founders are part of communities, like Google Developer Circles here in London, part of the machine learning meet-up and part of PI data, and so on. You get to know people and you get to speak to them.
Esther: How do you decide which ones to offer? There must be many more candidates than you are able to work on.
Felix: There are plenty and we have defined so far 87 based on where we can have the potentially highest impact, so how many speakers are in that one language, and what does it take for us to acquire data? For example, there is Wolof in Senegal which we would love to offer. We do not at the moment because we cannot find any data. The data needs to be collected and either we do it ourselves or we work with a data acquisition partner. We looked at where most speakers are and automatically you draw attention to India because it is around 1.3 billion people. We now have 12 languages for India. We also offer something which is called Hinglish or Benglish, so a mix of Hindi and English. In the Arabic-speaking world, it is called Arabizi, so it is a mix of Arabic and English or Arabic and French sometimes. I think in African countries, all the people used to speak that way but we see a huge attraction to what people like now.
Florian: That is a competitive advantage because you could theoretically go to Dialogflow but if your language is not there then what are you going to do? Write an email to Google? Probably not, so is that the key competitive advantage or are you seeing other things that are differentiating you from those big tech solutions?
Felix: The number of languages, definitely. Dialogflow, for example, is offering Finnish. We have seen that recently being added. All our models are built with the sole purpose of fitting well to low-resource languages and I believe Big Tech takes the opposite approach. They build something for English and then try to adapt it to the other languages and we take exactly the opposite approach and out the low-resource language first. The model is not meant to be the best for English.
Esther: Does machine translation fit into the framework and what you are focusing on in any way, and how so?
Felix: Definitely, so machine translation is only one app or one feature on our platform. There is one brilliant example being designed by a customer which was very interesting. They built the language understanding capabilities, which are at the end used by a chatbot in English. Although none of the team members spoke a foreign language by themselves, or at least not the language that they want to cover. They use the translation feature to do, what we call, one to hundred chatbot. You design it in one language and then you apply it to 100 languages. They use the translation of their data with the machine translation app on our platform and then have it quite easily available to offer chatbots in many more languages.
Florian: Are you seeing actual use cases being built? Is it e-commerce websites? Somebody asks something and you take the first triage of what a person wants? What are some other use cases you are seeing being built?
Felix: A lot, actually. What we do not do in Europe, or very rarely do, are WhatsApp chatbots, so you see that a lot. I have not seen it in Europe but it is hugely popular in India and other Asian countries. You even have even a GP literally offering a chatbot to book an appointment. Maybe all they do is check some basic symptoms because the mobile network is more or less everywhere but maybe not the strongest in India. It is just not sufficient to go onto a website and it is only in English and so on. They very often offer a chatbot for a business or for GP or something else. What we have also seen is a government project in India. It was not built with us but one of our close partners was offering a COVID vaccination booking system through WhatsApp. It was a chatbot that offered to book your COVID vaccination and use automatically the dates that are available and then you could have it booked through that.
Florian: Is this mostly text-based or can you also do voice-to-text and then people interact using their voice because there are some languages you cannot even type, or you can type but people struggle typing.
Felix: These use cases that I mentioned were all purely text-based. We want to get to the level of voice. In India through a mobile phone they can say please transfer five rupees to that person, so it would be ideal to get to that level. Different dialects are understood and then in the beginning little voice commands can be taken forward to do some actions.
Florian: Is that mostly in WhatsApp now, so you do not see any of the other chat apps or Facebook chat?
Felix: There are plenty of others. I do not see that much traction on Messenger compared to WhatsApp. Other people also use it on their website, like a chatbot popping up. I also know one other company that has done a lot of customer call centers, so they did an artificial voice that is helping you with an FAQ and then you are forwarded to a real person if the issue is too complicated. Even voice commands have already been handled a lot.
Florian: Outside of India, the region that everyone is looking at is Ukraine at the moment. Is that a language you currently offer and are you seeing additional activity right now?
Felix: We do offer the language. We do not have any projects at the moment being built on NeuralSpace in Ukraine. We would love to obviously. I have been thinking about what we could do. There are a lot of low-resource languages in Ukraine as well. I think there are 20 regional languages. They are dialects, so not purely different languages, but it would be interesting. Lots of people are stuck in their hometowns at the moment trying to get help.
Florian: How do you go about collecting the data? Do you talk to somebody at the university, do you go to a company, how do you source that data if it is not freely available on the internet?
Felix: At the moment we work mostly with data acquisition companies so they have their own process, but we can design these processes. What we want is everyday conversations because they are valuable. Conversations where people interact in the kitchen, in the extreme case, where nobody tries to speak the language well, which are obviously difficult to get. Our data acquisition company has some recording tools and we try to do that. What we have been also thinking about is working directly with linguistic professionals and getting them to read a certain text that covers different pronunciations and different emphasis and even emotions. For example, read that section in a very excited tone. We are also thinking of something which is called text-to-speech, so we want to put real speech output to text input which is at the moment a bit boring, in my opinion. When you take the Big Tech providers, it is hello Florian, how are you today? It is so flat and neutral, so we want to add some spice to that.
Esther: There is a lot of demand for experienced developers in the areas of machine learning and deep learning in NLP. How do you find competing for talent in those areas? What is your pitch? What do you offer to your potential recruits?
Felix: You are right. It is a huge race and it is the uptake in remote work. American companies suddenly offer African employees or Indian employees a very, very high fee we cannot compete with, so it is tricky. We have done a couple of positions based on that mostly and we offer every employee equity of the company. It does not matter what level they work, they will get equity. If they have 10 years of experience, they would get equity. It is not always the same amount, but there is a boon that we want to establish for the company. Otherwise, it is a lot about the culture and that is all about people working with us, so there is quite a flat hierarchy. We do set learning as the number one criteria, so it is fine when you make mistakes but you should try to learn from that. That is especially valuable for the ones who just joined from university because it is a huge pressure in the first moment. You have to learn to give them enough time and they can feel safe with us. We always want to invest in ourselves as well, but it is tricky. Once they join they normally stay very long because they enjoy the work atmosphere but it is a race at the moment.
Florian: When they join you, do they typically already have a PhD in that field, or is their path open to going into any ML area? How specific would that profile already be?
Felix: So far, we have not hired anyone with a PhD. Two of our founders are on their way to getting a PhD and we generally try to communicate and give that knowledge forward as much as we can. We want people to have previous NLP experience because otherwise, it is extremely difficult to pick up and the pace of research is very fast. When you start studying NLP and when you start working with us, I do not know if you will ever catch up, so you need to have some previous NLP experience. We have also hired platform engineers who are the pure experts in Microsoft Azure, who know exactly how to scale, and so on. Even for them, we require some machine learning knowledge, because some models, in the end, need to scale to thousands of thousands of users. It is not that easy to find those people. Platform engineers are extremely in demand.
Florian: In the broader NLP space, what cool things do you think are coming out in the next two, three years? What are the most exciting research areas?
Felix: It depends on a lot of the smaller achievements, but I want to have voice-to-voice live translation and that is something that I am hugely fascinated about. There are companies, for example, Translated who have done something in that direction. I find that extremely fascinating because it all depends technologically on so many different levels. You need to have nearly perfect speech-to-text. You need to have nearly perfect translation and then you need to have, again, a nearly perfect text-to-speech. You do not want a neutral language or neutral voice, you want to have some excitement. It will be fascinating if you can have that. Even on Edge, so without internet connection. That is a different fear but I find it quite fascinating, where people try to get very high performance on tiny models that can fit on a mobile device or in a car’s computer, and do not need to have an internet connection to actually work. What happens at the moment when you use Siri on your phone, it directly goes to the web, it goes to the Cloud. It is processed there because there is a massive language model behind it and then it goes back. Your phone is a transfer agent but if all the computation could be done on the mobile phone itself, it would be terrific and open up so many use cases, like data privacy concerns. Customers would be happier to use it.
Esther: What is on the roadmap for NeuralSpace in the next year to year and a half?
Felix: We are not there yet to build speech-to-speech live translation but we want the NeuralSpace platform to cover nearly any fundamental NLP project or NLP problem. We generally see the different apps on the NeuralSpace platform as building blocks and they can be combined in different ways to build an end solution. For us, at the moment we have our first language understanding app live on the platform. We want to add speech capabilities in Q2, so we have maybe not all 87 but maybe 40 to 50 languages for speech-to-text, and then we also invest quite a bit in what is called auto NLP, so there is the customization of some NLP functionalities for your data set. We can already do that for language understanding, but it would be for speech. Also, Google does not allow us at the moment, so we want you to go to our platform, you say 10 different words with your voice about a topic that you find interesting and then the entire model should adapt and should be customized for you. That would also be extremely fascinating. There are just these fundamental building blocks at the moment. Later on, we will think about meta apps, so for example, voice-to-voice translation.