2 months ago
April 23, 2021
Lilt CEO Spence Green on Predictive Translation and the AI Agency’s Roadmap
Five years ago, Slator pegged language startup Lilt as one to watch. Today, Spence Green, CEO and co-founder of the tech-enabled language service provider (LSP), joins SlatorPod to talk about the journey from research concept to enterprise-scale AI translation agency.
Having raised USD 25m in Series A funding in early 2020, Spence recalls hiring researchers, developers, and enterprise sales professionals, and the benefit of the monthly standups Lilt holds to bring together their research and customer service teams.
Spence touches on performance improvements in their interactive translation system and cites prediction scores as high as 78% — meaning that nearly 8 in 10 AI-generated translation suggestions are unchanged by human hands.
He identifies key items on their roadmap, outlining Lilt’s connector-first approach, as well as enhancing the customer review cycle with in-built tools.
With Lilt’s investors backing an AI-agency thesis, Spence says there’s still more that can be done by the industry to further automate translation and localization, and explains that the ability to dramatically reduce costs for customers involves a huge automation effort.
First up, Florian and Esther discuss the language industry news of the week, as they review the possible impact on US-based linguists of America’s PRO Act, which passed in a Democrat-led House in early March.
The two talk about TransPerfect’s phenomenal Q1 2021, as the US-based Super Agency delivers 21% organic growth in the first three months of the year, adding USD 40m to top-line revenues in one quarter.
TransPerfect’s Q1 performance sees the company enter a two-horse race with UK-based RWS (post-SDL acquisition) for the title of world’s largest LSP, which looks set to be a close fight based on RWS’ latest trading update.
Florian discusses AI video generation and dubbing company Synthesia’s USD 12.5m funding round, while Esther talks about ZOO’s latest financials (SlatorPro) — in which the UK-based media localizer lifted revenue forecasts.
Then there’s new kid on the media localization block LinQ, which made its official debut in March 2021, and was put up by none other than Björn Lifvergren, founder and former CEO of BTI Studios, which merged with Iyuno (now Iyuno-SDI Group) in 2019.
Stream Slator webinars, workshops, and conferences on the Slator Video-on-Demand channel.
SlatorCon Remote September 2021 | $98
A rich online conference which brings together our research and network of industry leaders.
Florian: When we first covered Lilt back in 2016, we called you a company to watch. Tell us a bit more about your background. What brought you into the translation localization tech space?
Spence: My and my co-founder’s background is that we are both researchers. We both worked on machine translation and we met working on Google Translate. For both of us, the motivation for working on language translation was the opportunity to have an impact in the world and specifically in the area of information access. Language presents a barrier for a lot of people that do not speak English and so both of us went to grad school to work on that problem.
How we came to think about localization was machine translation systems can give you a prediction but they cannot tell you whether they are right. In settings where you need to know whether the translation is correct, which is in the business case or book publishing, what we were originally thinking about. We started to think about how we could use machines that are scalable and efficient to make it possible for there to be more information available in the world that right now is mostly translated by people. That was the original sort of thinking behind going down this path, which eventually led to a company.
Esther: Can you share a bit more around that tech piece as it pertains to Lilt and how it went from this kind of vision that you described through to implementation at scale?
Spence: Originally, within the academic context what we build is called an interactive machine translation system. This is a machine translation system that is designed for interaction with a human being and this idea is very old. It goes back to the late sixties and it was a thread of research that had periods where people worked on it and then periods when people did not work on it very much. Philipp Koehn was working on it as a side project in the late 2000s and early 2010s.
When I was at Google I got to know Franz Och, who was the Chief Architect of Google Translate. Franz had worked on it when he was in grad school, in the late 90s and early 2000s. He wrote a paper in 2003 that was quite good but people did not really cite it. He and I were having lunch one day and I was asking him about this paper and he said, ‘I think this was always a good idea and somebody should work on it because that paper should have more citations’. I thought if he thinks it is a good idea maybe it is something we ought to go investigate a bit more. That turned into the research project, restarting this thread of research on interactive machine translation. There never really was a thought of starting a company. It was just this thing of getting human-in-the-loop to work.
Florian: From a tech perspective, is latency still an issue or has it been solved? What are some of the other key issues when you interact with the machine translation?
Spence: There are some interesting engineering issues that come up. I think that in the research setting nobody had really gotten a system to work that demonstrated that it could make people faster and could make them more accurate. We got those results in the research setting so then we have this research prototype and then it was, well, how do you enable people to use that? There are two problems that you have when you deploy a system like this. The two key things that it does is it does self-training so it learns as you use it and it is interactive so when you type it predicts, it retranslates every time you type a word.
You have two problems there. One is the synchronization of the updates as you have multiple people working on the same model and then you have latency as you rightly point out, because if you are typing you have a budget of about 200 milliseconds. Otherwise, the system feels sluggish and 200 milliseconds, this is a quarter of the circumference of the world at the speed of light so the speed of light starts to become a problem. This is enabled by the public cloud, now you have bunches of regions that you can run in and so the engineering challenge is to have a system that runs in multiple regions around the world. It is near users and it can be responsive, but then it can also coordinate global updates so you have a globally consistent system that is learning. That has been an engineering challenge, an interesting one for five years now and the system is quite fast globally in all the regions that we run in.
Esther: When you are thinking about those problems or removing some of those hurdles, what kind of people are you engaging? What kind of profiles of employees are you bringing on board? Who is on board with Lilt at the moment?
Spence: These systems are quite complicated to build, and it is still the case. Although neural network approaches to building machine translation systems have sort of broadened the number of people working in this area, like the old statistical systems. They are very idiosyncratic, the way that they were trained, the way that they were built and there was just a small group of people that built them. Now more people build them because the models and architectures that are used are used for lots of different tasks and so a broader array of people that are just generally good machine learning people can work on them. I think there is still quite a bit of black magic to get them to work correctly.
The core of the team is still a research team. We have about 15 researchers split between San Francisco and Berlin that build the core technology and then all around are an engineering team that builds the application infrastructure and all of the user-facing software. Then as we have become a service provider, a lot of the change in the past two years or so has been adding in a services team and a sales team that knows how to sell in this industry. It is merging between software technology people and then the sort of domain knowledge of how to sell language services in the enterprise. We have an interesting dynamic internally as these two groups of people do not ordinarily interact with each other. Every month there is a services research staff meeting where they meet and they talk about problems. It is well attended and people get excited about this so that is a good thing.
Florian: You took the business from pure SaaS tech and then added the services layer on top. Was that a reaction to the dynamics that were bubbling up and making sure that these two parts of the business interacted well because it must have been very challenging?
Spence: Yes, a services business is very different operationally than a software business but there is tremendous power in these, what are known as technology enabled services businesses, in that you can build the process and the software in parallel and do that in an agile fashion and that gives you a tremendous operational lever. It is much more challenging to coordinate, but when you are trying to automate what has historically been a function that is very manual, you have to do both at the same time. It is not sufficient to just build the technology and then assume that a service is going to adapt to it. I think in lots of different industries entrepreneurs are finding that this is true.
Florian: Were there particular challenges that you remember from going from a visionary tech to the CEO of a growth services company that is now employing hundreds of staff?
Spence: As the company starts to grow and get larger, the bigger part of this job becomes ensuring that we have a strong company culture and combating complacency and ensuring that we continue to move quickly and innovate. That is the real trick and that comes down to hiring and company culture.
Esther: Tom Tunguz from RedPoint talked about the investment thesis that they were operating at the time based on this AI agency. It is a question of looking at the layer of the economy that is occupied by outsourced service providers or agencies. Then you think about law or accounting and this kind of traditionally manual professional services and looking at what part of that work can be automated and therefore delivered at a lower cost. How do you see localization and translation fitting into that spectrum? Do you think there is still a lot of work being done by LSPs or translators that are not yet automated that could be by the current state tech?
Spence: It is a question of unit economics and 20 cents a word or whatever big agencies are charging. For a single eight and a half by 11 sheet of paper that works out to $60 to $80 and that is what businesses are paying right now. With technology, we have cut that in half and that is still a lot of money. Until we can get that down to a dollar or something like that we have not solved the problem.
The problem is making it possible for everybody to have the same experience that you and I are having speaking English and that is just not true right now. If you navigate pretty much any company’s website or just go around the internet, just flick it into Korean and click a few links, and then you get kicked back into English. That is the problem that needs to be solved and it is an operational problem. It is an efficiency problem, and it is not a problem that can be fixed with hiring, which is the traditional solution and a manual approach. There just are not enough translators in the world for the amount of information that there is so there has to be a technology enabled solution if we want to live in a world where everybody has the same experience, irrespective of the language that they speak.
Florian: We frantically try to keep up to date with what is going on and the current state of the art and machine translation, but from where you are at, what are some of the most exciting developments right now? Did you have anything that stands out in 2021?
Spence: The transformer and its variance is the dominant architecture, not just within machine translation, but more broadly in NLP and in sequence modeling so that has not really changed since that paper was published in July of 2017. Our team in particular is focused on domain adaptation, so model learning and we have had some tremendous improvements in that.
What we track in production as our translators are working is next word prediction accuracy, which means as they are typing the system proposes the next word, how often do they pick that versus typing something differently. At the beginning of 2020, it was in the 60% range. The figure from last month was 78% so that means that the system is getting eight out of 10 words right that are being predicted. Now people are still reading every word but this is a meaningful improvement in how much work the machines are doing to augment what people do. That does not come at the expense of people. We are hiring lots of translators right now but it makes it possible for each one of them to do multiple times the amount of work per day than they would ordinarily do and that is how we amplify the very limited and constrained human resources that we have to be able to make more information available.
Esther: How are you scaling with established localization or enterprise localization buyers that are used to a more traditional way of approaching localization from pricing to the whole thing?
Spence: Our solution is we have tried to make it as easy to adapt as possible and so that means if you have a translation management system and you want to keep that, we can plug into that. If you do not want a translation management system and you want us to integrate directly with all of your business systems, we can also do that. We have a customer surface that looks very much like a service provider. You have an account manager and what we call a strategist, who is your linguistic resource and then a solutions architect so a lot of that would be familiar. The bigger issue is just that historically people who run localization, one of the main things that they optimize on is risk mitigation. One of the biggest barriers to entry in this industry is just simply that people do not change things unless they are compelled to and this presents a barrier for new technologies, new processes, new ideas and that is the single thing that I would like to see change is more openness to new ideas and innovation.
Florian: When you talk to end clients, the marketing people, those global marketing teams, they talk about global customer experience, they take it from a broader perspective. How can we elevate that conversation around localization so it becomes part of this bigger picture and we are talking the same lingo to these people? Is that something that you are focusing on?
Spence: It is. I think that language should be thought about as a competency of customer experience so what do I mean by that? Companies spend a bunch of time on customer experience because the world is competitive and there are lots of alternatives. One way that you can differentiate is the degree to which you personalize the experience for each customer so that you can build loyalty with your customer base. One way to not build loyalty is, for example, if I buy a Toyota and the gauges, the dashboard, the instruments and the display are all in Japanese and I cannot change that. That does not make me feel like a high value customer of Toyota and that is how a lot of digital experiences are on the internet.
When you move up from localization into the strategic level of thinking about customer experience, then you need to think about what is the experience of the customer when they go to the website, when they use the app, the support site when they call in to support? All the different components, the touchpoints of where you interact with the business and is that consistent in the same way that it is for English? For most businesses, it just really is not and so localization is kind of a function box that you put documents into and you get documents out of. It is not a strategic function that helps a business stitch all of these things together. There is an operational and strategic element to that and I think that is where language is properly situated in the enterprise.
Esther: How do you think that LSPs can help raise the profile of localization within these enterprise organizations? Is it a case of helping the localization leaders have a seat or access to the executive table? How does localization get elevated to a strategic level?
Spence: It is the responsibility. As a service provider, we are not employees of the business so when the localization job is focused on vendor management there is just not going to be any change affected internally. When that job is transformed into a strategic enabler of other parts of the business and the language that is used is one of customer experience and growth. That is what they are designed to do if you can present a business case in those terms with data to support that business case, then this can be thought of differently in the enterprise.
Esther: Is there data to support that, when you are talking about return on investment or value add for clients? How are people looking at this in a tangible way when you are building that kind of business case? What really speaks to customers?
Spence: Our research team last year built this prototype product but we use it for some of our customers and it does two things. It crawls websites and then it connects to an analytics product like Google Analytics and for a page if you show it in English and you show it in another language, differences in engagement and the results are really strong. Even in surprising cases where you think, we have a customer that is a software company and all software engineers speak English because 20 years ago most programming languages did not even support Unicode so you just had to speak English. It turns out if you give software engineers the opportunity to read things in their native language, then they will just do it and you can actually prove that. When you prove it visually in a data oriented way, then it is not a difficult case to make to expand that experience to other languages. Going into a business meeting and saying, we have these QA metrics or we have this cost basis, like nobody cares.
Florian: Where do you see the improvement trajectory of this all? Where do you see this heading over the next five years? Do you feel it is going to be companies or customers on the internet who expect a quasi perfect native adopted quality? How fast will we see this?
Spence: In the time that we have been doing this, what has been most surprising to me, is that over the last couple of years the rate at which MT is getting better is extraordinary but the rate at which companies quality expectations are increasing is going up at least as fast. What most companies are talking about is their brand voice or their copy-editing guidelines and historically companies spend a lot of time writing their English copy-editing guidelines. Then it sort of goes through this localization meat grinder and it comes out the other side as generic German or something like that. The real opportunity is to have that degree of specificity and precision in the offering of the texts, both on the source side and the target side. That is just really hard to do with automation and so I think that is a frontier. Yes, now systems can generate outputs that are both fluent and adequate but are they preferred?
Florian: You raised some money, Series B was the last series about one and a half years ago. How is it to raise capital for a localization translation related startup? What is the perception out there in the VC community about this space generally?
Spence: Historically going back a number of years, there was less investment in this industry but in the last year or two, there has been more venture investment, more private equity investments and more outside interests. The objective that we have, if Lilt achieves anything it is catalyzing change towards all products and services being available in all languages. We are not going to achieve that on the small amount of capital that we have raised. The business is growing and we will do more but we need lots of companies and lots of capital to catalyze that change in the world and lots of investments. This is generally a good thing that there is more outside interest in the language industry and people are building companies and they are building technology in service of this mission, which is why we started this company and what we hope to achieve with it.
Florian: How would you weigh this availability of capital versus genuine excitement around the industry in the investment community? Do you think it is 50/50?
Spence: There is a lot of money out there, that is true. I also think that there is more interest. More investors now have this tech-enabled thesis than a few years ago because I think the experience of building these machine learning companies is that enterprises need a lot of help with change and adopting these technologies is quite hard. You just end up building a services function to drive adoption and that leads to a very different business model than a pure SaaS business. More companies are being built this way and therefore there are more examples of this and that generally helps investors develop a thesis.
Esther: When you are pursuing that goal of making products and services available for everybody everywhere, what kind of features does that depend on? What have you added in the past 12 to 18 months that have been some of the key developments for the roadmap?
Spence: I will give two examples. The first is we have a point of view with our integration with the enterprise which is connector first and let me describe what we mean by that. I think 10 years ago the business systems that people used, be they content management systems, source code repositories, document management systems, general enterprise content management, those systems did not have really great multilingual support. Then you ended up needing these big pieces of middleware, which we call translation management systems to kind of augment the deficiencies of those systems and do workflow.
Today the systems that companies are using have awesome multilingual support and so what we have been building towards is building the multilingual capability directly into those systems. You can think about it in much the same way you use Google Translate. It is integrated directly with your phone, it is integrated with your browser, it is integrated with your email and you have it accessible right there where you are doing content offering. That is how we are building out our enterprise integrations and so that means we will connect to a TMS if you already have that and that is your integration point or we will build directly into your systems if you are in the process of re-platforming and doing digital transformation. We have gone from a couple of these connectors to 30, 40 of them and we have a whole team internally working on building these out. That is one big change that has made it much easier for companies to access language and implement multilingual workflows than before.
The second part is that internally companies are wanting to go to market faster and move faster. The internal pieces of the production process transform a piece of content from one language into another. One of the bottlenecks is QA and so for the last year and a half, we have been working on this, what we call auto review. It is basically a grammatical error correction system that starts to automate part of what a traditional reviewer would do. We spent a lot of time building support for the translation part of the production process, and very little time trying to augment the review part of the process. That takes quite a bit of time, especially when you get into having internal stakeholders review things, the preferred translation, the brand voice and the other language, where a lot of that happens at the review phase. The team has been building this Otter product that we are going to publish a paper on in the next couple of months and then hopefully get it into production later this year. That is going to increase quality and then also fix what has been a production bottleneck for a long time and so look out for that.
Florian: Things are going quite well for the language industry and probably will so for the next five years. How do you see the next two, three, four years in this industry? Would you agree that it looks quite strong?
Spence: The pandemic is a horrible situation and a lot of people have been hurt and suffered. One consequence of the pandemic has been that more companies now are focusing on their online and digital experiences and as a consequence of that thinking, they are also realizing that they need to speak to their customers in their own language. In the past three to six months, it seems like we have just had a number of conversations that are more strategic in nature and there are these sort of digital initiatives going on to pivot companies more broadly, not just those in Silicon Valley or in tech hubs, towards having comprehensive online presences. That is the exogenous event that will accelerate the need for language technology and services and I think that presents a lot of opportunity for those of us working in this field.