Samuel Läubli, Partner and CTO at TextShuttle, joins SlatorPod to talk about the ins and outs of a language technology provider, the current state of machine translation, and his experience as a researcher and an entrepreneur.
The CTO touches on his background in Computational Linguistics and decision to go back to the academe in 2016 to learn more about the then-emerging neural models for machine translation. He gives his take on the current state of machine translation, particularly weaknesses around sentence-by-sentence structure and limited control.
Samuel discusses his thesis, which tackles three key challenges in MT for professionals: quality, presentation, and adaptability. The conversation turns philosophical as Samuel debates whether machine translation can become truly creative without artificial general intelligence — or if it will always be considered imitation.
He then walks listeners through TextShuttle’s business model as well as the key problems the company solves for clients, ranging from producing MT systems to helping with configurations, workflows, and training translators.
Samuel also shares his insights on the future of MT, unpredictable as it may be, and TextShuttle’s initiatives with controllability and the adaptive machine translation paradigm.
First up, Florian and Esther discuss the language industry news of the week, in a tech-centered episode.
This week, RSI platform Interactio announced that it had raised USD 30m in series A funding, led by VCs Eight Roads Ventures and Storm Ventures.
Esther delves into Straker’s 100-page annual report, which showed the Australia-listed LSP’s 13% revenue growth to USD 22.6m for the 12 months to March 31, 2021. Straker shares jumped more than 14% the day results were announced.
The duo also discusses Akorbi, another fast-growing language service provider (LSP), which recently acquired the low-code process automation platform RunMyProcess from Fujitsu — a surprising move by the company as they expand to business software unrelated to translation.
Heading to Japan, Florian goes over Honyaku Center’s 2020 financial results, which saw revenues decline 14% to USD 91m and operating income nearly halved to USD 3.8m.
Florian closes the Pod full circle with more machine translation news: a research paper presented by Bering Lab about IntelliCAT, an MT post-editing and interactive translation model; and, out of big tech, Microsoft Document Translation, a recent addition to their enterprise MT offerings.
Stream Slator webinars, workshops, and conferences on the Slator Video-on-Demand channel.
Florian: Tell us a bit more about your personal background. How did you get into the machine translation space, on the one hand as a researcher, and then also as an entrepreneur?
Samuel: I got into this when I studied, 10 years or so ago, at the University of Zurich. I got attracted to the field of Computational Linguistics and honestly, I did not know much about it. I was interested in programming, but not so much in classic IT. I started studying Computational Linguistics and at the time we had a very good professor, Martin Volk, who got me interested in it. I went on to do a masters in Edinburgh and then I had my first job in the industry at Autodesk, a software company in the US. My job was to make machine translation better or to use machine translation to make translators faster in localizing software products and so that is how I got into it. Working at a big company can be interesting and it has its benefits, that is for sure.
There was a paradigm shift on the horizon, neural machine translation. I felt like it was a good idea to go back into academia to learn more about it because at the time people who had a background in statistical machine translation really did not know much about it and I do not regret this at all. When I went back to academia in 2016, this was very much the right decision. That is also how TextShuttle started because at the time you there was not a commercial offering for neural machine translation. At some point, Google shifted, of course, and then DeepL came along, but that was all a bit later. It was kind of clear to us that this was going to have an impact on the industry as well and it might be a good idea to start a company there.
Florian: Tell us a bit about the founding and how it started?
Samuel: TextShuttle was actually founded by two professors at the University of Zurich in 2009 and at the time statistical machine translation was the big thing and the idea was to produce systems that are geared to the media industry. Basically subtitling because they had short sentences and you could show in research that you have an advantage if you post-edit this output, even at the time. It turned out it was not so easy. Machine translation was not easy to pitch to people. There was a lot of resistance, not only from professional translators who produced the subtitles at the time but also machine translation was not in the news as such. We reactivated it a bit later because this paradigm shift was coming.
Florian: You reactivated the company, what was the vision at the time and how has it changed until now?
Samuel: The idea really was to get breakthroughs that were happening in research to people, to users and professional translators in particular as quickly as possible because the bigger a company is, the more of a delay you usually have. You build your machine translation pipeline, which costs a lot of money and you have got to monetize it. For quite some time many companies were saying neural machine translation is a thing. We did not want to do this. We did not have this legacy or technical debt whatsoever so that was the goal.
Esther: How good is machine translation in your view and tell us about the current state of machine translation?
Samuel: It is an interesting question obviously and one that you can talk about for hours. I think machine translation today is at a level where it is certainly useful. It is useful to both professional translators and also everyone on the internet, so for professionals, it has really been proven in many language combinations and domains. You can save time if you use machine translation, this is not about replacing them, but it is a productivity tool. For everyone out there, obviously, it is a tool to get a gist of what is in a text. I think this works fairly well, probably not accurate in every detail but you can get the idea of what the text is about.
It has weaknesses and these are important to talk about too. Machine translation is very often still sentence based so the context of a document is not really considered when a sentence is translated. Basically, when you translate a text you get a sequence of translated sentences, you do not get a translated text back from a system. There is little control. I think this is also very annoying with machine translation today. We have seen features such as a politeness feature that DeepL offers, for example, but other than that there is not so much you can control about machine translation. It is usually not too bad, but if you want to turn a few knobs and want the output to be more like this or more like that, it is not so easy.
Florian: With this document level content, is the general problem computation? It is just too much or is it that the knobs are not quite right? What is the core problem to getting this better?
Samuel: You can frame it as a hardware problem, so the problem is you have a sentence and then technically you have a maximum length of a sentence you produce in the target language. The basic problem is at every position so, at every word, there can be any word in the vocabulary of the target language. If you combine every word times every word and so on throughout a whole sentence, this gives you a huge amount of possible combinations. It is already a very expensive problem and when you compute the translation of a sentence, there are a lot of calculations involved. Obviously, if you want to go beyond that level, technically it could be similar to just computing a sentence, but then you need even more hardware. That has actually been done in research. It has been done successfully. It is just too slow and it is too expensive these days but this is going to change.
Esther: You mentioned that part of the vision of TextShuttle was to get MT into the hands of professional translators. That also ties in with your doctoral thesis that you published recently. Can you give us just a quick overview of your thesis and some of the context around it?
Samuel: The thesis basically tackles three key challenges in machine translation for professionals. Overall, I think machine translation in the context of professional translation, there has not been so much research about this. There are all these technical papers out in the archive, but the typical machine translation researcher does not necessarily care about professionals. I am not even sure if the big companies really do. The three challenges I was tackling were quality. There was this study or experiment about human parity, where Microsoft claimed, we are as good as professional translators in translating news now, which turned out not to be the case, depending on the valuation setup.
It was also about presentation. We found that the way in which you show translations to professionals, and not necessarily machine translation can be fuzzy matches, actually has an impact on how fast they work and how accurately they can work. Then there was another part about adaptability. That was a bit more technical. The gist is machine translation can incorporate resources that professionals have been using for a long time, talking translation memories, terminology databases, but this is not necessarily available to them because of the tooling they use.
Florian: You mentioned the top to bottom versus left to right aspect. Do we know what is the most common variety?
Samuel: It is about how sentences are shown on a screen and there are a few weird things happening if you work with a CAT tool. People have now accepted it. It was a huge discussion back in the late 90s, early 2000s. You have this CAT tool, you basically throw out all the layout the text has and you basically format it as an excel spreadsheet. Then there is the source sentence on the left and the target sentence on the right. This is the classical arrangement. At least that is what the study participants told us, so most translators we have recruited for this study were used to having the source sentence to the left and then the translation to the right.
There is another configuration where your source sentence is on the top and then underneath it, you can write out a translation, and if you think about the number of eye movements or the distance you make with your eyes. If you constantly go from left to right, it is more than if it is just top to bottom. There is probably less movement involved there, but we have not used eye-tracking. One of the findings was that even if participants did not use this setup, when they used it in the study, they found it rather convenient and it made them faster, so one explanation could be that it is just closer together. Well, in most languages, at least.
Florian: I keep saying that truly creative MT is impossible unless we achieve artificial general intelligence at some point. As the expert, what are your thoughts on the technical feasibility of that?
Samuel: I find predictions hard to make. In machine translation, they were always so wrong. I do not want to go back to the 60s but it has been a five-year problem for so long really and then it is solved. If you think about how a machine translation system is trained or how it learns, what does it do? It looks at existing translations and then tries to imitate these translations when you bring in a new text. Now you can go two ways when you argue about this, so you can say since it imitates it can never be truly creative because it is always going to imitate what it sees in texts.
On the other hand, if you think about how a human translator learns, it is probably also with looking at the translations and then reproducing them to some extent. Honestly, I could not say. At some point, there is so much text and so many things to combine and you add some more technology to the mix. Maybe it is not truly creative in the sense that these machines will then think and will also be able to play chess but it could at least seem as if it was creative.
Florian: To me truly creative would be that it takes somewhat editorial judgements. That it is very differently worded, but it still means the same at the core. As a human translator, you could always go and defend it, but could a machine ever start taking these types of decisions?
Samuel: I think it could because we often talk about these cultural differences in translations or languages, but if the training material you use contains examples of when people have made this transfer, why could not the machine reproduce it? If the human translator produces a translation from Chinese to English, that is then in the training material and so for this exact entity, maybe the machine would even be able to reproduce it. Hard to say. It is not going to be thinking and considering the webpage it is on or the background image and everything. You have to at least take more information into account than just the text when you produce the creative translation.
If you think about politeness that is one feature, how you would brief a human maybe, and there can be other things. Even if you want a text to be shorter or longer, these things are relatively easy to adjust even today. Not necessarily available in CAT tools or your online system but you can technically do it. Not sure if it is creative but at least it is more controlled in the sense that it would maybe produce a text that comes closer to what a briefed human being would produce.
Esther: What are your thoughts about MT as a managed service in terms of being a growth area of the language industry?
Samuel: I think machine translation as a managed service clearly has its benefits. There is definitely a need for it in the language services industry. If you just think about how many resources there are out there, translation memories, termbases and so on, that people have been curating for years, decades, sometimes, and they are not necessarily really used. When people use machine translation these days, of course, you can connect your system and then tie it into your CAT tool in some way but these resources that are there are not considered. If you have your own technical expertise, maybe you can mix them together but as soon as this is not the case, then you probably need some kind of managed service.
Esther: What type of problems are you looking to solve for clients? What else is there that drives clients to come to a company like TextShuttle?
Samuel: It does not make sense for every small language services department to hire their own technical specialists to do machine translation. You also do not have a person that creates a CAT tool for you typically. Maybe some people do, but typically you do not. That is among the key problems we solve and if you go more into the details it is also about them making translation specialists faster. You have these resources and if you produce a domain adapted system, the output is going to be closer to what you eventually want, you have less post-editing effort. It is not all about that.
We do not work with tons of companies yet. If you look at our website there are a few companies, bigger players in Switzerland mostly that may have their own language services department, but they have been regarded, to my understanding at least, as a cost center. In many big corporations, they have translation specialists and it is always at the last minute they have to translate something and it costs them X thousand dollars. This is horrible, but now these people, and not only the translation departments, have translation needs, so if they want to translate a contract quickly, they have a legal department.
It would be very handy if they could actually use the resources that have been produced inside the company because there is a way in which certain words are translated. If you go to your online service of choice, you might have to anonymize the document and then you have mistranslated words that are in there. For our clients, for example, if they can make their machine translation system available to the entire company for them to translate, it is a way to make visible the value they are providing. They are not just a cost center and that is one of the main drivers for our business and our clients.
Florian: How do you do sales and marketing? Is it client prospects, they contact you or are you actively going out there and doing lead generation and business development?
Samuel: Honestly we could do a lot more there. We are 10 people now and it is profitable so far. That is good, even without lots of business development and all this pipeline, but that is where we need to go next. We have a person that does marketing, among many other things. Our sales pitches are probably too technical. Honestly, it is hearsay, so clients hear it has been successful at company X. They want to have this too so they contact us. It is not so much us going out there and doing this actively which we will probably need to change soon.
Florian: Have you ever considered taking on outside investment? What is your stance because in this industry everybody is getting a series A, series B?
Samuel: It is crazy. It is a possibility, certainly. Again, we are surprised how well this has worked out in a sense. Personally, I am quite happy about how this went down so far because I think if we had raised capital earlier, we certainly would have grown faster, but I do not think the product would have been better or it would have been better by now. We were able to make decisions based on actual needs. There is an actual problem, we solve it and then we can take a bit longer and implement it the way we want.
There has been a focus on how end-users really use machine translation so professional translators, again. It is not necessarily about we have to tick all of these boxes and an enterprise sales process, and then we need to get into it. For the users that actually use it at the end of the day, it is not necessarily useful to them. I know this sounds all a bit idealistic. Of course, this will also change to some extent and it will have to in the future.
Esther: You are a team of 10 people at the moment. What are those types of roles and are you hiring for anything in particular right now?
Samuel: Yes, it is mostly people with a technical background. Since we have close ties to academia, the University of Zurich mainly, but also ETH in Zurich, we have a good way to get hold of young talent. I would say that seven out of these 10 people have a background in machine learning or at least informatics. We are hiring a backend engineer now, but we will be hiring an account manager soon and also other non-technical roles so this is all on the horizon.
Florian: What do you think is the big deal with GPT-3 and these big multipurpose language models in the context of a narrower use case like machine translation? Are you excited?
Samuel: Language modeling, obviously, that is a part of machine translation. You ingest tons of text and then you model the language in that way and you can create new texts. Basically, a machine translation system is a language model, it just has another input source, basically the source text. I am excited about the ability to actually get hold of so much language data and then train a model with all of this data. Also, lots of resources are used there, so this is a very interesting area. People have tried incorporating big language models into machine translation. If you look at the state of the art systems today that is not necessarily a component so I am not sure where this is going. I am not excited, as an MT researcher, particularly about GPT-3. It is very interesting but not sure if it has a direct impact on machine translation today at least.
Florian: Briefing the machine in one language and having an output in another language, would that be a feasible scenario?
Samuel: Sure, I think this will work and Facebook has a model where this works very similarly. It is just that it is cross-language. It is not monolingual. You can even do it with GPT-2 or 3. You can make a prompt and say ‘the man goes to the shop’ in French and then it will complete the sentence. This already works with these models. Basically, the concepts are very simple there. It is inherent learning of translation because translated sentences are out there on the internet and if you use all of them to train a model, then the model can translate somehow.
When we were studying computational linguistics, machine learning for natural language processing, essentially, we built systems like these 15 years ago, or 10 years ago. Did not have contexts and the sentences that would come out were odd. In that sense, it is very interesting to see how fluent this suddenly is and how you can give even more context and it will still produce a meaningful continuation. The concept itself is not really new. That does not mean it is not interesting for the industry. You can get a continuation for your prompt but how can you then steer it to go in the right direction? I guess that is what these companies are now trying to figure out. That is interesting.
Esther: What about the role of Big Tech and particularly the fact that you have Amazon, Google, Microsoft, offering increasing customization and differentiated machine translation offerings as well?
Samuel: There are offerings from these companies. If we are talking about professional translation, it does not seem to me that these companies are interested in solving problems that professional translators have. This is primarily geared to end-users who use raw machine translation. That is my understanding and I guess there is more value there. In that sense, I am not so sure if customization is really a priority for them. Of course, you can now lexicalize your machine translations with Amazon, give it a glossary and do things.
Esther: Are you likely to bump up against them if you are ever talking to customers?
Samuel: No, that is not typically a problem. Maybe DeepL as a competitor, but that is mostly because for many people in the German-speaking market DeepL is good machine translation these days. Slator once described TextShuttle as a boutique player and I think that is true. If you want a machine translation system for your team of professionals, it is going to make them faster and you can commoditize it. You can make a business case. That is not the same as just going and connecting some system and then seeing what comes out. This is a different setup and honestly, I think many people would also struggle to actually integrate these services meaningfully into their processes. TextShuttle does not only produce machine translation systems, we also consult people. We tell them, well, how can you configure this? How can you configure your workflows? How can you train your translators and so on? I think that is also a differentiator there.
Florian: MT outlook in the next two, three, four, five years? What are you guys working on?
Samuel: Again, predictions are always wrong in this field. I stand by it. On the immediate horizon, I mentioned controllability. One thing is this whole focus on sentences and contexts getting larger and larger, document-level translation. This is clearly coming. You could argue it is solved in research. It is a matter of how much money people want to invest to run it now.
What are we working on? We are also working on controllability in the sense of using resources that have been created. It is funny, people were so excited for a long time and still, I think about this adaptive machine translation paradigm, so the idea where you add it to translation and then it does the same way in future projects or even in the same document. At TextShuttle, we find this a bit weird in the sense that if at a certain company, you have your glossary and I want to translate X as Y, why does the machine translation system have to make a mistake and you have to correct it and maybe correct it again for the system to eventually learn what is already known upfront. This domain adaptation where you do not necessarily need to fine-tune your systems and do long training, but if you activate this translation memory, the MT output is going to look like that. If you connect this glossary, the MT output is going to look like that. That is what we were working on, for example, and I think others will be too. I guess that is going to be interesting for people who use machine translation in a professional context in particular.