The Great ChatGPT and Translation Debate

SlatorPod #162 - ChatGPT and Translation Panel

This week, SlatorPod hosts its very first debate with guests Adam Bittlingmayer, CEO of ModelFront, Varshul Gupta, Co-founder of Dubverse, and Mihai Vlad, General Manager of Language Weaver

To start off, the panel participants reflect on their recent experience with ChatGPT since its launch in November 2022 and how this shapes their views on large language models (LLMs). Varshul and Adam talk about how clients view ChatGPT.

Mihai agrees with the idea that the language services industry is exceptionally well-prepared for the launch of ChatGPT due to its experience with human-machine interaction. Varshul discusses how LLMs have influenced startups like Dubverse to build prototypes that can handle edge cases.

Mihai shares the challenges of deploying LLMs in large enterprises. Adam and Varshul highlight how parameters such as security, data privacy, latency, throughput, and cost are essential to consider in an enterprise setting.

Varshul and Mihai talk about the potential of multilingual content generation from scratch and how it will affect production costs. Varshul shares how they continue to attract users throughout this AI hype and the importance of adding a UX on top of LLMs.

Subscribe on YoutubeApple PodcastsSpotifyGoogle Podcasts, and elsewhere

Adam discusses the potential for LLMs to assist translators in their work, although the implementation of this tech may take some time to become the new normal. Varshul and Mihai debate how services-focused companies should react to the rapid advancements in LLMs, whether you wait to see how things pan out or go all in to stay ahead of the curve.

The panel rounds off with emerging use cases for LLMs, from building prompt-based systems for more concise translations to addressing long-tail languages that are often overlooked by machine learning due to the fragmentation of the language industry.

Transcript

Florian: Briefly, just introduce yourself to the audience, talk a bit about your business and your role at the business. We’re starting with Varshul, then go to Mihai and then Adam.

Varshul: I’m Varshul the CEO and founder of Dubverse. We at Dubverse essentially are into video dubbing using AI. So any piece of video content you can convert from language A to language B, C, D. Now, why we’re trying to do that? A very core important piece, value proposition that we offer to our creators is that content should not be limited in one language. If any piece of content is created to be consumed and that needs to be consumed in multiple languages, so that’s the larger value proposition that we bring on. We deal with transcriptions, translations and text-to-speech and happy to chat more on, deeper on translations with GPT.

Mihai: I head up Language Weaver. We started our journey with Language Weaver in the government and public sector space and then we developed our machine translation product to be suitable for enterprises. Right now we’re enterprise grade and government grade machine translation. And as we’ve expanded into other markets, we’ve expanded also our NLP stack beyond machine translation, a bit of summarization and some quality estimation, so I think we’re going to be talking around that. But yeah, thanks for having us and maybe one bit of trivia. The majority of the traffic that passes through RWS passes through Language Weaver, so RWS is our biggest customer.

Adam: Hi, I’m Adam. My background sort of started in the consumer side in Google Translate, and before that I had been in Android. So if you kind of think about what happened there, we kind of made the machine translation, instant translation free for most of humanity at this point. Four billion people have an Android phone. And now at ModelFront, the mission is to make human quality translation not free, but radically more efficient. It is very inefficient and so we do that by predicting which machine translated segments, or human translated segments even, do not require human post-editing or do not require human review to get the same final quality, but skipping a significant portion of the segments. And at ModelFront, so we mainly work with buyers, translation buyers who have large volumes, tens of millions or hundreds of millions or even a few billion a year. And then it’s like a machine translation API, it basically gets integrated in their TMS.

Florian: A broad spectrum from the MT side, but also, of course, startup with Varshul there and Adam. So I want to start just by kind of collecting everybody’s maybe personal experience over the past three to four months with ChatGPT. It feels like it’s been ages, right? But it’s only been like three, four months and ChatGPT and other LLMs. So my personal one, maybe just to start, like I remember when it came out, I tested it, I dug out my old OpenAI account, like, oh, wow, it still works, I can still log in, I can still use it, so I didn’t have to kind of wait in line. Started playing around, then initially produced, like a short video where the ChatGP pretended to be a translation manager and talk to each other and we put it in a Synthesia video and uploaded it on YouTube. And low and behold, it got like four times more views than the average podcast we do, which is ridiculous. But so there was this wow moment, and then it was followed by like, okay, what does this actually mean for the industry? Like, we kind of initially, okay, can it translate? Yes, it can translate and now it’s been just a journey of unpacking potential repercussions for the past two to three months. So just wondering what your experience was, maybe starting with you, Mihai?

SlatorPod – News, Analysis, Guests

The weekly language industry podcast. On Youtube, Apple Podcasts, Spotify, Google Podcasts, and all other major platforms.

SlatorPod – News, Analysis, Guests

Mihai: Yeah, so maybe two stories here and I’m not going to talk much about the translation. I think we’ll have enough time to cover that. But the first thing is my behavior changed fundamentally after obsessing over these LLMs, starting with OpenAI’s. So in developing a habit to constantly ask questions through this chat-like interface and getting useful answers, I just found myself every time when I engage with a website or with a company, and they have this intercom or HubSpot bubble, when you’re trying to interact with something, you almost like want an answer instantly and every time when you can get to engage with potentially a human is a bit bizarre. It’s like you’re learning very quickly this habit of getting insight and answers very quickly. And then when I think on the other side, bit of trivia, so every time when someone leaves Language Weaver and they join other companies and whatnot, as a parting gift, what we used to do is to build a model, a language model around the emails that they sent. So it’s all kept inside the walls of the company and build a bot for them so that we have a bit of a memory of what they’re doing. So we’ve been doing that for two or three years, especially when CEOs have left and it was kind of used to get a sense of what models or language models produce. But, yeah, it’s just been shocking to see the amount of value and the nuance and the direction and the possibilities that these models are providing. So it’s just mega exciting to be in this business now or in this area.

Varshul: I think since the time that GPT has come out right, I’ve been too excited and too psyched about this, essentially. The way I’m looking at this, essentially, is that entire human knowledge now is wrapped under API. Is there a better impactful way to deal with any form of textual technology? I don’t think so. I mean, LLMs is something that will be massively impactful and it’s just sort of very early days for people trying to figure out how to use it, be it prompts, be it embedding, be it vector spaces, but I think going forward, this will massively disrupt a lot of workflows, user behaviors at startups, so pretty excited about what’s going to come. A little bit of an anecdote, like I come from a land of about 25 official and about 100 unofficial languages and thousands of dialects, right, so this is what I was pretty much most excited about. Wherein is it just around the context of English being a modality or are there more other languages covered as well? Which is where people like us, folks like us, essentially can leverage LLMs of translation. So, yeah, I think this has been my little journey around this.

Adam: I would say technically LLMs in some form have existed really for five years or more, what’s technically an LLM, right. This new class of LLMs is something different and it’s definitely interesting. So sort of my experience starts though going back years to the time when I was writing out the code at ModelFront. And right now we’re obviously exploring this and it’s not an, if it’s a when. It’s happening on the quality prediction side and on the machine translation side and I also try to use it in my personal tasks. I have tons of annoying personal tasks to do and I almost view it as like a measure of how useless the task is, right? The better ChatGPT does at it, the more pointless the task was.

Florian: Was it for you more incremental when you saw it? When you first logged in and got kind of the chat interface? Were you like, okay, well, this is a logical next extension having followed LLMs for four or five years or was it still like, okay, this is a new quality here?

Adam: You know what they say, right? Quantity has a quality all its own, right? So if you make something radically bigger, all of a sudden it takes on new properties, so I do think there’s some of that. There’s an interview with me and you and Slator going, I think it’s got to be two years at this point saying, hey, this was the logical extension of the current paradigms and I would say again, this is the transformer-based model, right? A former colleague of mine, right, Jacob was one of those co-authors in 2016, now it’s just transformers, just bigger and more and more languages, right? But that does fundamentally change things. I saw this great article. I was reading a great article on the plane today from… George that the Hindenburg crash happened actually 30 years after the Wright brothers first flight. So if you kind of look at this thing where you say, okay, these balloons basically, right, balloons was like one thing, and The New York Times, of course, it’s always The New York Times, right? New York Times was saying these heavier than air machines, meaning airplanes, they’re never going to compete with balloons, right? And obviously there were still a lot of people investing tons in the balloon paradigm as the real next paradigm was already taking off.

Mihai: I think your question was super interesting, like was it an incremental change? Were you expecting this to happen? And I think the majority of us have had an experience primarily with ChatGPT and I’m still scratching my head because chat interfaces existed, OpenAI’s playing ground existed, and we played with it, and yet it didn’t give us this experience. And somehow I found very recently an interview with Sam Altman, one of the founders and the current CEO of OpenAI, and it was at Graylock Partners. And he was predicting, so that was like six, seven months ago, he was kind of making the point that the adoption of these technologies will go through a really good user interface. And he was making the points like, look, no one has actually taken on Google, no one has actually taken on Search, and he was kind of like insisting on the user interface. And yet I think there’s more than the combination of the two really good models and the chat interface. Yes, it’s easy to understand the chat interface, but I think it’s ultimately a collection of more layers. Maybe a safety layer, maybe more models behind the chat interface, maybe more filters, a bit of history passing on context. I think we probably want to simplify this and say, well, it’s just a chat interface on top of a large language model. That’s the killer application and I think there’s more layers to the onion. In TL;DR in this answer, like I was not expecting this, definitely. It makes sense looking backwards, but it felt very surprising and it still feels very surreal.

Florian: Varshul from your early clients, established clients, what were some of the initial feedback you got, inbound you got? Has there been a lot of kind of feedback or was it relatively muted so far?

Varshul: Essentially, when we talk about languages, right, specifically in India, it’s very contextual. So Hindi is the most popular language that’s been spoken, and I’m a native Hindi speaker, but then when we converse day to day, it’s not pure Hindi, it’s a mix of Hindi and English, right, so that’s what… I mean, for the first time when we were trying this out, we were surprised to see the results, right, because practically this is all of the translation systems that are rightly, that are currently built are based on textual translations, our bookish translations versus we do more of AV content, audiovisual, more conversational, more day to day sort of stuff. So that’s one of the biggest problem that our clients used to have, wherein even if you do machine translation, let’s say from English to Hindi, it’s still not consumable. So we went through a cycle of trying to come up with feature sets which can deal with more mixed languages of translation, but now with GPT essentially we are looking to remove that sort of custom build stack and then essentially just work our way with prompts wherein it can now essentially scale to each and every language domain use case, right, and I don’t have to custom build config for all of my clients. Yeah.

Adam

We definitely get the question sort of like, how does this all fit together, right? If you kind of look at what we do, it’s a little bit… Fundamentally, if you found a technology business, it means that you believe there is a gap between what it is possible, at least for the right team to build and what people can get in their hands today. So that gap existed when we founded the company, right, which is why we founded it. Arguably, this sort of makes that gap a little bit bigger even, so what is possible to build and available on the research side has advanced yet again, and then you just go back to your day-to-day operations inside a translation team or a translation agency, and it’s still basically manual in a lot of cases. Fully manual, right. So from that perspective, we always welcome anything that makes people use machine translation more or that makes their machine translation better, right. But the real sort of, let’s say, attitude shifts or thinking shifts, I would say there’s actually two things working hand in hand, which is ChatGPT on the one hand, which is sort of this positive bubbly thing, and the bad economy on the other hand. And because they’re happening at the same time, it’s very hard to disentangle those two, but there is definitely a new openness to more automation.

SlatorCon Remote June 2024 | $ 150

SlatorCon Remote June 2024 | $ 150

A rich online conference which brings together our research and network of industry leaders.

Buy Tickets

Register Now

Florian: Let me pose an idea here. Or, like I’ve been advocating kind of recently the past two or three months, right, that’s my framework now that the language translation industry was probably one of the best prepared for this launch because it’s been in this kind of expert-in-the-loop mode. It’s kind of been operating like that for years. We say at Slator, like it’s no longer like human-in-the-loop, it’s kind of expert-in-the-loop because the human can’t really add a ton to good machine translation anymore. But do you think this is a naive view or would you agree that as an industry and now I’m not just talking about machine translation, but the whole language services industry has been or is probably one of the best prepared? What do you think, Mihai?

Mihai: I totally agree with it. It’s not a naive view at all and I think there’s more layers to that. So yes, maybe the whole localization industry, or let’s call it language industry, was lucky to have had this human machine interaction honed across, let’s say, machine translation. But the tasks, the linguistic tasks go beyond just post-editing what the machine got wrong. Maybe you change the nuance a bit. Maybe you change a bit the culture, the tuning. Maybe you label the data differently. So there’s a lot of linguistic tasks that are designed to work together with the machine and I think we’re incredibly well prepared for this. I’m wondering whether the industry is going to react to the plethora of opportunities or the news cases that are being created. Or they would want just a better technology so they do not inject more effort into, I don’t know, cultural biasing or correcting what the machine does, so I think that’s one of the big questions I’m having. But on the other side, I think there’s obviously the questions like, okay, so you’ve got the human machine parity, human machine interaction, and then the machine gets better and does more things like, where does this lead us? Do we need all this interaction power or human-in-the-loop? And my answer would be yes, because technically, I think what the localization industry is is more like similar to the legal industry or the insurance industry. So yes, technically you can generate a contract with ChatGPT or any kind of LLM, but you still want the assurance and the oversight of someone skilled that this is a good thing, so it goes beyond the human machine interaction. It’s almost like it’s a layer of assurance and comfort and the same thing is, yes, planes can back to the zeppelin example from Adam is that, yes, planes can land by themselves today, but it’s much safer, at least for me, to think that there are two pilots on one of these flights that actually do more than interact with the machine, but also overseas. So I think we will not see the massive shift that potentially this technology we think would be generating because we are still selling into and offering a layer of assurance on top of what these technologies provide.

Florian: Now, Mihai just mentioned that it’s still interesting to see how the industry will react to it. Now, Varshul, you with a startup, like a couple of years old, are probably at the very fast end of reacting to it, so how have you reacted to it? You kind of alluded to it before, but like, okay, this has been around for like 10% of your company’s life, which for most other companies it’s not, right, so have you reacted to it? How are you reacting to it? Has it kind of influenced your roadmap already so far?

Varshul: 100%. Look at it this way. When we were thinking about existing machine translation systems, then the immediate question that comes up is scalability and we as a platform right now allow 30 languages. So we practically cannot train our models language to language or domain to domain, which is why we’re looking at a more generalized sort of a solution. Even if not LLMs, we would have been happy with some sort of a multi-machine translation, multi-language machine translation sort of system, right. So that has always been our approach, wherein if we solve for a specific vertical or a domain, then it becomes very restricted to that specific domain, which is why having GPT and looking at a more generalized solutions changes our roadmap like anything. We have been able to build prototypes and show results where these are now handling more edge cases. It’s still dreaming at some places, but then it essentially handles some of the edge cases that our users had pointed out. So, yeah, definitely. I’m assuming this should change everybody’s roadmap right away, so that’s for taken.

Adam: I would agree with the overall assertion, right, basically with what you said, right, that the startups can react always a little bit faster. And I would say that we were born sort of assuming that this is the way things are going and it’s exactly as Varshul said, right, that we have to deal with a lot of languages from day one, right. And so we sort of started with this previous generation of LLMs from day, 100 plus language support from day one and so it’s maybe not a huge shift. Our customers expect us to be sort of their AI guys and pull this stuff in and it’s logical that we do because for the reasons that Varshul said that it makes it easier to build things. I think that the opposite thing holds, right? It’s not necessarily just companies that are 100 years old, let’s say, that have trouble adopting this for good reasons, but teams within those companies also, right? Even on the buyer side, there are so many processes and mentalities and so on around the old way of doing things.

Florian: Let’s talk about kind of the enterprise deployment side of these LLMs, Mihai. So typically I think people tend to underestimate how hard it is to deploy these kind of cutting edge technologies within large enterprises or sophisticated government organizations. What type of challenges do you see when you try to roll this out? I mean, this being like your current solution, right, which is one part of AI at large companies. And how do you think this may kind of influence the way these LLMs eventually become like enterprise ready, right? I mean, it’s one thing for me to go and use the chat box and test a little bit of translation, but it’s another thing when you have millions and millions of words in a second. So talk a bit about that.

Mihai: Maybe I’ll start with a parallel, so if LLMs and these technologies are the thing that get us excited and we see lots of emerging applications, like we were not expecting this to, I don’t know, summarize well, as well as generate marketing content as well as translate, so it’s mega exciting. So naturally to your previous question, the customers would be saying we want this or how exactly will it benefit us? But then throughout this journey, I learned that customers do not buy technologies, especially in the enterprise. They buy products and solutions like full packages and a good example is like you do not buy an electric car, you buy a Tesla, as an example. No affiliation to Tesla, but in essence, yes, it is an electric car, but it is also potentially the safest and it’s got the best UX and I don’t know, it’s got the self driving module. And this is the mindset that enterprises have and governments have when they buy something. They do not just buy an amazing technology, they want a full package. So as an example, when it comes to NLP applications, in this case machine translation, they want to know precisely what’s happening with the data. They want to be able to call a human when things go wrong if the things go wrong. They want preference on geography. They want an assured throughput. They want an assured latency. And if they ask, if they spot an error like, I don’t know, maybe the diacritics or maybe the decimal separator or the nuance of the translation in this case isn’t good, they want you to fix it and they want you to fix it at scale. And then they want this thing to be integrated with their systems and then they want some different models to speak different nuances. So I think when you play the enterprise game, you need to balance all these things into consideration. You can make an amazing solution that no one buys because it’s going to be super expensive, like the next computer from Steve Jobs. Amazing technology. No one actually could afford the price even if it had perfect objective programming. So I think what we’re going to be seeing with these large language models is it’s definitely a race to see how big do they have to be or how small do they have to be to still retain value and then be able to, I don’t know, deploy them on premise or package them in your solution or not blow the bank. And then just to be transparent, like Christian was saying in the previous podcast, which I think is a brilliant episode, and people should definitely check it out, what a brilliant guy, and he was saying it’s like, look, large language models ultimately create a competitive pressure to rethink your technology, especially if, I don’t know, the GPT turbo cost is lower and they’re going for reach. You need to rethink your business model and think exactly how you’re going to be providing this for the enterprise, so lots of homework to be done by a lot of the providers.

LocJobs.com I Recruit Talent. Find Jobs

LocJobs is the new language industry talent hub, where candidates connect to new opportunities and employers find the most qualified professionals in the translation and localization industry.

LocJobs.com I Recruit Talent. Find Jobs

Florian: Adam, what are some of the parameters that people need to be aware of in an enterprise setting? Like, okay, quality is one thing, but then if it’s super slow, then you can’t use it, so what are like the 3, 4, 5, 6 parameters that you need to hit if you want to use something at the millions or tens of millions of words scale per, I don’t know, second minute, what have you?

Adam: I would say before any of that is just the security and data privacy. That’s the most obvious thing. Why, let’s say, there isn’t going to be a single LLM to rule the world. Why this is going to fragment.

Florian: Does anybody know what’s the privacy policy of OpenAI? I mean, I’m quite cautious what I’m putting in there right now. Is it just free for all or when you sign up for the paid version, they actually have some kind of privacy policy? Does anybody know?

Mihai: I looked very briefly into it and also I’m very cautious and I have friends who acknowledge the enterprise prohibited access to them. They work in development and they just don’t want developers to post code that potentially could be absorbed. But in essence, I think if you pay the plus subscription, they say that they retain the data, but they’re not looking at it after 30 days. And I’m not sure if it’s live yet, but there are some enterprise offerings whereby I don’t know, maybe they segregated the data, but we don’t know yet. There are, however, super interesting companies, like in this trifecta of LLM players. It’s OpenAI it’s Cohere and Anthropic and I think Cohere is probably the one that’s going very hard on the enterprise and data privacy and the multilingual aspect of these LLMs, so very interesting company to look into.

Florian: We had Nick, I probably said it ten times now because I’m so proud. I had him on the podcast about a year ago, six months ago, when they were still available. Now it’d be a little harder to get Nick from Cohere. But just very briefly back, Adam so you’re saying, I mean, privacy, data security, so then quality, latency, throughput, can you just give listeners a bit of an idea what type of performance would be required that obviously a ChatGPT couldn’t deliver? In terms of the throughput, for example and cost.

Adam: If you look at human translation workflows, they can be post machine translation, post-editing workflows. By the standards of pure machine translation workflows, the large ones or other systems like Ads or Search or whatever, they’re not that high throughput. If we’re talking like Google, Facebook scale, they’re not that high throughput. The largest artists are at a few billion words a year. It’s not that much, right. They are a little bit spiky, right, so it really comes down to price per word and then the price of these things. And usually, no matter what, if you can do it at a reasonable scale, the machine price is still radically less than the price of the human, stuff that it’s hopefully making more efficient, right. If there’s one thing, it’s very hard to see the future, but if there is one thing I think we can predict with confidence is that the efficiency of these things will go up, the price will go down. That always happens with technology, so I’m not too concerned about that.

Florian: Varshul, have you hit the limitations of, I don’t know, whatever API you plug in right now? And what were some of the limitations so far?

Varshul: Primarily we are on the stage of looking at some eval metrics around how we can better measure this, but then, more importantly, throughput matters to us, yes, but then accuracy beats throughput any day, right, so that’s how we look at our current systems. In terms of some mission critical use cases, I believe throughput could become a bottleneck. But then largely, if you have to talk about… If it anyway goes through a cycle of human review, then the difference between 2 seconds and 4 seconds would really not matter a lot, right. Which is why I would say that throughput will only come as an issue for mission critical cases. But cost, definitely, as Adam said, I think as this get commoditized, there’ll be competition around pricing, around throughput, and this will keep on getting better and better.

Florian: Adam, so we had Christian on the podcast from Microsoft and they did some early testing using these LLMs to do quality estimation evaluation, which is very much in your wheelhouse. What’s your take on their work? I mean, they published it, right. Big papers on archive. We reported on it and then is this kind of still like research grade or can this be, I don’t know, I’m not a techie but kind of deployed in live production, or if that even makes sense?

Adam: I would say very much both, so it is mainly still research, but it can be deployed in production. So there is at least in our space, whether we call it machine learning or the translation industry, there’s a big overlap between things that are mostly research, but still already possible to deploy in production though not deployed in production, obviously, in most places. I think currently still roughly half of the human translation volume is from scratch. Roughly half, right.

Florian: It’s incredible. It’s so hard for outsiders to understand. We’re so frontier, but then there’s this kind of legacy giant volume of corporate activity that’s coming and following some of that, right. But specifically, for MTQE, it’s something that you’re looking at and you think you could also adopt or where do you see that?

Adam: I think there’s a ton of self selection there because by definition, anyone doing MTQE is on the more advanced end of the spectrum. Whether they’re a provider like we are, or whether they’re a user of it, they’re all more on the advanced end of the spectrum.

Florian: Mihai, give us your view on multilingual content generation from scratch. I know a lot of stuff will be written and it needs to be translated, but maybe you could see kind of the translation supply chain changing a bit in terms of that. A lot of content would be generated from scratch, multilingual instead of like you have a source and then you translate into certain targets. Do you think this could be… There’s two ways to look at it, right? One is market expansion because a lot more content is getting generated and then the other one would be like maybe it’s taking some share away from the kind of legacy or core translation market because people just prompt their LLMs to do multilingual copy. Convoluted question, but like multilingual content generation from scratch, thoughts?

Mihai: To be honest, I think it’s the most exciting area right now and the one where the gains will be much higher than, let’s take machine translation as an example. Let’s assume that the accuracy for the top languages are, I don’t know, 90, 95%, whatever it is, however you want to measure it. Okay, so LLMs came out and we have better context for what the quality will get at 96%, 97%. Great, like the the gains are harder and harder to to attain. But there is this universe out there which is closer to the creation of content, which is unclear right now how players that were just focusing on translation downstream can tap into. And I think what you’re alluding to is like the majority of the traffic we’re looking at, we’re looking at it from the point of view of localization and that’s, I don’t know, a fraction of the whole continent gets translated, what Adam was mentioning. But I think we’re going to be seeing smart language service providers moving closer to the creation of content if not offering services because I think the translation, as you were alluding to is going to happen closer to the generation. We’re not going to be having creation then a TMX file, export it, then send it to a localization agency, then get it back, put it up. Canva, or whomever built a localization business around their core products. It feels to me like the multilingual aspects are getting close or more upstream, if you call it upstream. And I think the cost per variant or the cost per page created will drop so much that we will not be… I think we’re going to be seeing lots of drafts and lots of variations and I think that the way we generate content will change fundamentally. And if we can create it in language to begin with, it generally changes the complexion of the whole marketing translation landscape. So I’m not sure if my answer was less convoluted than your question, but it is definitely an area where a lot of people should be looking if they’re not looking already.

Florian: We’ve been talking internally very kind of extensively about this and then initially I was so excited in a sense of like, oh, almost every translation kind of could start, which is the multilingual prompt, but then internally I got a lot of pushback from people saying, no, no, no. I mean, there’s oftentimes you need… I mean, the translation is required because you have a source and that thing needs to go very kind of close to the source into a number of different languages out there. So anyway, it’s something that I think the jury is very much still out there, how people are going to use these generation capabilities, right? Maybe Varshul from Dubverse, have you seen this, like people generating things instead of just converting from one?

Varshul: Essentially, I mean, if we talk about that duplication of a content in multiple languages, it’s also a form of creation. From that lens, this is kind of creating content, right? That’s what we at Dubverse do at the very core, wherein the idea of trying to convert your content into multiple languages right now is very hard with a lot of human-in-the-loop, human dependence, a lot of info cycles versus if you’re able to power this with AI. Then the cost of production comes down, so as the cost of production comes down, we’ll see a lot of more and more interesting sort of content being generated rather than created. I think that’s what happened with Search, right? I mean, the entire idea of looking at Search as an indexing problem versus a generation problem massively changed the way that Bing had come up with their own version of GPT, right? And then Google had to dance, so that’s how I think I look at it. And most of these problems will now be a form of generation problems, so yeah.

Floria: Quickly about the perception for you Varshul, like Dubverse, right? I mean, there’s so much buzz in the market over the past four or five, three, four months. So many kind of simple front end offerings are popping up with a nice website and you can easily scroll through and it promises you the moons and the stars, right? Now, how has this kind of hype impacted your maybe top of funnel and just kind of your ability to cut through the noise out there and make sure that people understand, hey, this is actually something that’s robust. It’s been out there for a couple of years and it’s working as opposed to just a simple front-end to some API.

Varshul: It’s definitely more than that because the way we look at this, any sort of use case or a workflow is pretty much intricate, right? It’s not just a function of trying to just chat through a chat window and then solve for your use case. I was just creating some presentation. There’s another app called Gamma App. Again, no affiliations, but there the ability to create certain content is very fast, right, because you are right now talking about democratization of a design skill set. Now I don’t have to go to a designer. I can just go to this app and start creating my presentation, so that’s what will happen across use cases. But then I think the workflow of any use case, be it dubbing, subtitling, presentation, creation, so on and so forth, will require more these intricate tools. So I think then the importance of UX on top of LLMs becomes massively important because any startup, any product and any business essentially can add a UX layer on top of these LLMs and then help deliver that use case very well because LLMs now are commodity like how we’ve been discussing. So a UX essentially, which ends up collecting more niche data is how we are looking at our current systems.

Florian: Have you seen, I guess, competition for mind share from these, there’s probably 1000 or 2000 .ai, something with language out there now that weren’t round five months ago? Have you seen kind of competition from them for just mind sharing people’s attention?

Varshul: On the other hand, we’ve actually seen this competition benefiting us because just as a function of better SEO and better having some sense of reaching out to the user, we have been able to acquire a lot of users, especially since we started our product in this chat itself. So that way we’ve already done about 37,000 hours of content, roughly in about 30 languages, which essentially says that people are excited about this. People do want to give it a try. People do want to end up going through that experience, that magic of trying to create your content into multiple languages in our case, and so on and so forth. I think that’s what it’s largely benefited us, if I have to say.

Florian: Same here. We’re seeing traffic going up, like really arcane like niche search terms are now all of a sudden starting to rank quite high. Hey, Adam, so when you’re triaging with ModelFront like, you’re triaging what needs to still go into kind of the expert-in-the-loop side of things. Where could an LLM support translators, maybe post-editors day to day in that part? Okay, you’ve used the LLM now or implemented your own version of it, of course, to kind of triage what’s perfect out of human-in-the-loop but what remains human-in-the-loop. Where do you see LLMs potentially helping translators there?

Adam: Firstly, I would caveat that and say that so far, despite the best attempts of many, if we even go back to the translation memory, right, Jochen Hummel said, I first tried to give this to the translators. They didn’t really want to adopt it and it ended up being not even the LSPs, but the very front of the chain, the enterprises who sort of used this first and then things kind of… The fish rots at the head, as they say, right. So I will say that overall, it’s easier to introduce disruptive technology that way. I think the incentive alignments are much clearer, but then eventually it does get right now, most translators have a TM. So I think that the same thing will happen and I do think that we will see the CAT basically have this kind of capabilities, but only when it becomes the new normal. Where the buyers expect it from the LSPs, the LSPs expect these efficiencies from the translators and so on, because translators are not going to cut corners and sort of risk their jobs for a little bit more efficiency. I think eventually we will get there. In a sense, I believe that Lilt had the right idea, at this point almost ten years ago, right. Basically, this is what’s needed. This is what we as consumer users are used to, right. Which is it it knows what my cousin’s dog had for breakfast yesterday and so it’s showing me this search result and this ad based on that. And if I don’t click on that, after a few times, it starts to understand and serve me up something else and this is sort of what we expect. And I understand the translators, fundamentally. I’m a big user of machine translation, by the way, right, so I love languages, speak eight of them, right, and I’m sort of a compulsive… If I can’t remember a word in one, I have to go look it up, right. And it is just super annoying when any sort of automatic tool, if you just have to correct the same thing again and again, right? That is the difference between a human and bad automation or good automation and bad automation is like if I tell you once, I don’t know. No, I mean Zug the city in Switzerland, not the train, right? It’s like I don’t have to tell you ten times in the same article and the technology exists to make this a reality. The thing is, they don’t complain when it works. They’ll start taking it for granted. That will be an achievement for humanity, but currently that is not the case. Currently, people, our fellow humans working as translators have to correct the same thing again and again a thousand times in the same document, or even a million times.

Mihai: I just want to add to what Adam said. Like maybe an interesting observation is like it’s fascinating that we’re looking at these large language models or maybe not so large language models, depending on the privacy and to kind of refine or create efficiency or eliminate drag and frustration. What Adam’s ModelFront is actually fixing and solving and identify where to deploy your efforts. And I think this is a very specific mature use case application of LLMs, which is post-editing. So billions of dollars are being spent every year on that. And on the other end we’ve got LLMs in the case of Dubverse, who is being maybe looked at as a holistic one-stop shop technology to kind of power multiple things almost like the Swiss Army knife, no pun intended, to kind of have a lot of functionality on a lot of languages into one single model. And I think the power of these technologies with all their emerging capabilities is that I think we’ll patch mature use cases to gain productivity and maybe create new use cases, be that through applications over the top, but equally, sometimes it can augment enterprise use cases. The way we’re looking at it is the majority of our traffic at Language Weaver isn’t necessarily localization. I would say it’s a small fraction that it is and what we’re dealing with is in most of the cases, something that’s super noisy. It’s not perfectly formatted content that you get out of a PDF. It’s actually mangled, you don’t quite know what language you’re dealing with, you don’t necessarily know what’s happening. And I think the value is to ultimately make more sense of this content and maybe apply some sense of a filter so then the translations are better. And then on top of that, what Adam was mentioning, like the example of Zug, is it just opens up the possibility to have a wider context window. So right now we’re translating sentence by sentence, and our context is that sentence plus/minus two other sentences, allowing the model or a combination of models, we don’t necessarily have to be maniacal about it has to be this or that. But maybe combining these two models will actually provide more information and more insight into the machine translation or maybe cleaner input data into the machine translation models so that the output is of a higher quality. And then you can apply another model to, I don’t know, vary the output and make it more fluent or change its tone and bias. I think the answer for these larger language models apply, just to summarize, is like maximizing productivity, enabling emerging use cases, and then augmenting existing technologies to ultimately derive better output or better results.

Florian: You mentioned context, but if you want to add all of these gazillion things as additional context, doesn’t that impact the data security, the privacy part of the equation because where would it get all this context from unless you give it to it?

Mihai: I’ll answer this one first. So this is precisely where you’re probably not going to be building your business in sending your context or no enterprise, as Adam was mentioning, will not be sending their whole documents as context just to get a better translation. It has to be something that’s either on premise or highly secure or someone that signed an enterprise contract that you know that if you push that information, it’s not going to be absorbed just to make that large language model better. So I think providers and suppliers will need to have a better answer within their enterprise stack if they offer enterprise solutions to deal with this. But the answer will not be, let me just send all my sensitive information as context to an open API and then get the results out there.

Varshu: I think in this case, from security standpoint, all of these open-source LLMs will be sort of put to use way better than, let’s say, a closed-source LLM, right? The ability to use in your data specifically from an enterprise context is very important because that’s what the sort of prompting allows to have more and more context. So with this, there might be a very big opportunity for either these enterprises having a version of open-source LLM and then fine-tuned to their use case with their context so they don’t end up sending all of their data to OpenAI and then have something which is more on premise. But then again, it has to justify the cost and throughput for them to take that move.

Adam: I agree. I think basically, right, the cat is out of the bag, so to speak. So there’s going to be very competitive open-source LLMs. There already are the first green shoots of that, and that’s what’s going to happen because of all of the various forces that want decentral. To go back to this privacy standpoint, I think you Florian, since you know the sort of Swiss and German markets very well, if we look at machine translation… Machine Translation, Mihai described the privacy terms of ChatGPT, of OpenAI, and they’re pretty similar to the machine translation API privacy policies for the cloud-based ones, right. Which is, right, if you’re just using the free version, well, it is what it is, right? And if you pay up, of course you can get your own and that’s not good enough for some people, especially if they’re a direct competitor to the provider itself, or in OpenAI’s case, the major investor of the provider, which is Microsoft, right? But Florian, you know very well what happened in Central Europe, which is employees at all the companies just use DeepL and translate.google.com, the consumer version, nonstop every day for their daily work, whether the head of security allows it or not and that sort of forces the hand of the enterprises to provide something. And not only should they provide something and send a nasty email, but they should provide something and do something like customize it or something, so that it actually works slightly better, so that the people in their large organization are incentivized to use it. And to me, it smells very similar with this kind of ChatGPT scenario, right, where you got to do something because there is a lot of useless tasks inside a large enterprise and so there’s a lot of rational decisions by employees to use something like ChatGPT. And so everyone’s got to provide it, whether they like it or not.

Florian: You mentioned open-source. I still have a hard time wrapping my head around this. So we got this ChatGPT that’s closed-source despite OpenAI being open. So, what, all of a sudden, three months later, we have, like, I don’t know, 500 giga open-source things lying on some Hugging Face that I can download and run locally? Or like how?

Adam: Enterprise facing companies, like you mentioned Cohere or I could see RWS doing that, right? Saying, hey, we’ve got this enterprise grade one, it’s on prem, blah, blah, blah, right? And you don’t have to go Hugging Face yourself. It’s going to cost you a few hundred k.

Mihai: I think the interesting thing here is whilst you have on one end potentially the largest of largest models properly trained, with reinforcement learning and safety layers on top of them and whatever secrets are getting to such good quality and that’s one extreme. And that’s in the cloud and it’s probably going to be very hard to shrink, even if the stories that the model isn’t large, it made gigs so technically it could sit on your phone. But I think to run it properly on a capacity, you need that extreme for hundreds of thousands of users. On the other end, you’ve got slightly smaller large language models running on your computer, on your Mac and everything in between. And I think the size of the model and the capacity will be an element. But I think Adam makes a very interesting point, is I think providing the right value to the employees of an enterprise that have to do this meaningless task over and over will probably be important. And if you can build a business case, if you remove 10% of that effort and you free the resources to do something more intelligent, what is the value of that? Is it 100k, is it a million, is two million, who knows? And this is I think where the size of the model will be potentially less important than just solving the answer whilst abiding to data privacy. I think enterprises and everyone should get very clued up to the point that what their asset is is the data that they produce. And if they’re happy to give it away, be it enterprise or assets, as consumers for some efficiency gains, I don’t think they’re getting a good bargain and I think they have to be more clued up. And I think we’re going to see more nuance on how you trade your data and how you deal with it than we’ve seen before. Let’s see how it evolves.

Florian: Let’s take the point of view of a services focused company. Many LSPs and there’s thousands, right, are very much focused on the service. Yes, they have tech, they buy some, they license some, they maybe develop a bit themselves, but at the end of the day they’re very much service focused companies. Like how should they react to this right now? Should they kind of wait and observe or let things shake out a bit and see who’s kind of becoming the winner? Or should they really be on top of this and remain very much cutting edge?

Varshul: I think in this case what will end up happening is after a certain period of holdout you’ll start to see patterns. Probably enterprise starts adopting more open LLMs, more consumer use cases are on OpenAI. So I think all of this will play out to a level where this will be part of our workflow, day to day workflow, right. And that’s not just valid for LSPs, enterprise, but be it any SME or SME sort of a business, right? So one way to look at this could be to sort of wait it out and then actually see once the storm settles. So how are the patterns emerging?

Florian: What do you think, Mihai? Like watch or be freaking out about every… To me it’s very hard, like every day there’s a new breakthrough and you feel like everything’s obsolete like 24 hours later. So how do you react to that?

Mihai: Personally, it’s hard for me to keep up. It’s just crazy, the adoption and the excitement and even Christian at the previous podcast had an interesting point. We’re at the top of the hype. We’re seeing a lot of cool demos, cool applications and then we might hit a wall, things will cool down, but I’ll tell you what feels different. So I joined Language Weaver five or six years ago when the neural MT wave was starting and it took a while to latch on. Interestingly, we were using some neural models at the back end of the last stages of the model just to make the translations a bit smoother and I think the majority of the top providers were doing that. And then we were pushing to adopt this technology and we had the luxury of laggards or cynics will say, well, this will never catch on, this is nothing, it’s to be ignored. And we had the luxury of two to three years, and then it was the right technology. But what’s happening right now with these models is like, people are not questioning whether they will take the risk and do something about it. They are just doing it and they are doing it at the rate that it’s incredibly hard to keep up. So if you layer on top of that, Andy Grove’s idea of only the paranoid survives, I think it’s a massive risk not to act right now, especially if you’re a service provider. So what if you’re wrong? Fine, you would have wasted some resources, but the positive impact, it’s an asymmetric bet. If it goes right, you’ll be the first and you’re going to be ahead of the curve and you will be maximizing the gains like Adam was mentioning. You will be more efficient and especially if you’re a service provider or a technology-enabled service provider, because you have the relationship with your customers. And if you can deliver a better service because you just got yourself a better technology, you should do it. And you should go 100% into this because again, worst case scenario, nothing happens. It’s a small investment, but if it goes well, everyone has to gain. So maybe, just to summarize, the answer is the rate of adoption cannot be ignored. It feels very different than itself five or six years ago. And I think the risk of not acting is larger. And I think in this area, the saying is like many of wrong steps been taken by standing still. I totally believe that this is the case with these technologies. To be proven wrong by Christian in six months, we’ll see.

Florian: Let’s end with your kind of top two use cases that you’re seeing maybe in the next two to three months, maybe even personal for your business, for the core of your business. We’re playing around at Slator, for example, we’re building a model that answers questions based on all of our content so our users can log in and test this out. So this doesn’t really have anything to do with translation itself. It’s just kind of a business application. So maybe starting with you, Adam, where do you see cool, interesting use cases in the next two to three months that you might accelerate, you might start, you might integrate into the business?

Adam: I think that fundamentally, the fragmentation is what defines the language industry tasks and our tasks in other areas, right, so fragmentation. So I think that if you look at sort of like the biggest volume, most automatable use cases, in some sense, the stuff that we focus on, right, where it’s extremely repetitive and high volume and so on and you have a lot of data, those don’t benefit as much from this type of thing because they were already in some sense the most automatable, the best handled by machine learning. And what’s really interesting is all these little corners, these little niches where it didn’t justify a huge investment in machine learning. But all of a sudden, machine learning became so much more accessible that now you can get something and you don’t have to know how to code or you know how to code, but you don’t have to go code something, right? For all of us here, it’s also faster to just open up, openai.com for some little thing. So I really think that it’s the niches, but the long tails really make a difference, right? So you take something like Google Search, something like 50% of Google Search queries have never been queried before. Isn’t that amazing? Which shows you the exponential expressiveness of language, right? It’s just the math. This many characters combined, right? These are the number of combinations, right? So basically, if you had something that only covered sort of the things that are very common, you’d cover, okay, half, but there’s still half that’s uncovered. And so anything that is able to deal with the long tail in some is very powerful. And so that would kind of be my answer for where all of this goes. It allows folks to get AI for many, many… I need to provide a code of conduct to pass this security review or something. Okay, give me a code of conduct for a company, a startup called ModelFront and these are our basic goals, we’re incorporated in the US, et cetera, et cetera. Okay. Very niche.

Varshul: I think on the sort of business side, right, like we believe that language is something that inherently has a lot of context, specifically talk about different languages and translation as a very core problem. So we’re looking to build systems which are more around prompts, where you can essentially prompt your way through a translation and get it more concise in a way that, hey, this particular content needs to be translated for kids, so then it would definitely not do expletives, right? It’ll basically keep it very simple, sort of a language format. That’s one thing that we’re exploring on the business side and I think on the personal side, I hate doing taxes, so if there’s a way that it can do taxes for me, then I’m all yours.

Mihai: Cool business idea. I look at these things in two camps. The use cases that I find interesting. I think it’s augmenters of existing technologies, as I mentioned, so this can make machine translation better by surrounding it with extra modules. Ironically similar to the SMT days when you had multiple models in sequence to derive a good translation. I think we’re going back to that rather than just using a single model, so that’s camp one. So I think MT specifically will get better over the next few months. That’s one. The second topic is I think what you mentioned initially is like the multitude of the ease and the access to creating more content and the variations of those content will literally explode, which I think will turn this whole content business upside down. We started in a world of scarce content and perfect content before we were going to a market and I think with so many variations and the ease of creating them and the low cost of creating them, it will give us space to experiment and see what’s working. So I think we’re going to be going to let’s say a large… The economics of this content transforming or creation will change fundamentally. And there are some amazing examples of that, like this Drake and the Weekend, exactly content that came out and it’s amazing. So there’s definitely something upstream of content that will change everything and just to the sheer increase of volume and the reduction of cost. And I think the third bucket, it’s something that fascinates me and I never quite thought it would work. It’s that systems used to work in clearly defining the interfaces amongst them, and these are usually called APIs, application programming interfaces and you have to spend so much time to make sure that you have the right parameter in this API code, with this API server, it has to be pristine and you spend hours and hours with engineers to do this to get two systems to work. And surprisingly, these systems are able to talk to each other using language and I’m seeing all this sequencing of agents through Auto-GPT or whatever other technology will come in the next two to three days that are starting to talk to each other and fulfill tasks and I think this is the most interesting use case. We probably haven’t seen it, but I don’t know, linking these large language models with the external world and sequencing them and getting them to communicate through language will, I think, open up endless possibilities in fulfilling end tasks. Like booking a wine tasting is probably flirting, another podcast. So I think there’s a 10 X effect in just combining the models and we don’t quite know what’s going to come out, but something cool will come out for sure.

Florian: We force the machines to use our language to communicate with each other.

Adam: This is happening, right? Maybe Mihai is referencing LangChain in the LangChain ecosystem.
Mihai: Bingo and there’s also this Auto-GPT, I’m not sure if you played a bit with it, something you can like sequences. So it’s it’s almost like LangChain plus, plus. It’s worth looking into. It’s super interesting. And apparently, like, people, I mean, talk about hype. Apparently they got more stars on GitHub than PyTorch, which is in a matter of a week or two, which is bonkers. But, hey, we’re definitely riding either the hype or the beginning of something.