Where the ‘Translator Still Feels Like a Translator’ with Bureau Works’ Gabriel Fairman

SlatorPod #190 - Gabriel Fairman, CEO, Bureau Works

Gabriel Fairman, Founder and CEO of Bureau Works, joins SlatorPod to talk about the potential of generative AI in translation management.

Gabriel shares the origin story of Bureau Works, where over the years his perspective shifted towards viewing translation as an information management challenge, leading Bureau Works to transition into a tech-enabled business.

Gabriel discusses the challenges and opportunities presented by large language models, touching on issues of cost, workflow integration, and the potential for a more interactive and fluid translator-computer relationship.

Gabriel rejects the idea of comparing language models like GPT to human translators, viewing them as aids to improve the human experience rather than alternatives.

Subscribe on YoutubeApple PodcastsSpotifyGoogle Podcasts, and elsewhere

Gabriel Fairman explains the flexibility of Bureau Works’ UI, which aims to optimize productivity and a sense of authorship, in contrast to the repetitive and frustrating nature of traditional machine translation post-editing.

BWX is concentrating on simplification in 2024, introducing features like the learning terms tool, and aiming to integrate translation seamlessly into various tools and simplify project creation and management.

Transcript

Florian: Gabriel is the Founder and CEO of Bureau Works, cloud-based translation management system. Hi, Gabriel, thanks so much for joining today. Give us an overview of Bureau Works, including key services, products. Just give the audience a bit of an overview of what you do and what the product does, what the company does.

Gabriel: I really have a hard time with the alphabet soup that we use in our industry, TMS, CAT, all of the jargon that we use, but I like to explain it in ways that I can explain it to my kids. So we’re a tool that helps companies translate stuff with maximum leveraging, whether from AI or other contexts, maximum quality and minimum cost. That’s really what we focus on and maximum governance as well. That’s a big part of what we do. And a lot of that stuff ends up falling in different parts of that alphabet soup that we described. So is it a CAT tool? Yes. Is it a business management system? Yes. Is it a translation management system? Yes. Is it a quality management system? Yes. But the key thing I would say is that the way we approach it is everything works together towards a single goal, which is what I mentioned. So we try to be very abstract as far as how we think of tooling, because ultimately I think that tools that are very literal, they tend to be short lived and bad software in general. So that’s something I’ve been learning for a long time now, working with software is learning how to think more abstractly.

Florian: I will vote for the abolishment of CAT tool. Even 15 years ago, what do you mean, CAT? What do you mean computer aided translation? Like computer aided everything, right? So the rest I can live with, but CAT, at Slator, we’ve tried to move it towards translation productivity technology, but there’s a bit too many syllables in it, so it didn’t really catch on.

Gabriel: I play around with that, like translation productivity environment, enhanced translation environment. It ends up getting wordy and I lose people in the second word, yeah.

Florian: How did you start the company? What’s your origin story? Where are you from? Language, from tech, and how did it evolve into what it is now?

Gabriel: I think the main genesis behind the company was getting fired from a corporate job when in my early 20s and I had no idea what I was going to do. I grew up speaking Portuguese, English and Spanish. While at college I studied a little bit of French, Italian and Chinese, and I like languages. And I actually studied how language and the awareness of language is perhaps one of the main enabler of human agencies, which is our ability to actually be accountable for our decisions, so that’s what I majored in, actually. My major is called the Death and Rebirth of Human Agency. So it’s this correlation between linguistic awareness and decision making and so I had that in my background. I had no idea what I was going to do. I liked languages. My mother had been a translator and I figured, well, I was under a lot of heat from my parents for having getting canned from that nice prestigious corporate job, so I had to do something to make money and I figured, well, I might as well start translating. At the time, my mother was an interpreter, so she had quite a bit of connections and clients, people who liked her. And I said, well, send me something, send me a translation job and see what happens. She’s like, no, you’re going to ruin my name, you’re going to be terrible at it, you’re just a 20 something year old. I’m like, let me do it and I did it and people actually liked it. It was like, oh, I guess I can translate and that really picked up. I think we were at the right place at the right time. This was Brazil 2005. It was right around when Brazil became termed a brick. So there was a lot of external international interest and was very underserved from a professional services perspective in all areas, like whether it was M&A, whether it was accounting, whether it was translation. So I think honestly, at that level, at that point, I just think we were at the right place at the right time. The company grew very significantly, within a few years, we were the biggest in Brazil and continued growing. But I never really, to be perfectly honest, I would say even after my first year, I realized this isn’t about translation at all. This is about information technology and information management, and I’m never going to be able to solve. I’ve always been very focused on solving things at a root level, at a more fundamental level than just brushing it up or putting bandaids on it. This is an information management challenge, and I got to get good at information management. So I began to treat it as like a software oriented thing from the very beginning. And I would say we continued to evolve as a translation agency for a while, all the way to 2015, and that’s when we began to pivot to a fully tech-enabled business with our own tech powering all of the main components of the business. And then in 2021, we fully pivoted to a translation platform, a software first company.

Florian: What about services? Do you guys also do services, even now after the pivot?

Gabriel: Yeah, and I think that has to do with the vision. This is just an opinion. We don’t have a firm business plan around it. But my opinion is that things evolve typically to marketplace, particularly in our area. I think business tends to be very fluid in the sense that people need to get things done, and like I said, the whole alphabet soup, right? We have an entire very rigid structure of how people think about translation. And it’s like you need a vendor and an MOV and an SOV and all of this alphabet soup, and the fact is that people just need things done. And it’s interesting when you talk about the CAT tool, because that’s how I feel, that clunkiness, that awkwardness of the wording. I feel that about most of the stuff that’s done in our industry. I feel like the answer to why we do things a certain way is just, that’s because we’ve always done it that way. And it’s not because people are rethinking, well, what makes sense over here. So again, when you ask the 25-year-old, just as an example, and try to explain what a CAT tool is, and they’re going to be like, yeah, sure, it’s done on our computer, that’s great. It’s the same thing. Well, why do you need an agency and why do you need a subcontracting agency, and then a freelancer, and then three workflow steps and then this and then that? And we’re trying to explore really what is needed from a more abstract perspective. And in that regard, working with people is part of what we do.

Florian: Now, one thing that’s kind of shaking up things, of course, are large language models, because I’m also seeing a fair amount of new companies pop up that don’t know this kind of legacy workflow and that are very new to this and are building products without any kind of legacy in their head. But can you help us kind of parse hype and reality of large language models and kind of actual production today in the localization, translation industry.

Gabriel: First of all, in actual production, this is just an opinion, it’s not a fact. But I don’t think percentage wise, much of the industry has adopted anything that can be truly leveraged from large language models. I think people are exploring, dabbing with it, running tests, researching that for sure. But I would say the industry as a whole, I don’t see it widely adopting this stuff. I do see people getting more and more heat from management. I see companies starting to switch over to an approach that is generative AI compatible or generative AI adoptive. But I think that there’s a big difference, first off, with regards to what even it means to be putting something like this in production, because sometimes a translator can be copying and pasting a translation into GPT and saying, hey, I’m leveraging generative AI. That’s true, they are leveraging generative AI from a definitional perspective. But I think even when people talk about what a large language model is and what it isn’t, I don’t even see people being able to talk about the same things, really. So a good example is I see most people talking about generative AI as either a superior replacement for machine translation, or an enhancer for machine translation. That’s typically where it falls. I don’t see it as either, personally. I see it as a contextual manager that can emulate linguistic inferential capability and that makes it, in my opinion, a lot more powerful than, let’s say, treating it as this enhancement layer. So even how you define what it is and how people are using it is tricky. It feels like a little bit of quicksand to me, at least. In general, I would say most people that I’ve talked to so far have a very cautious approach to what this means and how it’s going to impact them. I would say for now, I see the overall predominant feeling is minimize and wait and see. That’s how I would characterize it in general.

Florian: You are working on this very actively. I think you guys, I know you guys won the Process Innovation Challenge at LocWorld recently. So what impressed those who voted? What was the solution that you presented?

SlatorPod – News, Analysis, Guests

The weekly language industry podcast. On Youtube, Apple Podcasts, Spotify, Google Podcasts, and all other major platforms.

SlatorPod – News, Analysis, Guests

Gabriel: I think what impressed people, A is that it’s in production, it’s live, it’s working. We have thousands of people using it on a daily basis. B, it’s not like… Our approach to generative AI is working on the background, so the translator still feels like the translator. They’re still making their decisions. They’re being aided at a very granular process, so we’re trying to create something that still feels like authorship. It’s not just like, oh, I’m getting these feeds and just mindlessly confirming, I’m teaching. And that’s how I see generative AI. It opens a portal of a different way of communicating with computers than the traditional way. A good analogy, our Head of Marketing brought this up, I really think it’s great, said it’s almost like we’re going from radio and TV communication with a computer from a translator perspective, like getting stuff from a machine translation engine, getting stuff from a translation memory to a social media kind of interaction, where the engine that we’re working with, whether it’s Facebook or LinkedIn, whatever media, it’s learning from our behavior and it’s adjusting to it imperceptibly. And that’s how kind of like we see it. It’s the possibility of a very interactive flow between author, translator and computer. So I think that’s how I see it. I see it more as an enabler than a thing in itself. I don’t know if I’m being clear, but that’s how I see it.

Florian: In the UI, it’s basically you get a few options to reword, manipulate the content, but there’s still like an initial output, right?

Gabriel: Yeah, but the initial output is already contextualized based on how you’ve been rewording things. So just a very basic example, like, let’s say you reworded formal French to informal French. So it was vous, vous, vous, and you are like tu, tu, tu. And you’re confirming that the future feeds are going to be informal if you’ve done that, even if, let’s say, the translation memory is formal or the machine translation is formal, it’s going to be infusing your feed with your style, let’s say. Same thing with your terminology, same thing with… So we’re used to teaching, whatever we teach like a translation memory is only syntactical, right? So it’s only if the sentence has a very similar syntax, will it be able to quote-unquote, remember that. And in this case, we can operate semantically. So if the meaning is similar, it can be inferred, it can be propagated, it can be suggested. So the relationship becomes a lot more fluid, a lot more bidirectional, let’s say.

Florian: In terms of the UI, are you mostly post editing, so you’re always presented a target version and you’re going in and you’re fixing it, or is there some translators, they read it, don’t like it, let me start from scratch. What’s that process like?

Gabriel: You can configure it. We try to make the UI as flexible as you can, because everybody has different tastes and preferences. So you can turn off, let’s say, the suggestion and work in a different way. You can work with what we call translation smells. That’s just this guardrail that flags semantic problems. So translation inaccuracies, wordiness, omissions. So stuff that can be pretty helpful for a translator. You can turn that off and on as well. Can turn on and off alternative suggestions, but the key thing is whatever feed that they’re getting is an initial starting point. They’re free to do whatever they want with that. So for us, what we’re still trying to learn is what puts people in the path of greatest productivity and at the same time, greatest feeling of authorship. So in your example, for instance, yeah, if someone wants to just delete the feed and type over something else, they can, but that still will teach them, teach the engine for the future feed. So that’s what’s cool, is that it’s not just… You’re getting like one of the things that really always annoyed the hell out of me for machine translation post-editing was the fact that the feeds don’t get any better. You get like this pretranslated file, and maybe in the background you can get some engineers training the model and making the machine translation model better, or tuning the model or doing something to the model. But me as a translator, I’m getting a document that has maybe good, good gibberish, gibberish, gibberish, gibberish good, good. But it’s not adapting and I just have to make the same changes over and over and over and over and it’s this mindless pain for me, honestly. Like, machine translation post-editing is like translation hell.

Florian: I’m seeing a podcast episode title forming: Translation Post-Editing is Mindless Hell.

Gabriel: That’s a good title. I mean, I hadn’t thought about it before that way, but it is. I really, really don’t like it. And this just feels good. I love it. It’s fluid. I feel like, okay, I taught it one thing, it’s learning, it’s getting better, it’s getting closer to me, I feel closer to the engine. I feel aided and that’s been our guideline. It’s like, how do we feel? And how do translators feel when they use this? Because if it doesn’t feel good, then what’s the point? At least that’s kind of like how we’ve seen it. And sure it has to be more productive, it has to be more cost-effective, but there has to be a good feeling about it because in the end this is language and we are re-authoring things when we’re translating, so I think that has to be kept in mind.

Florian: You talked to a lot of linguists. I think we caught up a couple of months ago, and I think you mentioned you talked to many, many, many. Now since generative AI came out, we spoke about some of the features already, but what’s kind of the top two, three things that they’re mentioning to you that frustrate them? Maybe it’s what you just mentioned, but maybe there’s something else that you’re hearing from them as well.

Gabriel: First, I think in general translators feel very divided about this stuff and I would say, not that it’s a 50/50 split. I would just say there’s a big camp of translators that really resent the technology. I post very actively. Sometimes I get comments of like I find people saying, I find translators who use this stuff disgusting, shameful, really, people wanting to nail, let’s say, the adopters against the cross. And then I see the adopters also kind of scrambling, not knowing what they’re doing, like trying to use a plugin here or plugin there and creating all these crazy workarounds. And there’s this idea that it has to be complicated. If it’s simple, then you’re not really adopting anything. It’s funny. And then our stuff is super simple and it’s accessible to anybody. My dad is like 75 years old and he’s not the most computer savvy guy, super smart, but not the most computer savvy and he uses our software and he translates with my mother like a ton of words a day and he just uses it. He doesn’t think about from a theoretical perspective what’s going on. He just says, this is great. I can do a lot. But with translators, I would say that first there’s a layer of prejudice, whether good or bad, right, but there’s a layer of prejudice. Once they get past that, there’s a lot of curiosity. And once they start experiencing just how productive they can be and how they can still feel good about it, that’s when they take off. But from a raising awareness perspective, it’s hard because you have to get over all of these hurdles. And yeah, we run weekly webinars with hundreds of translators and it’s always a learning experience for us.

Florian: Now, recently I watched a video where you gave a talk at the Middleware Institute, and you mentioned something called an old paradigm versus a new paradigm, and there’s a slide there that people should go and check out. But can you just walk us through it verbally? Like, what do you mean by old paradigm versus now the new paradigm?

Gabriel: I think from a more abstract perspective, not referencing the slide per se, because to be perfectly honest, I have a hard time remembering the exact lines that I put down. But the key thing goes back to the beginning of the thread that we were discussing around the CAT tools. And that’s the old paradigm, this alphabet soup in which things have to be get done. So like a translation, editing, proofreading framework, like quality frameworks, like LISA compatible with like 55 different error types and each one with different severity scores or DQF. There’s all this alphabet soup of how things have to get done and I just think the new paradigm is all of a sudden there’s a much, much simpler way to get things done. There’s a much simpler way to think about translations. In summary, you don’t need any more translation, editing, proofreading to get to professional level translations. You can get everything done in a single workflow step because it becomes what we call iterative and pulverized. So essentially you’re getting all these layers of added verification within a single workflow step instead of having to assign it to different people. And you have so much more performance data available on how people are performing. So you’re not guessing anymore who a given project manager thinks will perform well on a given client or working with a given translator because they haven’t had a complaint, which is such a big decision-making driver in our space. So you start to get to better decisions, you start to get to simplified processes, you start to get to more streamlined situations. And that’s what I would call like a new paradigm, is just a more abstract way of looking at the same things and focusing more on what’s the result that’s produced rather than the methodology. So I would say results first, process second, as opposed to what I consider to be our current paradigm, which is process first, process second and sometimes we consider results. We’ll present them in slides.

SlatorCon London 2024 | £ 980

SlatorCon London 2024 | £ 980

A rich 1-day conference which brings together 140+ industry leaders views and thriving language technologies.

Buy Tickets

Register Now

Florian: Even the process should be easier, more user friendly, and just kind of simpler, right? So people can focus more, people meaning the linguists can just focus more on, well, the linguistic component rather than all the bells and whistles that are built around it. I mean, I’ve been waiting for this forever. This is getting simplified, and people can focus on their core task rather than on like, hey, there’s some term popping up on the left hand side that it’s in yellow, and yellow means it’s medium severity, I’m trying to write a sentence here. All right, kind of a hard switch to cost. I mean, one of the things that I’m hearing is that, sure, these models can do all kinds of things, but if you plug in too many of those, actually, you’re going to start having an issue with cost, especially if you’re running millions of words, tens of millions of words, through this. Can you give us a bit of a benchmark, what you’re thinking about here? And also, how do you price this? Is it an issue? Is it not an issue?

Gabriel: I think it can be an issue, but when you talk about cloud computing in general, cost is always an issue, but it becomes a DevOps thing. Like, just as an example, right, we could run BWX, we run it on AWS, for instance. And if we don’t have our DevOps really tip top, our costs, just our server costs could be 10 X, because we don’t have, operations aren’t optimized for computational costs. So maybe you’re exchanging packets more than you have to, you have all these redundancies that aren’t really necessary, and computational costs can mushroom in any given scenario. So it’s the same thing. I would say it’s analogous with language models. And it’s funny, just a parentheses, whenever I say language models, my brain is always thinking numerical sequence models, because it’s just more accurate as to what they are in the end. The language gets transformed into numbers instantly and to me, it actually becomes even more frightening, because when you think of them as mathematical as opposed to linguistic, then you have a better appreciation of how complex and sophisticated these things are. Language is actually a lot more simple to understand.

Florian: Hey, it’s the language moment now. Don’t introduce the math. We finally can do computational things using natural language. This is our time.

Gabriel: That’s why it’s so deceiving because it feels so natural and at the end that’s why I’m excited and at the same time frightened about everything that’s out there because when I put it into a vector and I start thinking about the direction that we’re going, oh. Again, this is my opinion. Maybe I’m just crazy and spending too much time with machine learning people, but I don’t think people realize just how fast this thing is going to evolve. And just as an example, right, I see a lot of people reducing these language models to, oh, they’re just guessing the next word, they’re just predicting the next word. Yeah, that’s a very simplified way of looking at it. When you think about the fact that every single letter gets quantified, quantitized, changed into numbers, and then you get vectorization, and you start seeing how these numbers are correlated at so many different levels, and you start seeing the number of associations that a model like this can produce, and then transform this back into words and that they still make sense. Then things start seeing, okay, we’re at the early stages of this, and this is already good enough, in many cases, to fool people into believing that it’s actually thinking or actually producing something that makes sense and I think that’s also another thing. A lot of people get caught up as to these engines as sources of truth, and then it’s, oh, they’re hallucinating, for instance. Yeah, it’s hallucinating if you consider it to be a source of truth. If you forget for a fact that it’s not meant to be a source of truth, it’s just meant to be as an instrument, a different way of interacting with language through numbers, then you start to develop a much bigger appreciation of how sophisticated it already is, what a GPT-5 could mean, what’s possible to do using these structures. So I think a lot of people just spend too much time focusing on fighting this stuff off as opposed to appreciating, and at the same time projecting forward what this could mean to our reality.

Florian: It’s overwhelming, though. It’s so overwhelming when you think about what the next five, let alone like 10 years of this could potentially mean, right? It’s very, very hard to like, okay, now we can write a text like, what could it possibly do in five years? But on a more practical note, let me push back a bit on the kind of the thought that it’s getting a lot right. I don’t know. I mean, from our particular example of translation, when I go to GPT-4 and I have it translate something, it still doesn’t do what I think a good translator does, is which is kind of these leaps of imagination and these micro decisions, these great, interesting, expert-level micro decisions. It’s still stuck in this kind of formulaic thing and even if I like, okay, translate like you’re a German financial translator with 20 years of experience, it’s still stuck, and it still makes some of these decisions, which then sometimes makes me think, well, it’s obviously not actually smart, right? That’s number one but on the other hand, why is translation so incredibly hard that now these super powerful models running on the most powerful computers ever in history can’t do it very well.

Gabriel: I agree with you, but that’s the thing. I think people are too hung up on the idea of whether can it be as good as a translator or better and I don’t look at it at all like that because, in my opinion, I’m on board with you. I think translating, especially translating well, is incredibly hard.

Florian: Why is it so hard?

Gabriel: I would call it context. You’re referencing troves of context and learnings that aren’t reduced to the specific thing that you’re translating. You’re making all kinds of different associations. I mean, some people define intelligence as the ability to compress data. That’s definitely happening. I mean, think about it, right? The prompt that you just described, pretend that you are a 20-year veteran in financial translations in German, for instance, you can’t pretend that. That’s accessing decades worth of information in a way that’s not linear, it’s not simple to replicate and that’s why I think I would never ask… Just for instance, at the very beginning, I tried to write using GPT as an aid in many different ways. I gave that up very quickly because writing, for me is something very personal. I like to write my way, and maybe it sucks, but it still feels like mine, and that’s what it’s all about. Same thing with translations. That’s why we never look at it as an alternative for a human. We’re looking at, okay, how can we scrap this for components and think about increasing or improving the human experience in general, and not saying, okay, this is better than the translator. And again, it may be exactly like you said, maybe without added context and just throwing a text there and saying, translate this, it’s going to suck. But if you add translation memories and libraries and terminology to that, and you start to add a lot more context, and you’re able to search semantically through this stuff, too, you’ll see the results are radically different. So that’s another thing, right? These engines, at least in my understanding, in my opinion, my use, they weren’t ever developed to be translation engines. And one of the first studies we ran very early on was comparing exactly GPT models to translation engines, and in general, the difference wasn’t that big. That’s not the flabbergasting part at all and that’s why I think people don’t realize just how big this is and how much it’s going to impact the next five to 10 years that we have.

Florian: What about for you, you know how to code, are you using it if you still code? Are you using it at all for just code generation as kind of a copilot?

Gabriel: Our developers use some level of copilot or other tools, but it hasn’t changed their productivity dramatically, either because we’re not using it amazingly yet, or because it’s not a matter of style, but I wouldn’t know. I don’t have hands on experience to comment on that.

LocJobs.com I Recruit Talent. Find Jobs

LocJobs is the new language industry talent hub, where candidates connect to new opportunities and employers find the most qualified professionals in the translation and localization industry.

LocJobs.com I Recruit Talent. Find Jobs

Florian: Now, obviously there’s a lot of different layers to this, right, from the foundational platform to, I don’t know, front end, et cetera. In your business, what do you want to own yourself and what do you think you just want to plug into and maybe put a light customization layer on top of?

Gabriel: Kind of like in a very simple way, and again, going back to the 25-year olds, because that’s the thing I don’t think people realize is that they are the people who will be making the decisions moving forward, right? We’re at the end of the torch. The world changes very, very quickly. And when I mean decisions, not necessarily that they will be in the management positions, but they will be the people buying, the people influencing how products are shaped, how things are done and again, the competition this time is very different. As an industry, as an example, you see tech like HeyGen, for instance, which produces these really uncanny videos where you can record and then have the same video lip-synced and translated and dubbed in, I think they support currently like eight languages. But I don’t think people realize the impact that this has on someone who has no idea of what translation is and has never seen the alphabet soup and then they see this video. There was Florian, now Florian is speaking in Mandarin, and that happens seamlessly and that cost me 3 bucks. Why the hell are we spending millions a year on translation? That’s the natural decision process that any executive would make looking at this stuff and this stuff is going to become more and more pervasive. So there’s going to be this idea that translation is a solved problem. In my opinion, that’s almost going to start to permeate like a virus, how a lot of people think about this stuff. And the challenge that I see is that, well, if it’s either that or spending 20 cents a word, well, that’s probably going to win because it’s not an attractive value proposition to spend 20 cents a word to translate, then spend however many hundreds of dollars to dub and then wait for a result that’s going to be slightly better than that thing that you get in for 3 bucks instantly, right? So what we’re trying to do is figure out a way where we can kind of remain relevant for both worlds. Where we can still respect the idea that a brand has its own terminology, where we can still respect that a translator does add a lot of value and still understand that it has to be done in a much more value friendly way, given all the tech that we have. And I think, very honestly, I think that produces, that’s going to create a very high pressure situation in our space because it means that we have to do the same stuff for less. And I don’t see our industry wired to support that because it means basically supporting the same amount of people with a lot less money.

Florian: How about the idea that the fact that people consider it a solved problem? There’s going to be a ton more like a lot of people who do a lot more things in many more languages than they used to because it became so accessible. That in turn makes it almost like you have to have a multilingual presence, no matter what business you run. And that in turn then drives demand for that last five, three, four percent sliver at the very top that the industry has historically provided. It’s like with design, right? I mean, now anybody with a Canva account can do something relatively okay.

Gabriel: I agree. I mean, I think that you have these two opposing pressures, right? One pressure for better, cheaper, faster, and then another pressure for more and whether it’s more scale of content, more languages, more breadth, more depth, so yeah, I see those happening for sure. This is just an opinion of mine, but I think our relationship to text is fundamentally going to change. So I’m a big believer in something I call the idea of intertextuality, so this idea that you can relate to the text dynamically as a reader, right? So basically, even as a reader, the way you’re reacting to a certain line that could be shaping the outcome of the next line, and user behavior may create different versions of the same text for you and for me. So I see that possibility that you’re mentioning. I just don’t think that, again, all of that happens in a very high tech framework, which means very low cost and low margins, which is not how our industry is built. Our industry is built in a very service kind of framework, which means high cost and high margins to operate these very large businesses, so that’s the squeeze. I see in the end a potential very positive outcome with a lot more for all of us to do, but it’s like we’re going to have to go through this portal and I don’t think it’s going to be an easy one for us.

Florian: I agree with that, yeah. Now, where do you see, well, not the focus, but basically there’s two things as a linguist you need, and then basically at scale, as an LSP, you need subject matter experts or expertise, and you need linguistic experts or expertise, right? Now, where do you see these new technologies shifting the balance? Do you think, well, kind of the creative language part is now getting automated to a degree, so you basically just need the expert, the subject-matter expert on top, maybe they don’t even need to know the source language? Or do you feel, well, you can also have the AI check a lot of the terminology, so you really want to have the language expert who deeply understands the source and is very creative in the target language?

Gabriel: I can see both scenarios. I think my opinion, this is just an opinion, I think it’s a lot easier to, you mentioned your example earlier about how hard it is to translate, and my opinion echoes yours. And I think it’s a lot simpler to emulate, I’m not saying like replicate, but it’s a lot simpler to emulate subject domain expertise than it is to emulate literary finesse and creativity. So if I had to guess, that’s going to be a more high valued thing, not that you’re not going to need maybe subject matter… Particularly for liability purposes, people are going to want to say this translation was signed off by a financial expert in German, as your example said. I think that’s still going to be a thing, but I think the harder thing is how do you craft language in a superior way? And the reason for that is another trend is as baselines get better, and I think that that’s something that all of the studies that I’ve read so far show that. So it’s like because of a language model, like, let’s say, regardless of the area where it’s applied, lower performers tend to become average performers pretty quickly because you’re aided by something that gives you at least some kind of framework. The challenge is that you get a decreasing marginal utility for top performers. It’s not like you get like Superman if you’re already really good, you’re going to get slightly better, maybe, often not even impacted, at least the way the tech is right now. But my opinion is that the burden for translating for these people is going to be a lot bigger and should be more highly valued as well because they’re not just committing, let’s say, a segment to a translation memory, they’re actually training an engine on the fly. The decisions they make have bigger repercussions, and we’re probably going to figure out ways to further propagate these things, too. So a good example is like, let’s say today we work with very static TM’s and static segments, so you confirm a segment. Maybe let’s say it’s a website and you confirm a segment and it’s going to change that particular part of the website, it’s not going to change the entire website. We’re evolving into a world where if you change something here, it’s going to change maybe 1000 pages and it’s going to change everything dynamically. So the decisions are going to begin to carry a lot more weight, which means that they have to be done a lot more by people who really, really know what they’re doing and how they’re going to make these decisions and how they’re going to be aided. That’s also something important. But I see a lot more emphasis on that fine literary ability as the ability that’s going to be the most valued one. It’s also harder to find. Comparatively speaking, it’s easier for you to find, and not only that, but you can go through credentials, right? If someone has a CPA in the US for instance, okay, they’re a financial expert in the US, they have a CPA, but with this stuff, maybe they have an MFA, or maybe they have a degree in writing, doesn’t mean that they’re necessarily an amazing writer. It is something harder for you to figure out who really has that fine literary ability, in my opinion.

Florian: Very interesting and also for all of the translation or linguistics degrees and MAs and university programs, right. They’re trying to cram a lot into their curriculum. I mean, you need to be able to code, you need to be a subject-matter expert in something when you graduate and then, yeah, maybe at the end they may actually neglect the linguistic aspect, right? Which is something that probably you only have time in your life to study once, really, when you’re at uni, right? After that you’re in business.

Gabriel: I agree, but I also think that the linguistic aspect is a function of age and experience as well for most people. Very few people are great writers in their 20s. Most people mature into great writers a lot later in life and I think it has to do with this level of experience and understanding the depth of words. And we’re not computers that you can just upload all this information, like we require living, breathing experience to become better at things. And I think that’s something that is going to be highly valued, in my opinion, and the ability to relate to others, I think that’s really the gap. It’s like, okay, this has been produced, but how relatable is it? How is it making other people feel? And not just from a data perspective, because you could do that from a data perspective. Are people clicking the CTA or are people not clicking the CTA? What’s behind it? How are people feeling when they click the CTA? Those are harder things to answer and you typically need people who are very linguistically, situationally, culturally aware to talk about these things and make decisions around these things.

Florian: I just guess that the only time, at least in my life, when you have that time and space to really get it completely absorbed into writing and into a text is at uni, when you get the space for it, right? For example, I did one kind of a commented translation like 20 years ago. I mean, I’ve never been analyzing words and sentences and syntax to that level of depth, ever. Why would you, right? And it’s almost like the model training and then, yeah, sure, you lay 10, 20 years of life experience and reading on top, but that level and that depth, at least I only got like at uni.

Gabriel: Yeah, and I agree, and I think those are the tools and then you use those tools as you move along, as you read and you continue to build on those tools, but that’s the basic framework. It’s literary analysis, semiotics, deconstructionism, understanding, developing a deeper appreciation of how meaning is produced, how words work together to produce meaning. I think, again, it’s not the theoretical ability, it’s being able to put this into practice and developing this level of finesse is, in my opinion, going to be very, very highly rewarded, comparatively speaking.

Florian: We just published an article about kind of machines evaluating machines, machine translation quality estimation, all of these things, so basically, AI. Looking at how AI translated. What are your thoughts on that? Like, where do we stand? Useful, not useful. Where do you see that going?

Gabriel: I think MTQE was super useful before this stuff. Super useful. I could see it. I don’t think it’s useful anymore. Again, I think it’s a different paradigm that we’re working with. So just to explain first all of our experiments in that regard, first we were trying to create numbers, quality numbers that indicated how good or bad something was. That’s very hard for these models, for the most part, and then the second part, it’s even hard for humans, right? If I tell you Florian this sentence is 93% right, this is 7% wrong, what does that mean, right? What is that 7%? Is it 7% that’s compromising, not compromising, compromising to who? So translating how good a translation is into number is always going to be hard. And we start to get a lot better results trying to describe how good or bad a translation was. And that’s kind of like what led us to translation smells, which is qualitative in its sense, and it’s zero or one, really. Either the sentence has a smell or it doesn’t have a smell and if it does have a smell, what’s the smell? So to me, that’s a much more organic way of looking at MTQE. It’s like, is there something wrong with it? Yes or no? And if so, what’s wrong with it? So a qualitative approach is definitely, in my opinion, stronger. What I see a lot of enterprises wanting to do is, okay, I want to process an entire file against these smells, and I want to preconfirm all the segments that don’t have smells. And that’s fine, that’s something that we can do, and that’s within the MTQE framework. But for me, again, it’s about making the human experience so integrated with the text that they’re translating, that you don’t need MTQE to make people insanely productive and drive costs insanely down. So you can still have someone looking at every segment, but their speed is going to be super fast, their decisions are going to be replicated, carried out, rippled out in different ways. There’s going to be learning taking place, so that with very little effort, you get a lot of benefit from the human interference.

Florian: Now, all of that takes investment to develop these types of solutions. There’s a bunch of companies in this space that have a lot of money from venture capital, private equity, maybe not so much this year, but the last couple of years. So what’s your experience competing with these kind of well-funded companies and building something from ongoing revenue?

Gabriel: Software fascinates me, and one of the reasons it fascinates me is that I think that in certain things, scale makes such a big difference. You’re already a much better salesperson than I am, but if you have a 10 person sales team and I have a two person sales team, you’re going to kick my ass. There’s no way I’m going to be able to compete with you. Like 10 versus two, no way. I could have like amazing salespeople we’re not going to be able to compete. With software developers, I would change the perspective completely, because if you start going over a certain number of developers, it becomes very hard to manage those teams. It’s very difficult to create an innovation focused environment at scale. Ask Google, ask any of these companies, super hard. And then there are all these layers of complication where you don’t get this one to one correlation between team size and output. And so in software development, even you often see the opposite, like take away people from the team, actually start making more, shipping out more things. So I think it’s a particular area of business that, in my opinion, is not that impacted by scale. It’s much more impacted by vision, by clarity, by architecture. It’s more impact by concepts than it is impact by the number of people. So in that respect, I would say it doesn’t play against us at all, the fact that we’re not throwing tons of money at it. But I would say the opposite. I think that because we’re not financially driven, it really allows us to focus on the product. And if I were to pitch BWX, honestly, if I were to take truth serum and pitch BWX to a venture capital firm, they would say I was out of my mind. I would just say, I need your money and I’m going to develop something for 10 years and maybe we’ll grow 100% a year maybe after 10 years having your money. They’re going to say no way. Any person in private equity or VC putting their money somewhere, they want to start seeing return, maybe two years down the road, three years down the road. And if they’re going to put a lot of money in, they’re going to want to see exponential returns and we’re not after that. We’re after creating a great product with a great experience and the company is already profitable, so we’re not out of our minds going after more revenue. We don’t have to focus on bringing in clients that don’t fit our model, and we don’t have to focus on things that just add to top line revenue but don’t add to the product. And a good example to that is also connectors, right? It’s like, connectors are a great way to make revenue and they’re also a great way to close deals. That’s what most people look for when they close a deal with a software platform. But they’re actually, in my opinion, they’re an outdated concept, like for most use cases, it makes a lot more sense just to have someone on a client and just to use our API and plug stuff in. There’s no connector, there’s no maintenance, there’s no fees around that, there’s no engineering developed to that. So when you add all of that up, you add up the difficulty in managing large teams, loss of focus around product, loss of focus around core features and everything that I just described. I think it kind of levels the playing field, in my opinion.

Florian: What are you guys working on for 2024? Anything you can share?

Gabriel: Most of the stuff for this year has already come out or is coming out. So one cool thing, for instance, we developed recently is the learning terms part, so the platform was already learning from the segment commits, but now it’s also being able to infer terminology not from the traditional extraction of terms, it’s actually aligning the terminology that’s being produced in runtime. So, like, let’s say you translate word A as word B. In your translation process, you don’t flag it as a glossary term, but the tool is able to recognize it that is a relevant term, and it’s able to feed to the glossary your exact translation as a learning term and then you can later upgrade it into like a full fledged glossary term. So that’s a very productive tool because a lot of companies really struggle to build on glossaries and they have separate teams for terminology management and stuff like that, so that’s big. I would say what we’re focusing on, Florian, is really simplifying, like, if anything, the more we can take away, the better. We don’t like to add. So the more we can simplify from project creation, the more we can simplify from project management, the more we can simplify from all these things that we do, that’s what we’re focusing on. And I think you did ask me this previously, but I didn’t really quite answer it. As far as the vision, what we see is really a lot of focus on interoperability between platforms. So, for instance, you mentioned the Canva example. To me, this is super hard to do, but to me, as a really cool example is, okay, you’re getting, let’s say, the Canva translation through AI. That’s step one and Canva already offers that. But let’s say you want something that has been vetted by a human in quasi runtime, and you would be able to plug into that, or you’d be able to open yourself that string in a string productivity environment that pops up, that’s integrated somehow, and that’s leveraging other sources of information. That’s giving you somehow some level of decision, whether it’s good or bad, whether it has some kind of potential errors or not. But it’s really integrating the translation bit into the use of these tools. I would say that’s what we’re kind of focused on.