English versionEN

How I Support Over 100 Languages in My React app...and You Can Too!

Does your React app serve a global audience, but is only available in English? Let's change that. In this talk, I'll show you how i18n can become an automatic part of your CI/CD workflow, enabling your team, regardless of size, to deliver your React app in over 100 different languages without any extra effort.

This talk has been presented at React Day Berlin 2024, check out the latest edition of this React Conference.

FAQ

The main problem addressed is that most websites and web apps only publish content in one language, which is not convenient or inclusive for users who speak different languages.

Richard Kerrigan works at Microsoft but is not affiliated with the Azure AI team, nor is he trying to convince users to adopt Azure for cloud services.

The goal is to demonstrate a solution for automatically translating dynamic website content into multiple languages and dialects using Azure AI Translator or similar services from other cloud providers.

Key features include support for over 100 languages and dialects, a generous free tier of up to 2 million characters per month, Neural Machine Translation for accurate translations, and the ability to use custom glossaries.

Limitations include the inability to handle MDX or JSX/TSX file types, horizontal text translation only, and the need for human validation of translations.

JSON content can be translated at runtime using the Azure Translator API, potentially through a React server component or a custom API that caches translated content.

The solution involves incorporating the translation process into the CI-CD workflow, using Azure Blob Storage and Azure Function App to orchestrate and handle translations.

Benefits include inclusivity for diverse language speakers, cost-effectiveness with a free tier, accuracy with Neural Machine Translation, and customization with glossaries.

Yes, similar solutions can be built using AWS, GCP, or other cloud service providers that offer document translation services.

He suggests validating the translated content in a test environment before deploying to production to ensure accuracy and make any necessary adjustments.

internationalization

Richard Carrigan

28 min

16 Dec, 2024

Comments

Video Summary and Transcription

I'm Richard Kerrigan, and I'm thrilled to be here today to show you how you can add multilingual support into your React apps. The problem we're trying to solve is that most websites, web apps, only publish content in one language, which is not convenient for users and is not inclusive for people for whom that language is not their primary language. We need a way to support other languages and dialects on our websites and web apps without adversely affecting our team's productivity. Now there are various ways we could approach solving the problem, depending on the resources we have available to us. But for the sake of this talk, let's assume that we don't have much budget to dedicate to this feature, nor do we have a team that can translate content for us. This is a fantastic use case for AI and automation. When my team started tackling this problem, we found the Azure AI translator service. It offers ad hoc text translation and document translation, with key features such as language support, cost-effectiveness, and accuracy. It also allows the use of custom glossaries to tweak the translation process. However, there are some limitations, such as compatibility with certain file types and the need for result validation and possible tweaking. Let's now focus on designing the workflow for translating content stored as JSON. Here's an example of how you could achieve this. We're taking the pieces of data that need to be translated and sending each piece through the translator. Then, we send the translated content as a response. For JSON content, the translation can be done at runtime in a React server component or an API route. For HTML or Markdown files, the translation can be incorporated into the CI-CD workflow, involving Azure Blob Storage, Azure Function App, and the app's source code files. Instead, we've moved the translation process to our GitHub Actions workflow. We provision the necessary resources in Azure, build and deploy the function app, and then perform the translation process. We configure the translation endpoint, specify the glossary for tweaking translations, and execute the translation. Once the translation is complete, we create a new blob in the output location. Our GitHub workflow ensures that all translations are done before proceeding to the next step, where we upload and download the artifact for deployment. This ensures that all content is translated and avoids errors for missing files. When setting up the translation process, it felt like magic. The pipeline for translating Markdown content is the same as for HTML content. The difference lies in how the metadata is handled. In HTML, the metadata is stored in the head of the file, while in Markdown, it is within the front matter section. When translating HTML files, the head needs to be translated as part of the process. For Markdown files, the metadata needs to be translated separately due to issues with Azure AI Translator. The translation function parses and translates the original files' metadata, then combines it with the translated files' content. The GitHub actions workflow provisions Azure resources, builds and deploys the function app, and runs the translation process for both HTML and Markdown files. The process is now deploying the ARM template that includes the necessary resources. We discussed the problem of limited language support on websites and web apps, the benefits of Azure AI Translator, and alternative solutions. We also explored different versions of the app for translating JSON, HTML, and markdown content without learning a new language. If interested, you can find more information and resources on GitHub and Azure AI Translator. The workflow is now deploying and translating the markdown files. It ensures all files are translated before deployment. Once the translation is complete, the files are put back into the repository. The deployment process guarantees the presence of all necessary resources. After a brief wait, the web app is ready to be viewed. The blog contains multiple posts. The metadata provides the title and excerpt for each post. The markdown is parsed into HTML for rendering. Formatting is preserved in different languages. Feel free to connect with me on LinkedIn or X.

Available in Español: ¡Cómo Soporto Más de 100 Idiomas en Mi App de React...y Tú También Puedes!

1. Introduction to Multilingual Support in React

Short description:

I'm Richard Kerrigan, and I'm thrilled to be here today to show you how you can add multilingual support into your React apps. Before we dive into today's topic, let me do some quick housekeeping. Problem number one, although I do work at Microsoft, I'm in no way affiliated with the Azure AI team, nor am I looking to convince you to adopt Azure as your cloud service provider. As you'll see by the end of this talk, my goal is to demonstrate one possible solution for automatically translating website web app content into other languages and dialects, but you could absolutely build something similar using AWS, GCP, or another cloud service provider that provides a document translation service.

Item number two, there are already libraries, such as React, IATN, that allow you to translate static content, such as your homepage header, footer, et cetera, into other languages using the translations you provide. The solutions I'll be presenting to you today are for automatically translating the dynamic content, such as blog posts created by your team's content creators or even by end users.

All right, so with those housekeeping items out of the way, let's dive in. First, let's clearly state the problem we're trying to solve. The problem we're trying to solve is that most websites, web apps, only publish content in one language, which is not convenient for users and is not inclusive for people for whom that language is not their primary language. In other words, we need a way to support other languages and dialects on our websites and web apps without adversely affecting our team's productivity.

2. Designing the Workflow for Content Translation

Short description:

Now there are various ways we could approach solving the problem, depending on the resources we have available to us. But for the sake of this talk, let's assume that we don't have much budget to dedicate to this feature, nor do we have a team that can translate content for us. This is a fantastic use case for AI and automation. When my team started tackling this problem, we found the Azure AI translator service. It offers ad hoc text translation and document translation, with key features such as language support, cost-effectiveness, and accuracy. It also allows the use of custom glossaries to tweak the translation process. However, there are some limitations, such as compatibility with certain file types and the need for result validation and possible tweaking. Let's now focus on designing the workflow for translating content stored as JSON.

This is a fantastic use case for AI and automation. When my team started tackling this problem, for obvious reasons we looked at Azure first. Through that research, I found the Azure AI translator service. This service not only allows for ad hoc text translation, where you send text directly through the translator and it returns the translated text as JSON, but it also offers document translation, which takes as input one or more files from a blob storage container, translates them and outputs the translated files into a different blob storage container.

Also, Azure AI translator service has a few key features that ultimately convinced us that it was the right fit for our needs. The first key feature that was important to us was language support. My team at Microsoft runs an upscaling, reskilling program that operates all around the world. So it's important to us that any multilingual solution be able to accommodate any language or dialect we want to support. With over 100 different languages and dialects supported, even including Klingon, Azure AI translator was definitely the right choice for us.

The next key feature for us was cost. Azure AI translator has a generous free tier of up to 2 million characters translated per month. So depending on the amount of content on your site and how often you translate your content, you may not even incur a cost at all. Another important feature is accuracy. Azure AI translator uses what they call Neural Machine Translation or NMT, which in their own words is an improvement on previous statistical machine translation, SMT, based approaches as it uses far more dimensions to represent tokens, such as words, morphemes and punctuation of the source and target text. Additionally, they go on to explain that using the NMT approach, the translation will take into context the full sentence versus only a few words sliding window that SMT uses and will produce more fluid and human translated looking translations. For us, this means more contextually accurate translations, which means less tweaking, if any that would be needed. This is what makes it possible for us to integrate translations into our CI CD workflow.

The final key feature for us is the ability to use custom glossaries to tweak the translation process, which is useful for skipping translation of industry specific terminology and or brand related text, as well as for ensuring that certain words or phrases are translated in a way that retains the original meaning. It's not all sunshine and rainbows though, so I wanted to also take a moment to call out some of the limitations of Azure AI translator. Although the service can handle many different file types, it currently isn't able to handle MDX or JSX TSX, so your content will need to be stored separately from these files in order to be translated. Another limitation is that all translation responses are returned as horizontal, left to right, or right to left text, so you may need to add rendering logic if you want to display content in a vertical format for applicable languages. Finally, as with any AI implementation, you'll still want to validate the results before deploying them to prod, and you may find that additional tweaking may be needed, which would also require a redeployment and revalidation.

Alright, enough talk about Azure AI translator, let's get into how we're going to design this workflow. Depending on how the content is stored, there are essentially three options for how to translate it. Option 1, your content is stored as JSON. As you can see from this JSON snippet, we have an array of posts, which are objects containing the various data our frontend would need in order to render each post. If your content is stored like this, you would most likely want to translate the content during runtime using the Azure Translator API, probably using a React server component or a separate custom API that would cache the translated content in order to reduce how often the translation service is being run, which is ultimately going to keep your cost down.

3. Translating JSON and HTML Files

Short description:

Here's an example of how you could achieve this. We're taking the pieces of data that need to be translated and sending each piece through the translator. Then, we send the translated content as a response. For JSON content, the translation can be done at runtime in a React server component or an API route. For HTML or Markdown files, the translation can be incorporated into the CI-CD workflow, involving Azure Blob Storage, Azure Function App, and the app's source code files.

Here's an example of how you could achieve this. As you can see, we're taking just the pieces of data that need to be translated, namely the title, excerpt, and the content, and sending each piece through the translator. Then, once we have all of the translated content, we're sending it as a response. This example could be added into a React server component or an existing API route that sends the content to the React app.

Scenarios 2 and 3, your content is stored as separate HTML or Markdown files. In either of these cases, rather than executing the translation at runtime, we can actually incorporate the translation process into our CI-CD workflow. Firstly, we'll need to upload the files to be translated into an Azure Blob Storage container. Next, we'll need to create and then trigger an Azure Function App or similar which will orchestrate the translation process, including specifying the input and output languages and corresponding Blob Storage containers. Once the function app has completed, then we need to download the translated files from the output Blob Storage container and add them to the app's source code files. Finally, we're ready to deploy our React app. You'll want to be sure to deploy the app to a test environment first so you can validate the translation outcome before deploying to prod.

All right, enough talking about concepts. Let's look at some code. First, let's walk through what the solution looks like for JSON content translated at runtime. So, in here, if we look at the API route for my Next.js app, you'll see that there's this translate function, just like we looked at earlier in the slides. If we dive into that translate function, you'll see it's essentially just a post API call going to Microsoft Translator. What we're doing is we're taking the locale that was requested. You can think of this as the route being the domain of the website and then slash en-us or es-es for Spanish or similar to that. We're grabbing just the first portion of that, so like the en or the es, in order to understand which language they're trying to translate to. And then we're putting that together with the API endpoint, and we're specifying that as the to language. And then in here, we're sending this post request to our translator service, and we're sending in the text in JSON form. And then once the translation comes back, then we're digging into that JSON and grabbing just the text translation out of it and sending that as a response. So that's the solution for translating JSON.

Next, let's look at how we would translate HTML files from within a CICD pipeline. So I'll switch over to my HTML branch. Alright. So looking at this same API route, the main thing you're going to see is that we're no longer doing any translation work inside of the API route. You can see that we're parsing through the head and grabbing the data out of there, but that's about it. Instead, what we've done is we've moved that over to our GitHub Actions workflow.

4. Configuring Translation in GitHub Actions Workflow

Short description:

Instead, we've moved the translation process to our GitHub Actions workflow. We provision the necessary resources in Azure, build and deploy the function app, and then perform the translation process. We configure the translation endpoint, specify the glossary for tweaking translations, and execute the translation. Once the translation is complete, we create a new blob in the output location. Our GitHub workflow ensures that all translations are done before proceeding to the next step, where we upload and download the artifact for deployment. This ensures that all content is translated and avoids errors for missing files.

Instead, what we've done is we've moved that over to our GitHub Actions workflow. So if I open that up and start at the top, so what we're doing is we're going through here and first we're creating all the resources in Azure. And if we look in the template for that, you see we have the static web app for the Next.js site, App Insights tied to it so that we can collect telemetry. And then we have our function app that's going to be doing the translation process. And then we have our translation service. And then we have our storage account, which contains an input files container and an output files container.

So back over to our GitHub workflow. So then after we provision all these resources, then we build our function app inside of the workflow. And then once that's built, then we deploy it out. And then after this is deployed out, then we do the translation process. And if we look into our actual function. So this is where essentially all the magic happens. So here we're just setting everything up. This isn't really any different than what you would do anytime you're interacting with blob storage. But then down here is where it starts to get more interesting. So we're specifying the translation endpoint that we're going to be going up against. And one thing to call out is the translation endpoint when you're doing document translation is very different than the translation endpoint that you're going to use for just text like JSON. And so you want to make sure to not use the same domain for each of those. And so in here, we're specifying our glossary that allows us to tweak our translations. And then we're going through each of the blob files that's in our input. And here's where we configure how we do the translation when we go to execute.

So we're saying we want to translate to Spanish and that we're providing it a glossary that's in a CSV format and that we're doing file translation. And then we send our request and then we submit that data in there. And then once it's done, then we go ahead and we create the new blob inside of our output location. And then we're just checking to make sure that everything got done until we can close this out. And this is necessary because we want our GitHub workflow to wait until after all the translations have been done before it resumes with the next step. Because in the next step, after we run the translation, then we're uploading that artifact and then downloading it in the next stage. And we're taking that artifact and we're uploading it as our next JSF. And so it's really important that all the translations get done first, because otherwise you could end up in a situation where you have half of your content translated and you go to deploy it. And so then when someone switches to that language, they see half of your content and they get errors for the other half because the file's not found.

5. Translating Markdown Content

Short description:

When setting up the translation process, it felt like magic. The pipeline for translating Markdown content is the same as for HTML content. The difference lies in how the metadata is handled. In HTML, the metadata is stored in the head of the file, while in Markdown, it is within the front matter section.

So honestly, when I first set this up, it felt kind of like magic. Because you're essentially just sending a request over to the translator and then you just magically get this translated file, and it's all good. And you just have to do a little bit of tweaking.

So finally, let's look at how we would do this same translation pipeline but for Markdown content instead. So the pipeline stays exactly the same. There's no difference here. We're still just uploading the files, triggering the translation, and then grabbing the output files and deploying our app. So we don't need to go into that again.

And if we look at our translate content, this is where it's a little bit different. So in here, if I go down... So all of this is the same. Nothing's changed yet. Yeah, so let me compare this real quick. Let me switch back over to my HTML so that you can see what that looks like first. Yeah, so here's the big difference. With HTML, because all the metadata for each of these posts is stored in the head of the file, so like if we look over at one of these files, you see we have our title here, our excerpt, and then down at the bottom we have all our content.

6. Translating HTML and Markdown Files

Short description:

When translating HTML files, the head needs to be translated as part of the process. For Markdown files, the metadata needs to be translated separately due to issues with Azure AI Translator. The translation function parses and translates the original files' metadata, then combines it with the translated files' content. The GitHub actions workflow provisions Azure resources, builds and deploys the function app, and runs the translation process for both HTML and Markdown files.

And we want to translate this. Well by default, when you send an HTML file through, it's not going to translate the head. And so instead, that has to be part of our process. So inside of our function, you see here that we're parsing the document and this is after the translation's been done. So we're taking the translated file and we're parsing through it. And then what we're doing is we're going through the head in order to get all the metatags and then we're specifically looking for the title and the excerpt. And once we find it, we're sending it through our translate function to get it translated into the destination language. And then after that's all done, then we're reassembling our head, including that translated metadata, and then we're replacing the translated file with the new translated file that also has a translated head.

So if we jump back over to our Markdown version, this is where it's pretty different. Because in here, after we do the initial translation, you'll see that we don't have a second step of needing to translate the head. However, what we do have is we have to translate the...we still have to translate the metadata separately. And this is actually an issue that I ran into with Azure AI Translator, where it attempts to translate the front matter. So like if we look at one of these posts, all this section, if you're unfamiliar, all of this section is metadata in our Markdown file. So it attempts to translate it, but I've run into several issues. So like I've run into issues where like this period gets moved outside of the quote, which will prevent this file from loading. And also I've run into issues where it tries to translate some of the keys in the objects. And then also it tries to translate some of the words in the URL path itself. And so this obviously causes it all to break. And so in our translation function, it's an interesting setup. So we have to translate not the destination files, so like the translated files, front matter, because it's actually unparsable right now because of some of the issues. So what we actually have to do is we have to parse and then translate the original files metadata and then put that translated metadata in with the translated files content in order to get it matched up. And so that's what you're seeing right here, where we're replacing or using regex to find the front matter and we're replacing it with the input files translated metadata. And then after that, then we're uploading. And so then, so this is this is how it all works. So now if we jump over to GitHub, we can actually look at the GitHub actions workflow running. And so here's the HTML version. And so you see first we provision the Azure resources, then we build our function app, then we deploy it out. Then we run our translation process. And then with those translated files, we then deploy out. And similarly, now we can go in and we can run this for our markdown branch.

7. Deployment and Recap

Short description:

The process is now deploying the ARM template that includes the necessary resources. We discussed the problem of limited language support on websites and web apps, the benefits of Azure AI Translator, and alternative solutions. We also explored different versions of the app for translating JSON, HTML, and markdown content without learning a new language. If interested, you can find more information and resources on GitHub and Azure AI Translator.

This should kick off here in one second. All right, there it is. OK, and so now it's going and making sure that there's a resource group in Azure for this. There we go. And now it's going to deploy out our ARM template that contains all the various resources we looked at earlier.

Just to recap what we've discussed. The problem we were trying to solve, which is the fact that many websites and web apps don't offer their content in additional languages. Next we discussed the key benefits and features of Azure AI Translator. But we also acknowledged that similar solutions could be built with other cloud service providers translation services. Finally, we looked at the three different versions of the same app, showcasing how to translate JSON, HTML and markdown content without requiring any members of your team to learn a new language.

With this being a prerecorded remote session, I can't tell, but I sincerely hope that you're as excited about the solution as I am. If you'd like to explore these solutions in more depth, here's the link and the QR code to the repo on GitHub, as well as links to a couple of resources to learn more about Azure AI Translator.

8. Deploying and Translating Markdown Files

Short description:

The workflow is now deploying and translating the markdown files. It ensures all files are translated before deployment. Once the translation is complete, the files are put back into the repository. The deployment process guarantees the presence of all necessary resources. After a brief wait, the web app is ready to be viewed. The blog contains multiple posts.

Now let's go back over to our workflow and see how it's doing. Okay, great. So it's deployed out our Azure resources. It's built our function out. And now it's in the process of deploying that out. And so this is the part that we really want to look at, because this is where it's translating our markdown files. Okay, so at this point, it's downloading the files from blob storage. And so now it's triggered the translation. And it's just waiting on the translator service to say that all the jobs have been finished. And like I was saying earlier, this is a really important part, because you don't want to end up with not all of our files translated when we go to deploy next. Awesome. And so you can see in here, it even specifies how many files have been translated.

Okay, and so now it's taking this translated files, putting them back in with the rest of our repo code. And now we're deploying out. And just a quick word about like, infrastructure as code. So I have this setup. So that way, it guarantees that all the resources will be out there. You don't necessarily have to have that part. If you want to have like a separate GitHub actions workflow, that deploys out your function app, separate from the the next JS app, you could totally do that. I chose to do it this way, because the function app is a dependency of this next JS app. And I want to make sure that it exists and is configured right, before I go trying to run translations against it. But that's just my own personal preference.

All right, so it's deployed it. And now it's just waiting on the the static web app service to to respond back saying that the the static web app is actually ready to be viewed. And this takes about two minutes. And then once this is done, then we'll be able to look at the the completed app. Okay, so took a bit longer than expected, but our web app did deploy out. And so let's go ahead and open that up and check it out. All right, here we go. So we have our blog, and we have different posts in here.

9. Rendering Markdown Content

Short description:

The metadata provides the title and excerpt for each post. The markdown is parsed into HTML for rendering. Formatting is preserved in different languages. Feel free to connect with me on LinkedIn or X.

And for each post, this is where the title from the metadata comes in. And then this is the excerpt from the metadata. We have other posts down here. And if we open up one of these posts, then you see, we got our title again. And then we have our actual markdown being parsed into HTML to be rendered on the client. And you can see we have different formatting. So like we have headers, we have a link right here, we have some bolded text, we have an unordered list.

And then if we go back and we change languages, now you see the title's been translated. The excerpt has been translated. And it's for all of these. And then if we open up one of these posts, you see that it's kept the original formatting. So like our headers are still there. Our link is still working as expected. The bolded text is still the same. Unordered list. And everything is rendered exactly the same. So I'll leave this up for a second.

Also, I'd love to hear about any multilingual solutions that you build based on the concepts we covered today. So please feel free to connect with me on LinkedIn or X. Thank you all for attending. And I hope you enjoy the rest of your conference. Take care.

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

How do Localise and Personalize Content with Sanity.io and Next.js

React Advanced 2021

8 min

How do Localise and Personalize Content with Sanity.io and Next.js

Workshops on related topic

Localizing Your Remix Website

React Summit 2023

154 min

Localizing Your Remix Website

WorkshopFree

Harshil Agrawal

Localized content helps you connect with your audience in their preferred language. It not only helps you grow your business but helps your audience understand your offerings better. In this workshop, you will get an introduction to localization and will learn how to implement localization to your Contentful-powered Remix website.
Table of contents:- Introduction to Localization- Introduction to Contentful- Localization in Contentful- Introduction to Remix- Setting up a new Remix project- Rendering content on the website- Implementing Localization in Remix Website- Recap- Next Steps

remix web development headless cms internationalization