English versionEN

Internationalization (i18n) With AI-Powered Language Model

AI chatbots are closest to a human conversation model which makes communication effortless and accurate. It can be powerful tool for translating smaller chunks of text presented in common languages. Learn how compelling chatbot prompts can revolutionize your communication and translate hundreds of documents enriched with HTML text formatting and code blocks in one chat.

This talk has been presented at JSNation 2024, check out the latest edition of this JavaScript Conference.

FAQ

The speaker is Cynthia, a software engineer and technical lead at InterTech company based in Berlin. She teaches web development basics at Ready School of Digital Integration and is a member of several engineering communities and guilds, working closely with international teams.

The main topic of Cynthia's talk is internationalization with the Powered Language Bundle, focusing on making AI work for localization and translation, and customizing content locally without external content management system dependencies.

Localization is important in the development process because it helps make products and content accessible globally, challenging the scalability of business and technology through localization.

AI improves its ability and accuracy in localization tasks by gradually learning from more data. The higher the amount of data fed into AI, the better its probabilities of providing the correct answer.

The three streams of content types mentioned in the talk are JSON, Markdown, and YAML front matter.

When translating JSON, it is important to keep the keys of the object in the same language and translate only the values. For YAML front matter, the front matter block and the keys should remain in the original language while translating the values.

Natural language processing (NLP) performs linguistic analysis on text to make sentences and words understandable and comparable, helping to extract deeper context from individual words and sentences in translations.

ChatGPT can be used in the early stages of experimenting with translations by providing system-level instructions such as origin language, target language, and formatting of the text, making it ideal for experimentation without the higher costs associated with using an API.

Storybook is beneficial for automating translations because it has built-in tools that can be integrated with middleware on Node.js, allowing for the specification of target languages, triggering translation actions via APIs, and managing content translations efficiently.

'Hot deployable infrastructures' are significant in the translation process because they allow for scalability and flexibility without downtime, making it easier to handle volatile fluctuations and ensuring continuous deployment and version control of translation files.

internationalization

Sintija Birgele

14 min

17 Jun, 2024

Comments

Video Summary and Transcription

Today's Talk covers internationalization with the Powered Language Bundle and leveraging AI capabilities. It emphasizes the importance of planning AI roles and workflows, customizing content locally, and understanding different content types. The translation process involves linguistic analysis, accurate system instructions, and experimentation with different communication methods. The workflow includes using Express Server and Storybook for translations, connecting metadata with the user interface, and integrating AI technology responsibly for efficient and effective results.

Available in Español: Internacionalización (i18n) con Modelo de Lenguaje Potenciado por IA

1. Introduction to Internationalization and AI

Short description:

Today, I will talk about internationalization with the Powered Language Bundle and how to make AI work for you. We'll start by narrowing down the scope and understanding the importance of planning AI roles and workflows. I will also show you how to fully customize content locally with OpenAI. Additionally, we'll explore the differences between content types and how to achieve the same goal when working with different types of content.

Hello, everyone, my name is Cynthia, and today I will talk about internationalization with the Powered Language Bundle. Shortly about me, I'm a software engineer and technical lead at InterTech company based in Berlin. I'm working at Ready School of Digital Integration where I'm teaching web development basics. I'm also a member of several engineering communities and guilds, and I work closely with international teams.

So my motivation for working with translations and talking about it today is really trying to challenge the scalability of the business and technology through localization to make products and content accessible globally, as well as introducing localization into web basics. At the beginning of the development journey, it's very important to learn how to implement properly within organizations. So today we will try to answer one question, how to make AI work for you?

And the start of this process is narrowing down the scope. Machines do not learn like human beings, but rather gradually improve their ability and accuracy, so that the more data is fed into them, the higher probabilities receive the right answer. So it is therefore important to narrow down the scope to one problem and one task when planning the AI roles and workflows, as well as modeling the process itself. And the time-consuming and labor-intensive tasks that are standardized are particularly ripe for automation using AI. So the content management systems are usually limited with available integrations for localization and number of locales. So today, I will show you how to fully customize the content locally with OpenAI without any external content management system dependencies. So it is therefore important to distinguish three streams for each content type to understand what are the main differences, what are the common functions, and how can we achieve the same goal when having different content types in the application. Because in the end, we are not working only with one content type, we're probably crossing out more than one when building new products.

2. Translation Process and System Instructions

Short description:

When translating different files, we need to oversee the content level and understand the limits. Translation components differ between Markdown and JSON formats. Natural language processing performs linguistic analysis to understand the meaning of sentences and words. Precise system instructions are essential for accurate output. Communicating through API requires strict precision, while open chat allows for experimentation. In Node.js, we can use different dynamic keys for each content type, and tools like Storybook can be used for automating translations.

So the first, the start of the process is overseeing the content level when trying to understand the limits and plan how to translate the different files and where to begin basically. So when we look at the main content types, we see the common translation components, which is the text values. For the Markdown, it is entire text that we can just pass over. Let's say if we start using JGPT, we would pass this entire text to the prompt and ask the translator to ask the OpenAI to translate it. But when it comes to the JSON, we would probably want to keep the keys of the object, the same language, and translate only the values of it.

So therefore, there are exceptions, both in JSON format and YAML front matter. So in the front matter, it would be probably the front matter block and the keys of the front matter that we would like to also keep the same language and translating the values of it. Probably, if we talked to JGPT, we would say, please do not translate the title description sections, but translate entire Markdown and the key values. So when it comes to translations with the natural language processing, there are differences between the human translations or translations done by AI. And in order to extract the most value from the incoming data, and for it to be done useful for their purposes, we need to first analyze and make sure of it.

So natural language processing comes into the play and performs linguistic analysis to the text at different levels of increasing the complexity. So at the lowest level, natural language performs actions to make sentences and words understandable and comparable. So initially, information is used to obtain syntactic semantic representation of the sentences and their meaning. And the ultimate goal is for the system to gain deeper context from individual words and sentences. So when working with OpenAI, between the system level instructions or instructions by JGPT, important is to highlight that JGPT can be good for early stages of experimenting, what are the system level instructions.

Let's say we have the common instructions, like using the origin language, target language, formatting of the text, making sure that the output is without commentary, or other further details of the text is extracted exactly. So there are different details that the system has to know. And when we work with OpenAI API, then building upon this, the application domain dependent analysis can be performed through sentiment and this target recognition, which allows natural language processing to detect the polarity of the sentences for it to be negative, positive or neutral, and respective target entity on the system level instructions. So for us, it is important to really clearly define what are the rules based on the content level for the system to retrieve back to us the exact same output that we're expecting and nothing more. And when system instructions are done precisely, it will enhance analytical functions, but not over increase it. And as well increase the efficiency of the operations due to decreased time of spending, acquiring the information in the end.

So it is very important to be precise at this level when communicating through API. But it can be less strict via open chat where we don't have the over costs of the price, the cost of using API, so for experimenting, ChatGPT is ideal. So when working with Node.js, the process is very simple. We're using the target and origin language, and passing different dynamic keys specific to content type. And in this example, I'm using only one message for all three content types for JSON for translating pages with front matter or just a markdown. So some general terms also works. And also the last part about JSON, formatting can be excluded for this project for this example, because I'm also making sure of parsing the data in the middle layer, the middle where we're actually retrieving the content and making sure that it's parsable in the end. So when automating translations, first with a working markup, we need to use some kind of interface. So for the demonstration purpose, I'm using Storybook, which has already built in tools for that we can integrate with the middleware on Node.js.

3. Translation Workflow and AI Integration

Short description:

Express Server and middleware are used for working with translations. Storybook provides useful tools for specifying target languages. Translating markdown and components involves reading and validating content, translating it via API, and saving it back. Front matter connects metadata with the user interface. Intelligent automation can deliver cost benefits and improve efficiency, but human intervention is still necessary. Precise versioning and explainability strategies are essential. Responsible integration of AI technology can solve issues and avoid complexity.

So it uses Express Server and we can work with middleware directly. For the user interface, the Storybook has useful tools like global types where we can specify the target languages. So we have, let's say, origin language always available. For any case, when we have application with components or with the pages we have or origin language, and we would like to translate it from language A to language B, and then we can list it on navigation and select it, and then it will trigger the action of actually translating the content via APIs.

So there will be somewhere in the preview, if it's a storybook, if it's an application, somewhere between the components, this function, which actually reads the origin content and make sure the content exists, if it exists, then translate it and then save it back to the new file in the same folder structure. So translating the markdown is very straightforward. We just pass the content, the origin and target language, then translate it via API. And then when it comes to the component translations, the process is similar, except this function has to validate if it's a string for a markdown, if it's JSON, then it's an object. And then working with a file of markdown or components, it is very similar process.

And then with the front matter, it's a little bit different. We have this controller component wrapped around the main story or the main component, which basically extracts the metadata from front matter and make sure that it is getting serialized into the HTML components. So let's say we had the title description and sections on the front matter, and we would like to, for this data to be connected with the user interface. So what we can do is trigger these main functions from the toolbar when switching between the locales and then passing the main content from origin language, retrieving the metadata from front matter and parsing it to the UI. So front matter is just like the connection between these two main parts. The function would be similar working with is docs available, getting file components and using API on the middle where which reads the file with gray matter or other similar libraries, you can parse the content and then retrieve it back to component level or markdown level, vice versa. It is flexible.

So intelligent automation can deliver huge cost benefits that can make dealing with volatile fluctuations much easier with scalability made possible through deployable infrastructures and stateless microservice architectures. So what I mean with hot deployable infrastructures is that there is no downtime when working with the files locally and establishing some kind of mechanism for versioning system can be very beneficial, but also has to be done precisely to make the process more effective and controlled. So even though 99% of this work can be automated, there will be always this 1% that needs to be handled by a colleague translator marketing team by humans. So this last mile has to be thought through very carefully with the matter which can be integrated into workflows and procedures not making it over complicated.

And when it comes to the explainability, the build strategies to make sure the robust controls of the custom built tools are in place to safeguard all the decisions around the development and the trade off is to balance performance with explainability. It is to improve the AI also benefits as much as possible for the society. And this means that educating people educating your teams using this technology can solve the issues that are being created. But it can also mean creating more issues. So there is a fine line between an aim maybe it must be responsibly integrated into workflows. If we are about to avoid increasingly increasingly complex solutions, and increasing levels of economic inequality that arises from larger digitomi between the skills between the aspirations and the skills of the working specialists in the end that can do integrations with AI but can also make the things more enhanced and over enhanced.

So thank you so much for listening today. And the code is available on GitHub. You can contact me on the LinkedIn and thank you so much for listening. Transcribed by https://otter.ai

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

How do Localise and Personalize Content with Sanity.io and Next.js

React Advanced 2021

8 min

How do Localise and Personalize Content with Sanity.io and Next.js

Top Content

Simeon Griggs

Sanity

Sanity.io provides a content platform for structured content that replaces traditional CMS. Their solution allows businesses to structure and query content anywhere using the Sanity studio and open source React application. The talk focuses on solving the challenge of sending personalized data to users in a static website environment using Next.js Vercel for hosting and Sanity for content querying and delivery. The Sanity studio allows for modeling pages, articles, and banners, with banners being shown to visitors based on their country. The solution involves using Grok queries to fetch the right banner based on country information, demonstrating personalization based on localization and dynamic content querying.

next.js jamstack internationalization sanity react

End-to-end i18n

React Advanced 2021

26 min

End-to-end i18n

Luke Ehresman

Gazelle

Thanks for joining my talk on end-to-end internationalization. I'll walk you through internationalizing a React app, covering translated strings, currency and date formatting, translator management, and injecting translated strings back into the app. The constants used throughout the app define localization settings and translations. The React Intel library is used for managing translations, and custom functions are created for consistent date and number formatting. The translation process involves extracting strings, using tools like PO Edit, and compiling the translated strings into JSON files for the React app.

react internationalization

Emoji Encoding, � Unicode, & Internationalization

JSNation Live 2020

34 min

Emoji Encoding, � Unicode, & Internationalization

Naomi Meyer

Adobe

This Talk explores the UTF-8 encoding and its relationship with emojis. It discusses the history of encoding, the birth of Unicode, and the importance of considering global usage when building software products. The Talk also covers JavaScript's encoding issues with Unicode and the use of the string.prototype.normalize method. It highlights the addition of emoji support in Unicode, the variation and proposal process for emojis, and the importance of transparency in emoji encoding. The Talk concludes with the significance of diverse emojis, the recommendation of UTF-8 for web development, and the need to understand encoding and decoding in app architecture.

internationalization

Building JS Apps with Internationalization (i18n) in Mind

JSNation 2022

21 min

Building JS Apps with Internationalization (i18n) in Mind

Naomi Meyer

Adobe

This Talk discusses building JavaScript apps with internationalization in mind, addressing issues such as handling different name formats, using Unicode for compatibility, character encoding bugs, localization and translation solutions, testing in different languages, accommodating translated text in layouts, cultural considerations, and the importance of enabling different languages for users. The speaker also mentions various open source tools for internationalization. The Talk concludes with a reminder to avoid assumptions and embrace diversity in the World Wide Web.

internationalization react i18n

Internationalizing React

React Summit Remote Edition 2021

29 min

Internationalizing React

Daria Caraway

Workday

The Talk discusses the challenges of adding and maintaining new languages in a React application and suggests ways to make the process easier. It defines internationalization as the process of architecting an application for localization and explains that localization involves adapting the content and experience for a specific locale. The speaker recommends using libraries for internationalization and provides resources for learning more about the topic. The Talk also addresses edge cases and difficulties with multiple dialects or languages, translation processes, and right-to-left CSS libraries.

react internationalization

Modern JavaScript: Leveling Up Arrays and Intl

JSNation US 2024

27 min

Modern JavaScript: Leveling Up Arrays and Intl

Watch video: Modern JavaScript: Leveling Up Arrays and Intl

Mariko Kosaka

Chrome Developer Relations Engineer @ Google

Hi, I'm Mariko from Chrome Developer Relations Team. Let's dive into the talk, leveling up JavaScript. I sat down and learned JavaScript. I sat down and learned ES6 again. TC39 has published a new version of JavaScript spec every year. I want to focus on the parts of JavaScript that got updates recently. So ArrayFlat creates a new flattened array. You can also pass a depth argument to flatten nested arrays. Another method, copyToReserve, creates a reversed copy of an array. There's also copy to sort, which creates a sorted copy of an array. Another useful method is array to spliced, which allows you to remove and add items to a copied array. Lastly, the array at method returns an item at a given index. Array at accepts negative numbers for reverse order lookup. Find last iterates in reverse order and returns the item or index. Copy to change the value at a given index with a function. Object group by allows grouping and creating a new object by type. JavaScript intl allows for word segmentation in different languages, improving readability. It also includes features like data type format, number format, and plural rules for locale-based results. Staying up to date on web features is challenging due to time-consuming research and potential errors in implementation. Baseline provides clear information on web platform features supported by major browsers, ensuring compatibility without issues. Baseline provides two levels of support: newly available and widely available. By aligning your project to Baseline, you can confidently avoid browser compatibility issues. You can use Baseline to keep up with what's new on the web by installing the Baseline widget. Websites and dashboards like feature explorer and web starters.dev have been released. The project roadmap includes developer tooling and integrating Baseline into linters and actions. Check the RAM archive insights page for user data based on Baseline years. We are planning for more tools next year, including linting and AI integration.

internationalization

Workshops on related topic

Localizing Your Remix Website

React Summit 2023

154 min

Localizing Your Remix Website

WorkshopFree

Harshil Agrawal

Localized content helps you connect with your audience in their preferred language. It not only helps you grow your business but helps your audience understand your offerings better. In this workshop, you will get an introduction to localization and will learn how to implement localization to your Contentful-powered Remix website.
Table of contents:- Introduction to Localization- Introduction to Contentful- Localization in Contentful- Introduction to Remix- Setting up a new Remix project- Rendering content on the website- Implementing Localization in Remix Website- Recap- Next Steps

remix web development headless cms internationalization