Internationalization (i18n) With AI-Powered Language Model

Rate this content
Bookmark

AI chatbots are closest to a human conversation model which makes communication effortless and accurate. It can be powerful tool for translating smaller chunks of text presented in common languages. Learn how compelling chatbot prompts can revolutionize your communication and translate hundreds of documents enriched with HTML text formatting and code blocks in one chat.

This talk has been presented at JSNation 2024, check out the latest edition of this JavaScript Conference.

FAQ

The speaker is Cynthia, a software engineer and technical lead at InterTech company based in Berlin. She teaches web development basics at Ready School of Digital Integration and is a member of several engineering communities and guilds, working closely with international teams.

The main topic of Cynthia's talk is internationalization with the Powered Language Bundle, focusing on making AI work for localization and translation, and customizing content locally without external content management system dependencies.

Localization is important in the development process because it helps make products and content accessible globally, challenging the scalability of business and technology through localization.

AI improves its ability and accuracy in localization tasks by gradually learning from more data. The higher the amount of data fed into AI, the better its probabilities of providing the correct answer.

The three streams of content types mentioned in the talk are JSON, Markdown, and YAML front matter.

When translating JSON, it is important to keep the keys of the object in the same language and translate only the values. For YAML front matter, the front matter block and the keys should remain in the original language while translating the values.

Natural language processing (NLP) performs linguistic analysis on text to make sentences and words understandable and comparable, helping to extract deeper context from individual words and sentences in translations.

ChatGPT can be used in the early stages of experimenting with translations by providing system-level instructions such as origin language, target language, and formatting of the text, making it ideal for experimentation without the higher costs associated with using an API.

Storybook is beneficial for automating translations because it has built-in tools that can be integrated with middleware on Node.js, allowing for the specification of target languages, triggering translation actions via APIs, and managing content translations efficiently.

'Hot deployable infrastructures' are significant in the translation process because they allow for scalability and flexibility without downtime, making it easier to handle volatile fluctuations and ensuring continuous deployment and version control of translation files.

Sintija Birgele
Sintija Birgele
14 min
17 Jun, 2024

Comments

Sign in or register to post your comment.

Video Summary and Transcription

Today's Talk covers internationalization with the Powered Language Bundle and leveraging AI capabilities. It emphasizes the importance of planning AI roles and workflows, customizing content locally, and understanding different content types. The translation process involves linguistic analysis, accurate system instructions, and experimentation with different communication methods. The workflow includes using Express Server and Storybook for translations, connecting metadata with the user interface, and integrating AI technology responsibly for efficient and effective results.

1. Introduction to Internationalization and AI

Short description:

Today, I will talk about internationalization with the Powered Language Bundle and how to make AI work for you. We'll start by narrowing down the scope and understanding the importance of planning AI roles and workflows. I will also show you how to fully customize content locally with OpenAI. Additionally, we'll explore the differences between content types and how to achieve the same goal when working with different types of content.

Hello, everyone, my name is Cynthia, and today I will talk about internationalization with the Powered Language Bundle. Shortly about me, I'm a software engineer and technical lead at InterTech company based in Berlin. I'm working at Ready School of Digital Integration where I'm teaching web development basics. I'm also a member of several engineering communities and guilds, and I work closely with international teams.

So my motivation for working with translations and talking about it today is really trying to challenge the scalability of the business and technology through localization to make products and content accessible globally, as well as introducing localization into web basics. At the beginning of the development journey, it's very important to learn how to implement properly within organizations. So today we will try to answer one question, how to make AI work for you?

And the start of this process is narrowing down the scope. Machines do not learn like human beings, but rather gradually improve their ability and accuracy, so that the more data is fed into them, the higher probabilities receive the right answer. So it is therefore important to narrow down the scope to one problem and one task when planning the AI roles and workflows, as well as modeling the process itself. And the time-consuming and labor-intensive tasks that are standardized are particularly ripe for automation using AI. So the content management systems are usually limited with available integrations for localization and number of locales. So today, I will show you how to fully customize the content locally with OpenAI without any external content management system dependencies. So it is therefore important to distinguish three streams for each content type to understand what are the main differences, what are the common functions, and how can we achieve the same goal when having different content types in the application. Because in the end, we are not working only with one content type, we're probably crossing out more than one when building new products.

2. Translation Process and System Instructions

Short description:

When translating different files, we need to oversee the content level and understand the limits. Translation components differ between Markdown and JSON formats. Natural language processing performs linguistic analysis to understand the meaning of sentences and words. Precise system instructions are essential for accurate output. Communicating through API requires strict precision, while open chat allows for experimentation. In Node.js, we can use different dynamic keys for each content type, and tools like Storybook can be used for automating translations.

So the first, the start of the process is overseeing the content level when trying to understand the limits and plan how to translate the different files and where to begin basically. So when we look at the main content types, we see the common translation components, which is the text values. For the Markdown, it is entire text that we can just pass over. Let's say if we start using JGPT, we would pass this entire text to the prompt and ask the translator to ask the OpenAI to translate it. But when it comes to the JSON, we would probably want to keep the keys of the object, the same language, and translate only the values of it.

So therefore, there are exceptions, both in JSON format and YAML front matter. So in the front matter, it would be probably the front matter block and the keys of the front matter that we would like to also keep the same language and translating the values of it. Probably, if we talked to JGPT, we would say, please do not translate the title description sections, but translate entire Markdown and the key values. So when it comes to translations with the natural language processing, there are differences between the human translations or translations done by AI. And in order to extract the most value from the incoming data, and for it to be done useful for their purposes, we need to first analyze and make sure of it.

So natural language processing comes into the play and performs linguistic analysis to the text at different levels of increasing the complexity. So at the lowest level, natural language performs actions to make sentences and words understandable and comparable. So initially, information is used to obtain syntactic semantic representation of the sentences and their meaning. And the ultimate goal is for the system to gain deeper context from individual words and sentences. So when working with OpenAI, between the system level instructions or instructions by JGPT, important is to highlight that JGPT can be good for early stages of experimenting, what are the system level instructions.

Let's say we have the common instructions, like using the origin language, target language, formatting of the text, making sure that the output is without commentary, or other further details of the text is extracted exactly. So there are different details that the system has to know. And when we work with OpenAI API, then building upon this, the application domain dependent analysis can be performed through sentiment and this target recognition, which allows natural language processing to detect the polarity of the sentences for it to be negative, positive or neutral, and respective target entity on the system level instructions. So for us, it is important to really clearly define what are the rules based on the content level for the system to retrieve back to us the exact same output that we're expecting and nothing more. And when system instructions are done precisely, it will enhance analytical functions, but not over increase it. And as well increase the efficiency of the operations due to decreased time of spending, acquiring the information in the end.

So it is very important to be precise at this level when communicating through API. But it can be less strict via open chat where we don't have the over costs of the price, the cost of using API, so for experimenting, ChatGPT is ideal. So when working with Node.js, the process is very simple. We're using the target and origin language, and passing different dynamic keys specific to content type. And in this example, I'm using only one message for all three content types for JSON for translating pages with front matter or just a markdown. So some general terms also works. And also the last part about JSON, formatting can be excluded for this project for this example, because I'm also making sure of parsing the data in the middle layer, the middle where we're actually retrieving the content and making sure that it's parsable in the end. So when automating translations, first with a working markup, we need to use some kind of interface. So for the demonstration purpose, I'm using Storybook, which has already built in tools for that we can integrate with the middleware on Node.js.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

How do Localise and Personalize Content with Sanity.io and Next.js
React Advanced Conference 2021React Advanced Conference 2021
8 min
How do Localise and Personalize Content with Sanity.io and Next.js
Sanity.io provides a content platform for structured content that replaces traditional CMS. Their solution allows businesses to structure and query content anywhere using the Sanity studio and open source React application. The talk focuses on solving the challenge of sending personalized data to users in a static website environment using Next.js Vercel for hosting and Sanity for content querying and delivery. The Sanity studio allows for modeling pages, articles, and banners, with banners being shown to visitors based on their country. The solution involves using Grok queries to fetch the right banner based on country information, demonstrating personalization based on localization and dynamic content querying.
End-to-end i18n
React Advanced Conference 2021React Advanced Conference 2021
26 min
End-to-end i18n
Thanks for joining my talk on end-to-end internationalization. I'll walk you through internationalizing a React app, covering translated strings, currency and date formatting, translator management, and injecting translated strings back into the app. The constants used throughout the app define localization settings and translations. The React Intel library is used for managing translations, and custom functions are created for consistent date and number formatting. The translation process involves extracting strings, using tools like PO Edit, and compiling the translated strings into JSON files for the React app.
Building JS Apps with Internationalization (i18n) in Mind
JSNation 2022JSNation 2022
21 min
Building JS Apps with Internationalization (i18n) in Mind
This Talk discusses building JavaScript apps with internationalization in mind, addressing issues such as handling different name formats, using Unicode for compatibility, character encoding bugs, localization and translation solutions, testing in different languages, accommodating translated text in layouts, cultural considerations, and the importance of enabling different languages for users. The speaker also mentions various open source tools for internationalization. The Talk concludes with a reminder to avoid assumptions and embrace diversity in the World Wide Web.
Internationalizing React
React Summit Remote Edition 2021React Summit Remote Edition 2021
29 min
Internationalizing React
The Talk discusses the challenges of adding and maintaining new languages in a React application and suggests ways to make the process easier. It defines internationalization as the process of architecting an application for localization and explains that localization involves adapting the content and experience for a specific locale. The speaker recommends using libraries for internationalization and provides resources for learning more about the topic. The Talk also addresses edge cases and difficulties with multiple dialects or languages, translation processes, and right-to-left CSS libraries.
Emoji Encoding, � Unicode, & Internationalization
JSNation Live 2020JSNation Live 2020
34 min
Emoji Encoding, � Unicode, & Internationalization
This Talk explores the UTF-8 encoding and its relationship with emojis. It discusses the history of encoding, the birth of Unicode, and the importance of considering global usage when building software products. The Talk also covers JavaScript's encoding issues with Unicode and the use of the string.prototype.normalize method. It highlights the addition of emoji support in Unicode, the variation and proposal process for emojis, and the importance of transparency in emoji encoding. The Talk concludes with the significance of diverse emojis, the recommendation of UTF-8 for web development, and the need to understand encoding and decoding in app architecture.
Localization for Real-World Use-Cases: Key Learnings from Onboarding Global Brands
React Summit 2022React Summit 2022
8 min
Localization for Real-World Use-Cases: Key Learnings from Onboarding Global Brands
I'm going to talk about localisation in the real world and how Sanity, a platform for structured content, focuses on content as data and flexible internationalization. Sanity allows for multiple languages in different markets, providing customization options for content visibility based on location. The platform offers a flexible approach to content creation and customization, allowing organizations to internationalize their content based on their specific needs. With Sanity's query language and the brand new version of Sanity Studio, developers have more flexibility than ever before.

Workshops on related topic

Localizing Your Remix Website
React Summit 2023React Summit 2023
154 min
Localizing Your Remix Website
WorkshopFree
Harshil Agrawal
Harshil Agrawal
Localized content helps you connect with your audience in their preferred language. It not only helps you grow your business but helps your audience understand your offerings better. In this workshop, you will get an introduction to localization and will learn how to implement localization to your Contentful-powered Remix website.
Table of contents:- Introduction to Localization- Introduction to Contentful- Localization in Contentful- Introduction to Remix- Setting up a new Remix project- Rendering content on the website- Implementing Localization in Remix Website- Recap- Next Steps