Video Summary and Transcription
Today's Talk covers internationalization with the Powered Language Bundle and leveraging AI capabilities. It emphasizes the importance of planning AI roles and workflows, customizing content locally, and understanding different content types. The translation process involves linguistic analysis, accurate system instructions, and experimentation with different communication methods. The workflow includes using Express Server and Storybook for translations, connecting metadata with the user interface, and integrating AI technology responsibly for efficient and effective results.
1. Introduction to Internationalization and AI
Today, I will talk about internationalization with the Powered Language Bundle and how to make AI work for you. We'll start by narrowing down the scope and understanding the importance of planning AI roles and workflows. I will also show you how to fully customize content locally with OpenAI. Additionally, we'll explore the differences between content types and how to achieve the same goal when working with different types of content.
Hello, everyone, my name is Cynthia, and today I will talk about internationalization with the Powered Language Bundle. Shortly about me, I'm a software engineer and technical lead at InterTech company based in Berlin. I'm working at Ready School of Digital Integration where I'm teaching web development basics. I'm also a member of several engineering communities and guilds, and I work closely with international teams.
So my motivation for working with translations and talking about it today is really trying to challenge the scalability of the business and technology through localization to make products and content accessible globally, as well as introducing localization into web basics. At the beginning of the development journey, it's very important to learn how to implement properly within organizations. So today we will try to answer one question, how to make AI work for you?
And the start of this process is narrowing down the scope. Machines do not learn like human beings, but rather gradually improve their ability and accuracy, so that the more data is fed into them, the higher probabilities receive the right answer. So it is therefore important to narrow down the scope to one problem and one task when planning the AI roles and workflows, as well as modeling the process itself. And the time-consuming and labor-intensive tasks that are standardized are particularly ripe for automation using AI. So the content management systems are usually limited with available integrations for localization and number of locales. So today, I will show you how to fully customize the content locally with OpenAI without any external content management system dependencies. So it is therefore important to distinguish three streams for each content type to understand what are the main differences, what are the common functions, and how can we achieve the same goal when having different content types in the application. Because in the end, we are not working only with one content type, we're probably crossing out more than one when building new products.
2. Translation Process and System Instructions
When translating different files, we need to oversee the content level and understand the limits. Translation components differ between Markdown and JSON formats. Natural language processing performs linguistic analysis to understand the meaning of sentences and words. Precise system instructions are essential for accurate output. Communicating through API requires strict precision, while open chat allows for experimentation. In Node.js, we can use different dynamic keys for each content type, and tools like Storybook can be used for automating translations.
So the first, the start of the process is overseeing the content level when trying to understand the limits and plan how to translate the different files and where to begin basically. So when we look at the main content types, we see the common translation components, which is the text values. For the Markdown, it is entire text that we can just pass over. Let's say if we start using JGPT, we would pass this entire text to the prompt and ask the translator to ask the OpenAI to translate it. But when it comes to the JSON, we would probably want to keep the keys of the object, the same language, and translate only the values of it.
So therefore, there are exceptions, both in JSON format and YAML front matter. So in the front matter, it would be probably the front matter block and the keys of the front matter that we would like to also keep the same language and translating the values of it. Probably, if we talked to JGPT, we would say, please do not translate the title description sections, but translate entire Markdown and the key values. So when it comes to translations with the natural language processing, there are differences between the human translations or translations done by AI. And in order to extract the most value from the incoming data, and for it to be done useful for their purposes, we need to first analyze and make sure of it.
So natural language processing comes into the play and performs linguistic analysis to the text at different levels of increasing the complexity. So at the lowest level, natural language performs actions to make sentences and words understandable and comparable. So initially, information is used to obtain syntactic semantic representation of the sentences and their meaning. And the ultimate goal is for the system to gain deeper context from individual words and sentences. So when working with OpenAI, between the system level instructions or instructions by JGPT, important is to highlight that JGPT can be good for early stages of experimenting, what are the system level instructions.
Let's say we have the common instructions, like using the origin language, target language, formatting of the text, making sure that the output is without commentary, or other further details of the text is extracted exactly. So there are different details that the system has to know. And when we work with OpenAI API, then building upon this, the application domain dependent analysis can be performed through sentiment and this target recognition, which allows natural language processing to detect the polarity of the sentences for it to be negative, positive or neutral, and respective target entity on the system level instructions. So for us, it is important to really clearly define what are the rules based on the content level for the system to retrieve back to us the exact same output that we're expecting and nothing more. And when system instructions are done precisely, it will enhance analytical functions, but not over increase it. And as well increase the efficiency of the operations due to decreased time of spending, acquiring the information in the end.
So it is very important to be precise at this level when communicating through API. But it can be less strict via open chat where we don't have the over costs of the price, the cost of using API, so for experimenting, ChatGPT is ideal. So when working with Node.js, the process is very simple. We're using the target and origin language, and passing different dynamic keys specific to content type. And in this example, I'm using only one message for all three content types for JSON for translating pages with front matter or just a markdown. So some general terms also works. And also the last part about JSON, formatting can be excluded for this project for this example, because I'm also making sure of parsing the data in the middle layer, the middle where we're actually retrieving the content and making sure that it's parsable in the end. So when automating translations, first with a working markup, we need to use some kind of interface. So for the demonstration purpose, I'm using Storybook, which has already built in tools for that we can integrate with the middleware on Node.js.
3. Translation Workflow and AI Integration
Express Server and middleware are used for working with translations. Storybook provides useful tools for specifying target languages. Translating markdown and components involves reading and validating content, translating it via API, and saving it back. Front matter connects metadata with the user interface. Intelligent automation can deliver cost benefits and improve efficiency, but human intervention is still necessary. Precise versioning and explainability strategies are essential. Responsible integration of AI technology can solve issues and avoid complexity.
So it uses Express Server and we can work with middleware directly. For the user interface, the Storybook has useful tools like global types where we can specify the target languages. So we have, let's say, origin language always available. For any case, when we have application with components or with the pages we have or origin language, and we would like to translate it from language A to language B, and then we can list it on navigation and select it, and then it will trigger the action of actually translating the content via APIs.
So there will be somewhere in the preview, if it's a storybook, if it's an application, somewhere between the components, this function, which actually reads the origin content and make sure the content exists, if it exists, then translate it and then save it back to the new file in the same folder structure. So translating the markdown is very straightforward. We just pass the content, the origin and target language, then translate it via API. And then when it comes to the component translations, the process is similar, except this function has to validate if it's a string for a markdown, if it's JSON, then it's an object. And then working with a file of markdown or components, it is very similar process.
And then with the front matter, it's a little bit different. We have this controller component wrapped around the main story or the main component, which basically extracts the metadata from front matter and make sure that it is getting serialized into the HTML components. So let's say we had the title description and sections on the front matter, and we would like to, for this data to be connected with the user interface. So what we can do is trigger these main functions from the toolbar when switching between the locales and then passing the main content from origin language, retrieving the metadata from front matter and parsing it to the UI. So front matter is just like the connection between these two main parts. The function would be similar working with is docs available, getting file components and using API on the middle where which reads the file with gray matter or other similar libraries, you can parse the content and then retrieve it back to component level or markdown level, vice versa. It is flexible.
So intelligent automation can deliver huge cost benefits that can make dealing with volatile fluctuations much easier with scalability made possible through deployable infrastructures and stateless microservice architectures. So what I mean with hot deployable infrastructures is that there is no downtime when working with the files locally and establishing some kind of mechanism for versioning system can be very beneficial, but also has to be done precisely to make the process more effective and controlled. So even though 99% of this work can be automated, there will be always this 1% that needs to be handled by a colleague translator marketing team by humans. So this last mile has to be thought through very carefully with the matter which can be integrated into workflows and procedures not making it over complicated.
And when it comes to the explainability, the build strategies to make sure the robust controls of the custom built tools are in place to safeguard all the decisions around the development and the trade off is to balance performance with explainability. It is to improve the AI also benefits as much as possible for the society. And this means that educating people educating your teams using this technology can solve the issues that are being created. But it can also mean creating more issues. So there is a fine line between an aim maybe it must be responsibly integrated into workflows. If we are about to avoid increasingly increasingly complex solutions, and increasing levels of economic inequality that arises from larger digitomi between the skills between the aspirations and the skills of the working specialists in the end that can do integrations with AI but can also make the things more enhanced and over enhanced.
So thank you so much for listening today. And the code is available on GitHub. You can contact me on the LinkedIn and thank you so much for listening. Transcribed by https://otter.ai
Comments