JS Character Encodings

Character encodings can be confusing for every developer, providing pitfalls even for the most experienced ones, so a lot of the time we want to end up with something that “just works” without an in-depth understanding of the involved concepts. In this talk, Anna will give an overview over what they are, what the JavaScript language provides to interact with them, and how to avoid the most common mistakes in Node.js and the Web.

Rate this content
Bookmark
Video Summary and Transcription
The video explores the importance of character encoding in software development, emphasizing the need to understand the difference between strings and byte sequences. It highlights that UTF-8 is a popular encoding due to its compatibility with ASCII, and mentions how JavaScript engines often use UTF-16 but can optimize for ASCII-only text to save space. The talk addresses common issues in Node.js, such as misalignment of character data across chunks, and explains the role of the TextDecoder API in decoding text from binary data. The discussion also covers the significance of the TextEncoder API in converting text to byte sequences, typically in UTF-8, and how Unicode assigns unique codes to characters for consistent representation across systems. It advises on measuring string length in JavaScript, especially with Unicode characters, using methods that account for code points rather than relying on the .length property.
Available in Español: Codificación de caracteres en JS

FAQ

Strings are sequences of characters, like text, while sequences of bytes represent the digital encoding of these characters. In computing, these bytes can represent anything and are not exclusively tied to textual data.

UTF-8 is backwards compatible with ASCII because the first 127 bytes of UTF-8 match ASCII exactly, allowing UTF-8 encoded data to be read by systems that support ASCII without modification.

JavaScript engines can use various methods to store strings, often defaulting to UTF-16 encoding. However, engines are optimized to use different encodings based on the content of the string to conserve memory and enhance performance.

Using different encodings can impact how data is processed and stored. For instance, encoding mismatches can lead to data loss or misinterpretation, especially when interacting with external systems or networks.

The TextDecoder API in JavaScript allows decoding of text from binary data using different character encodings. It supports options like 'fatal' and 'stream' to handle errors and stream decoding, providing flexibility in data handling.

Understanding character encoding is crucial because it ensures that data is interpreted correctly across different systems and platforms. This is essential for maintaining data integrity and compatibility in global applications.

A common issue in Node.js is the misalignment of character data across data chunks when using certain encoding methods. This can lead to characters being incorrectly decoded if the data chunks do not align with character boundaries.

1. Introduction to Character Encodings#

Short description:

I am working at MongoDB working on the Developer Tools team. So let's jump in. Why are character encodings important? Your program is typically run by an operating system that has no idea what a string is. The solution is to assign numbers to characters and convert them into bytes. Strings and sequences of bytes are different things. Historically, people came up with ways to assign numbers to characters, like ASCII and character encodings for different languages.

I am working at MongoDB working on the Developer Tools team so the Shell and the GUI and the VSCode extension for the database but this talk has absolutely nothing to do with that. So let's jump in.

So about a month ago or so I saw this tweet which got somewhat popular on Twitter and you know... Some people are laughing, you get the joke. Obviously, the easiest way to get the length of a string in JavaScript is to do object spread in it then call object.keyson.object and then use array prototype reduce to sum up the length of that array. So we all know what the joke is. But let's take a step back.

Why are character encodings sometimes something that we care about or have to deal with? The typical situation that you're in is you're a software developer and you're writing software. You're writing a program. That program does not exist in isolation. There is something else out there, literally anything but your program like the file system, network, other programs, other computers, anything like that. And obviously you want your software to be able to communicate with them. The default way to communicate anything is to use strings. You can put basically anything in a string. Any data you have you can serialize into a string. So it would be nice if we could talk with these other programs using strings. Unfortunately, that's not how it works.

Your program is typically run by an operating system that has no idea what a string is. If it's a Javascript program, which is going to be the case for many of you, a Javascript string is something that the Javascript engine understands, but your operating system has no idea what to do with that. You can't just pass it directly to that. That also means you can't pass it to other things. So the solution that people came up with is, you have your string, and for each character in that string you assign that character a number, and then you come up with some clever way to assign or convert these numbers into a sequence of bytes. And this feels like a very basic discussion to have, but I think it's important to have that distinction in mind.

When I say strings, I mean sequence of characters, like text. This intermediate representation, which for the most part you don't care about, I'm going to refer to that as code points, because that is the language that Unicode uses for this, and then your output is a sequence of bytes. Obviously when you're decoding you go these steps in reverse. If you take anything away from this talk, it's that strings and sequences of bytes are different things. Historically, how people have approached that, back in the 70s when Americans had not yet discovered that there is something besides America in the world, you came up with a way to assign, a standardist way to assign numbers to characters, and those were characters from 1 to 128, and that's enough space for lowercase and uppercase English alphabets and some special characters and, you know, who needs more than that? Then the next iteration, which is a little bit more popular around the 90s I would say, is, you know, you discover that there are other languages out there besides English, and you say like, okay, well, ASCII is 128 characters, so 7 bits, bytes usually have 8 bits, so we have another 128 characters available. And the solution that people came out with was like, you know, you're probably either going to have Greek text, or Slavic text, or Arabic text, you're not going to mix these probably. So, for each of these you create a character encoding.

2. Character Encodings and JavaScript#

Short description:

And so these ISO-8859 character encodings, they're like 16 different character encodings each of the additional characters that are not ASCII have an additional meaning. Unicode solves the problem by allowing as many code points as we want. UTF-8 is the most commonly used encoding, and it is backwards compatible with ASCII. UTF-16, on the other hand, uses two bytes per character but can require four bytes for certain characters. JavaScript lets you interact with strings as if they were stored using UTF-16.

And so these ISO-8859 character encodings, they're like 16 different character encodings each of the additional characters that are not ASCII have an additional meaning. But you can't mix, like you can't have a single byte sequence that can represent both, say, Greek and Arabic text, and sometimes you might want that. So something that got popular towards the end of the 90s is Unicode.

And so Unicode essentially solves that problem by saying, yeah we're not going to stick to single byte encodings, we're just going to have as many code points as we want. There is a limitation, like around one million code points currently, but that's, I mean, we're not close to hitting that currently. I don't think we're going to get that many emojis, so I think that's OK. What is sometimes relevant for JavaScript is that the first 265 code points match one of these prior encodings, namely ISA8591, that doesn't mean by itself that it is compatible with ASCII, because that's only the code points, not the actual transformation to byte sequences. But then you have multiple encodings to do that, and the one that we all know and use everyday is UTF-8, and this one is backwards compatible with ASCII because, you know, the first 127 bytes match ASCII exactly, and it uses all the other bytes to, you know, represent other characters that don't fit into that range.

And then there's UTF-16, which JavaScript people might also care about from time to time, where the idea is more closely to, you know, two bytes per character. This made a lot of sense when Unicode was first introduced because back then, you know, nobody expected that there might be more than 65,000 characters to care about. So, you know, two byte was a very natural choice for that. But with things like emoji being introduced, we're going to—we've stepped outside that range. So some things have to be represented by pairs of two bytes, so four bytes in total. So people sometimes say that JavaScript uses UTF-16, and like, well, there might be something to that. So I have here the output of the Unicode command line utility. If you've never used that, it is a very neat tool for finding out information about individual characters or looking up characters based on their code points, all that kind of stuff. However, I wrote that, I am very thankful. There is an example of what this looks like in UTF-16. I've highlighted that. And then, what happens when you use Node to print out the length of a string that only contains this single hamster face character? It says two, even though it's one character. And then you can dig further and you see that like, this one character compares equal to a string comprised of two escape sequences. And these escape sequences happen to match exactly how UTF-16 serializes things. And so you might say, well, JavaScript uses UTF-16. I'm done. The reality is that UTF-16 is a character encoding. It's a way of transforming sequences of characters into sequences of bytes. There is no sequence of bytes in here. This is not an encoding thing. It just happens to have some similarities. So in some ways, JavaScript lets you interact with strings as if they were stored using UTF-16.

3. Storage and String Length in JavaScript#

Short description:

Sometimes JavaScript engines don't use UTF-16 for ASCII only text, which saves storage space. By emitting ASCII-only output, the overall executable size can be reduced. JavaScript provides different ways to get the length of a string, but there's no fast way to get the number of characters. Consider the purpose of getting the string length and explore npm packages if needed.

Sometimes they might be. But also JavaScript engines can use whatever storage they want to. And they're, practically speaking, not always going to use UTF-16 because if you have ASCII only text, you don't need that. If you have ASCII only text, it's wasting half the bytes in your storage. And JavaScript engines are made to be very efficient because people care about that.

So one thing that we did, and this is the only MongoDB work reference that I have here. So we had a project last year to improve the startup performance of one of our tools that we maintain. So we shipped this tool by basically gluing node together with a Webpack bundle of our CLI code. It sounds easy enough, right? And so Webpack has this flag for emitting ASCII-only output from its minifier. It does that by replacing non-ASCII characters with escape sequences. And so when we did that, the Webpack bundle got a bit larger, and that's to be expected. The escape sequences are longer than their characters that they represent. But the overall executable that we shipped got 15% smaller. And that is because we could not, we didn't need to start data as UTF-16 anymore. We could just pass it to the JavaScript engine as ASCII data. That actually sped things up by 3.5% which was a pretty neat, very easy win for a single line change. So yeah, for example, V8 can use latin1 or UTF-16 as backends for JavaScript strings. I think JS Core can use UTF-8 backends. You don't get to see that. You don't get to interact with the underlying storage of strings. So like, it doesn't use UTF-16.

Okay, so let's go back to the example from the beginning from that slide from Twitter. Obviously this is what you would use to get the length of a string, but you know, obviously this is right in some ways and not right in some other ways, because this is a single character and it shouldn't have a length of two, or maybe it should. Luckily, JavaScript is aware that these things happen, and so when you use anything that uses the JavaScript Iterable protocol, like for off or erase, you can get the proper answer, when proper answer means you actually care about the number of Unicode characters. If you do this you're probably going to say, well, isn't this terribly inefficient, creating a temporary array just to get the length of a string, and the answer is obviously yes. You can improve on that a bit by actually using a loop and not allocating an array, but still, this is like several orders of magnitude slower than just doing .length. And what's the story here? I mean, you're just going to have to pick one of these and think about why you want the length of a string and why that matters, and it's going to have to live with the fact there's no fast way to get the number of characters from a string in JavaScript. One thing I wanted to mention. Really think about why you want to get the length of a string, like what do you want to do with that? Because you care, for example, about the number of characters something takes up when printing it in Terminal because you want to tab-align things or something. In that case, there's an npm package out there.

4. Encoding and Decoding in JavaScript#

Short description:

It does a lot of things that you would never think about because some characters are invisible so they don't take up any space at all. What we want to do in JavaScript is to get from strings to byte sequences. Buffer is very much a legacy API in Node, and there are Web API standard replacements. Encoding things is easy enough with text encoder instances, and decoding has interesting configurability options like fatal faults and the stream true flag.

It does a lot of things that you would never think about because some characters are invisible so they don't take up any space at all, all that stuff. There's always an npm package for what you actually want.

All right, so let's go back to the basics here. What we want to do, and what we want to do in JavaScript, is we want to get from strings to byte sequences. If you're used to Node.js, you might say, I'm just using Buffer, that's how I do things. That's fine, but I'm not going to care about that because in my eyes, Buffer is very much a legacy API in Node. There's Web API standard replacements for a lot of things in the Buffer API, and so there's no real reason to use it anymore.

Encoding things is easy enough. You can create text encoder instances, which they only allow UTF-8. That is a limitation to some degree, but also for the most part, you don't want to use anything else, so easy enough. Then, for decoding. Things get a bit more tricky. If I pass the UN8 array that it just got as output from the previous step, it decodes it again, works perfectly, but the API does have some interesting configurability that you might want to know about. So first of all, TextDecoder actually understands multiple character encodings. For the most part, you're not going to care about that, but it does, and that can be handy sometimes.

There's a fatal boolean option when creating one. The semantics of that are that you are decoding data, and that data may or may not be valid. And you have to handle errors somehow. You have to think about what you do. Two options that are pretty standard are presented here. One is either you do fatal faults, which means you're just taking into account replacement characters, like the one on the title slide of the talk which unfortunately didn't make it into the schedule because somebody thought it was an encoding error. I think that's pretty funny. If you use fatal true, then encoding errors will actually result in an exception when you call decode. Sometimes that's what you want because you actually want valid input and don't want to accept the fact that you're, well, losing data because it might be corrupted. And then there's the stream true flag, which is best explained by an example. So I hope that's big enough on the screen. So you have two chunks of data that logically come from the same source and you want to decode them from UTF-8. And what happens is that you can't because this happens to be a character that's split across two chunks. That happens sometimes, for example, when you're doing network I-O, you might not get data chunks from the network that are neatly aligned to your characters because it's just a byte stream, TCP doesn't care about where your chunk boundaries are. It just gives you bytes as they come in.

5. Text Decoder and Node.js Bugs#

Short description:

And that is where this flag comes into play. You pass it to every call but the last one if you're decoding a stream of data and the text decoder instance keeps in mind which partial characters it had already seen. So it has a window of which are the last bytes that I saw, and it just is smart and keeps in mind what you already passed to it. People get this wrong all the time in Node. There's a bug in the Node.js documentation where chunks might not be neatly aligned to character boundaries. Luckily, this is something that's pretty easy to fix. Node.js streams have the setEncoding property where you can just tell it to decode incoming data using this encoding. Another Node.js bug is the hash function that sometimes produces unexpected results.

And that is where this flag comes into play. You pass it to every call but the last one if you're decoding a stream of data and the text decoder instance keeps in mind which partial characters it had already seen. So it has a window of which are the last bytes that I saw, and it just is smart and keeps in mind what you already passed to it.

And so this is one of my very, very big pet peeves. People get this wrong all the time in Node. And I get why. So this is from the actual official Node.js documentation. And there's a bug in there, very much what I just described, which is that, you know, you have this common pattern where you define data to be a string. And then you have a screen, and you do attach an onData listener. And that listener, it appends the chunk to that data string. And what that does under the hood is there's a lot of implicit details here. Adding something to a string converts it to a string. Chunk in this case is Node.js buffer. Calling toString on a Node.js buffer transforms it, decodes it from UTF-8 by default. That's all implicitly happening here. But it suffers from the problem I just described, where like, you know, chunks might not be neatly aligned to character boundaries.

Luckily, this is something that's pretty easy to fix. So like, let's go to the Node.js documentation and open a pull request. So it's a pretty easy one-line fix. Node.js streams have the setEncoding property where you can just tell it to, like, you know, hey, decode incoming data using this encoding. And then it's going to do the exact same thing that I just described using TextDecoder, where it keeps in mind which characters it has already seen. And that's a live pull request. All right. And it uses the same thing under the hood in Node.js, actually, like TextDecoder and this setEncoding thing.

Another Node.js bug that I wanted to talk about that's like, you sometimes see out there and always makes me want to go like, go on. So somebody wrote a hash function here. And that just does a simple chart 2,256 of a string. It takes the string as an argument, returns the string like a hexadecimal string as its output. And it does that by creating a crypto API hash object calls update on that with a string, calls it to interpret that as binary data and then calls the just to get the hex to result. And obviously, that might not look that bad at a glance.

6. Binary Alias and Passing Binary to Node.js APIs#

Short description:

You can pass different screens to this hash function and get the same result. That's bad. Binary is a legacy alias for ISO 88591 in Node.js. It's almost always a bug when you pass binary as a string to Node.js API and especially with crypto APIs, like think about what happens. They always work on byte sequences. That's how all crypto things are designed.

Like what actually can happen is that you can pass different screens to this hash function and get the same result. And that's bad. That's like the exact opposite of what hash functions are for. And so what happens here? binary is actually a legacy alias for ISO 88591 in Node.js. This is the case because long, long ago before UN8Array and buffers were a thing in JavaScript, you still wanted to deal with the binary data sometimes. And one way that you could do that was you could use strings and just pretend that like, your first 256 bytes correspond to your first 256 Unicode code points, which happens to exactly be ISO 8591. And so that was called a binary string. I haven't heard that time used in like real world projection usage in 20 years or something. But yeah, that's why that aliases there. Sometimes people still pass like binary to Node.js APIs because they think it tells Node to interpret something as like binary data or whatever. It doesn't do that. It's almost always a bug when you pass binary as a string to Node.js API and especially with crypto APIs, like think about what happens. Like it. They always work on byte sequences. That's how all crypto things are designed. So if you just submit that parameter, it actually does the right thing. It uses UTF 8 by default.

7. Final Thoughts and Considerations#

Short description:

Keep in mind that character encodings are important, even if you're not directly working with them. UTF-8 is popular because it's ASCII compatible. Don't assume JavaScript is using UTF-16, but also don't ignore the possibility. Be cautious when copying code from the docs.

So I'm at the end of my talk. Some things to keep in mind, like you are using encodings under the hood or not or whether you know it or not. Sometimes we have built some extractions to make it work as seamless as possible, but that doesn't mean that you can forget about it. It's still something when you convert between sequences of bytes and sequences of characters. You have to think about it. One lesson that I think is not that surprising, but like why is UTF-8 so popular? It's because it's ASCII compatible. That's the reason. So like, always something to keep in mind when you're building something new, if it's compatible with existing big players out there, then that is the best way to get your stuff adopted. Just going to skip that because I'm running out of time. But don't assume that JavaScript is using UTF-16. It might not be. You don't know what happens under the hood. But also don't pretend that it doesn't because sometimes it acts like it does. And then one final thing, don't just copy code from the docs, they might be wrong.

QnA

String Length and Character Encoding#

Short description:

The best way to find the length of a string in JS depends on what you need. If you care about individual characters or JavaScript string elements, the answers will be different. When using a for loop with array indexing notation, be aware of characters that are split into two. You can handle this situation using the Code Point At method. In the collision example, A and L have the same byte representation when using the ISO-88591 character encoding in Node.js.

All right, that was me. Thank you Ana for this great talk. Now we have one question but please ask more questions. So the question is, circling back to the trending question, what is the best way to find the of a string in JS? Well, the best way is to first think, what does the length of a string mean for you? Like if you care about the number of individual characters, why do you care about that? If you care about the number of JavaScript string elements, which is like UTF16 code units, why do you care about that? Or if you use string width, why do you want the width of a string when you print it to the terminal? Like different semantics, different answers. Cool. Cool question.

So the second question is, if .length returns the real length of a multi by character, how does it behave when used in a traditional for loop with array indexing notation? So if I'm understanding the question right, like it is, it's tricky. Um, because you are going to have situations where like, you know, if you have a character that split into two. Uh, you know, surrogate pairs is what they call it in UTF 16. Then if you iterate over a screen using the standard, you know, for loop with an index, uh, you're gonna see these two things show up separately. Um, and this is not something that I included in my talk, but it's something that you generally wanna think about. Like, um, that can happen. Um, if you wanna know how to handle that well, there's, um, you might have, you might know that there's JavaScript chart code at API and API on strings. There's also something called Code Point At. And there's a subtle difference for these multiply characters where a, where Code Point At actually gives you the full Unicode code point, uh, of this character and the next one together in that case. Um, that, that is a good way to handle that if you run into that situation. But, um, yeah. Nice.

The next question is, the collision example is crazy. Can you explain what happens from a technical point of view? Do A x L have the same byte representation? I'm, can I turn around and ask the question? Uh, so yeah, no, that's right. So, um, what happens is that A is like, uh, 65 and ASCII, like uppercase A. And the Polish uppercase L that I use is 65 plus 256. Uh, so what happens is that when you tell Node.js to use ISO-88591 to convert these two bytes, um, that that second character is not representable using that character encoding. And Node.js doesn't throw on that or something. It just silently truncates the codepoint for that character. And so because truncating means truncating to a single byte, um, what you end up with is like, you know, plus 256 falls away and you end up with the same value for that byte. Um, it is true the JavaScript engines actually use as- as- I never know how to pronounce that. But yeah. I don't know either. ASCII in the backend most of the time.

Character Encoding and JavaScript Engines#

Short description:

JavaScript engines auto-convert characters outside of ASCII. They check if characters can be represented in the desired encoding. V8, for example, can have concatenated strings with different encodings. Don't try to outsmart the engine.

Then auto-converts as soon as you use the character outside of ASCII. Um, yeah. So obviously JavaScript engines do the work to check whether they can represent characters in their input in the encoding that they want to start in. Again, JavaScript engines are very, very smart about this kind of stuff. So like, there's a lot of internal Spring representation for Spring. I'm mostly familiar with V8, because I'm a Node person, different engines might do different things. But like, so for example, in V8, you might end up with situations where, for example, you have a string and you created that from concatenating to other strings. And it actually starts that as a concatenated representation of these two strings. And one of these might be ASCII, one of these might be not ASCII. Um, yeah, but don't try to outsmart the engine. That's always a good advice for Javascript.

WTF8 Encoding and String Manipulation#

Short description:

You want standard WTF8 and the standard validation it provides. The most recommended encoding format is WTF8. To safely truncate a string after 15 characters, use the CodePointAdd API to check for double byte characters. You can force an encoding in JS by passing an explicit encoding parameter. The fastest way to handle long strings is to trust the JavaScript engine. The string length depends on how it is measured.

Good advice. So what do you think about WTF8? Um, okay. I'm just gonna assume that people in here might not all be familiar with that. If you want to know what that is, then look it up. I think that, typically, you want standard WTF8, and you want the standard validation that, for example, a text decoder gives you. And like, just stick with that because that's the most standardized thing you can get. Obviously, it's a variant of WTF8 that handles these code points outside the 65,000 range a bit differently. Not better, but differently. And I don't know. There are use cases for it, but if you don't have a good reason for using it, then don't.

And which one is the most recommended encoding format right now? WTF8. That's very simple. Sorry. No, yeah. There's a good reason why it's the default encoding for, like, basically every JavaScript API that exists. Cool.

Um, what is the safest way to truncate a string after 15 characters, adding ...at the end? Yeah, the safest way is also the most laborious way of doing this, I guess. What I would do, what I have done in the past when running into this problem, is to use the CodePointAdd API that I mentioned in an earlier question to check whether the 14th character, in that case of the string, is a double byte character, and then adjusting the index where you cut off at, depending on whether it is one of those, at 14 or 15. And then you can just use StringPrototype, Slice, or Substring, or whatever API you want to use. But yeah, it's not pretty but it is correct and, you know, people might otherwise actually notice that you're just cutting off in the middle of an emoji or something. Cool.

In all Python, Versus will force UTF-8 encoding on top of file. Is there a way to force an encoding in JS? To forge? Force, force. I mean, you can pass an explicit encoding parameter to most JavaScript APIs that do encoding or decoding. TextEncoder, as I mentioned, is one of the exceptions to that because it only supports UTF-8 because you're only supposed to use UTF-8 unless you have a very good reason not to. But otherwise, I mean, Node.js APIs that do encoding or decoding take an explicit parameter and other encodings do as well, yeah.

And what's the fastest way to handle long strings? To do what with? I mean, just do what you would usually do. And then if you're running into performance issues, you can take a look in more detail. But generally speaking, I mean, write in your meta JavaScript like and, you know, trust the engine that it's making smart decisions for you, for the most part. Cool. Nice.

And the last question will be what is the string length of this? And like, how do you define length? I think if you pass this to this string with packages that I mentioned earlier, it might just say for that is the width. And obviously, the other other cases, I can't really answer. Well, thank you very much, Anna.

Anna Henningsen
Anna Henningsen
33 min
14 Apr, 2023

Comments

Sign in or register to post your comment.

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

Scaling Up with Remix and Micro Frontends
Remix Conf Europe 2022Remix Conf Europe 2022
23 min
Scaling Up with Remix and Micro Frontends
Top Content
This talk discusses the usage of Microfrontends in Remix and introduces the Tiny Frontend library. Kazoo, a used car buying platform, follows a domain-driven design approach and encountered issues with granular slicing. Tiny Frontend aims to solve the slicing problem and promotes type safety and compatibility of shared dependencies. The speaker demonstrates how Tiny Frontend works with server-side rendering and how Remix can consume and update components without redeploying the app. The talk also explores the usage of micro frontends and the future support for Webpack Module Federation in Remix.
Full Stack Components
Remix Conf Europe 2022Remix Conf Europe 2022
37 min
Full Stack Components
Top Content
RemixConf EU discussed full stack components and their benefits, such as marrying the backend and UI in the same file. The talk demonstrated the implementation of a combo box with search functionality using Remix and the Downshift library. It also highlighted the ease of creating resource routes in Remix and the importance of code organization and maintainability in full stack components. The speaker expressed gratitude towards the audience and discussed the future of Remix, including its acquisition by Shopify and the potential for collaboration with Hydrogen.
Debugging JS
React Summit 2023React Summit 2023
24 min
Debugging JS
Top Content
Watch video: Debugging JS
Debugging JavaScript is a crucial skill that is often overlooked in the industry. It is important to understand the problem, reproduce the issue, and identify the root cause. Having a variety of debugging tools and techniques, such as console methods and graphical debuggers, is beneficial. Replay is a time-traveling debugger for JavaScript that allows users to record and inspect bugs. It works with Redux, plain React, and even minified code with the help of source maps.
Making JavaScript on WebAssembly Fast
JSNation Live 2021JSNation Live 2021
29 min
Making JavaScript on WebAssembly Fast
Top Content
WebAssembly enables optimizing JavaScript performance for different environments by deploying the JavaScript engine as a portable WebAssembly module. By making JavaScript on WebAssembly fast, instances can be created for each request, reducing latency and security risks. Initialization and runtime phases can be improved with tools like Wiser and snapshotting, resulting in faster startup times. Optimizing JavaScript performance in WebAssembly can be achieved through techniques like ahead-of-time compilation and inline caching. WebAssembly usage is growing outside the web, offering benefits like isolation and portability. Build sizes and snapshotting in WebAssembly depend on the application, and more information can be found on the Mozilla Hacks website and Bike Reliance site.
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Node Congress 2022Node Congress 2022
26 min
It's a Jungle Out There: What's Really Going on Inside Your Node_Modules Folder
Top Content
The talk discusses the importance of supply chain security in the open source ecosystem, highlighting the risks of relying on open source code without proper code review. It explores the trend of supply chain attacks and the need for a new approach to detect and block malicious dependencies. The talk also introduces Socket, a tool that assesses the security of packages and provides automation and analysis to protect against malware and supply chain attacks. It emphasizes the need to prioritize security in software development and offers insights into potential solutions such as realms and Deno's command line flags.
Webpack in 5 Years?
JSNation 2022JSNation 2022
26 min
Webpack in 5 Years?
Top Content
In the last 10 years, Webpack has shaped the way we develop web applications by introducing code splitting, co-locating style sheets and assets with JavaScript modules, and enabling bundling for server-side processing. Webpack's flexibility and large plugin system have also contributed to innovation in the ecosystem. The initial configuration for Webpack can be overwhelming, but it is necessary due to the complexity of modern web applications. In larger scale applications, there are performance problems in Webpack due to issues with garbage collection, leveraging multiple CPUs, and architectural limitations. Fixing problems in Webpack has trade-offs, but a rewrite could optimize architecture and fix performance issues.

Workshops on related topic

Master JavaScript Patterns
JSNation 2024JSNation 2024
145 min
Master JavaScript Patterns
Top Content
Featured Workshop
Adrian Hajdin
Adrian Hajdin
During this workshop, participants will review the essential JavaScript patterns that every developer should know. Through hands-on exercises, real-world examples, and interactive discussions, attendees will deepen their understanding of best practices for organizing code, solving common challenges, and designing scalable architectures. By the end of the workshop, participants will gain newfound confidence in their ability to write high-quality JavaScript code that stands the test of time.
Points Covered:
1. Introduction to JavaScript Patterns2. Foundational Patterns3. Object Creation Patterns4. Behavioral Patterns5. Architectural Patterns6. Hands-On Exercises and Case Studies
How It Will Help Developers:
- Gain a deep understanding of JavaScript patterns and their applications in real-world scenarios- Learn best practices for organizing code, solving common challenges, and designing scalable architectures- Enhance problem-solving skills and code readability- Improve collaboration and communication within development teams- Accelerate career growth and opportunities for advancement in the software industry
Integrating LangChain with JavaScript for Web Developers
React Summit 2024React Summit 2024
92 min
Integrating LangChain with JavaScript for Web Developers
Featured Workshop
Vivek Nayyar
Vivek Nayyar
Dive into the world of AI with our interactive workshop designed specifically for web developers. "Hands-On AI: Integrating LangChain with JavaScript for Web Developers" offers a unique opportunity to bridge the gap between AI and web development. Despite the prominence of Python in AI development, the vast potential of JavaScript remains largely untapped. This workshop aims to change that.Throughout this hands-on session, participants will learn how to leverage LangChain—a tool designed to make large language models more accessible and useful—to build dynamic AI agents directly within JavaScript environments. This approach opens up new possibilities for enhancing web applications with intelligent features, from automated customer support to content generation and beyond.We'll start with the basics of LangChain and AI models, ensuring a solid foundation even for those new to AI. From there, we'll dive into practical exercises that demonstrate how to integrate these technologies into real-world JavaScript projects. Participants will work through examples, facing and overcoming the challenges of making AI work seamlessly on the web.This workshop is more than just a learning experience; it's a chance to be at the forefront of an emerging field. By the end, attendees will not only have gained valuable skills but also created AI-enhanced features they can take back to their projects or workplaces.Whether you're a seasoned web developer curious about AI or looking to expand your skillset into new and exciting areas, "Hands-On AI: Integrating LangChain with JavaScript for Web Developers" is your gateway to the future of web development. Join us to unlock the potential of AI in your web projects, making them smarter, more interactive, and more engaging for users.
Using CodeMirror to Build a JavaScript Editor with Linting and AutoComplete
React Day Berlin 2022React Day Berlin 2022
86 min
Using CodeMirror to Build a JavaScript Editor with Linting and AutoComplete
Top Content
WorkshopFree
Hussien Khayoon
Kahvi Patel
2 authors
Using a library might seem easy at first glance, but how do you choose the right library? How do you upgrade an existing one? And how do you wade through the documentation to find what you want?
In this workshop, we’ll discuss all these finer points while going through a general example of building a code editor using CodeMirror in React. All while sharing some of the nuances our team learned about using this library and some problems we encountered.
Node.js Masterclass
Node Congress 2023Node Congress 2023
109 min
Node.js Masterclass
Top Content
Workshop
Matteo Collina
Matteo Collina
Have you ever struggled with designing and structuring your Node.js applications? Building applications that are well organised, testable and extendable is not always easy. It can often turn out to be a lot more complicated than you expect it to be. In this live event Matteo will show you how he builds Node.js applications from scratch. You’ll learn how he approaches application design, and the philosophies that he applies to create modular, maintainable and effective applications.

Level: intermediate
Testing Web Applications Using Cypress
TestJS Summit - January, 2021TestJS Summit - January, 2021
173 min
Testing Web Applications Using Cypress
Top Content
WorkshopFree
Gleb Bahmutov
Gleb Bahmutov
This workshop will teach you the basics of writing useful end-to-end tests using Cypress Test Runner.
We will cover writing tests, covering every application feature, structuring tests, intercepting network requests, and setting up the backend data.
Anyone who knows JavaScript programming language and has NPM installed would be able to follow along.
Build and Deploy a Backend With Fastify & Platformatic
JSNation 2023JSNation 2023
104 min
Build and Deploy a Backend With Fastify & Platformatic
WorkshopFree
Matteo Collina
Matteo Collina
Platformatic allows you to rapidly develop GraphQL and REST APIs with minimal effort. The best part is that it also allows you to unleash the full potential of Node.js and Fastify whenever you need to. You can fully customise a Platformatic application by writing your own additional features and plugins. In the workshop, we’ll cover both our Open Source modules and our Cloud offering:- Platformatic OSS (open-source software) — Tools and libraries for rapidly building robust applications with Node.js (https://oss.platformatic.dev/).- Platformatic Cloud (currently in beta) — Our hosting platform that includes features such as preview apps, built-in metrics and integration with your Git flow (https://platformatic.dev/). 
In this workshop you'll learn how to develop APIs with Fastify and deploy them to the Platformatic Cloud.