Top Content

English versionEN

[EN] Things I learned while writing high-performance JavaScript applications
[ES] Cosas que aprendí mientras escribía aplicaciones de JavaScript de alto rendimiento

Things I learned while writing high-performance JavaScript applications

Passionate and experienced software engineer, Google GDE, and Microsoft MVP.

During the past months, I developed Lyra, an incredibly fast full-text search engine entirely written in TypeScript. It was surprising to me to see how it could compete with solutions written in Rust, Java, and Golang, all languages known for being typically "faster than JavaScript"... but is that even true? In this talk, I will share some lessons I learned while developing complex, performance-critical applications in JavaScript.

This talk has been presented at Node Congress 2023, check out the latest edition of this JavaScript Conference.

FAQ

The talk aims to share insights and experiences related to full-text search, particularly focusing on how JavaScript can be used to innovate in this domain.

Elasticsearch is a distributed, RESTful search and analytics engine capable of addressing a growing number of use cases. It is built on Apache Lucene and is known for its scalability and performance.

The speaker found Elasticsearch hard to deploy, upgrade, and manage. It has a big memory footprint, high CPU consumption with more data, and is costly to run. Additionally, the speaker dislikes Java, which Elasticsearch is built on.

The speaker chose JavaScript due to its familiarity, ease of use, and the fact that it can be very performant when data structures and algorithms are implemented properly.

Orama is an evolution of the Lyra full-text search engine. It is open-source, free to use, and designed to be highly performant and easy to extend. The rebranding was mainly due to naming conflicts.

Orama supports faceted search, filtering, typo tolerance, field boosting, stemming, support for 26 languages, stop words, plugins, components, and hooks for customization.

Orama is designed to run on a CDN, leveraging its distributed nature to provide fast and scalable search capabilities. It also utilizes efficient algorithms and data structures optimized for JavaScript.

Yes, Orama can handle large datasets, but it is recommended to use the commercial solution of Orama for such big data due to the complexities and requirements involved.

The quote "What I cannot create, I do not understand" signifies the speaker's approach to learning by doing, which led to the creation of a new full-text search engine as a way to deeply understand the subject.

The talk discussed techniques like avoiding MapReduce for critical algorithms, understanding monomorphism vs polymorphism, and optimizing code for the runtime environment. Specific libraries like '2-Fast Properties' were also mentioned for performance enhancements.

Michele Riva

31 min

14 Apr, 2023

Comments

Video Summary and Transcription

This talk explores the creation of a full-text search engine in JavaScript, highlighting the challenges with existing search engines like Algolia and the advantages of using JavaScript. The speaker emphasizes the importance of code optimization and performance enhancement techniques in JavaScript. The talk also discusses the evolution of the Lyra search engine into the open-source project Orama, which offers a feature-rich and highly performant full-text search engine for JavaScript. The speaker addresses questions about language choice, scalability, and deployment, and showcases the benefits of deploying an immutable database to a CDN.

Available in Español: Cosas que aprendí mientras escribía aplicaciones de JavaScript de alto rendimiento

1. Introduction to Full-Text Search

Short description:

Welcome to my talk on Disrupting Full-Text Search with JavaScript. I love Elasticsearch because of its performance and scalability. Elasticsearch is built on Apache Lucene, a powerful full-text search library. However, I also love other search engines like Algolia, MeliSearch, and MiniSearch. I decided to recreate a search engine with my team to learn more and address personal issues I had with existing software, such as deployment difficulties, upgrades, memory usage, and high costs.

Welcome everyone to my talk, Disrupting Full-Text Search with JavaScript. I've been already introduced, so I won't proceed any further with that.

And I'm here to talk about full-text search because it's a domain that I love and something that really keeps me awake at night because I love it so much that I can't just stop thinking about it. And there is a good reason why I love it so much, and it's mainly because of Elasticsearch.

How many of you knows Elasticsearch? Everyone. How many of you have used Elasticsearch? Again, almost everyone. And I gotta say I've been introduced to open source software mainly because of Elasticsearch. So I have a very passionate relationship with it and I had the pleasure and the honor to work on Apache You Know Me, which is a customer data platform that uses Elasticsearch as a leader database in its infrastructure. And when I was a bit more junior like, I don't know, almost 10 years ago now, I was impressed by the performances of such a complex and distributed system. I was impressed to see that I could throw like millions of millions of records against it and it wouldn't degrade the performances that much. That was seriously impressing to me, and this is where I decided to go into open source software and try and understand how Elasticsearch works.

So my first question as a curious junior engineer was how is that even possible? I mean, how can a software maintain such good performances even with a billion of records? So I later discovered that Elasticsearch is not actually a full-text search engine, but Apache Lucene is. So Apache Lucene is the full-text search library, which Elasticsearch wraps by providing a RESTful interface, disability system capabilities, sharding, data consistency, monitoring, cluster management and so on and so forth. So big shout out to Elasticsearch.

And before proceeding any further, let me please clarify that again I love Elasticsearch and I love Algolia. I love MeliSearch. I love MiniSearch. I love every single search engine out there. And the reason why, of course, I'd be talking about something that I recreated with my team. The reason why I did that in the first place is because I wanted to learn more and of course I wanted to solve some very personal issues that I had with such software. So nothing personal. Please, if you're using Elasticsearch, just continue using it, if you're comfortable with it. There's no problem with that, of course. I was talking about the fact that I had some personal issues with Elasticsearch. My first personal problem was that it's pretty hard to deploy, in my opinion. Could be simplified. Hard to upgrade. Has a big memory footprint. CPU consumption becomes terrible as soon as you add more data. It's really costly to manage and run.

2. Challenges with Java and Algolia

Short description:

I don't like Java. I prefer JavaScript. Algolia is expensive and hard to extend. Making simple software is extremely hard, but as engineers, we have to give it a try.

Hard to extend and customize. But most importantly, Java. I knew that people would have laughed at this one. But it's a real concern, actually. Like, I don't like Java. I've been coding in Java for a bit. I prefer JavaScript forever and always. Also, I tried different solutions, such as Algolia, which is, again, an extremely extraordinary software. And I'm not even exaggerating here. The problems I had with Algolia is that it's incredibly expensive at scale. It's a big black box, right? It's closed source. And therefore, it's hard to extend and try to understand what's going on with it. But again, as I said, these are my personal problems with them. And maybe when I had these problems in the first place, I was a bit too inexperienced in that domain. Elasticsearch and Algolia were a bit too much for me. Maybe it's worth it to have such problems, right? Because people are using them. So there must be a reason why. And I also do understand now that I'm a bit more experienced, that making simple software is extremely hard. But I feel like, as engineers, we have to give it a try.

3. Learning by Doing and Choosing a Language

Short description:

I wanted to learn by doing, so I set myself a goal to give a talk on how full-text search engines work and create a new kind. I had to study algorithms and data structures extensively. Choosing a programming language was a challenge, but I ended up reimplementing in JavaScript and found it to be performant.

So I set myself some goals because I wanted to learn more again. And the only way I can actually learn is by following a Richard Feynman quote. That is, what I cannot create, I do not understand. So I wanted to learn by doing. And I set myself a goal. I wanted to give a talk on how full-text search engine works. And I want to create a new kind of full-text search engine. And I wanted it to be easy massively scalable, and easy to extend. So three easy goals, right?

And of course whenever you start trying to understand how full-text search engine works, you have to deal with the theory behind full-text search. So trees, graphs, engrams, causine similarity, BN25, TF-ADF, tokenization, stemming, and more and more and more. So you find yourself in that situation typically, right? Yeah, everything is fine. I'm understanding on the keyboard and hopefully something good will come up. Spoiler, it doesn't. But still, the hard truth, whenever you decide that you want to learn something like that, is that you need to study algorithms in data structures, and you need to study them a lot.

And of course at a certain point you have to find or to choose a programming language to implement them, right? And I wanted to be a cool guy, so I choose, oh no, that's the wrong one. And I saw people in the audience being like that when I showed the Haskell logo, right? Of course I didn't choose Haskell, even though I love it. I tried to implement it in Rust in the first place. It was too complex, I got to say. I started saying, oh yeah, I miss the garbage collector, so maybe I can try Golang, right? I tried Golang for a bit, and it was still decent, but still pretty complex to implement. And then I remembered Lou, the Atwood Lou. You know who Atwood was? He's the founder of Stack Overflow. So any application that can be written in JavaScript will eventually be written in JavaScript. And there is nothing more true to that. So JavaScript is the king of programming languages. We all know that. Please, big round of applause to JavaScript. Yeah, it's worth it. So, yeah, I started to reimplement stuff in JavaScript by translating the source code I started working in Rust and then Golang. And surprisingly, I discovered that, wow, it was very performant when I started to implement data structures in a proper way. And this is the first takeaway I'd like you to bring home from this talk.

4. Enhancing Performance and Code Optimization

Short description:

There is no slow programming language that is just algorithms in data structure design. We will see benchmarks to prove this point. I want to give you examples on how to enhance your performance in JavaScript. When writing a full text search engine, you have to deal with boolean queries and compute the intersections of arrays. Using MapReduce is handy but not the most performant. Optimizing the algorithm by removing MapReduce and using plain loops can result in a 13% performance enhancement. Knowing your language and runtime is crucial for code optimization. Understanding the difference between monomorphism and polymorphism is important.

In my opinion that's, again, very personal opinion, there is no slow programming language that is just algorithms in data structure design. We will see later some benchmarks to prove this point. But, please, I really want this to be a takeaway for this talk. We are at Node conference, right? So, hopefully, this can give us hope.

So, yeah, I want to give you also some examples on how to actually enhance your performances in JavaScript, Node, Deno, whatever. So, when you're writing a full text search engine, at a certain point, you have to deal with boolean queries, right? And, or, etc. And at a certain point, you have to compute the intersections of arrays. So, you have multiple arrays and you have to determine the elements that appears in every single array and return that. This is an example using the MapReduce paradigm. And as a former, and I'm highlighting, former functional programming guy, I used to love this a lot. And yeah, you basically, I see Matteo there, he's face palming and you're the reason why I don't code functionally anymore. I hope you know this. We will get there later. But anyway, using MapReduce, it's very handy, in my opinion, but it's not the most performant way to deal with that kind of algorithms.

So, this is, let's call it version one. We can optimize it, sweeping away all the MapReduce stuff and using plain iteration. And then we have to understand how JavaScript works and how algorithms work. And maybe we can have version two, optimize it even more, and go to version three. So, it's not a matter of parity, it's a matter of algorithms. Here, basically, we are just adding a single line that basically starts the intersection from the smallest array. It's a very simple performance tuning that you can apply on these kinds of algorithms. When you run the benchmarks, and there is the reference for the benchmarks, so you can run them yourselves, you will see that you have like 13% of performance enhancements just by removing the MapReduce stuff and using plain loops, because that's how, at the end of the day, computers tend to work.

I want to give you other examples, but before doing that, as I said, I built a full text search engine, and with this PR alone, we incremented the performances by 50%, five zero, 50% of the overall full text search engine performance is just taking care of how intersections are computed. That's why I wanted to bring this example to the table. But another thing that I'd like you to think of is, please, you have to know your language, your runtime, and how to optimize your code for the runtime you're executing it on. And I can give you a couple of examples of that. First of all, who knows the difference between monomorphism and polymorphism? I'm sure there is more people than that. So basically polymorphism is when you create a function that takes multiple arguments, one to infinite arguments, and you basically call the same function with the same argument types. So if you have an add function where you compute the addition of two numbers, you will always pass numbers to that function. So it's monomorphic.

5. Polymorphic Functions and Performance Optimization

Short description:

In JavaScript, using the same operator for concatenating strings and computing numbers makes functions polymorphic. Polymorphic functions inside loops decrease performance. Check out the Optimization Killers repository on GitHub for more performant code. The 2-Fast Properties library ensures object shape modifications don't slow down performance. By calling the function fast object inside a loop, inline optimization for Db8Engine is warmed up. JavaScript can be highly performant with optimization. I created a self-contained full-text search engine.

There is just one shape for the data that you're passing to the function arguments. But of course, in JavaScript, like concatenating strings and computing the addition of numbers uses the same operator, like the plus. So you could easily use the add number to concatenate strings. This is what makes the function polymorphic. You will soon discover that if you use polymorphic functions inside loops, you are decreasing performance for many different reasons that we'll show in just a second.

There is an awesome repository on GitHub that has an awesome wiki called Optimization Killers. It's all about the turbofan. This is basically a list of stuff that I'd love you to explore if you want to learn more about how to actually create more performant code for Node.js, specifically. There is a link up there, so you're free to take a picture and look at it later on.

There is one specific library that exposes an API that is really crazy to me. When my colleague Paolo showed that to me, I was like, oh, my God, no, please, you've got to be kidding me. The library is called 2-Fast Properties. If you have an object and you're modifying the shape of the original object, this is slowing down everything. There is this library that basically makes sure that every time that you edit the shape of an object, it basically keeps being performant, right? How does it do that? That must be like a crazy metaprogramming reflection kind of stuff going on, right? Actually, no. It's 31 lines of code with a bunch of comments. Can you see how it does it? What if I lied from line 24 to 27? What's going on here? So, basically, we are just calling the function fast object inside a loop, and that warms up the inline optimization for Db8Engine. And that's crazy. I mean, this breaks on JavaScript Core. So if you're doing something for Safari, for example, or for BUN, this is not working great. But for Node.js or Dino, that's exceptional. This is serious performance improvement here. Once you know how to optimize it, JavaScript, it's very, very performant. And I'm bringing you some benchmarks now. In just a second. You can easily make your code work in the microseconds area. We will see that in just a minute. But before doing that, I was talking about how I wanted to create a full-text search engine right at the beginning. So let's go down here for a second. I had the honor to talk at We Are Developers for Congress in Berlin last year, and I gave this talk. And I made a very tiny self-contained full-text search engine.

6. Lyra: From Open Source Project to Company

Short description:

This was crappy as hell. It was very bad, but it has some potential. We cleaned it a bit and open sourced Lyra, a full-text search engine for JavaScript. The project gained popularity on GitHub, leading us to spin off a company called Orama, focused on open source full-text search. Orama is the next evolution of Lyra and is completely free for use.

This was crappy as hell. It was very bad, but it has some potential. So we cleaned it a bit. I was working in a team at Nearform led by Mateo. I was very happy and proud to be part of this team. I had Paolo, which is a Node.js contributor, Rafael Silva, now Node.js THC, and Cody Zuchlag, which other than being an awesome developer, is also a professor in the NSU university.

And working with them, we decided to optimize it a bit, make it a bit more prettier, and open source it. So we open sourced Lyra. So Lyra was a full-text search engine meant for running whatever JavaScript runs. And we will see how it does in just a second. One nice thing that happened, and I'd like this to be kind of an inspirational story, if you will. Some guy took the link for the Lyra project and put it on Hacker News, without telling me. And the day after they did that, I discovered that we had like 3,000 stars on GitHub, which was crazy for me. I never had any repository going that fast on GitHub. And we decided that maybe it was the time to create something around it. So I talked with my former boss at NearForm, and we decided that, yeah, we had to spin off a company around that. And I'll get there in just a second.

I had the chance to take two amazing professionals from NearForm, Paolo, which has spoke today in this very stage, which is now working with me as a principal architect. Angela, which is the best designer you can ever dream of. And me, of course. But still, we needed someone with knowledge of the domain, with knowledge of how business works. And asking around, we've been prompted to talk with a person that now has joined the company as a CEO and co-founder, a person that was an early Node.js-based person that created OpenShift, Strongloop, Scalar, many beautiful startups has served IBM as a CTO. And now it's our, I'm proud to say, co-founder and CEO, Sir Isaac Roth. And I'm very proud to say that we founded together Orama. So Orama is the next evolution of Lira. And this is where we want to bring full-text search, which is open source and free. I know that when I say companies, people may think, oh no, now you're commercializing something. Actually not. This is open source. You can use it.

7. Introduction to Orama

Short description:

Orama is an open source, free-to-use full-text search engine for JavaScript. It's licensed under Apache 2.0 and can be downloaded using NPM. With Orama, you can create a strongly typed movie database, insert data, and perform searches based on selected properties or all properties. Orama stands out among other full-text search libraries like mini-search, FUSE.js, and FlexiSearch, which are also excellent choices for developers.

It's free. And we want to bring a new paradigm to full-text search. And I'm going to show you how. So first of all, using Orama, it's pretty simple. It's open source. It's free for use. It's licensed under the Apache 2.0 license. And you can download it using NPM.

So once you import it, let's say you create a movie database. So you must have a schema, which is strongly typed. And you may say, OK, I will have my title, a director name, some metadata, such as rating, which is number, has won an Oscar, which is a Boolean. And then you have to insert stuff, like my favorite movie ever, The Prestige. Is there anyone else sharing this passion for this movie? Wow, amazing. No, no. Why are you saying no? We need to discuss this later, you know. OK. Anyway, anyway, this is the best movie ever, in my opinion. And yeah, anyway, that's not why we are here. We will discuss this later. We insert stuff, and then we search for stuff. So let's search for Prestige. And we select the properties we want to search in. Or we can use the star operator to search through all the properties. And that's really it. That's how it works.

But now you may be thinking, what makes Orama so special, right? We have multiple full-text search libraries built in JavaScript. We have mini-search, we have FUSE.js, we have FlexiSearch. And let me tell you, these are all exceptional libraries. If you're already using them right now, and they're working well for you, just keep on using them. They are fantastic.

8. Arama's Feature Set and Advantages

Short description:

We have a huge feature set in Arama, including support for face-sets, filtering, field boosting, and more. You can customize Arama using hooks and components. Arama works on any JavaScript runtime, except for Rhino. It's extendable in plain JavaScript and provides insanely fast search times measured in microseconds.

Really. And they gave me a lot of inspiration in building what we created today. So, nothing bad about them. I just want to learn more and create something different. That's really it.

But I said, we want to create something different. So, we create a huge feature set. And this is not even whatever we cover. But I think this is the most important stuff we cover. Like, we have support for face-sets, filtering, type of tolerance, field boosting, stemming, 26 languages out of the box, stop word support, plugins, components, and hooks, which are not related to React.

I mean, hooks means, for example, before I insert stuff, do something after I insert stuff, do something after I tokenize before I tokenize. There is a lot that you can customize about Arama. And the same applies for components. Like, for example, if you index numbers, for convention, we did some benchmarks and we figured out that AVL trees are the data structure that is most optimized for numbers in our case. But you might have a different use case. So, we export an opaque interface. So, you can bring your own data structure which also means that if you want to test your skills and create new data structures, you can use our test suite to test your algorithms and data structures. That's just easy and fun to do, I guess.

Nice thing about Arama, it works whatever JavaScript runs. It works on Cloud 5.0 workers. We don't have any dependencies and we made it compatible with literally every single runtime out there, except for Rhino. Is there anyone knowing Rhino here? Oh my god, I'm so sad for you. I've been working with it for five years. You have no idea. No, maybe you have. Maybe you have. Apart from that, I feel like the biggest advantage of using Orama, if you need full-text search in your application, is that it's extendable in plain JavaScript. We have the built-in hooks and component systems, but it's not just that. It is insanely fast and we measure time, search time, in microseconds. Don't you believe me? Maybe it's time for a little demo.

9. Database Indexing and Querying

Short description:

Let me give you an example. We have a database with 20,000 products that we will index in the browser and run queries on. We also deployed the same database on our own CDN for comparison. Query times are incredibly fast, with results returned in milliseconds and even microseconds. Running on a CDN means you're getting charged for each microsecond your CPU runs, unlike AWS where the minimum charge is 1 millisecond. Additionally, there are some missing images due to corrupted data, but you still pay for the data transfer from your CDN.

Let's see. Wow. It disappeared. Oh, here it is. Okay. Let me give you an example. So I have a database that I took from keggle.com. It's basically 20,000 products full of title, description, prices, links to images, reviews, whatever. We will index 20,000 products in the browser right now and we will run some queries on them.

Just to be honest, I also deployed the exact same database on our own CDN. We will talk about that later so we can do some comparison. Right. So are you ready? Let's populate 20,000 products into the database. Should take around three seconds. Yeah, it did. And now we're ready to make some queries. So let's make a query. Okay. Wow. 17 milliseconds. That's good. Let's try with, I don't know, horse. 175 microseconds. Why am I highlighting this? Because you are running on a CDN, on a worker, and you're getting voiced every single microsecond that your CPU runs. So this is how much you're getting voiced for running this, right? On AWS, for example, that's not true. Minimum is 1 millisecond. But still, we're running on a CDN.

One thing to notice is that, like, now we have 10 images on this page, right? Some images actually are missing because of the corrupted data. But you're paying for the address from your CDN, right? So you're paying like 1 megabyte and 1.8. Sorry, 1.8 megabytes for your data transfer.

10. Orama: Cheaper, Better, Faster

Short description:

Orama is a cheaper, better, faster enterprise search. It runs on our own CDN, eliminating the need for cluster management, server provisioning, and performance degradation. It's as simple to use as a JavaScript library. You can integrate your all-large-language model with it and let Orama do its magic. We're hiring a staff engineer working mostly on Node.js and Rust. Thank you for being here!

And you're paying 12 kilobytes for Orama payload, which is basically free, right? But if your data set is small enough, like 20,000 products, in my opinion, is small enough, you can just click here. It takes 100 microseconds, and you're not paying anything because it runs on the browser, thank you.

And of course, as I said, we have a nice feature set. So just for giving you an example, if the horse, I don't know why I typed horse, but I can multiply it by, I don't know, 1.8 if it appears in the description. Zero nanoseconds, yeah, that's basically, we can't even measure because of how fast it is. And if you don't believe me, that's open source. The way we measure performances is on our GitHub repo. Sometimes I don't trust myself and I go read this code, and it actually works. I asked multiple people, and so I have to believe it, I guess. I don't know, you can filter by price, you can create face sets around that. Let me see. Okay, there's no data, but basically it's all between one and 50, whatever. But this is just for showing you basically the capabilities of running something at the edge, basically for free. So that's what I wanted to share with you. Maybe now you can believe me that it can run some microseconds, right?

So at its core, Orama is cheaper, better, faster enterprise search. And the nice thing about it is that given that it runs on our own CDN, maybe yours, we should discuss this, right? You have no cluster management, no server provisioning, no performance degradation, because CDN scales it for you, right? And it's as simple to use as a JavaScript library. But that's not all. You can integrate your all-large-language model with it and let Orama, or let Orama do its magic for you. That's just GPT for example, right? So that we are all on the same page. So I guess I'm finishing my time. So if you want to learn more, please give us a star on GitHub, subscribe to our newsletter in the oramasearch.com website, and you can find whatever you need here. And of course, I'll be around if you have a specific use case to share with us so we can collect feedback. I got nothing to say to you right now. I just want to collect feedback. And we are also hiring. So if you're interested, we're hiring a staff engineer working mostly on Node.js and Rust. And if you're interested, we are full remote, limited PTO, generous stock plan, and that's it. So thank you so much for being here. It has been an honor. Thank you, Michele.

QnA

Questions and Language Choice

Short description:

Awesome talk. Thank you. The first part of the talk gave me a serious JSPerf vibes. Java, that was the first one. Why didn't you choose Haskell? Serious question... I mean, serious answer? Up to you. If you want to run on a browser, you don't want to use pure script. Just use either Javascript itself or use TypeScript in a way that you can only strip types away and have pure Javascript out of it. That's how I suggest to write Javascript at all. Yeah, next question, how does it really scales? Oops. Yeah, and it's gone...

Awesome talk. Thank you. And we have a few questions for you. And by the way, the first part of the talk... Yeah, the first one was not really a question, but it was about Java. Alberto, I see you in the audience. I see the second one. Oh, I can't believe it. Sorry. Personal rant. Oh, no. The first part of the talk gave me a serious JSPerf vibes. I don't know if anyone else feels like that. I guess it's a nice thing. Yeah.

Java, that was the first one. Why didn't you choose Haskell? Serious question... I mean, serious answer? Up to you. I actually used to code Haskell for a bit. I was not good at it. The problem is that if you want to run on a browser, you don't want to use pure script. You know you can compile Haskell to Javascript. It's not worth it, in my opinion. It's not worth to compile any language to Javascript. Just use either Javascript itself or use TypeScript in a way that you can only strip types away and have pure Javascript out of it. That's how I suggest to write Javascript at all. Don't use any other language on it. Yeah, got it.

Yeah, next question, how does it really scales? Oops. Yeah, and it's gone...

Full-Text Search on PubMed with Orama

Short description:

We can make full-text search on PubMed with Orama, an in-memory full-text search engine. It provides an ultra scalable solution for big data sets, allowing you to deploy it as you prefer. Deploying that much data on a CDN is dangerous and hard to achieve, but we found a way to make it work. Let's discuss it further, although I cannot disclose the details publicly yet.

Yeah, I will revert it in one second. How it really scales. Could we make full-text search on PubMed? 35 million articles downloadable for free. Free 50 gigabytes of articles in the medical field. Sure you can. In the same way you can do it... Of course you are not doing that in a browser because you don't have that much RAM memory. This is an in-memory full-text search engine. The best way to do that, in my opinion, is to... I'm really sad to say that but this is... How we want to commercialize RAM, all right? By providing an ultra scalable solution for your big data sets. Because of course you can use the open source projects and deploy it as you prefer. You can deploy it to a CDN but that much data on a CDN that's pretty dangerous and that's pretty hard to achieve. We found a way to achieve it. So we should definitely discuss this. I'm very sorry I cannot disclose this publicly yet. I'm sorry that's a very bad answer.

Testing Text Search in Orama

Short description:

To ensure the correctness of the text search, we have several unit tests and generative tests. However, determining if the search is working properly can be challenging due to factors like context and stemming. While Algolia and Elastic may yield different results, Orama allows you to inject your code and customize the search to your preferences using hooks and components.

All right. How do you test that the text search works? Okay, we have several unit tests in the first place. We have several generative tests and we are making sure every time we implement a new feature we make sure to generate even more unit tests on real data to ensure its correctness. The biggest problem is determining whether search is working or not properly because context matters, stemming matters and if you run the same search on Algolia and Elastic it's very unlikely to have the exact same results for multiple reasons. So how do you determine if it's working or not? Nice fact about Orama is that you can actually inject your code and make it work and search how you prefer. So I can only grant you that it's working fine because we have a community adoption that proves it, we have unit tests and integration tests that proves it but if it's not working how you want it to work you can always use the hooks and components to make it work as you prefer.

Orama's Rebranding and Name Meaning

Short description:

Orama is a rebranded version of Lyra due to naming conflicts. The name Orama means 'to see' in Greek and is a fun name to use in English. The decision to rebrand was primarily to avoid confusion and incorporate the company under a unique name.

Awesome, next one I think this one will go. Yeah, will Orama be different from Lyra? Speaking about features. Oh, not at all. We just had to rebrand the name basically because Lyra was a codec from Google and we had some problems with naming. Also there are many companies called Lyra where we incorporated the company. So we decided to go with Orama which in Greek means to see and I feel like Orama is also a fun name, right? To use in English. That's what they told me, I'm Italian. I feel like I don't know. I have good feelings about the name but that's just a name rebranding actually. It's nothing more than that.

Garbage Collector Penalty and Deployment

Short description:

The garbage collector penalty is significant for high volume workloads, but it's not a concern when running on a CDN. Large datasets in the browser can be problematic, so Algolia or Elasticsearch are better options for monolithic deployments on Node.js servers. Search engines in JavaScript should run at the edge or on the browser.

Yeah, question about garbage collector. How significant is a garbage collector penalty in your opinion for high volume workloads? So whenever you run on a CDN, you don't really care about that at all because every time you make a query you're likely to hit a different node or a node that it's already worn by the previous request and it's always cached. So the fact is garbage collection penalty is bad if you have a large dataset in the browser but you're not likely to have a large data set in the browser. I feel like having a monolithic deployment on a Node.js server, it's not good. If you wanted that, please use Algolia or Elasticsearch. They are good at this and also because you would have to deal with consensus protocols for distributed data, data consistency. You have to ensure it. So my opinion is that search engines built in JavaScript to run at the edge should run at the edge or on the browser. That's really it.

Reasons for Choosing Imperative JavaScript

Short description:

I used to write functional JavaScript, but I found that imperative JavaScript is faster and better in many ways. Numbers show that imperative code outperforms object-oriented or functional code. Although I may not have been writing good object-oriented or functional JavaScript, my imperative code works better.

Okay, sounds cool. Yeah, there are a few questions. Why don't you use a functional paradigm anymore? Because of Matteo Collina, which is here smiling. No, because he's right. I mean, it's not the most efficient way to write JavaScript code. Before I joined NIRF and his team, I was working with mainly functional JavaScript. Then I discovered that this is not the best way to write JavaScript. And if you count the number... I mean, numbers are what matters at a certain point and numbers shows that imperative JavaScript is way faster and better in so many ways than object-oriented or functional JavaScript. That's probably because I was writing terrible object-oriented or functional programming JavaScript. But how I write it, my imperative code works better. So that's why it's imperative.

Benefits of Deploying Immutable Database to CDN

Short description:

There is a question on benchmarks comparing to Elastic Search. We are about to release them. Deploying an immutable database to a CDN offers advantages, such as personalized search experiences. Boosting documents instead of fields can increase conversion for users. Thank you for joining the Q&A room.

Yeah, makes sense. Yeah, there is also a question on benchmarks comparing to Elastic Search. Do you have any references?

We are about to release them, actually, yes. And not yet. They're not public. We are about to release them.

All right, I think I need to choose a last one. Let me quickly... Oh, that's a nice one. Yeah, absolutely. This is nice because of one thing that I didn't mention. But as for now, I need to. If you deploy an immutable database to a CDN, you have some advantages. For example, if no one is searching, you're not paying anything. Which means that if you have 1000 users, you can build 1000 different indexes, deploy them to a CDN, and you're not paying anything until they start searching. Which means that every single user can have a different and customizable and personalized search experience. Right? So yes, you can boost documents instead of fields. And you can say like, Michele likes, I don't know, Violet t-shirts. So whenever he searches for t-shirts, he will have boosted fields for Violet and t-shirts together, right? Because I have maybe a search history or conversion history on e-commerce. Maybe other people like Paolo, I always joke with him because he likes yellow t-shirts. If I search t-shirts, I will have an index that will boost Violet t-shirts. Paolo will have an index that will boost yellow t-shirts. As you can see, this is increasing conversion for your users, hopefully. And so yes, definitely, you can boost documents and you should boost documents instead of fields.

All right, thank you very much. Thank you. There are much more questions. I just cannot, yeah. I don't know how it's called, but I'll be there. Yeah, exactly. It's a Q&A room. So everyone, please join in with the Q&A room. Thank you.

Available in other languages:

Check out more articles and videos

We constantly think of articles and videos that might spark Git people interest / skill us up or help building a stellar career

A Guide to React Rendering Behavior

React Advanced 2022

25 min

A Guide to React Rendering Behavior

Top Content

Mark Erikson

Replay.io

This transcription provides a brief guide to React rendering behavior. It explains the process of rendering, comparing new and old elements, and the importance of pure rendering without side effects. It also covers topics such as batching and double rendering, optimizing rendering and using context and Redux in React. Overall, it offers valuable insights for developers looking to understand and optimize React rendering.

react performance deep dive react rendering

Scaling Up with Remix and Micro Frontends

Remix Conf Europe 2022

23 min

Scaling Up with Remix and Micro Frontends

Top Content

Adrien Baron

Creator of Tiny Frontend

This talk discusses the usage of Microfrontends in Remix and introduces the Tiny Frontend library. Kazoo, a used car buying platform, follows a domain-driven design approach and encountered issues with granular slicing. Tiny Frontend aims to solve the slicing problem and promotes type safety and compatibility of shared dependencies. The speaker demonstrates how Tiny Frontend works with server-side rendering and how Remix can consume and update components without redeploying the app. The talk also explores the usage of micro frontends and the future support for Webpack Module Federation in Remix.

remix javascript micro-frontends architecture

Speeding Up Your React App With Less JavaScript

React Summit 2023

32 min

Speeding Up Your React App With Less JavaScript

Top Content

Watch video: Speeding Up Your React App With Less JavaScript

Miško Hevery

Qwik Creator

Mishko, the creator of Angular and AngularJS, discusses the challenges of website performance and JavaScript hydration. He explains the differences between client-side and server-side rendering and introduces Quik as a solution for efficient component hydration. Mishko demonstrates examples of state management and intercommunication using Quik. He highlights the performance benefits of using Quik with React and emphasizes the importance of reducing JavaScript size for better performance. Finally, he mentions the use of QUIC in both MPA and SPA applications for improved startup performance.

performance frameworks builders and founders qwik react less

React Concurrency, Explained

React Summit 2023

23 min

React Concurrency, Explained

Top Content

Watch video: React Concurrency, Explained

Ivan Akulov

Google Developer Expert, Web Performance Consultant, Netherlands

React 18's concurrent rendering, specifically the useTransition hook, optimizes app performance by allowing non-urgent updates to be processed without freezing the UI. However, there are drawbacks such as longer processing time for non-urgent updates and increased CPU usage. The useTransition hook works similarly to throttling or bouncing, making it useful for addressing performance issues caused by multiple small components. Libraries like React Query may require the use of alternative APIs to handle urgent and non-urgent updates effectively.

react performance best practices react 18 deep dive react concurrent mode

Understanding React’s Fiber Architecture

React Advanced 2022

29 min

Understanding React’s Fiber Architecture

Top Content

Tejas Kumar

Author of the "Fluent React" bestselling book, software engineer with 23 years of experience, and host of the developer-loved ConTejas Code podcast.

This Talk explores React's internal jargon, specifically fiber, which is an internal unit of work for rendering and committing. Fibers facilitate efficient updates to elements and play a crucial role in the reconciliation process. The work loop, complete work, and commit phase are essential steps in the rendering process. Understanding React's internals can help with optimizing code and pull request reviews. React 18 introduces the work loop sync and async functions for concurrent features and prioritization. Fiber brings benefits like async rendering and the ability to discard work-in-progress trees, improving user experience.

react architecture concurrent rendering react 18 beginner friendly react fiber react reconciliation

How React Compiler Performs on Real Code

React Advanced 2024

31 min

How React Compiler Performs on Real Code

Top Content

Nadia Makarevich

Coder, writer, author of Advanced React book

I'm Nadia, a developer experienced in performance, re-renders, and React. The React team released the React compiler, which eliminates the need for memoization. The compiler optimizes code by automatically memoizing components, props, and hook dependencies. It shows promise in managing changing references and improving performance. Real app testing and synthetic examples have been used to evaluate its effectiveness. The impact on initial load performance is minimal, but further investigation is needed for interactions performance. The React query library simplifies data fetching and caching. The compiler has limitations and may not catch every re-render, especially with external libraries. Enabling the compiler can improve performance but manual memorization is still necessary for optimal results. There are risks of overreliance and messy code, but the compiler can be used file by file or folder by folder with thorough testing. Practice makes incredible cats. Thank you, Nadia!

performance

Workshops on related topic

React Performance Debugging Masterclass

React Summit 2023

170 min

React Performance Debugging Masterclass

Top Content

Featured Workshop

Ivan Akulov

Ivan’s first attempts at performance debugging were chaotic. He would see a slow interaction, try a random optimization, see that it didn't help, and keep trying other optimizations until he found the right one (or gave up).
Back then, Ivan didn’t know how to use performance devtools well. He would do a recording in Chrome DevTools or React Profiler, poke around it, try clicking random things, and then close it in frustration a few minutes later. Now, Ivan knows exactly where and what to look for. And in this workshop, Ivan will teach you that too.
Here’s how this is going to work. We’ll take a slow app → debug it (using tools like Chrome DevTools, React Profiler, and why-did-you-render) → pinpoint the bottleneck → and then repeat, several times more. We won’t talk about the solutions (in 90% of the cases, it’s just the ol’ regular useMemo() or memo()). But we’ll talk about everything that comes before – and learn how to analyze any React performance problem, step by step.
(Note: This workshop is best suited for engineers who are already familiar with how useMemo() and memo() work – but want to get better at using the performance tools around React. Also, we’ll be covering interaction performance, not load speed, so you won’t hear a word about Lighthouse 🤐)

react performance best practices advanced debug react debugger react performance react profiler

Master JavaScript Patterns

JSNation 2024

145 min

Master JavaScript Patterns

Top Content

Featured Workshop

Adrian Hajdin

During this workshop, participants will review the essential JavaScript patterns that every developer should know. Through hands-on exercises, real-world examples, and interactive discussions, attendees will deepen their understanding of best practices for organizing code, solving common challenges, and designing scalable architectures. By the end of the workshop, participants will gain newfound confidence in their ability to write high-quality JavaScript code that stands the test of time.
Points Covered:
1. Introduction to JavaScript Patterns2. Foundational Patterns3. Object Creation Patterns4. Behavioral Patterns5. Architectural Patterns6. Hands-On Exercises and Case Studies
How It Will Help Developers:
- Gain a deep understanding of JavaScript patterns and their applications in real-world scenarios- Learn best practices for organizing code, solving common challenges, and designing scalable architectures- Enhance problem-solving skills and code readability- Improve collaboration and communication within development teams- Accelerate career growth and opportunities for advancement in the software industry

best practices javascript patterns

AI on Demand: Serverless AI

DevOps.js Conf 2024

163 min

AI on Demand: Serverless AI

Top Content

Featured WorkshopFree

Nathan Disidore

In this workshop, we discuss the merits of serverless architecture and how it can be applied to the AI space. We'll explore options around building serverless RAG applications for a more lambda-esque approach to AI. Next, we'll get hands on and build a sample CRUD app that allows you to store information and query it using an LLM with Workers AI, Vectorize, D1, and Cloudflare Workers.

serverless architecture artificial intelligence

React and Microfrontends

React Summit US 2024

56 min

React and Microfrontends

Featured Workshop

Harsh Maheshwari

Mentorship available

Leveraging reactjs to create reusable microfrontends addressing challenges and common pitfalls.

architecture

Using CodeMirror to Build a JavaScript Editor with Linting and AutoComplete

React Day Berlin 2022

86 min

Using CodeMirror to Build a JavaScript Editor with Linting and AutoComplete

Top Content

Workshop

2 authors

Using a library might seem easy at first glance, but how do you choose the right library? How do you upgrade an existing one? And how do you wade through the documentation to find what you want?
In this workshop, we’ll discuss all these finer points while going through a general example of building a code editor using CodeMirror in React. All while sharing some of the nuances our team learned about using this library and some problems we encountered.

javascript build tools

Next.js 13: Data Fetching Strategies

React Day Berlin 2022

53 min

Next.js 13: Data Fetching Strategies

Top Content

Workshop

Alice De Mauro

- Introduction- Prerequisites for the workshop- Fetching strategies: fundamentals- Fetching strategies – hands-on: fetch API, cache (static VS dynamic), revalidate, suspense (parallel data fetching)- Test your build and serve it on Vercel- Future: Server components VS Client components- Workshop easter egg (unrelated to the topic, calling out accessibility)- Wrapping up

performance next.js best practices react server components