So inside the demo we are rendering all of this on top of a canvas element. So we basically go ahead and render our landmarks to kind of see how many landmarks do we actually get inside of our canvas. And then what we're doing is that we are basically going ahead and drawing the actual image on top of our canvas. And this is where we are primarily just processing our actual footage.
So if you take a look at from code lines 36 to 48, this is where when you bring your image and in this case, like in the webcam footage near your webcam, then it will be able to fetch the landmarks and then find the coordinates for each of these landmarks since it's a 2D image. It will fetch the X and Y coordinates of each and every landmark and it will keep them stored inside of an array. And then what we're doing is that we are drawing a rectangle. So as you can see that when I bring my hand up inside the demonstration, it will actually go ahead and render these landmarks by fetching the X-coordinate and Y-coordinate of each and every landmark that finds and it will render it on top of the actual image that is being rendered on top of the canvas.
And this is where we are basically loading the actual model and that's the hands model. And this is where we are initializing our camera. So when we run the demo, this is where we initialize our camera and then we have a couple of functions that we run including loading the hands model. And then finally, what we are seeing is we are rendering the results using the async function on results, which is basically capturing your footage. And it's basically going ahead and rendering the landmarks and connecting them, ensuring that they're matching or being superimposed on top of your footage. So this way what you can do is that of course, this is one example for just being able to run the hands demo. And of course the separate logic that has been put on top of this is to be able to detect when the landmarks are closed or open.
So, of course, in this case, the logic that I use is that when all the landmarks are not overlapping with each other, then it should just print it as an open. But in this case, when the landmarks are overlapping with each other, we can see that I'm printing the label as closed. So depending on the need, or depending on how you want to use this particular model, you could go ahead and write your custom logic in JavaScript itself to see if, like, let's say you are, because each of these different landmarks have their unique coordinates. So you could do a lot more with this specifically, like, let's say, if you wanted to create something like an American Sign Language, so you could train your model in such a way that depending on the landmarks or the positions of your landmark, and the way they are oriented, you could create an entire end-to-end American Sign Language demonstration as well with the help of the hands demo as well, or the hands self-segmentation model, hands segmentation model. So that's, of course, you know, totally up to you in terms of how you want to do it.
So, basically, going back to our screen. So this is basically the quick demonstration that I want to showcase. So with that, I like to conclude my talk. And of course, in case you have any questions about how to get started with MediaPipe in JavaScript, definitely you can reach out to me and I'll recommend you to check out the MediaPipe in JavaScript, where you'll find a list of all the different solutions and their respective NPM modules. And, of course, you'll see some working examples that are already out there. With that, I like to conclude and thank you so much. And I hope to see you in person next year at React Dev Berlin.
Comments