Skip to content

Commit 51e236b

Browse files
committed
Create sapiens-node demo script
1 parent 66429c6 commit 51e236b

File tree

8 files changed

+1313
-10
lines changed

8 files changed

+1313
-10
lines changed

README.md

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,16 @@
22

33
A collection of [🤗 Transformers.js](https://huggingface.co/docs/transformers.js) demos and example applications.
44

5-
| Name | Description | Links |
6-
| ------------------------------------------------------- | --------------------------------------------------- | ------------------------------------------------------------------------------ |
7-
| [Phi-3.5 WebGPU](./phi-3.5-webgpu/) | Conversational large language model | [Demo](https://huggingface.co/spaces/webml-community/phi-3.5-webgpu) |
8-
| [SmolLM WebGPU](./smollm-webgpu/) | Conversational small language model | [Demo](https://huggingface.co/spaces/webml-community/smollm-webgpu) |
9-
| [Segment Anything WebGPU](./segment-anything-webgpu/) | WebGPU image segmentation | [Demo](https://huggingface.co/spaces/webml-community/segment-anything-webgpu) |
10-
| [Remove Background WebGPU](./remove-background-webgpu/) | WebGPU image background removal | [Demo](https://huggingface.co/spaces/webml-community/remove-background-webgpu) |
11-
| [PGlite Semantic Search](./pglite-semantic-search/) | Semantic search | [Demo](https://huggingface.co/spaces/thorwebdev/pglite-semantic-search) |
12-
| [Bun](./bun/) | Compute text embeddings in [Bun](https://bun.sh/) | n/a |
13-
| [Node.js (ESM)](./node-esm/) | Sentiment analysis in Node.js w/ ECMAScript modules | n/a |
14-
| [Node.js (CJS)](./node-cjs/) | Sentiment analysis in Node.js w/ CommonJS | n/a |
5+
| Name | Description | Links |
6+
| ------------------------------------------------------- | ----------------------------------------------------------- | ------------------------------------------------------------------------------ |
7+
| [Phi-3.5 WebGPU](./phi-3.5-webgpu/) | Conversational large language model | [Demo](https://huggingface.co/spaces/webml-community/phi-3.5-webgpu) |
8+
| [SmolLM WebGPU](./smollm-webgpu/) | Conversational small language model | [Demo](https://huggingface.co/spaces/webml-community/smollm-webgpu) |
9+
| [Segment Anything WebGPU](./segment-anything-webgpu/) | WebGPU image segmentation | [Demo](https://huggingface.co/spaces/webml-community/segment-anything-webgpu) |
10+
| [Remove Background WebGPU](./remove-background-webgpu/) | WebGPU image background removal | [Demo](https://huggingface.co/spaces/webml-community/remove-background-webgpu) |
11+
| [PGlite Semantic Search](./pglite-semantic-search/) | Semantic search | [Demo](https://huggingface.co/spaces/thorwebdev/pglite-semantic-search) |
12+
| [Sapiens](./sapiens-node/) | Image segmentation, depth, and normal estimation in Node.js | n/a |
13+
| [Bun](./bun/) | Compute text embeddings in [Bun](https://bun.sh/) | n/a |
14+
| [Node.js (ESM)](./node-esm/) | Sentiment analysis in Node.js w/ ECMAScript modules | n/a |
15+
| [Node.js (CJS)](./node-cjs/) | Sentiment analysis in Node.js w/ CommonJS | n/a |
1516

1617
Check out the Transformers.js [template](https://huggingface.co/new-space?template=static-templates%2Ftransformers.js) on Hugging Face to get started in one click!

sapiens-node/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# sapiens-node
2+
3+
This project demonstrates how to use [Sapiens](https://github.com/facebookresearch/sapiens), a foundation model for human tasks (segmentation, depth, and normal estimation) in a Node.js environment.
4+
5+
## Instructions
6+
7+
1. Clone the repository:
8+
```sh
9+
git clone https://github.com/huggingface/transformers.js-examples.git
10+
```
11+
2. Change directory to the `sapiens-node` project:
12+
```sh
13+
cd cd transformers.js-examples/sapiens-node
14+
```
15+
3. Install the dependencies:
16+
```sh
17+
npm install
18+
```
19+
4. Run the example:
20+
```sh
21+
node index.js
22+
```
23+
24+
## Results
25+
26+
The following images illustrate the input image, its corresponding depth map, and the normal map generated by the model:
27+
28+
| Input Image | Depth Map | Normal Map |
29+
| ---------------------------------- | -------------------------------- | ---------------------------------- |
30+
| ![Input Image](./assets/image.jpg) | ![Depth Map](./assets/depth.png) | ![Normal Map](./assets/normal.png) |

sapiens-node/assets/depth.png

237 KB
Loading

sapiens-node/assets/image.jpg

177 KB
Loading

sapiens-node/assets/normal.png

889 KB
Loading

sapiens-node/index.js

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
import {
2+
AutoProcessor,
3+
SapiensForSemanticSegmentation,
4+
SapiensForDepthEstimation,
5+
SapiensForNormalEstimation,
6+
RawImage,
7+
interpolate_4d,
8+
} from "@huggingface/transformers";
9+
10+
// Load segmentation, depth, and normal estimation models
11+
const segment = await SapiensForSemanticSegmentation.from_pretrained(
12+
"onnx-community/sapiens-seg-0.3b",
13+
{ dtype: "q8" },
14+
);
15+
const depth = await SapiensForDepthEstimation.from_pretrained(
16+
"onnx-community/sapiens-depth-0.3b",
17+
{ dtype: "q4" },
18+
);
19+
const normal = await SapiensForNormalEstimation.from_pretrained(
20+
"onnx-community/sapiens-normal-0.3b",
21+
{ dtype: "q4" },
22+
);
23+
24+
// Load processor
25+
const processor = await AutoProcessor.from_pretrained(
26+
"onnx-community/sapiens-seg-0.3b",
27+
);
28+
29+
// Read and prepare image
30+
const image = await RawImage.read("./assets/image.jpg");
31+
const inputs = await processor(image);
32+
33+
// Run segmentation model
34+
console.time("segmentation");
35+
const segmentation_outputs = await segment(inputs); // [1, 28, 512, 384]
36+
console.timeEnd("segmentation");
37+
const { segmentation } =
38+
processor.feature_extractor.post_process_semantic_segmentation(
39+
segmentation_outputs,
40+
inputs.original_sizes,
41+
)[0];
42+
43+
// Run depth estimation model
44+
console.time("depth");
45+
const { predicted_depth } = await depth(inputs); // [1, 1, 1024, 768]
46+
console.timeEnd("depth");
47+
48+
// Run normal estimation model
49+
console.time("normal");
50+
const { predicted_normal } = await normal(inputs); // [1, 3, 512, 384]
51+
console.timeEnd("normal");
52+
53+
console.time("post-processing");
54+
55+
// Resize predicted depth and normal maps to the original image size
56+
const size = [image.height, image.width];
57+
const depth_map = await interpolate_4d(predicted_depth, { size });
58+
const normal_map = await interpolate_4d(predicted_normal, { size });
59+
60+
// Use the segmentation mask to remove the background
61+
const stride = size[0] * size[1];
62+
const depth_map_data = depth_map.data;
63+
const normal_map_data = normal_map.data;
64+
let minDepth = Infinity;
65+
let maxDepth = -Infinity;
66+
let maxAbsNormal = -Infinity;
67+
for (let i = 0; i < depth_map_data.length; ++i) {
68+
if (segmentation.data[i] === 0) {
69+
// Background
70+
depth_map_data[i] = -Infinity;
71+
72+
for (let j = 0; j < 3; ++j) {
73+
normal_map_data[j * stride + i] = -Infinity;
74+
}
75+
} else {
76+
// Foreground
77+
minDepth = Math.min(minDepth, depth_map_data[i]);
78+
maxDepth = Math.max(maxDepth, depth_map_data[i]);
79+
for (let j = 0; j < 3; ++j) {
80+
maxAbsNormal = Math.max(
81+
maxAbsNormal,
82+
Math.abs(normal_map_data[j * stride + i]),
83+
);
84+
}
85+
}
86+
}
87+
88+
// Normalize the depth map to [0, 1]
89+
const depth_tensor = depth_map
90+
.sub_(minDepth)
91+
.div_(maxDepth - minDepth)
92+
.clamp_(0, 1)
93+
.mul_(255)
94+
.round_()
95+
.to("uint8");
96+
97+
const normal_tensor = normal_map
98+
.div_(maxAbsNormal)
99+
.clamp_(-1, 1)
100+
.add_(1)
101+
.mul_(255 / 2)
102+
.round_()
103+
.to("uint8");
104+
105+
console.timeEnd("post-processing");
106+
107+
const depth_image = RawImage.fromTensor(depth_tensor[0]);
108+
depth_image.save("assets/depth.png");
109+
110+
const normal_image = RawImage.fromTensor(normal_tensor[0]);
111+
normal_image.save("assets/normal.png");

0 commit comments

Comments
 (0)