Skip to content

Commit 13b671d

Browse files
author
ssjia
committed
Update on "[ET-VK][AOT] Serialize constant tensors via NamedDataMap"
Summary: When exporting models to Vulkan backend, save constant tensors in the NamedDataMap instead of the constant data section of the delegate header. ## Motivation Prevent screen blackout (Llama 3.2 1B) / device crash (Llama 3.2 3B) when running Llama 3.2 models on Samsung Galaxy S24. This behaviour is related to high peak memory usage when loading the model. For more information, see the top diff/PR in the stack. ## Context This change is based on the equivalent change D70315207/#9153 in XNNPACK. Test Plan: ## Memory Comparison with/without NamedDataMap Measured VmRss using ``` uint64_t getVmRssInKB() { std::ifstream statusFile("/proc/self/status"); std::string l, num; while (std::getline(statusFile, l)) { if (l.substr(0, 5) == "VmRSS") { size_t pos = l.find_first_of("0123456789"); num = l.substr(pos); break; } } uint64_t vmRssInKB = std::stoi(num); return vmRssInKB; } ``` P1908019767 (Meta only) Excerpt: ``` Log 1 | Log 2 --------------------------------------------------|-------------------------------------------------- Memory usage before model compilation: 1115416 KB | Memory usage before model compilation: 1919228 KB Memory usage after graph building: 1924340 KB | Memory usage after graph building: 1924256 KB Memory usage after graph preparation: 1798968 KB | Memory usage after graph preparation: 1782464 KB Memory usage prepack start: 1798968 KB | Memory usage prepack start: 1781968 KB Memory usage after prepack operations: 1271924 KB | Memory usage after prepack operations: 1653496 KB ``` Differential Revision: [D80460034](https://our.internmc.facebook.com/intern/diff/D80460034) [ghstack-poisoned]
2 parents 1be73c0 + 150afe4 commit 13b671d

File tree

3 files changed

+8
-6
lines changed

3 files changed

+8
-6
lines changed

backends/vulkan/runtime/VulkanBackend.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@
2222
#include <executorch/runtime/core/event_tracer_hooks_delegate.h>
2323
#endif // ET_EVENT_TRACER_ENABLED
2424
#include <executorch/runtime/core/exec_aten/util/tensor_util.h>
25-
#include <executorch/runtime/executor/pte_data_map.h>
25+
#include <executorch/runtime/core/named_data_map.h>
2626
#include <executorch/runtime/platform/compiler.h>
2727
#include <executorch/runtime/platform/profiler.h>
2828

backends/vulkan/serialization/vulkan_graph_serialize.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -201,11 +201,11 @@ def serialize_constant_tensors(
201201
named_key=named_key,
202202
)
203203
)
204-
elif tensor is None or tensor.numel() == 0:
205-
assert isinstance(tensor, torch.Tensor)
204+
elif tensor is None or (
205+
isinstance(tensor, torch.Tensor) and tensor.numel() == 0
206+
):
206207
vk_graph.constants.append(VkBytes(current_offset, 0))
207-
else:
208-
assert isinstance(tensor, torch.Tensor)
208+
elif isinstance(tensor, torch.Tensor):
209209
array_type = ctypes.c_char * tensor.untyped_storage().nbytes()
210210
array = ctypes.cast(
211211
tensor.untyped_storage().data_ptr(),
@@ -219,6 +219,8 @@ def serialize_constant_tensors(
219219

220220
vk_graph.constants.append(VkBytes(current_offset, len(tensor_bytes)))
221221
current_offset += aligned_size(len(tensor_bytes))
222+
else:
223+
raise ValueError(f"Unsupported constant tensor type: {type(tensor)}")
222224

223225

224226
def serialize_custom_shaders(

backends/vulkan/targets.bzl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -305,7 +305,7 @@ def define_common_targets(is_fbcode = False):
305305
"//executorch/backends/vulkan/serialization:vk_delegate_schema",
306306
"//executorch/runtime/core:event_tracer",
307307
"//executorch/runtime/core/exec_aten/util:tensor_util",
308-
"//executorch/runtime/executor:pte_data_map",
308+
"//executorch/runtime/core:named_data_map",
309309
],
310310
define_static_target = True,
311311
# VulkanBackend.cpp needs to compile with executor as whole

0 commit comments

Comments
 (0)