Skip to content

Commit 0c73dc6

Browse files
committed
[ET-VK][ez] Fix Vulkan Validation layer errors due to consecutive command buffer encoding
Pull Request resolved: #11401 ## Changes * In `VulkanBackend.cpp` do not call `encode_execute()` during model load if the model compile spec specifies `requires_dynamic_shapes` as true * `propagate_resize()` will not call `encode_execute()` if dynamic shapes are not expected. ## Motivation > In `VulkanBackend.cpp` do not call `encode_execute()` during model load if the model compile spec specifies `requires_dynamic_shapes` as true Recently, it was discovered that a command buffer re-encode was required to update push constant values. This means that for dynamic shapes to work correctly, `encode_execute()` must be called after updating tensor sizes. As a result, `propagate_resize()` now calls `encode_execute()` internally. This results in scenarios where `encode_execute()` is called once during model load, then again right before the first inference during `propagate_resize()`, without actually executing the command buffer in-between. This causes Validation layer errors like ``` UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout(ERROR / SPEC): msgNum: 1303270965 - Validation Error: [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object 0: handle = 0x24086224ec0, type = VK_OBJECT_TYPE_COMMAND_BUFFER; Object 1: handle = 0x88d2b500000000e2, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0x4dae5635 | vkQueueSubmit(): pSubmits[0].pCommandBuffers[0] command buffer VkCommandBuffer 0x24086224ec0[] expects VkImage 0x88d2b500000000e2[] (subresource: aspectMask VK_IMAGE_ASPECT_COLOR_BIT array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_UNDEFINED. Objects: 2 [0] 0x24086224ec0, type: 6, name: NULL [1] 0x88d2b500000000e2, type: 10, name: NULL UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout(ERROR / SPEC): msgNum: 1303270965 - Validation Error: [ UNASSIGNED-CoreValidation-DrawState-InvalidImageLayout ] Object 0: handle = 0x24086224ec0, type = VK_OBJECT_TYPE_COMMAND_BUFFER; Object 1: handle = 0x6caffc00000000e3, type = VK_OBJECT_TYPE_IMAGE; | MessageID = 0x4dae5635 | vkQueueSubmit(): pSubmits[0].pCommandBuffers[0] command buffer VkCommandBuffer 0x24086224ec0[] expects VkImage 0x6caffc00000000e3[] (subresource: aspectMask VK_IMAGE_ASPECT_COLOR_BIT array layer 0, mip level 0) to be in layout VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL--instead, current layout is VK_IMAGE_LAYOUT_UNDEFINED. Objects: 2 [0] 0x24086224ec0, type: 6, name: NULL [1] 0x6caffc00000000e3, type: 10, name: NULL ``` because the last access information of image/buffer resources are inaccurate during the second command buffer encoding, since the first command buffer never executed. > `propagate_resize()` will not call `encode_execute()` if dynamic shapes are not expected. Llama models will need to call `propagate_resize()`, to account for changes in the second input tensor (a scalar tensor which represents input token position). However, the cost of `encode_execute()` is costly, so we should avoid calling it if possible. ## Perf Impact * Performance improvement for first inference of dynamic shape models if actual tensor sizes are much smaller than maximum possible sizes * No impact for non-dynamic shape models ghstack-source-id: 289110008 Differential Revision: [D76047203](https://our.internmc.facebook.com/intern/diff/D76047203/)
1 parent 5100451 commit 0c73dc6

File tree

9 files changed

+41
-6
lines changed

9 files changed

+41
-6
lines changed

backends/vulkan/runtime/VulkanBackend.cpp

Lines changed: 17 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -140,6 +140,14 @@ GraphConfig get_graph_config(ArrayRef<CompileSpec>& compile_specs) {
140140

141141
config.set_memory_layout_override(memory_layout);
142142
}
143+
if (strcmp(spec.key, "require_dynamic_shapes") == 0) {
144+
ET_CHECK_MSG(value_size == sizeof(uint8_t), "Unexpected value size!");
145+
bool value = getBool(value_data);
146+
147+
if (value) {
148+
config.expect_dynamic_shapes = true;
149+
}
150+
}
143151
}
144152
#ifdef ET_EVENT_TRACER_ENABLED
145153
config.enable_querypool = true;
@@ -500,9 +508,12 @@ class VulkanBackend final : public ::executorch::runtime::BackendInterface {
500508
compute_graph->encode_prepack();
501509
compute_graph->prepack();
502510

503-
// TODO(ssjia): remove this once we can batch compile compute pipelines
504-
// during prepare().
505-
compute_graph->encode_execute();
511+
// If dynamic shapes are not expected, then the command buffer only needs to
512+
// be encoded once. Otherwise, wait until the first inference to encode the
513+
// the command buffer, when actual input shapes are known.
514+
if (!compute_graph->graphconfig().expect_dynamic_shapes) {
515+
compute_graph->encode_execute();
516+
}
506517

507518
return Error::Ok;
508519
}
@@ -574,7 +585,9 @@ class VulkanBackend final : public ::executorch::runtime::BackendInterface {
574585
// constants are updated and DynamicDispatchNode can update the compute
575586
// shader, global workgroup size, and local workgroup size to perform the
576587
// model inference.
577-
if (should_propagate_resize) {
588+
if (should_propagate_resize ||
589+
(compute_graph->graphconfig().expect_dynamic_shapes &&
590+
compute_graph->execute_count() == 0u)) {
578591
compute_graph->propagate_resize();
579592
}
580593

backends/vulkan/runtime/VulkanDelegateHeader.cpp

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,10 @@ uint32_t getUInt16LE(const uint8_t* data) {
6060
return (uint32_t)data[0] | ((uint32_t)data[1] << 8);
6161
}
6262

63+
bool getBool(const uint8_t* data) {
64+
return data[0] != 0;
65+
}
66+
6367
bool VulkanDelegateHeader::is_valid() const {
6468
if (header_size < kExpectedSize) {
6569
return false;

backends/vulkan/runtime/VulkanDelegateHeader.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,9 @@ uint64_t getUInt64LE(const uint8_t* data);
1919
uint32_t getUInt32LE(const uint8_t* data);
2020
uint32_t getUInt16LE(const uint8_t* data);
2121

22+
// Bool is serialized as a single byte
23+
bool getBool(const uint8_t* data);
24+
2225
struct VulkanDelegateHeader {
2326
bool is_valid() const;
2427

backends/vulkan/runtime/graph/ComputeGraph.cpp

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -761,7 +761,10 @@ void ComputeGraph::propagate_resize() {
761761
for (std::unique_ptr<ExecuteNode>& node : execute_nodes_) {
762762
node->trigger_resize(this);
763763
}
764-
encode_execute();
764+
// Only re-encode on resize if dynamic shapes are expected
765+
if (config_.expect_dynamic_shapes) {
766+
encode_execute();
767+
}
765768
}
766769

767770
} // namespace vkcompute

backends/vulkan/runtime/graph/GraphConfig.cpp

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,8 @@ GraphConfig::GraphConfig() {
6363

6464
enable_local_wg_size_override = false;
6565
local_wg_size_override = {};
66+
67+
expect_dynamic_shapes = false;
6668
}
6769

6870
void GraphConfig::set_storage_type_override(utils::StorageType storage_type) {

backends/vulkan/runtime/graph/GraphConfig.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,9 @@ struct GraphConfig final {
3333
bool enable_local_wg_size_override;
3434
utils::uvec3 local_wg_size_override;
3535

36+
// Whether or not the ComputeGraph should expect input shapes to be dynamic
37+
bool expect_dynamic_shapes;
38+
3639
// Generate a default graph config with pre-configured settings
3740
explicit GraphConfig();
3841

backends/vulkan/test/test_vulkan_delegate.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,9 @@ def lower_module(
4343
model: torch.nn.Module, sample_inputs: Tuple[torch.Tensor], dynamic_shapes=None
4444
) -> EdgeProgramManager:
4545
compile_options = {}
46+
if dynamic_shapes is not None:
47+
compile_options["require_dynamic_shapes"] = True
48+
4649
edge_compile_config = EdgeCompileConfig(
4750
_skip_dim_order=False, # TODO(T182928844): Delegate dim order op to backend.
4851
)
@@ -70,6 +73,9 @@ def quantize_and_lower_module(
7073
dynamic_shapes=None,
7174
) -> EdgeProgramManager:
7275
compile_options = {}
76+
if dynamic_shapes is not None:
77+
compile_options["require_dynamic_shapes"] = True
78+
7379
edge_compile_config = EdgeCompileConfig(
7480
_skip_dim_order=False, # TODO(T182928844): Delegate dim order op to backend.
7581
)

backends/vulkan/test/utils/test_utils.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -512,6 +512,7 @@ vkcompute::ComputeGraph build_mm_graph(
512512
const bool prepack_mat2) {
513513
using namespace vkcompute;
514514
GraphConfig config;
515+
config.expect_dynamic_shapes = true;
515516
ComputeGraph graph(config);
516517

517518
std::vector<int64_t> mat1_size = {M, K};

backends/vulkan/test/vulkan_compute_api_test.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2937,6 +2937,7 @@ void test_transpose_view_mm(
29372937
const int N,
29382938
utils::StorageType storage_type) {
29392939
GraphConfig config;
2940+
config.expect_dynamic_shapes = true;
29402941
config.set_storage_type_override(storage_type);
29412942
ComputeGraph graph(config);
29422943

@@ -2993,7 +2994,6 @@ void test_transpose_view_mm(
29932994
graph.prepare();
29942995
graph.encode_prepack();
29952996
graph.prepack();
2996-
graph.encode_execute();
29972997

29982998
for (int i = 1; i < 4; i++) {
29992999
float val_mat1 = i;

0 commit comments

Comments
 (0)