Skip to content

IndexOutOfBoundsException on working code after integrating new working code that used ByteBuffer instead of MemorySegment #11294

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
neocoretechs opened this issue May 31, 2025 · 7 comments
Assignees
Labels

Comments

@neocoretechs
Copy link

IndexOutOfBoundsException after integrating code from https://github.com/mukel/qwen2.svm.java into Llama3.java. Code which worked with ByteBuffer now failing on MemorySegment. Failure occurs under Java 25 GraalVM vector extensions incubator under both windows 11 and Ubuntu 22.04.

Steps to reproduce the issue
git clone --depth 1 https://github.com/neocoretechs/Llama4jNoDeps
compile_rel.bat
jar_llama3.bat
chat_qwendb.bat

java version "25" 2025-09-16 LTS
Java(TM) SE Runtime Environment Oracle GraalVM 25-dev+20.1 (build 25+20-LTS-jvmci-b01)
Java HotSpot(TM) 64-Bit Server VM Oracle GraalVM 25-dev+20.1 (build 25+20-LTS-jvmci-b01, mixed mode, sharing)

  • OS: Windows 11 and Ubuntu 22.04
  • Architecture: AMD64 Threadripper and Ryzen 9 Hx370

NOTE: must acquire qwen2-7b-instruct-q8_0.gguf from HuggingFace etc. for proper model

llama3runwithQwen2IndexOutOfBounds.txt

Windows 11 - AMD Threadripper 32 gig. Same result under Ubuntu 22.04, Ryzen 9 Hx370 128 gig using -Xmn96g -Xms96g -Xmx96g

RUN WITH STACE TRACE:

C:\Users\Jon Groff\Downloads\llama4jNoDeps>C:\Progra~1\Java\graalvm-jdk-25+20.1\bin\java -server -XX:+UseParallelGC -Xmn26g  -Xms26g -Xmx26g --enable-preview --add-modules jdk.incubator.vector -jar Llama3.jar --model qwen2-7b-instruct-q8_0.gguf --chat -n -1 
[0.006s][warning][gc,ergo] NewSize (27262976k) is equal to or greater than initial heap size (27262976k).  A new NewSize of 27262464k will be used to accomodate an old generation.
[0.006s][warning][gc,ergo] MaxNewSize (27262976k) is equal to or greater than the entire heap (27262976k).  A new max generation size of 27262464k will be used.
WARNING: Using incubator modules: jdk.incubator.vector
Parse qwen2-7b-instruct-q8_0.gguf: 1159 millis
GGUF metadata:
{qwen2.block_count=28, tokenizer.ggml.add_bos_token=false, qwen2.embedding_length=3584, tokenizer.ggml.padding_token_id=151643, quantize.imatrix.chunks_count=1937, qwen2.feed_forward_length=18944, quantize.imatrix.entries_count=196, qwen2.attention.layer_norm_rms_epsilon=1.0E-6, tokenizer.ggml.merges=[Ljava.lang.String;@2e5d6d97, tokenizer.ggml.pre=qwen2, qwen2.attention.head_count_kv=4, general.architecture=qwen2, tokenizer.chat_template={% for message in messages %}{% if loop.first and messages[0]['role'] != 'system' %}{{ '<|im_start|>system
You are a helpful assistant.<|im_end|>
' }}{% endif %}{{'<|im_start|>' + message['role'] + '
' + message['content'] + '<|im_end|>' + '
'}}{% endfor %}{% if add_generation_prompt %}{{ '<|im_start|>assistant
' }}{% endif %}, general.file_type=7, general.name=qwen2-7b-instruct, general.quantization_version=2, tokenizer.ggml.token_type=[I@238e0d81, tokenizer.ggml.eos_token_id=151645, tokenizer.ggml.bos_token_id=151643, quantize.imatrix.dataset=../sft_2406.txt, qwen2.rope.freq_base=1000000.0, tokenizer.ggml.tokens=[Ljava.lang.String;@31221be2, tokenizer.ggml.model=gpt2, qwen2.context_length=32768, quantize.imatrix.file=../Qwen2/gguf/qwen2-7b-imatrix/imatrix.dat, qwen2.attention.head_count=28}
Tensor:blk.11.attn_q.bias=blk.11.attn_q.bias offset:2575927296 dims:[3584] number elems:3584 size:14336
Tensor:blk.3.attn_norm.weight=blk.3.attn_norm.weight offset:1322035200 dims:[3584] number elems:3584 size:14336
Tensor:blk.19.attn_v.weight=blk.19.attn_v.weight offset:5530279936 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.10.attn_q.bias=blk.10.attn_q.bias offset:2328268800 dims:[3584] number elems:3584 size:14336
Tensor:blk.12.attn_q.bias=blk.12.attn_q.bias offset:2823585792 dims:[3584] number elems:3584 size:14336
Tensor:blk.13.attn_q.bias=blk.13.attn_q.bias offset:3071244288 dims:[3584] number elems:3584 size:14336
Tensor:blk.8.attn_output.weight=blk.8.attn_output.weight offset:3872710656 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.15.attn_output.weight=blk.15.attn_output.weight offset:4512333824 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.11.attn_norm.weight=blk.11.attn_norm.weight offset:2343882752 dims:[3584] number elems:3584 size:14336
Tensor:blk.19.attn_q.bias=blk.19.attn_q.bias offset:5516615680 dims:[3584] number elems:3584 size:14336
Tensor:blk.18.attn_q.bias=blk.18.attn_q.bias offset:5268957184 dims:[3584] number elems:3584 size:14336
Tensor:blk.17.attn_q.bias=blk.17.attn_q.bias offset:5021298688 dims:[3584] number elems:3584 size:14336
Tensor:blk.23.attn_v.weight=blk.23.attn_v.weight offset:7099973632 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.0.attn_output.weight=blk.0.attn_output.weight offset:797456384 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.10.ffn_norm.weight=blk.10.ffn_norm.weight offset:2312654848 dims:[3584] number elems:3584 size:14336
Tensor:blk.15.attn_q.bias=blk.15.attn_q.bias offset:4525981696 dims:[3584] number elems:3584 size:14336
Tensor:blk.14.attn_q.bias=blk.14.attn_q.bias offset:3174596608 dims:[3584] number elems:3584 size:14336
Tensor:blk.16.attn_q.bias=blk.16.attn_q.bias offset:4773640192 dims:[3584] number elems:3584 size:14336
Tensor:blk.11.attn_q.weight=blk.11.attn_q.weight offset:2575941632 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.16.ffn_norm.weight=blk.16.ffn_norm.weight offset:4758026240 dims:[3584] number elems:3584 size:14336
Tensor:blk.13.ffn_norm.weight=blk.13.ffn_norm.weight offset:3055630336 dims:[3584] number elems:3584 size:14336
Tensor:blk.26.attn_q.weight=blk.26.attn_q.weight offset:7829299200 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.11.attn_k.weight=blk.11.attn_k.weight offset:2560329728 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.16.attn_k.weight=blk.16.attn_k.weight offset:4758042624 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.14.attn_v.weight=blk.14.attn_v.weight offset:3188260864 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.16.attn_q.weight=blk.16.attn_q.weight offset:4773654528 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.17.attn_q.weight=blk.17.attn_q.weight offset:5021313024 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.22.attn_q.bias=blk.22.attn_q.bias offset:6187423744 dims:[3584] number elems:3584 size:14336
Tensor:blk.26.ffn_down.weight=blk.26.ffn_down.weight offset:7597254656 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.8.attn_norm.weight=blk.8.attn_norm.weight offset:3654313984 dims:[3584] number elems:3584 size:14336
Tensor:blk.9.attn_q.bias=blk.9.attn_q.bias offset:4134017024 dims:[3584] number elems:3584 size:14336
Tensor:blk.21.attn_q.bias=blk.21.attn_q.bias offset:6011932672 dims:[3584] number elems:3584 size:14336
Tensor:blk.23.attn_q.bias=blk.23.attn_q.bias offset:7086309376 dims:[3584] number elems:3584 size:14336
Tensor:blk.23.attn_k.bias=blk.23.attn_k.bias offset:7070709760 dims:[512] number elems:512 size:2048
Tensor:blk.25.attn_k.bias=blk.25.attn_k.bias offset:7566026752 dims:[512] number elems:512 size:2048
Tensor:blk.8.attn_q.bias=blk.8.attn_q.bias offset:3886358528 dims:[3584] number elems:3584 size:14336
Tensor:blk.20.attn_q.bias=blk.20.attn_q.bias offset:5764274176 dims:[3584] number elems:3584 size:14336
Tensor:output.weight=output.weight offset:6203037696 dims:[3584, 152064] number elems:544997376 size:579059712
Tensor:blk.24.attn_q.bias=blk.24.attn_q.bias offset:7333967872 dims:[3584] number elems:3584 size:14336
Tensor:blk.17.ffn_up.weight=blk.17.ffn_up.weight offset:4933545984 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.3.attn_q.weight=blk.3.attn_q.weight offset:1554094080 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.8.ffn_up.weight=blk.8.ffn_up.weight offset:3798605824 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.22.attn_k.bias=blk.22.attn_k.bias offset:6171824128 dims:[512] number elems:512 size:2048
Tensor:blk.26.attn_k.bias=blk.26.attn_k.bias offset:7813685248 dims:[512] number elems:512 size:2048
Tensor:blk.7.attn_k.bias=blk.7.attn_k.bias offset:3623100416 dims:[512] number elems:512 size:2048
Tensor:blk.9.attn_k.bias=blk.9.attn_k.bias offset:4118417408 dims:[512] number elems:512 size:2048
Tensor:blk.20.ffn_down.weight=blk.20.ffn_down.weight offset:5532243968 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.21.attn_k.bias=blk.21.attn_k.bias offset:5996333056 dims:[512] number elems:512 size:2048
Tensor:blk.23.ffn_down.weight=blk.23.ffn_down.weight offset:6854279168 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.27.attn_k.bias=blk.27.attn_k.bias offset:8061343744 dims:[512] number elems:512 size:2048
Tensor:blk.11.ffn_gate.weight=blk.11.ffn_gate.weight offset:2416035840 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.14.ffn_gate.weight=blk.14.ffn_gate.weight offset:3086858240 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.25.attn_q.weight=blk.25.attn_q.weight offset:7581640704 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.8.attn_k.bias=blk.8.attn_k.bias offset:3870758912 dims:[512] number elems:512 size:2048
Tensor:blk.17.ffn_gate.weight=blk.17.ffn_gate.weight offset:4861407232 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.20.attn_k.bias=blk.20.attn_k.bias offset:5748674560 dims:[512] number elems:512 size:2048
Tensor:blk.2.attn_q.bias=blk.2.attn_q.bias offset:1306421248 dims:[3584] number elems:3584 size:14336
Tensor:blk.1.attn_q.bias=blk.1.attn_q.bias offset:1058762752 dims:[3584] number elems:3584 size:14336
Tensor:blk.7.attn_v.weight=blk.7.attn_v.weight offset:3652364288 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.0.attn_q.bias=blk.0.attn_q.bias offset:811104256 dims:[3584] number elems:3584 size:14336
Tensor:blk.4.attn_q.weight=blk.4.attn_q.weight offset:1801752576 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.16.attn_norm.weight=blk.16.attn_norm.weight offset:4541595648 dims:[3584] number elems:3584 size:14336
Tensor:blk.19.ffn_norm.weight=blk.19.ffn_norm.weight offset:5501001728 dims:[3584] number elems:3584 size:14336
Tensor:blk.21.attn_k.weight=blk.21.attn_k.weight offset:5996335104 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.20.ffn_up.weight=blk.20.ffn_up.weight offset:5676521472 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.25.attn_q.bias=blk.25.attn_q.bias offset:7581626368 dims:[3584] number elems:3584 size:14336
Tensor:blk.27.attn_q.bias=blk.27.attn_q.bias offset:8076943360 dims:[3584] number elems:3584 size:14336
Tensor:blk.9.attn_k.weight=blk.9.attn_k.weight offset:4118419456 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.22.ffn_norm.weight=blk.22.ffn_norm.weight offset:6854250496 dims:[3584] number elems:3584 size:14336
Tensor:blk.24.attn_k.bias=blk.24.attn_k.bias offset:7318368256 dims:[512] number elems:512 size:2048
Tensor:blk.25.ffn_norm.weight=blk.25.ffn_norm.weight offset:7566012416 dims:[3584] number elems:3584 size:14336
Tensor:blk.26.attn_q.bias=blk.26.attn_q.bias offset:7829284864 dims:[3584] number elems:3584 size:14336
Tensor:blk.16.ffn_up.weight=blk.16.ffn_up.weight offset:4685887488 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.5.ffn_gate.weight=blk.5.ffn_gate.weight offset:1889505280 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.8.ffn_gate.weight=blk.8.ffn_gate.weight offset:3726467072 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.2.ffn_gate.weight=blk.2.ffn_gate.weight offset:1146529792 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.21.attn_norm.weight=blk.21.attn_norm.weight offset:5779888128 dims:[3584] number elems:3584 size:14336
Tensor:blk.1.ffn_norm.weight=blk.1.ffn_norm.weight offset:1043148800 dims:[3584] number elems:3584 size:14336
Tensor:blk.1.attn_k.bias=blk.1.attn_k.bias offset:1043163136 dims:[512] number elems:512 size:2048
Tensor:blk.6.attn_v.weight=blk.6.attn_v.weight offset:2094274560 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.0.attn_k.bias=blk.0.attn_k.bias offset:795504640 dims:[512] number elems:512 size:2048
Tensor:blk.8.attn_k.weight=blk.8.attn_k.weight offset:3870760960 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.21.ffn_up.weight=blk.21.ffn_up.weight offset:5924179968 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.4.ffn_norm.weight=blk.4.ffn_norm.weight offset:1786124288 dims:[3584] number elems:3584 size:14336
Tensor:blk.3.attn_q.bias=blk.3.attn_q.bias offset:1554079744 dims:[3584] number elems:3584 size:14336
Tensor:blk.3.attn_k.bias=blk.3.attn_k.bias offset:1538480128 dims:[512] number elems:512 size:2048
Tensor:blk.5.attn_k.bias=blk.5.attn_k.bias offset:2033797120 dims:[512] number elems:512 size:2048
Tensor:blk.9.ffn_up.weight=blk.9.ffn_up.weight offset:4046264320 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.4.attn_q.bias=blk.4.attn_q.bias offset:1801738240 dims:[3584] number elems:3584 size:14336
Tensor:blk.7.ffn_norm.weight=blk.7.ffn_norm.weight offset:3623086080 dims:[3584] number elems:3584 size:14336
Tensor:blk.2.attn_k.bias=blk.2.attn_k.bias offset:1290821632 dims:[512] number elems:512 size:2048
Tensor:blk.6.attn_k.bias=blk.6.attn_k.bias offset:2065010688 dims:[512] number elems:512 size:2048
Tensor:blk.6.attn_q.bias=blk.6.attn_q.bias offset:2080610304 dims:[3584] number elems:3584 size:14336
Tensor:blk.5.attn_q.bias=blk.5.attn_q.bias offset:2049396736 dims:[3584] number elems:3584 size:14336
Tensor:blk.7.attn_q.bias=blk.7.attn_q.bias offset:3638700032 dims:[3584] number elems:3584 size:14336
Tensor:blk.20.attn_k.weight=blk.20.attn_k.weight offset:5748676608 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.4.attn_k.bias=blk.4.attn_k.bias offset:1786138624 dims:[512] number elems:512 size:2048
Tensor:blk.23.ffn_gate.weight=blk.23.ffn_gate.weight offset:6926417920 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.11.ffn_down.weight=blk.11.ffn_down.weight offset:2343897088 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.10.ffn_up.weight=blk.10.ffn_up.weight offset:2240516096 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.12.attn_k.bias=blk.12.attn_k.bias offset:2807986176 dims:[512] number elems:512 size:2048
Tensor:blk.14.attn_k.bias=blk.14.attn_k.bias offset:3158996992 dims:[512] number elems:512 size:2048
Tensor:blk.14.ffn_down.weight=blk.14.ffn_down.weight offset:4149645312 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.26.ffn_gate.weight=blk.26.ffn_gate.weight offset:7669393408 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.27.attn_v.weight=blk.27.attn_v.weight offset:8090607616 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.5.attn_v.weight=blk.5.attn_v.weight offset:2063060992 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.10.attn_q.weight=blk.10.attn_q.weight offset:2328283136 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.19.attn_norm.weight=blk.19.attn_norm.weight offset:5284571136 dims:[3584] number elems:3584 size:14336
Tensor:blk.20.ffn_gate.weight=blk.20.ffn_gate.weight offset:5604382720 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.10.attn_k.bias=blk.10.attn_k.bias offset:2312669184 dims:[512] number elems:512 size:2048
Tensor:blk.13.attn_output.weight=blk.13.attn_output.weight offset:3057596416 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.16.attn_k.bias=blk.16.attn_k.bias offset:4758040576 dims:[512] number elems:512 size:2048
Tensor:blk.18.attn_k.bias=blk.18.attn_k.bias offset:5253357568 dims:[512] number elems:512 size:2048
Tensor:blk.27.attn_norm.weight=blk.27.attn_norm.weight offset:7844898816 dims:[3584] number elems:3584 size:14336
Tensor:blk.18.attn_v.bias=blk.18.attn_v.bias offset:5282619392 dims:[512] number elems:512 size:2048
Tensor:blk.2.attn_norm.weight=blk.2.attn_norm.weight offset:1074376704 dims:[3584] number elems:3584 size:14336
Tensor:blk.2.attn_k.weight=blk.2.attn_k.weight offset:1290823680 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.27.ffn_up.weight=blk.27.ffn_up.weight offset:7989190656 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.7.attn_k.weight=blk.7.attn_k.weight offset:3623102464 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.17.ffn_down.weight=blk.17.ffn_down.weight offset:4789268480 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.0.attn_q.weight=blk.0.attn_q.weight offset:811118592 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.22.ffn_up.weight=blk.22.ffn_up.weight offset:6099685376 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.0.attn_v.weight=blk.0.attn_v.weight offset:824768512 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.15.attn_k.weight=blk.15.attn_k.weight offset:4510384128 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.15.attn_v.weight=blk.15.attn_v.weight offset:4539645952 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.15.attn_q.weight=blk.15.attn_q.weight offset:4525996032 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.24.ffn_up.weight=blk.24.ffn_up.weight offset:7246215168 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.8.ffn_down.weight=blk.8.ffn_down.weight offset:3654328320 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.2.attn_q.weight=blk.2.attn_q.weight offset:1306435584 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.18.attn_v.weight=blk.18.attn_v.weight offset:5282621440 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.13.attn_v.weight=blk.13.attn_v.weight offset:3084908544 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.17.attn_norm.weight=blk.17.attn_norm.weight offset:4789254144 dims:[3584] number elems:3584 size:14336
Tensor:blk.0.attn_norm.weight=blk.0.attn_norm.weight offset:579059712 dims:[3584] number elems:3584 size:14336
Tensor:blk.6.attn_norm.weight=blk.6.attn_norm.weight offset:3190210560 dims:[3584] number elems:3584 size:14336
Tensor:blk.9.attn_v.bias=blk.9.attn_v.bias offset:4147679232 dims:[512] number elems:512 size:2048
Tensor:blk.21.attn_v.bias=blk.21.attn_v.bias offset:6025594880 dims:[512] number elems:512 size:2048
Tensor:blk.10.attn_v.bias=blk.10.attn_v.bias offset:2341931008 dims:[512] number elems:512 size:2048
Tensor:blk.25.attn_norm.weight=blk.25.attn_norm.weight offset:7349581824 dims:[3584] number elems:3584 size:14336
Tensor:blk.5.attn_k.weight=blk.5.attn_k.weight offset:2033799168 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.23.attn_v.bias=blk.23.attn_v.bias offset:7099971584 dims:[512] number elems:512 size:2048
Tensor:blk.12.attn_v.bias=blk.12.attn_v.bias offset:2837248000 dims:[512] number elems:512 size:2048
Tensor:blk.27.attn_q.weight=blk.27.attn_q.weight offset:8076957696 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.13.ffn_up.weight=blk.13.ffn_up.weight offset:2983491584 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.5.ffn_down.weight=blk.5.ffn_down.weight offset:1817366528 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.27.attn_v.bias=blk.27.attn_v.bias offset:8090605568 dims:[512] number elems:512 size:2048
Tensor:blk.16.attn_v.bias=blk.16.attn_v.bias offset:4787302400 dims:[512] number elems:512 size:2048
Tensor:blk.22.attn_norm.weight=blk.22.attn_norm.weight offset:6782097408 dims:[3584] number elems:3584 size:14336
Tensor:blk.25.attn_v.bias=blk.25.attn_v.bias offset:7595288576 dims:[512] number elems:512 size:2048
Tensor:blk.2.ffn_down.weight=blk.2.ffn_down.weight offset:1074391040 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.14.attn_v.bias=blk.14.attn_v.bias offset:3188258816 dims:[512] number elems:512 size:2048
Tensor:blk.10.attn_output.weight=blk.10.attn_output.weight offset:2314620928 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.3.attn_v.weight=blk.3.attn_v.weight offset:1567744000 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.12.attn_output.weight=blk.12.attn_output.weight offset:2809937920 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.12.attn_q.weight=blk.12.attn_q.weight offset:2823600128 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.14.attn_norm.weight=blk.14.attn_norm.weight offset:4149630976 dims:[3584] number elems:3584 size:14336
Tensor:blk.25.ffn_up.weight=blk.25.ffn_up.weight offset:7493873664 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.14.attn_output.weight=blk.14.attn_output.weight offset:3160948736 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.4.attn_k.weight=blk.4.attn_k.weight offset:1786140672 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.12.ffn_up.weight=blk.12.ffn_up.weight offset:2735833088 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.16.attn_output.weight=blk.16.attn_output.weight offset:4759992320 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.17.attn_v.weight=blk.17.attn_v.weight offset:5034962944 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.13.attn_q.weight=blk.13.attn_q.weight offset:3071258624 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.17.attn_k.weight=blk.17.attn_k.weight offset:5005701120 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.7.attn_v.bias=blk.7.attn_v.bias offset:3652362240 dims:[512] number elems:512 size:2048
Tensor:blk.2.attn_v.weight=blk.2.attn_v.weight offset:1320085504 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.5.attn_v.bias=blk.5.attn_v.bias offset:2063058944 dims:[512] number elems:512 size:2048
Tensor:blk.3.attn_v.bias=blk.3.attn_v.bias offset:1567741952 dims:[512] number elems:512 size:2048
Tensor:blk.23.attn_output.weight=blk.23.attn_output.weight offset:7072661504 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.22.attn_output.weight=blk.22.attn_output.weight offset:6173775872 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.24.attn_output.weight=blk.24.attn_output.weight offset:7320320000 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.25.attn_output.weight=blk.25.attn_output.weight offset:7567978496 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.19.attn_output.weight=blk.19.attn_output.weight offset:5502967808 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.26.attn_output.weight=blk.26.attn_output.weight offset:7815636992 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.26.ffn_up.weight=blk.26.ffn_up.weight offset:7741532160 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.1.attn_v.bias=blk.1.attn_v.bias offset:1072424960 dims:[512] number elems:512 size:2048
Tensor:blk.11.ffn_up.weight=blk.11.ffn_up.weight offset:2488174592 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.17.attn_output.weight=blk.17.attn_output.weight offset:5007650816 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.18.attn_output.weight=blk.18.attn_output.weight offset:5255309312 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.19.attn_k.weight=blk.19.attn_k.weight offset:5501018112 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.14.attn_q.weight=blk.14.attn_q.weight offset:3174610944 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.16.attn_v.weight=blk.16.attn_v.weight offset:4787304448 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.18.attn_k.weight=blk.18.attn_k.weight offset:5253359616 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.3.attn_k.weight=blk.3.attn_k.weight offset:1538482176 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.27.attn_output.weight=blk.27.attn_output.weight offset:8063295488 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.21.attn_output.weight=blk.21.attn_output.weight offset:5998284800 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.20.attn_output.weight=blk.20.attn_output.weight offset:5750626304 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.1.attn_v.weight=blk.1.attn_v.weight offset:1072427008 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.1.ffn_up.weight=blk.1.ffn_up.weight offset:971010048 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.24.ffn_norm.weight=blk.24.ffn_norm.weight offset:7318353920 dims:[3584] number elems:3584 size:14336
Tensor:blk.11.attn_output.weight=blk.11.attn_output.weight offset:2562279424 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.23.ffn_up.weight=blk.23.ffn_up.weight offset:6998556672 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.23.attn_k.weight=blk.23.attn_k.weight offset:7070711808 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.19.ffn_gate.weight=blk.19.ffn_gate.weight offset:5356724224 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.27.ffn_norm.weight=blk.27.ffn_norm.weight offset:8061329408 dims:[3584] number elems:3584 size:14336
Tensor:blk.4.attn_output.weight=blk.4.attn_output.weight offset:1788090368 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.6.attn_q.weight=blk.6.attn_q.weight offset:2080624640 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.6.attn_k.weight=blk.6.attn_k.weight offset:2065012736 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.21.ffn_norm.weight=blk.21.ffn_norm.weight offset:5996318720 dims:[3584] number elems:3584 size:14336
Tensor:blk.6.ffn_up.weight=blk.6.ffn_up.weight offset:3334502400 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.19.attn_q.weight=blk.19.attn_q.weight offset:5516630016 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.9.ffn_down.weight=blk.9.ffn_down.weight offset:3901986816 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.6.ffn_down.weight=blk.6.ffn_down.weight offset:3190224896 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.13.ffn_gate.weight=blk.13.ffn_gate.weight offset:2911352832 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.10.ffn_gate.weight=blk.10.ffn_gate.weight offset:2168377344 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.14.ffn_up.weight=blk.14.ffn_up.weight offset:4221784064 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.21.ffn_down.weight=blk.21.ffn_down.weight offset:5779902464 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.24.ffn_down.weight=blk.24.ffn_down.weight offset:7101937664 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.23.attn_q.weight=blk.23.attn_q.weight offset:7086323712 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.27.ffn_down.weight=blk.27.ffn_down.weight offset:7844913152 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.16.ffn_gate.weight=blk.16.ffn_gate.weight offset:4613748736 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.5.attn_norm.weight=blk.5.attn_norm.weight offset:1817352192 dims:[3584] number elems:3584 size:14336
Tensor:blk.1.attn_norm.weight=blk.1.attn_norm.weight offset:826718208 dims:[3584] number elems:3584 size:14336
Tensor:blk.4.attn_v.weight=blk.4.attn_v.weight offset:1815402496 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.18.ffn_down.weight=blk.18.ffn_down.weight offset:5036926976 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.6.ffn_norm.weight=blk.6.ffn_norm.weight offset:3406641152 dims:[3584] number elems:3584 size:14336
Tensor:blk.9.ffn_norm.weight=blk.9.ffn_norm.weight offset:4118403072 dims:[3584] number elems:3584 size:14336
Tensor:blk.3.ffn_norm.weight=blk.3.ffn_norm.weight offset:1538465792 dims:[3584] number elems:3584 size:14336
Tensor:blk.22.ffn_gate.weight=blk.22.ffn_gate.weight offset:6027546624 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.15.ffn_down.weight=blk.15.ffn_down.weight offset:4293951488 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.12.ffn_down.weight=blk.12.ffn_down.weight offset:2591555584 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.0.ffn_norm.weight=blk.0.ffn_norm.weight offset:795490304 dims:[3584] number elems:3584 size:14336
Tensor:blk.1.attn_k.weight=blk.1.attn_k.weight offset:1043165184 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.1.attn_q.weight=blk.1.attn_q.weight offset:1058777088 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.7.ffn_gate.weight=blk.7.ffn_gate.weight offset:3478808576 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.4.ffn_gate.weight=blk.4.ffn_gate.weight offset:1641846784 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.1.ffn_gate.weight=blk.1.ffn_gate.weight offset:898871296 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.25.attn_v.weight=blk.25.attn_v.weight offset:7595290624 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.12.attn_v.weight=blk.12.attn_v.weight offset:2837250048 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.12.ffn_norm.weight=blk.12.ffn_norm.weight offset:2807971840 dims:[3584] number elems:3584 size:14336
Tensor:blk.15.ffn_norm.weight=blk.15.ffn_norm.weight offset:4510367744 dims:[3584] number elems:3584 size:14336
Tensor:blk.18.ffn_norm.weight=blk.18.ffn_norm.weight offset:5253343232 dims:[3584] number elems:3584 size:14336
Tensor:blk.25.ffn_gate.weight=blk.25.ffn_gate.weight offset:7421734912 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.9.attn_norm.weight=blk.9.attn_norm.weight offset:3901972480 dims:[3584] number elems:3584 size:14336
Tensor:blk.13.attn_k.weight=blk.13.attn_k.weight offset:3055646720 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.0.attn_k.weight=blk.0.attn_k.weight offset:795506688 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.24.attn_norm.weight=blk.24.attn_norm.weight offset:7101923328 dims:[3584] number elems:3584 size:14336
Tensor:blk.24.attn_q.weight=blk.24.attn_q.weight offset:7333982208 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.3.ffn_down.weight=blk.3.ffn_down.weight offset:1322049536 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.18.attn_q.weight=blk.18.attn_q.weight offset:5268971520 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.0.ffn_down.weight=blk.0.ffn_down.weight offset:579074048 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.11.attn_v.weight=blk.11.attn_v.weight offset:2589591552 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.26.attn_v.weight=blk.26.attn_v.weight offset:7842949120 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.0.ffn_up.weight=blk.0.ffn_up.weight offset:723351552 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.13.attn_norm.weight=blk.13.attn_norm.weight offset:2839199744 dims:[3584] number elems:3584 size:14336
Tensor:blk.14.attn_k.weight=blk.14.attn_k.weight offset:3158999040 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.15.ffn_up.weight=blk.15.ffn_up.weight offset:4438228992 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.11.attn_k.bias=blk.11.attn_k.bias offset:2560327680 dims:[512] number elems:512 size:2048
Tensor:blk.15.attn_k.bias=blk.15.attn_k.bias offset:4510382080 dims:[512] number elems:512 size:2048
Tensor:blk.17.attn_v.bias=blk.17.attn_v.bias offset:5034960896 dims:[512] number elems:512 size:2048
Tensor:blk.4.attn_norm.weight=blk.4.attn_norm.weight offset:1569693696 dims:[3584] number elems:3584 size:14336
Tensor:blk.10.attn_norm.weight=blk.10.attn_norm.weight offset:2096224256 dims:[3584] number elems:3584 size:14336
Tensor:blk.19.attn_v.bias=blk.19.attn_v.bias offset:5530277888 dims:[512] number elems:512 size:2048
Tensor:blk.6.attn_output.weight=blk.6.attn_output.weight offset:2066962432 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.10.attn_k.weight=blk.10.attn_k.weight offset:2312671232 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.17.attn_k.bias=blk.17.attn_k.bias offset:5005699072 dims:[512] number elems:512 size:2048
Tensor:blk.10.attn_v.weight=blk.10.attn_v.weight offset:2341933056 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.2.attn_output.weight=blk.2.attn_output.weight offset:1292773376 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.20.attn_norm.weight=blk.20.attn_norm.weight offset:5532229632 dims:[3584] number elems:3584 size:14336
Tensor:blk.8.ffn_norm.weight=blk.8.ffn_norm.weight offset:3870744576 dims:[3584] number elems:3584 size:14336
Tensor:blk.2.ffn_norm.weight=blk.2.ffn_norm.weight offset:1290807296 dims:[3584] number elems:3584 size:14336
Tensor:blk.5.ffn_norm.weight=blk.5.ffn_norm.weight offset:2033782784 dims:[3584] number elems:3584 size:14336
Tensor:blk.13.attn_k.bias=blk.13.attn_k.bias offset:3055644672 dims:[512] number elems:512 size:2048
Tensor:blk.22.attn_k.weight=blk.22.attn_k.weight offset:6171826176 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.12.attn_norm.weight=blk.12.attn_norm.weight offset:2591541248 dims:[3584] number elems:3584 size:14336
Tensor:blk.7.ffn_up.weight=blk.7.ffn_up.weight offset:3550947328 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.27.attn_k.weight=blk.27.attn_k.weight offset:8061345792 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.8.attn_v.bias=blk.8.attn_v.bias offset:3900020736 dims:[512] number elems:512 size:2048
Tensor:blk.20.attn_v.bias=blk.20.attn_v.bias offset:5777936384 dims:[512] number elems:512 size:2048
Tensor:blk.11.attn_v.bias=blk.11.attn_v.bias offset:2589589504 dims:[512] number elems:512 size:2048
Tensor:blk.5.attn_q.weight=blk.5.attn_q.weight offset:2049411072 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.12.attn_k.weight=blk.12.attn_k.weight offset:2807988224 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.13.attn_v.bias=blk.13.attn_v.bias offset:3084906496 dims:[512] number elems:512 size:2048
Tensor:blk.22.attn_q.weight=blk.22.attn_q.weight offset:6187438080 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.22.attn_v.bias=blk.22.attn_v.bias offset:6201085952 dims:[512] number elems:512 size:2048
Tensor:blk.24.attn_v.bias=blk.24.attn_v.bias offset:7347630080 dims:[512] number elems:512 size:2048
Tensor:blk.24.attn_v.weight=blk.24.attn_v.weight offset:7347632128 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.19.attn_k.bias=blk.19.attn_k.bias offset:5501016064 dims:[512] number elems:512 size:2048
Tensor:blk.8.attn_v.weight=blk.8.attn_v.weight offset:3900022784 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.2.ffn_up.weight=blk.2.ffn_up.weight offset:1218668544 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.26.attn_v.bias=blk.26.attn_v.bias offset:7842947072 dims:[512] number elems:512 size:2048
Tensor:blk.20.attn_v.weight=blk.20.attn_v.weight offset:5777938432 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.15.attn_v.bias=blk.15.attn_v.bias offset:4539643904 dims:[512] number elems:512 size:2048
Tensor:blk.1.ffn_down.weight=blk.1.ffn_down.weight offset:826732544 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.8.attn_q.weight=blk.8.attn_q.weight offset:3886372864 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.15.attn_norm.weight=blk.15.attn_norm.weight offset:4293937152 dims:[3584] number elems:3584 size:14336
Tensor:blk.9.ffn_gate.weight=blk.9.ffn_gate.weight offset:3974125568 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.4.ffn_down.weight=blk.4.ffn_down.weight offset:1569708032 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.7.ffn_down.weight=blk.7.ffn_down.weight offset:3406669824 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:token_embd.weight=token_embd.weight offset:0 dims:[3584, 152064] number elems:544997376 size:579059712
Tensor:blk.3.attn_output.weight=blk.3.attn_output.weight offset:1540431872 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.26.attn_k.weight=blk.26.attn_k.weight offset:7813687296 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.5.attn_output.weight=blk.5.attn_output.weight offset:2035748864 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.9.attn_output.weight=blk.9.attn_output.weight offset:4120369152 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.7.attn_output.weight=blk.7.attn_output.weight offset:3625052160 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.23.attn_norm.weight=blk.23.attn_norm.weight offset:6854264832 dims:[3584] number elems:3584 size:14336
Tensor:blk.25.attn_k.weight=blk.25.attn_k.weight offset:7566028800 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.1.attn_output.weight=blk.1.attn_output.weight offset:1045114880 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.3.ffn_gate.weight=blk.3.ffn_gate.weight offset:1394188288 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.6.ffn_gate.weight=blk.6.ffn_gate.weight offset:3262363648 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.3.ffn_up.weight=blk.3.ffn_up.weight offset:1466327040 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.18.ffn_up.weight=blk.18.ffn_up.weight offset:5181204480 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.9.attn_q.weight=blk.9.attn_q.weight offset:4134031360 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.6.attn_v.bias=blk.6.attn_v.bias offset:2094272512 dims:[512] number elems:512 size:2048
Tensor:blk.21.attn_q.weight=blk.21.attn_q.weight offset:6011947008 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.21.attn_v.weight=blk.21.attn_v.weight offset:6025596928 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.15.ffn_gate.weight=blk.15.ffn_gate.weight offset:4366090240 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.4.attn_v.bias=blk.4.attn_v.bias offset:1815400448 dims:[512] number elems:512 size:2048
Tensor:blk.9.attn_v.weight=blk.9.attn_v.weight offset:4147681280 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.20.ffn_norm.weight=blk.20.ffn_norm.weight offset:5748660224 dims:[3584] number elems:3584 size:14336
Tensor:blk.12.ffn_gate.weight=blk.12.ffn_gate.weight offset:2663694336 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.7.attn_q.weight=blk.7.attn_q.weight offset:3638714368 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:output_norm.weight=output_norm.weight offset:8092557312 dims:[3584] number elems:3584 size:14336
Tensor:blk.0.attn_v.bias=blk.0.attn_v.bias offset:824766464 dims:[512] number elems:512 size:2048
Tensor:blk.23.ffn_norm.weight=blk.23.ffn_norm.weight offset:7070695424 dims:[3584] number elems:3584 size:14336
Tensor:blk.0.ffn_gate.weight=blk.0.ffn_gate.weight offset:651212800 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.5.ffn_up.weight=blk.5.ffn_up.weight offset:1961644032 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.24.attn_k.weight=blk.24.attn_k.weight offset:7318370304 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.2.attn_v.bias=blk.2.attn_v.bias offset:1320083456 dims:[512] number elems:512 size:2048
Tensor:blk.19.ffn_up.weight=blk.19.ffn_up.weight offset:5428862976 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.4.ffn_up.weight=blk.4.ffn_up.weight offset:1713985536 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.11.ffn_norm.weight=blk.11.ffn_norm.weight offset:2560313344 dims:[3584] number elems:3584 size:14336
Tensor:blk.26.ffn_norm.weight=blk.26.ffn_norm.weight offset:7813670912 dims:[3584] number elems:3584 size:14336
Tensor:blk.13.ffn_down.weight=blk.13.ffn_down.weight offset:2839214080 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.14.ffn_norm.weight=blk.14.ffn_norm.weight offset:4293922816 dims:[3584] number elems:3584 size:14336
Tensor:blk.16.ffn_down.weight=blk.16.ffn_down.weight offset:4541609984 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.7.attn_norm.weight=blk.7.attn_norm.weight offset:3406655488 dims:[3584] number elems:3584 size:14336
Tensor:blk.18.attn_norm.weight=blk.18.attn_norm.weight offset:5036912640 dims:[3584] number elems:3584 size:14336
Tensor:blk.10.ffn_down.weight=blk.10.ffn_down.weight offset:2096238592 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.17.ffn_norm.weight=blk.17.ffn_norm.weight offset:5005684736 dims:[3584] number elems:3584 size:14336
Tensor:blk.19.ffn_down.weight=blk.19.ffn_down.weight offset:5284585472 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.26.attn_norm.weight=blk.26.attn_norm.weight offset:7597240320 dims:[3584] number elems:3584 size:14336
Tensor:blk.25.ffn_down.weight=blk.25.ffn_down.weight offset:7349596160 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.18.ffn_gate.weight=blk.18.ffn_gate.weight offset:5109065728 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.20.attn_q.weight=blk.20.attn_q.weight offset:5764288512 dims:[3584, 3584] number elems:12845056 size:13647872
Tensor:blk.22.attn_v.weight=blk.22.attn_v.weight offset:6201088000 dims:[3584, 512] number elems:1835008 size:1949696
Tensor:blk.27.ffn_gate.weight=blk.27.ffn_gate.weight offset:7917051904 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.22.ffn_down.weight=blk.22.ffn_down.weight offset:6782111744 dims:[18944, 3584] number elems:67895296 size:72138752
Tensor:blk.21.ffn_gate.weight=blk.21.ffn_gate.weight offset:5852041216 dims:[3584, 18944] number elems:67895296 size:72138752
Tensor:blk.24.ffn_gate.weight=blk.24.ffn_gate.weight offset:7174076416 dims:[3584, 18944] number elems:67895296 size:72138752
Load model: 332 millis
>hi
setFloat:0 of size:3584
setFloat:1 of size:3584
setFloat:2 of size:3584
setFloat:...SNIP

Exception in thread "main" java.lang.IndexOutOfBoundsException: Out of bound access on segment MemorySegment{ kind: mapped, address: 0x217e96b3f20, byteSize: 2048 }; new offset = 2048; new length = 4
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.outOfBoundException(AbstractMemorySegmentImpl.java:433)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.apply(AbstractMemorySegmentImpl.java:414)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.apply(AbstractMemorySegmentImpl.java:70)
        at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:98)
        at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:124)
        at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:448)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.checkBounds(AbstractMemorySegmentImpl.java:403)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.checkAccess(AbstractMemorySegmentImpl.java:357)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.checkEnclosingLayout(AbstractMemorySegmentImpl.java:362)
        at java.base/java.lang.invoke.SegmentVarHandle.checkSegment(SegmentVarHandle.java:92)
        at java.base/java.lang.invoke.VarHandleSegmentAsFloats.get(VarHandleSegmentAsFloats.java:59)
        at java.base/java.lang.invoke.VarHandleSegmentAsFloats.get(VarHandleSegmentAsFloats.java:53)
        at java.base/jdk.internal.foreign.AbstractMemorySegmentImpl.get(AbstractMemorySegmentImpl.java:754)
        at com.llama4j.FloatTensor.readFloat(Llama3.java:2174)
        at com.llama4j.F32FloatTensor.getFloat(Llama3.java:2946)
        at com.llama4j.FloatTensor.lambda$addInPlace$0(Llama3.java:2320)
        at com.llama4j.FloatTensor.mapWithIndexInPlace(Llama3.java:2314)
        at com.llama4j.FloatTensor.addInPlace(Llama3.java:2320)
        at com.llama4j.FloatTensor.addInPlace(Llama3.java:2324)
        at com.llama4j.Llama.forwardQwen(Llama3.java:1392)
        at com.llama4j.Llama.generateTokensQwen(Llama3.java:1596)
        at com.llama4j.Llama3.runInteractive(Llama3.java:149)
        at com.llama4j.Llama3.main(Llama3.java:338)
@fangerer
Copy link
Member

fangerer commented Jun 2, 2025

I was able to reproduce the problem (on Linux) but I think it is not related to GraalVM because I see exactly the same issue when running the example on OpenJDK 25 (EA build).

@neocoretechs
Copy link
Author

Should I log this under the JDK issue tracker here?
https://bugreport.java.com/bugreport/

@neocoretechs
Copy link
Author

Just FYI I was encountering the same problem trying to run the Gemma1.1-2B and DevstralQ4 but it works fine with all the Llama's and Mistral-7b-Q8

@wirthi
Copy link
Member

wirthi commented Jun 3, 2025

It feels more like this is a problem in @mukel 's code. Can you have a look? Might still be a Java problem but this seems rather unlikely.

@neocoretechs
Copy link
Author

Logged bug on bugreport.java.com internal review Id 9078583

@neocoretechs
Copy link
Author

It feels more like this is a problem in @mukel 's code. Can you have a look? Might still be a Java problem but this seems rather unlikely.

If you look at what it's doing it's fairly straightforward; its traversing an array backed by a MemorySegment. It gets through it and then it doesn't for no logical reason.

@neocoretechs
Copy link
Author

neocoretechs commented Jun 13, 2025

I figured out the new Magistral model and integrated it and what do ya know, same IndexOutOfBoundsException! But now it's in vector dot. BTW the report says Oracle confirmed its in JDK 21 as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants