Skip to content

Commit 10786e3

Browse files
authored
fixing DS yaml by adding gradient clipping: 0.3, and small update to README (#726)
1 parent cbb109d commit 10786e3

File tree

2 files changed

+15
-11
lines changed

2 files changed

+15
-11
lines changed

llama2_70b_lora/README.md

Lines changed: 14 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ pip install -r requirements.txt
1616
1717
You will also need to run the following to install flash attention:
1818
```
19-
pip install flash-attn --no-build-isolation
19+
pip install flash-attn==2.1.0 --no-build-isolation
2020
```
2121

2222
> For flash attention, make sure that the following command returns 0:
@@ -52,27 +52,30 @@ As defaults the scripts assume the model is under at ```./llama-v2-fused-qkv```
5252
Run:
5353
```bash
5454
accelerate launch --config_file configs/default_config.yaml scripts/train.py \
55-
--model_name meta-llama/Llama-2-70b-hf \
56-
--dataset_name "tau/scrolls" --dataset_config_name "gov_report" \
55+
--dataset_path "./dataset" \
56+
--model_path "/software/users/ihubara/lora_clean/llama-v2-fused-qkv" \
5757
--max_seq_len 8192 \
5858
--bf16 True \
59-
--logging_steps 1 \
60-
--eval_steps 22 \
61-
--output_dir "/tmp/llama-70b" \
59+
--logging_steps 24 \
60+
--eval_steps 48 \
61+
--output_dir "./results/llama-70b_scrolls_gov_report_r16_$1" \
6262
--per_device_train_batch_size 1 \
6363
--gradient_accumulation_steps 1 \
64-
--dataset_text_field "input" \
6564
--lr_scheduler_type "cosine" \
66-
--learning_rate 1e-3 \
67-
--warmup_ratio 0.03 \
65+
--learning_rate 4e-4 \
66+
--weight_decay 0.0001 \
67+
--warmup_ratio 0 \
68+
--max_grad_norm 0.3 \
6869
--use_gradient_checkpointing True \
70+
--target_eval_loss 0.925 \
6971
--use_peft_lora True \
7072
--lora_r 16 \
7173
--lora_alpha 32 \
7274
--lora_dropout 0.1 \
73-
--max_steps 440 \
75+
--max_steps 1024 \
7476
--use_flash_attn \
75-
--lora_target_modules "q_proj,v_proj,k_proj,o_proj"
77+
--seed 1234 \
78+
--lora_target_modules "qkv_proj,o_proj"
7679
```
7780
where the Accelerate config file is [this one](https://github.com/regisss/lora/blob/main/configs/default_config.yaml).
7881

llama2_70b_lora/configs/default_config.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
compute_environment: LOCAL_MACHINE
22
debug: false
33
deepspeed_config:
4+
gradient_clipping: 0.3
45
gradient_accumulation_steps: 1
56
offload_optimizer_device: none
67
offload_param_device: none

0 commit comments

Comments
 (0)