Confusion about the 'warmup-decay' of the noise schedule

Hi, your work is outstanding, but I have a question.
**In the file gaussian_diffusion.py, line 36 as follow:**
elif schedule_name == 'warmup-decay':
        warmup_steps = max(1, int(warmup_steps_ratio * num_diffusion_timesteps))
        sqrt_steps = get_named_beta_schedule('sqrt', num_diffusion_timesteps)
        beta_mid = sqrt_steps[-warmup_steps]
        warmup = np.linspace(beta_mid, 0.0001, warmup_steps)
        return np.concatenate([sqrt_steps[:-warmup_steps], warmup])
**Why should the beta values of the last steps be reduced to 0.0001 and what is the benefit?**
**shouldn't the warm-up start from the very beginning, namely increasing to 0.0001, like this**
elif schedule_name == 'warmup-decay':
    warmup_steps = max(1, int(warmup_steps_ratio * num_diffusion_timesteps))
    warmup = np.linspace(0.0001, sqrt_steps[warmup_steps], warmup_steps) 
    return np.concatenate([warmup, sqrt_steps[warmup_steps:]])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Confusion about the 'warmup-decay' of the noise schedule #86

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Confusion about the 'warmup-decay' of the noise schedule #86

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions