Incorrect MD5 checksums being used for some files

### Description of the bug

I've been having an issue with some files failing at checksum in some studies. Upon investigation, for at least some of these failing samples, it appears to be due to the pipeline not picking the correct MD5 value from the metadata. 

For example, manually downloading the this file finishes and yields a `7b730` checksum:

```
(aspera) jonsan@nf-head:~/fetchngs/EA_pharma/fetchngs_exec/test$ ascp     -QT -l 300m -P33001     -i $CONDA_PREFIX/etc/aspera/aspera_bypass_dsa.pem     era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/SRR170/000/SRR17001000/SRR17001000.fastq.gz     SRX13191258_SRR17001000_1.fastq.gz
SRX13191258_SRR17001000_1.fastq.gz                                                                                                                 100%   26MB 15.3Mb/s    00:10
Completed: 26745K bytes transferred in 11 seconds
 (19634K bits/sec), in 1 file.
(aspera) jonsan@nf-head:~/fetchngs/EA_pharma/fetchngs_exec/test$ md5sum SRX13191258_SRR17001000_1.fastq.gz                                                                           7b7e0af5429bcb54b2c232489ea8212b  SRX13191258_SRR17001000_1.fastq.gz
```

However, looking at the `command.sh` file for this operation, the pipeline is comparing with a `3fcee` checksum:

```
#!/bin/bash -euo pipefail
ascp \
    -QT -l 300m -P33001 \
    -i $CONDA_PREFIX/etc/aspera/aspera_bypass_dsa.pem \
    era-fasp@fasp.sra.ebi.ac.uk:/vol1/fastq/SRR170/000/SRR17001000/SRR17001000.fastq.gz \
    SRX13191258_SRR17001000_1.fastq.gz

echo "3fcee2e72a2ec6221cac142538aff092  SRX13191258_SRR17001000_1.fastq.gz" > SRX13191258_SRR17001000_1.fastq.gz.md5
md5sum -c SRX13191258_SRR17001000_1.fastq.gz.md5
```

If we look at the metadata downloaded for this run, we see both checksums being represented, but in different columns:

```
fastq_md5	**7b7e0**af5429bcb54b2c232489ea8212b**;3fcee**2e72a2ec6221cac142538aff092;383df08e03e1cd1ee071fd67c16b085b
fastq_bytes	27387589;1445187226;1481254395
fastq_ftp	ftp.sra.ebi.ac.uk/vol1/fastq/SRR170/000/SRR17001000/SRR17001000.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR170/000/SRR17001000/SRR17001000_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR170/000/SRR17001000/SRR17001000_2.fastq.gz
fastq_galaxy	ftp.sra.ebi.ac.uk/vol1/fastq/SRR170/000/SRR17001000/SRR17001000.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR170/000/SRR17001000/SRR17001000_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR170/000/SRR17001000/SRR17001000_2.fastq.gz
fastq_aspera	fasp.sra.ebi.ac.uk:/vol1/fastq/SRR170/000/SRR17001000/SRR17001000.fastq.gz;fasp.sra.ebi.ac.uk:/vol1/fastq/SRR170/000/SRR17001000/SRR17001000_1.fastq.gz;fasp.sra.ebi.ac.uk:/vol1/fastq/SRR170/000/SRR17001000/SRR17001000_2.fastq.gz
fastq_1	ftp.sra.ebi.ac.uk/vol1/fastq/SRR170/000/SRR17001000/SRR17001000_1.fastq.gz
fastq_2	ftp.sra.ebi.ac.uk/vol1/fastq/SRR170/000/SRR17001000/SRR17001000_2.fastq.gz
md5_1	**3fcee**2e72a2ec6221cac142538aff092
md5_2	383df08e03e1cd1ee071fd67c16b085b
```

It appears as if there are three fastq files, and the workflow is grabbing the first one (maybe an index read? it's much smaller than the other two) and renaming it `_1.fastq.gz`, then comparing against the latter's MD5. I haven't looked in the code yet to determine where the logic is that's splitting reads 1 and 2, but it appears that it might be making too liberal an assumption about the structure of the `fastq_ftp` field?

Maybe related to issue #260 ?

Either way, this is leading to failed downloads, it seems like it might properly be considered a bug. 

### Command used and terminal output

```console

```

### Relevant files

_No response_

### System information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect MD5 checksums being used for some files #331

Description of the bug

Command used and terminal output

Relevant files

System information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect MD5 checksums being used for some files #331

Description

Description of the bug

Command used and terminal output

Relevant files

System information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions