-
Notifications
You must be signed in to change notification settings - Fork 8
PBFGen Usage
PBFGen reads in instrument data files and converts them to PBF files, the format used elsewhere in the Informed-Proteomics suite.
- *.mzML and *.mzML.gz
Note: x86/x64 matter with these additional softwares. PBFGen and other Informed-Proteomics tools that try to read these formats will give you an error if you do not have a correct version installed, telling you which version you need to install.
- *.raw (Thermo *.raw format)
- Assuming you run this program on a 64-bit computer, reading Thermo Finnigan .raw files is supported via the included RawFileReader DLLs (no additional software install required).
- The following formats (and others) can be read if 64-bit ProteoWizard is installed (Download here):
- *.mzXML, *.mzXML.gz
- *.raw (Thermo *.raw format, also can be read via Thermo MSFileReader)
- *.mzML, *.mzML.gz (Supported internally, but used for reading gzipped mzML files when available for speed/memory usage purposes)
- *.mgf (Mascot Generic Format)
- *.d folders (Agilent and Bruker .d folder formats)
- *.wiff (AB Sciex *.wiff format)
- *.u2, FID (older Bruker formats)
- *.raw folders (Waters *.raw folder format)
- *.lcd (Shimadzu *.lcd format)
We have only fully tested PBFGen, ProMex and MSPathFinder on *.mzML, *.mzML.gz, *.raw (Thermo), *.mzXML, and *.mzXML.gz formats (with centroided data in the mzML and mzXML formats; with Thermo *.raw we use Thermo's centroiding for profile data). While the full list of files above can be read for conversion into PBF, their valid conversion into PBF or their utility for performing feature finding or database searches is not tested and cannot be guaranteed in any fashion. For best results, the file provided to PBFGen or other tools in the Informed-Proteomics suite should contain scan data with MS and MS/MS data, with centroided peaks (when not a vendor format).
PbfGen version 1.0.7569 (September 21, 2020)
Usage: PbfGen.exe
-?, -help Show this help screen
-s, arg#1 Required. Raw file path: *.raw or directory. (also
supports other input formats, see documentation)
-o Output directory. Defaults to directory containing input
file.
-start Start scan number (use to limit scan range included in
.pbf file). (Default: -1, Min: -1)
-end End scan number (use to limit scan range included in .pbf
file). (Default: -1, Min: -1)
-ParamFile Path to a file containing program parameters. Additional
arguments on the command line can supplement or override
the arguments in the param file. Lines starting with '#'
or ';' will be treated as comments; blank lines are
ignored. Lines that start with text that does not match a
parameter will also be ignored.
-CreateParamFile Create an example parameter file. Can supply a path; if
path is not supplied, the example parameter file content
will output to the console.
NOTE: arg#1, arg#2, etc. refer to positional arguments, used
like "AppName.exe [arg#1] [arg#2] [other args]".
These commands will create MyDataset.pbf in the same directory as MyDataset.raw or MyDataset.mzML:
PbfGen.exe -s MyDataset.raw
PbfGen.exe -s MyDataset.mzML
These commands will create MyDataset.pbf in the directory C:\WorkFolder:
PbfGen.exe -s MyDataset.raw -o C:\WorkFolder
PbfGen.exe -s MyDataset.mzML -o C:\WorkFolder
Optionally use -start and -end to limit the scan range to include in the .pbf file
PbfGen.exe -s Dataset.raw -start 2000 -end 3000
Minimum required:
- .NET 4.7.2
Minimum recommended:
- 2.4 GHz, quad-core CPU
- 16 GB RAM
- Windows 7 or newer
- 250 GB hard drive
PBFGen needs a good amount of RAM available when writing the MS1 and MSn Full XICs, due to a need to hold a larger number of peaks in memory and sort them before writing them to disk. It will try to scale the amount it holds it memory according to the amount of free memory when it starts, but it does have a lower limit that it requires based on the number of scans in the input file. The amount of memory available can drastically affect the time it takes to process and write the full XICs to disk; the more memory available, the less time it may take.
The output file size will often be larger than the input file size, due to the data duplication in the PBF format to reduce the processing time for other programs. The primary exception to this is with data that is stored in profile mode, since the data will be centroided (if possible) before it is written in the PBF format.