Skip to content
Matthew Monroe edited this page Sep 22, 2020 · 3 revisions

PBFGen reads in instrument data files and converts them to PBF files, the format used elsewhere in the Informed-Proteomics suite.

Input format support

Supported natively

  • *.mzML and *.mzML.gz

Supported with additional software

Note: x86/x64 matter with these additional softwares. PBFGen and other Informed-Proteomics tools that try to read these formats will give you an error if you do not have a correct version installed, telling you which version you need to install.

  • *.raw (Thermo *.raw format)
    • Assuming you run this program on a 64-bit computer, reading Thermo Finnigan .raw files is supported via the included RawFileReader DLLs (no additional software install required).
  • The following formats (and others) can be read if 64-bit ProteoWizard is installed (Download here):
    • *.mzXML, *.mzXML.gz
    • *.raw (Thermo *.raw format, also can be read via Thermo MSFileReader)
    • *.mzML, *.mzML.gz (Supported internally, but used for reading gzipped mzML files when available for speed/memory usage purposes)
    • *.mgf (Mascot Generic Format)
    • *.d folders (Agilent and Bruker .d folder formats)
    • *.wiff (AB Sciex *.wiff format)
    • *.u2, FID (older Bruker formats)
    • *.raw folders (Waters *.raw folder format)
    • *.lcd (Shimadzu *.lcd format)

Input format warning

We have only fully tested PBFGen, ProMex and MSPathFinder on *.mzML, *.mzML.gz, *.raw (Thermo), *.mzXML, and *.mzXML.gz formats (with centroided data in the mzML and mzXML formats; with Thermo *.raw we use Thermo's centroiding for profile data). While the full list of files above can be read for conversion into PBF, their valid conversion into PBF or their utility for performing feature finding or database searches is not tested and cannot be guaranteed in any fashion. For best results, the file provided to PBFGen or other tools in the Informed-Proteomics suite should contain scan data with MS and MS/MS data, with centroided peaks (when not a vendor format).

PBFGen Syntax

PbfGen version 1.0.7569 (September 21, 2020)

Usage: PbfGen.exe

  -?, -help            Show this help screen

  -s, arg#1            Required. Raw file path: *.raw or directory. (also
                       supports other input formats, see documentation)

  -o                   Output directory. Defaults to directory containing input
                       file.

  -start               Start scan number (use to limit scan range included in
                       .pbf file). (Default: -1, Min: -1)

  -end                 End scan number (use to limit scan range included in .pbf
                       file). (Default: -1, Min: -1)

  -ParamFile           Path to a file containing program parameters. Additional
                       arguments on the command line can supplement or override
                       the arguments in the param file. Lines starting with '#'
                       or ';' will be treated as comments; blank lines are
                       ignored. Lines that start with text that does not match a
                       parameter will also be ignored.

  -CreateParamFile     Create an example parameter file. Can supply a path; if
                       path is not supplied, the example parameter file content
                       will output to the console.

  NOTE:                arg#1, arg#2, etc. refer to positional arguments, used
                       like "AppName.exe [arg#1] [arg#2] [other args]".

Examples

These commands will create MyDataset.pbf in the same directory as MyDataset.raw or MyDataset.mzML:

PbfGen.exe -s MyDataset.raw
PbfGen.exe -s MyDataset.mzML

These commands will create MyDataset.pbf in the directory C:\WorkFolder:

PbfGen.exe -s MyDataset.raw -o C:\WorkFolder
PbfGen.exe -s MyDataset.mzML -o C:\WorkFolder

Optionally use -start and -end to limit the scan range to include in the .pbf file

PbfGen.exe -s Dataset.raw -start 2000 -end 3000

System Requirements and Recommendations

Minimum required:

  • .NET 4.7.2

Minimum recommended:

  • 2.4 GHz, quad-core CPU
  • 16 GB RAM
  • Windows 7 or newer
  • 250 GB hard drive

PBFGen needs a good amount of RAM available when writing the MS1 and MSn Full XICs, due to a need to hold a larger number of peaks in memory and sort them before writing them to disk. It will try to scale the amount it holds it memory according to the amount of free memory when it starts, but it does have a lower limit that it requires based on the number of scans in the input file. The amount of memory available can drastically affect the time it takes to process and write the full XICs to disk; the more memory available, the less time it may take.

The output file size will often be larger than the input file size, due to the data duplication in the PBF format to reduce the processing time for other programs. The primary exception to this is with data that is stored in profile mode, since the data will be centroided (if possible) before it is written in the PBF format.

Clone this wiki locally