Usage: sirius [-hV] [--noCite] [--recompute] [--buffer=<initialInstanceBuffer>]
[--cores=<numOfCores>] [--log=<logLevel>] [--maxmz=<maxMz>]
[--workspace=<workspace>] [[-o=<outputProjectLocation>]
[--update-fingerprint-version]] [[-i=<inputPath>[,<inputPath>...]
[-i=<inputPath>[,<inputPath>...]]... [--ignore-formula]
[--allow-ms1-only]] [-z=<parentMz> [-1=<ms1File>[,<ms1File>...]]
[--adduct=<ionType>] -2=<ms2File>[,<ms2File>...]
[-f=<formula>]]...] [COMMAND]
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
--log, --loglevel=<logLevel>
Set logging level of the Jobs SIRIUS will execute.
Valid values: SEVERE, WARNING, INFO, FINER, ALL
Default: WARNING
--cores, --threads, --processors=<numOfCores>
Number of simultaneous worker thread to be used for
compute intense workload. If not specified SIRIUS
chooses a reasonable number based you CPU specs.
--buffer, --instance-buffer=<initialInstanceBuffer>
Number of instances that will be loaded into the
Memory. A larger buffer ensures that there are
enough instances available to use all cores
efficiently during computation. A smaller buffer
saves Memory. To load all instances immediately
set it to -1. Default (numeric value 0): 3 x
--cores. Note that for <DATASET_TOOLS> the
compound buffer may have no effect because this
tools may have to load compounds simultaneously
into the memory.
Default: 0
--workspace=<workspace>
Specify sirius workspace location. This is the
directory for storing Property files, logs,
databases and caches. This is NOT for the
project-space that stores the results! Default is
$USER_HOME/.sirius-<MINOR_VERSION>
--recompute Recompute results of ALL tools where results are
already present. Per default already present
results will be preserved and the instance will
be skipped for the corresponding Task/Tool
--maxmz=<maxMz> Only considers compounds with a precursor m/z lower
or equal [--maxmz]. All other compounds in the
input will be skipped.
Default: Infinity
--noCite, --noCitations, --no-citations
Do not write summary files to the project-space
Specify OUTPUT Project-Space:
-o, -p, --output, --project=<outputProjectLocation>
Specify the project-space to write into. If no
[--input] is specified it is also used as input.
For compression use the File ending .zip or .
sirius.
--update-fingerprint-version
Updates Fingerprint versions of the input project
to the one used by this SIRIUS version.
WARNING: All Fingerprint related results (CSI:
FingerID, CANOPUS) will be lost!
Specify multi-compound inputs (.ms, .mgf, .mzML/.mzXml, .sirius):
-i, --input=<inputPath>[,<inputPath>...]
Specify the input in multi-compound input formats:
Preprocessed mass spectra in .ms or .mgf file
format or LC/MS runs in .mzML/.mzXml format but
also any other file type e.g. to provide input
for STANDALONE tools.
--ignore-formula ignore given molecular formula if present in .ms or
.mgf input files.
--allow-ms1-only Allow MS1 only data to be imported.
Specify generic inputs (CSV) on per compound level:
-1, --ms1=<ms1File>[,<ms1File>...]
MS1 spectra files
-2, --ms2=<ms2File>[,<ms2File>...]
MS2 spectra files
-z, --mz, --precursor, --parentmass=<parentMz>
The mass of the parent ion for the specified ms2
spectra
--adduct, --ionization=<ionType>
Specify the adduct for this compound
Default: [M+?]+
-f, --formula=<formula> Specify the neutralized formula of this compound.
This will be used for tree computation. If given
no mass decomposition will be performed.
Commands:
config
<CONFIGURATION> Override all
possible default configurations
of this toolbox from the command
line.
custom-db, DB <STANDALONE> Generate a custom
searchable structure/spectral
database. Import multiple files
with compounds into this DB.
similarity <STANDALONE> Computes the
similarity between all compounds
in the dataset and outputs a
matrix of similarities.
decomp, mass-decomposition <STANDALONE> Small tool to
decompose masses with given
deviation, ionization, chemical
alphabet and chemical filter.
mgf-export, MGF <STANDALONE> Exports the spectra of
a given input as mgf.
fingerprinter, FP <STANDALONE> Compute SIRIUS
compatible fingerprints from
PubChem standardized SMILES in
tsv format.
service, rest, REST <STANDALONE> Starts SIRIUS as a
background (REST) service that
can be requested via a REST-API.
login <STANDALONE> Allows a user to login
for SIRIUS Webservices (e.g. CSI:
FingerID or CANOPUS) and securely
store a personal access token.
settings <STANDALONE> Configure persistent
(technical) settings of SIRIUS (e.
g. ProxySettings or ILP Solver).
install-autocompletion <INSTALL> generates and installs an
Autocompletion-Script with all
subcommands. Default installation
is for the current user.
summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write
Summary files from a given
project-space into the given
project-space or a custom
location.
lcms-align, A <PREPROCESSING> Align and merge
compounds of multiple LCMS Runs.
Use this tool if you want to
import from mzML/mzXml.
denovo-structures, msnovelist <COMPOUND TOOL> Predict MsNovelist
compound candidates and compare
them against molecular
fingerprint using CSI:FingerID
scoring method.
spectra-search, library-search <COMPOUND TOOL> Computes the
similarity between all
compounds/features in the
project-space (queries) one vs
all spectra in the selected
databases.
formulas, trees, formula, sirius <COMPOUND TOOL> Identify molecular
formula for each compound
individually using fragmentation
trees and isotope patterns.
structures, structure-db-search, structure
<COMPOUND TOOL> Search in molecular
structure db for each compound
Individually using CSI:FingerID
structure database search.
zodiac, rerank-formulas <DATASET TOOL> Identify Molecular
formulas of all compounds in a
dataset together using ZODIAC.
classes, canopus, compound-classes <COMPOUND TOOL> Predict compound
categories for each compound
individually based on its
predicted molecular fingerprint
(CSI:FingerID) using CANOPUS.
fingerprints, fingerprint <COMPOUND TOOL> Predict molecular
fingerprint from MS/MS and
fragmentation trees for each
compound individually using CSI:
FingerID fingerprint prediction.
Usage: sirius login [-hV] [--clear] [--limits] [--request-token-only] [--show]
[--select-license=<sid>] [[-u=<username> -p] |
[--token=<token>] | [--password-env=<password>
--user-env=<username>]]
<STANDALONE> Allows a user to login for SIRIUS Webservices (e.g. CSI:FingerID
or CANOPUS) and securely store a personal access token.
--clear, --logout Logout. Deletes stored refresh and access token
(re-login required to use webservices again).
-h, --help Show this help message and exit.
--limits, --license-info
Show license information and compound limits.
-p, --pwd, --password Console password input.
--password-env=<password>
Environment variable with login password.
--request-token-only Requests and prints a new SECRET refresh token but
does not store the token as login.
This can be used to request a token to be used in
third party applications that wish to call
SIRIUS Web Services using your account.
Do never store your username and password in third
party apps.
Do not store the output of this command in any
log. We recommend redirecting the output into a
file.
--select-license, --select-subscription=<sid>
Specify active subscription (sid) if multiple
licenses are available at your account.
Available subscriptions can be listed with
'--show'
--show Show profile information about the profile you are
logged in with.
--token=<token> Refresh token to use as login.
-u, --user, --email=<username>
Login username/email
--user-env=<username> Environment variable with login username.
-V, --version Print version information and exit.
Usage: sirius config [-hV] [--AdductSettings.detectable=[M+H]+,[M+K]+,[M+Na]+,
[M+H-H2O]+,[M+H-H4O2]+,[M+NH3+H]+,[M+FA+H]+,[M+ACN+H]+,
[2M+H]+,[2M+K]+,[2M+Na]+,[M-H]-,[M+Cl]-,[M+Br]-,[M-H2O-H]-,
[M+Na-2H]-,[M+CH2O2-H]-,[M+C2H4O2-H]-,[M+H2O-H]-,[M-H3N-H]
-,[M-CO2-H]-,[M-CH2O3-H]-,[M-CH3-H]-,[2M+H]-,[2M+Cl]-,
[2M+Br]-] [--AdductSettings.enforced=,] [--AdductSettings.
fallback=[M+H]+,[M-H]-,[M+Na]+,[M+K]+] [--AdductSettings.
ignoreDetectedAdducts=false] [--AdductSettings.
prioritizeInputFileAdducts=true]
[--AlgorithmProfile=default] [--CandidateFormulas=,]
[--CompoundQuality=UNKNOWN]
[--ConfidenceScoreApproximateDistance=2]
[--EnforceElGordoFormula=True]
[--ExpansiveSearchConfidenceMode.
confidenceScoreSimilarityMode=APPROXIMATE]
[--ExpansiveSearchConfidenceMode.confPubChemFactor=0.5]
[--ForbidRecalibration=ALLOWED]
[--FormulaResultThreshold=true] [--FormulaSearchDB=none]
[--FormulaSearchSettings.
applyFormulaConstraintsToBottomUp=false]
[--FormulaSearchSettings.
applyFormulaConstraintsToDatabaseCandidates=false]
[--FormulaSearchSettings.performBottomUpAboveMz=0]
[--FormulaSearchSettings.performDeNovoBelowMz=400]
[--FormulaSettings.detectable=S,Br,Cl,B,Se]
[--FormulaSettings.enforced=C,H,N,O,P] [--FormulaSettings.
fallback=S] [--InjectSpectralLibraryMatchFormulas.
alwaysPredict=true] [--InjectSpectralLibraryMatchFormulas.
injectFormulas=true] [--InjectSpectralLibraryMatchFormulas.
minPeakMatchesToInject=6]
[--InjectSpectralLibraryMatchFormulas.minScoreToInject=.7]
[--IsotopeMs2Settings=IGNORE] [--IsotopeSettings.
filter=True] [--IsotopeSettings.multiplier=1]
[--MedianNoiseIntensity=0.015] [--MotifDbFile=none] [--ms1.
absoluteIntensityError=0.02] [--ms1.
minimalIntensityToConsider=0.01] [--ms1.
relativeIntensityError=0.08] [--MS1MassDeviation.
allowedMassDeviation=10.0 ppm] [--MS1MassDeviation.
massDifferenceDeviation=5.0 ppm] [--MS1MassDeviation.
standardMassDeviation=10.0 ppm] [--MS2MassDeviation.
allowedMassDeviation=10.0 ppm] [--MS2MassDeviation.
standardMassDeviation=10.0 ppm] [--NoiseThresholdSettings.
absoluteThreshold=0] [--NoiseThresholdSettings.
basePeak=NOT_PRECURSOR] [--NoiseThresholdSettings.
intensityThreshold=0.005] [--NoiseThresholdSettings.
maximalNumberOfPeaks=60] [--NumberOfCandidates=10]
[--NumberOfCandidatesPerIonization=1]
[--NumberOfMsNovelistCandidates=128]
[--NumberOfStructureCandidates=10000]
[--PossibleAdductSwitches=[M+Na]+:[M+H]+,[M+K]+:[M+H]+,
[M+Cl]-:[M-H]-] [--PrintCitations=True]
[--RecomputeResults=False]
[--SpectralMatchingMassDeviation.allowedPeakDeviation=10.0
ppm] [--SpectralMatchingMassDeviation.
allowedPrecursorDeviation=10.0 ppm]
[--SpectralMatchingScorer=MODIFIED_COSINE]
[--SpectralSearchDB=ALL] [--SpectralSearchLog=10]
[--StructureSearchDB=BIO] [--TagStructuresByElGordo=True]
[--Timeout.secondsPerInstance=0] [--Timeout.
secondsPerTree=0] [--UseHeuristic.useHeuristicAboveMz=300]
[--UseHeuristic.useOnlyHeuristicAboveMz=650]
[--ZodiacClusterCompounds=false]
[--ZodiacEdgeFilterThresholds.minLocalCandidates=1]
[--ZodiacEdgeFilterThresholds.minLocalConnections=10]
[--ZodiacEdgeFilterThresholds.thresholdFilter=0.95]
[--ZodiacEpochs.burnInPeriod=2000] [--ZodiacEpochs.
iterations=20000] [--ZodiacEpochs.numberOfMarkovChains=10]
[--ZodiacLibraryScoring.lambda=1000]
[--ZodiacLibraryScoring.minCosine=0.5]
[--ZodiacNumberOfConsideredCandidatesAt300Mz=10]
[--ZodiacNumberOfConsideredCandidatesAt800Mz=50]
[--ZodiacRatioOfConsideredCandidatesPerIonization=0.2]
[--ZodiacRunInTwoSteps=true] [COMMAND]
<CONFIGURATION> Override all possible default configurations of this toolbox
from the command line.
--AdductSettings.detectable=[M+H]+,[M+K]+,[M+Na]+,[M+H-H2O]+,[M+H-H4O2]+,
[M+NH3+H]+,[M+FA+H]+,[M+ACN+H]+,[2M+H]+,[2M+K]+,[2M+Na]+,[M-H]-,[M+Cl]-,
[M+Br]-,[M-H2O-H]-,[M+Na-2H]-,[M+CH2O2-H]-,[M+C2H4O2-H]-,[M+H2O-H]-,
[M-H3N-H]-,[M-CO2-H]-,[M-CH2O3-H]-,[M-CH3-H]-,[2M+H]-,[2M+Cl]-,[2M+Br]-
Detectable ion modes which are only considered if
there is an indication in the MS1 scan (e.g.
correct mass delta).
--AdductSettings.enforced=,
Describes how to deal with Adducts:
Pos Examples: [M+H]+,[M]+,[M+K]+,[M+Na]+,[M+H-H2O]+,
[M+Na2-H]+,[M+2K-H]+,[M+NH4]+,[M+H3O]+,[M+MeOH+H]+,
[M+ACN+H]+,[M+2ACN+H]+,[M+IPA+H]+,[M+ACN+Na]+,
[M+DMSO+H]+
Neg Examples: [M-H]-,[M]-,[M+K-2H]-,[M+Cl]-,[M-H2O-H]
-,[M+Na-2H]-,M+FA-H]-,[M+Br]-,[M+HAc-H]-,[M+TFA-H]
-,[M+ACN-H]-
Enforced ion modes that are always considered.
--AdductSettings.fallback=[M+H]+,[M-H]-,[M+Na]+,[M+K]+
Fallback ion modes which are considered if the auto
detection did not find any indication for an ion
mode.
--AdductSettings.ignoreDetectedAdducts=false
if true ignores detected adducts from all sources
(except input files) and uses fallback plus
enforced adducts instead
--AdductSettings.prioritizeInputFileAdducts=true
Adducts specified in the input file are used as is
independent of what enforced/detectable/fallback
adducts are set.
--AlgorithmProfile=default
Configuration profile to store instrument specific
algorithm properties.
Some of the default profiles are: 'qtof',
'orbitrap', 'fticr'.
--CandidateFormulas=,
This configuration holds a set of neutral formulas
to be used as candidates for SIRIUS.
The formulas may be provided by the user, from a
database or from the input file.
Note: This set might be merged with other sources
such as ElGordo predicted lipid.
Set of Molecular Formulas to be used as candidates
for molecular formula estimation with SIRIUS
--CompoundQuality=UNKNOWN
Keywords that can be assigned to a input spectrum to
judge its quality. Available keywords are: Good,
LowIntensity, NoMS1Peak, FewPeaks, Chimeric,
NotMonoisotopicPeak, PoorlyExplained
--ConfidenceScoreApproximateDistance=2
Allowed MCES distance between CSI:FingerID hit
(best-scoring candidate) and true structure that
should still count as correct identification.
Distance 0 corresponds to identical molecular
structures. The closest non-identical structures
have an MCES distance of 2 (cutting 2 bonds). It
continues with 4,6,8 and so on.
Currently only 0 (exact) and 2 for approximate are
supported.
--EnforceElGordoFormula=True
El Gordo may predict that an MS/MS spectrum is a
lipid spectrum.
The corresponding molecular formula may be enforeced
or additionally added to the set of molecular
formula candidates.
--ExpansiveSearchConfidenceMode.confidenceScoreSimilarityMode=APPROXIMATE
Expansive search mode
OFF: No expansive search is performed
EXACT: Use confidence score in exact mode: Only
molecular structures identical to the true
structure should count as correct identification.
APPROXIMATE: Use confidence score in approximate
mode: Molecular structures hits that are close to
the true structure should count as correct
identification.
--ExpansiveSearchConfidenceMode.confPubChemFactor=0.5
Expansive search parameters.
Expansive search will expand the search space to
whole PubChem
in case no hit with reasonable confidence was found
in one of the user specified structure search
databases.
Factor that PubChem confidence scores gets
multiplied with as bias against it.
--ForbidRecalibration=ALLOWED
Enable/Disable the hypothesen driven recalibration
of MS/MS spectra
Must be either 'ALLOWED' or FORBIDDEN'
--FormulaResultThreshold=true
Specifies if the list of Molecular Formula
Identifications is filtered by a soft threshold
(calculateThreshold) before CSI:FingerID
predictions are calculated.
--FormulaSearchDB=none
--FormulaSearchSettings.applyFormulaConstraintsToBottomUp=false
--FormulaSearchSettings.applyFormulaConstraintsToDatabaseCandidates=false
--FormulaSearchSettings.performBottomUpAboveMz=0
These settings define the behaviour of de novo and
bottom-up molecular formula generation.
Candidate formulas from database or user input are
handled independently via {@link
CandidateFormulas}.
Candidate formulas from input files are always
prioritized. An (internal) parameter exist to
override this.
--FormulaSearchSettings.performDeNovoBelowMz=400
--FormulaSettings.detectable=S,Br,Cl,B,Se
Detectable elements are added to the chemical
alphabet, if there are indications for them (e.g.
in isotope pattern)
--FormulaSettings.enforced=C,H,N,O,P
These configurations hold the information how to
autodetect elements based on the given formula
constraints.
Note: If the compound is already assigned to a
specific molecular formula, this annotation is
ignored.
Enforced elements are always considered
--FormulaSettings.fallback=S
Fallback elements are used, if the auto-detection
fails (e.g. no isotope pattern available)
-h, --help Show this help message and exit.
--InjectSpectralLibraryMatchFormulas.alwaysPredict=true
If true Fingerprint/Classes/Structures will be
predicted for formulas candidates with
reference spectrum similarity > minScoreToEnforce
will be predicted no matter which soft threshold
rules
will apply.
--InjectSpectralLibraryMatchFormulas.injectFormulas=true
If true formulas candidates with reference spectrum
similarity > minScoreToEnforce will be part of the
result
list no matter of other filter settings or there
rank regarding SIRIUS score.
--InjectSpectralLibraryMatchFormulas.minPeakMatchesToInject=6
Matching peaks threshold to inject formula
candidates no matter which score they have or
which filter is applied.
--InjectSpectralLibraryMatchFormulas.minScoreToInject=.7
Specify settings to inject/preserve formula
candidates that belong to
high scoring reference spectra.
Similarity Threshold to inject formula candidates no
matter which score they have or which filter is
applied.
--IsotopeMs2Settings=IGNORE
--IsotopeSettings.filter=True
This configurations define how to deal with isotope
patterns in MS1.
When filtering is enabled, molecular formulas are
excluded if their theoretical isotope pattern does
not match the theoretical one, even if their MS/MS
pattern has high score.
--IsotopeSettings.multiplier=1
multiplier for the isotope score. Set to 0 to
disable isotope scoring. Otherwise, the score from
isotope pattern analysis is multiplied with this
coefficient. Set to a value larger than one if
your isotope pattern data is of much better
quality than your MS/MS data.
--MedianNoiseIntensity=0.015
--MotifDbFile=none
--ms1.absoluteIntensityError=0.02
The average absolute deviation between theoretical
and measured intensity of isotope peaks.
Do not change this parameter without a good reason!
--ms1.minimalIntensityToConsider=0.01
Ignore isotope peaks below this intensity.
This value should reflect the smallest relative
intensive which is still above noise level.
Obviously, this is hard to judge without having
absolute values. Keeping this value around 1
percent is
fine for most settings. Set it to smaller values if
you trust your small intensities.
--ms1.relativeIntensityError=0.08
The average relative deviation between theoretical
and measured intensity of isotope peaks.
Do not change this parameter without a good reason!
--MS1MassDeviation.allowedMassDeviation=10.0 ppm
Mass accuracy setting for MS1 spectra. Mass
accuracies are always written as "X ppm (Y Da)"
with X and Y
are numerical values. The ppm is a relative measure
(parts per million), Da is an absolute measure.
For each mass, the
maximum of relative and absolute is used.
--MS1MassDeviation.massDifferenceDeviation=5.0 ppm
--MS1MassDeviation.standardMassDeviation=10.0 ppm
--MS2MassDeviation.allowedMassDeviation=10.0 ppm
Mass accuracy setting for MS2 spectra. Mass
Accuracies are always written as "X ppm (Y Da)"
with X and Y are numerical values.
The ppm is a relative measure (parts per million),
Da is an absolute measure. For each mass, the
maximum of relative and absolute is used.
--MS2MassDeviation.standardMassDeviation=10.0 ppm
--NoiseThresholdSettings.absoluteThreshold=0
--NoiseThresholdSettings.basePeak=NOT_PRECURSOR
--NoiseThresholdSettings.intensityThreshold=0.005
--NoiseThresholdSettings.maximalNumberOfPeaks=60
--NumberOfCandidates=10
--NumberOfCandidatesPerIonization=1
Use this parameter if you want to force to report at
least numberOfResultsToKeepPerIonization results
per ionization.
If set to 0, this parameter will have no effect and
just the top numberOfResultsToKeep results will be
reported.
--NumberOfMsNovelistCandidates=128
--NumberOfStructureCandidates=10000
--PossibleAdductSwitches=[M+Na]+:[M+H]+,[M+K]+:[M+H]+,[M+Cl]-:[M-H]-
An adduct switch is a switch of the ionization mode
within a spectrum, e.g. an ion replaces an sodium
adduct
with a protonation during fragmentation. Such adduct
switches heavily increase the complexity of the
analysis, but for certain adducts they might happen
regularly. Adduct switches are written in the
form {@literal a -> b, a -> c, d -> c} where a, b,
c, and d are adducts and {@literal a -> b}
denotes an allowed switch from
a to b within the MS/MS spectrum.
--PrintCitations=True
--RecomputeResults=False
--SpectralMatchingMassDeviation.allowedPeakDeviation=10.0 ppm
Maximum allowed mass deviation in ppm for matching
peaks.
--SpectralMatchingMassDeviation.allowedPrecursorDeviation=10.0 ppm
Maximum allowed mass deviation in ppm for matching
the precursor.
--SpectralMatchingScorer=MODIFIED_COSINE
--SpectralSearchDB=ALL
--SpectralSearchLog=10
--StructureSearchDB=BIO
--TagStructuresByElGordo=True
Molecular structure candidates matching the lipid
class estimated by El Gordo will be tagged.
The lipid class will only be available if El Gordo
predicts that the MS/MS is a lipid spectrum.
If this parameter is set to 'false' El Gordo will
still be executed and e.g. improve molecular
formula annotaton, but the matching structure
candidates will not be tagged as lipid class.
--Timeout.secondsPerInstance=0
This configuration defines a timeout for the tree
computation.
As the underlying problem is NP-hard, it might take
forever to compute trees for very challenging (e.
g. large mass) compounds.
Setting a time constraint allows the program to
continue with other instances and just skip the
challenging ones.
Note that due to multithreading, this time
constraints are not absolutely accurate.
Set the maximum number of seconds for computing a
single compound. Set to 0 to disable the time
constraint.
--Timeout.secondsPerTree=0
Set the maximum number of seconds for a single
molecular formula check. Set to 0 to disable the
time constraint
--UseHeuristic.useHeuristicAboveMz=300
Set minimum m/z to enable heuristic preprocessing.
The heuristic will be used to initially rank the
formula candidates. The Top (NumberOfCandidates)
candidates will then be computed exactly by
solving the ILP.
--UseHeuristic.useOnlyHeuristicAboveMz=650
Set minimum m/z to only use heuristic tree
computation. No exact tree computation (ILP) will
be performed for this compounds.
-V, --version Print version information and exit.
--ZodiacClusterCompounds=false
cluster compounds before running ZODIAC
--ZodiacEdgeFilterThresholds.minLocalCandidates=1
Minimum number of candidates per compound which are
forced to have at least [minLocalConnections]
connections to other compounds.
E.g. 2 candidates per compound must have at least 10
connections to other compounds
--ZodiacEdgeFilterThresholds.minLocalConnections=10
Minimum number of connections per candidate which
are forced for at least [minLocalCandidates]
candidates to other compounds.
E.g. 2 candidates per compound must have at least 10
connections to other compounds
--ZodiacEdgeFilterThresholds.thresholdFilter=0.95
Defines the proportion of edges of the complete
network which will be ignored.
--ZodiacEpochs.burnInPeriod=2000
Number of epochs considered as 'burn-in period'.
Samples from the beginning of a Markov chain do not
accurately represent the desired distribution of
candidates and are not used to estimate the ZODIAC
score.
--ZodiacEpochs.iterations=20000
Number of epochs to run the Gibbs sampling. When
multiple Markov chains are computed, all chains'
iterations sum up to this value.
--ZodiacEpochs.numberOfMarkovChains=10
Number of separate Gibbs sampling runs.
--ZodiacLibraryScoring.lambda=1000
Lambda used in the scoring function of spectral
library hits. The higher this value the higher are
librar hits weighted in ZODIAC scoring.
--ZodiacLibraryScoring.minCosine=0.5
Spectral library hits must have at least this cosine
or higher to be considered in scoring. Value must
be in [0,1].
--ZodiacNumberOfConsideredCandidatesAt300Mz=10
Maximum number of candidate molecular formulas
(fragmentation trees computed by SIRIUS) per
compound which are considered by ZODIAC.
This is the threshold used for all compounds with mz
below 300 m/z and is used to interpolate the
number of candidates for larger compounds.
If lower than 0, all available candidates are
considered.
--ZodiacNumberOfConsideredCandidatesAt800Mz=50
Maximum number of candidate molecular formulas
(fragmentation trees computed by SIRIUS) per
compound which are considered by ZODIAC.
This is the threshold used for all compounds with mz
above 800 m/z and is used to interpolate the
number of candidates for smaller compounds.
If lower than 0, all available candidates are
considered.
--ZodiacRatioOfConsideredCandidatesPerIonization=0.2
Ratio of candidate molecular formulas (fragmentation
trees computed by SIRIUS) per compound which are
forced for each ionization to be considered by
ZODIAC.
This depends on the number of candidates ZODIAC
considers. E.g. if 50 candidates are considered
and a ratio of 0.2 is set, at least 10 candidates
per ionization will be considered, which might
increase the number of candidates above 50.
--ZodiacRunInTwoSteps=true
As default ZODIAC runs a 2-step approach. First
running 'good quality compounds' only, and
afterwards including the remaining.
Commands:
lcms-align, A <PREPROCESSING> Align and merge
compounds of multiple LCMS Runs.
Use this tool if you want to
import from mzML/mzXml.
spectra-search, library-search <COMPOUND TOOL> Computes the
similarity between all
compounds/features in the
project-space (queries) one vs
all spectra in the selected
databases.
formulas, trees, formula, sirius <COMPOUND TOOL> Identify molecular
formula for each compound
individually using fragmentation
trees and isotope patterns.
structures, structure-db-search, structure
<COMPOUND TOOL> Search in molecular
structure db for each compound
Individually using CSI:FingerID
structure database search.
zodiac, rerank-formulas <DATASET TOOL> Identify Molecular
formulas of all compounds in a
dataset together using ZODIAC.
classes, canopus, compound-classes <COMPOUND TOOL> Predict compound
categories for each compound
individually based on its
predicted molecular fingerprint
(CSI:FingerID) using CANOPUS.
fingerprints, fingerprint <COMPOUND TOOL> Predict molecular
fingerprint from MS/MS and
fragmentation trees for each
compound individually using CSI:
FingerID fingerprint prediction.
denovo-structures, msnovelist <COMPOUND TOOL> Predict MsNovelist
compound candidates and compare
them against molecular
fingerprint using CSI:FingerID
scoring method.
custom-db, DB <STANDALONE> Generate a custom
searchable structure/spectral
database. Import multiple files
with compounds into this DB.
similarity <STANDALONE> Computes the
similarity between all compounds
in the dataset and outputs a
matrix of similarities.
decomp, mass-decomposition <STANDALONE> Small tool to
decompose masses with given
deviation, ionization, chemical
alphabet and chemical filter.
mgf-export, MGF <STANDALONE> Exports the spectra of
a given input as mgf.
fingerprinter, FP <STANDALONE> Compute SIRIUS
compatible fingerprints from
PubChem standardized SMILES in
tsv format.
service, rest, REST <STANDALONE> Starts SIRIUS as a
background (REST) service that
can be requested via a REST-API.
login <STANDALONE> Allows a user to login
for SIRIUS Webservices (e.g. CSI:
FingerID or CANOPUS) and securely
store a personal access token.
settings <STANDALONE> Configure persistent
(technical) settings of SIRIUS (e.
g. ProxySettings or ILP Solver).
install-autocompletion <INSTALL> generates and installs an
Autocompletion-Script with all
subcommands. Default installation
is for the current user.
summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write
Summary files from a given
project-space into the given
project-space or a custom
location.
Usage: sirius formulas [-hV] [--elements-extended-organic]
[--no-isotope-filter] [--no-isotope-score]
[--no-recalibration]
[--bottom-up-search=<bottomUpSearchOptions>]
[-c=<numberOfCandidates>]
[--candidates-per-ionization=<numberOfCandidatesPerIoniza
tion>] [--compound-timeout=<instanceTimeout>]
[-d=<dbName>[,<dbName>...]] [-e=<detectableElements>]
[-E=<enforcedElements>] [-f=<candidateFormulas>]
[--heuristic=<mzToUseHeuristic>]
[--heuristic-only=<mzToUseHeuristicOnly>]
[-i=<ionsConsidered>] [-I=<ionsEnforced>]
[-l=<injectElGordoCompounds>] [-p=<profile>]
[--ppm-max=<ppmMax>] [--ppm-max-ms2=<ppmMaxMs2>]
[--solver=<solver>] [--tree-timeout=<treeTimeout>] []
[COMMAND]
<COMPOUND TOOL> Identify molecular formula for each compound individually using
fragmentation trees and isotope patterns.
--solver, --ilp-solver=<solver>
Set ILP solver to be used for fragmentation
computation. Valid values: 'CLP' (included),
'CPLEX', 'GUROBI'.
For GUROBI and CPLEX environment variables need to
be configure (see Manual).
--no-recalibration Disable Recalibration of input Spectra
--elements-extended-organic
Use extended set of elements for molecular formula
generation. DO NOT USE IN COMBINATION WITH DE
NOVO FORMULA GENERATION!
Enforced elements are: CHNOPFI
Detectable elements are: SBBrCl
--bottom-up-search=<bottomUpSearchOptions>
Valid values: CUSTOM, BOTTOM_UP_ONLY, DISABLED. Use
DISABLED to deactivate bottom up search. Use
BOTTOM_UP_ONLY to replace de novo computations
with bottom up search for every compound.
Default: CUSTOM, which uses the predefined values
from the config tool.
--no-isotope-filter Disable molecular formula filter. When filtering is
enabled, molecular formulas are excluded if their
theoretical isotope pattern does not match the
theoretical one, even if their MS/MS pattern has
high score.
--no-isotope-score Disable isotope pattern score.
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
--ppm-max=<ppmMax> Maximum allowed mass deviation in ppm for
decomposing masses.
Default: 10.0 ppm
-p, --profile=<profile> Name of the configuration profile.
Predefined profiles are: `default`, 'qtof',
'orbitrap', 'fticr'.
Default: default
-e, --elements-considered=<detectableElements>
Set the allowed elements for rare element detection.
Example: `SBrClBSe` to allow the elements S,Br,Cl,B
and Se.
Default: S,Br,Cl,B,Se
-E, --elements-enforced=<enforcedElements>
Enforce elements for molecular formula
determination.
Example: CHNOPSCl to allow the elements C, H, N, O,
P, S and Cl. Add numbers in brackets to restrict
the minimal and maximal allowed occurrence of
these elements: CHNOP[5]S[8]Cl[1-2]. When one
number is given then it is interpreted as upper
bound.
Default: C,H,N,O,P
-c, --candidates=<numberOfCandidates>
Number of precursor formula candidates in the
output - each can correspond to multiple adducts.
Default: 10
-f, --formulas=<candidateFormulas>
Specify a list of candidate formulas the method
should use. Omit this option if you want to
consider all possible molecular formulas
Default: null
-I, --adducts-enforced=<ionsEnforced>
Adducts that are always considered during the
analysis. Example: [M+H]+,[M-H]-,[M+Cl]-,[M+Na]+,
[M]+,[M-H2O+H]+.
Default: ,
--compound-timeout=<instanceTimeout>
Maximal computation time in seconds for a single
compound. 0 for an infinite amount of time.
Default: 0
--tree-timeout=<treeTimeout>
Time out in seconds per fragmentation tree
computations. 0 for an infinite amount of time.
Default: 0
-i, --adducts-considered=<ionsConsidered>
Adducts which are considered during adduct
detection. They are only used for further
analyses if there is an indication in the MS1
scan. If none of them could be detected in MS1,
all of them will be used as a fallback. Example:
[M+H]+,[M-H]-,[M+Cl]-,[M+Na]+,[M]+,[M-H2O+H]+.
Default: [M+H]+,[M+K]+,[M+Na]+,[M+H-H2O]+,[M+H-H4O2]
+,[M+NH3+H]+,[M+FA+H]+,[M+ACN+H]+,[2M+H]+,[2M+K]+,
[2M+Na]+,[M-H]-,[M+Cl]-,[M+Br]-,[M-H2O-H]-,
[M+Na-2H]-,[M+CH2O2-H]-,[M+C2H4O2-H]-,[M+H2O-H]-,
[M-H3N-H]-,[M-CO2-H]-,[M-CH2O3-H]-,[M-CH3-H]-,
[2M+H]-,[2M+Cl]-,[2M+Br]-
--candidates-per-ionization=<numberOfCandidatesPerIonization>
Minimum number of candidates in the output for each
ionization. Set to force output of results for
each possible ionization, even if not part of
highest ranked results.
Default: 1
-d, --db, --database=<dbName>[,<dbName>...]
Search formulas in the Union of the given
databases. If no database is given all possible
molecular formulas will be respected (no database
is used).
Example: possible DBs: 'ALL,,BIO,PUBCHEM,MESH,HMDB,
KNAPSACK,CHEBI,PUBMED,KEGG,HSDB,MACONDA,METACYC,
GNPS,ZINCBIO,YMDB,PLANTCYC,NORMAN,ADDITIONAL,
PUBCHEMANNOTATIONBIO,PUBCHEMANNOTATIONDRUG,
PUBCHEMANNOTATIONSAFETYANDTOXIC,
PUBCHEMANNOTATIONFOOD,KEGGMINE,ECOCYCMINE,
YMDBMINE'
Default: none
--ppm-max-ms2=<ppmMaxMs2>
Maximum allowed mass deviation in ppm for
decomposing masses in MS2. If not specified, the
same value as for the MS1 is used.
Default: 10.0 ppm
-l, --elgordo, --fix-lipids=<injectElGordoCompounds>
Fix the single molecular formula determined by El
Gordo if a lipid class is detected.
Default: True
--heuristic-only=<mzToUseHeuristicOnly>
Use only heuristic tree computation compounds >=
the specified m/z.
Default: 650
--heuristic=<mzToUseHeuristic>
Enable heuristic preprocessing for compounds >= the
specified m/z.
Default: 300
Commands:
zodiac, rerank-formulas <DATASET TOOL> Identify Molecular formulas of
all compounds in a dataset together using
ZODIAC.
fingerprints, fingerprint <COMPOUND TOOL> Predict molecular fingerprint
from MS/MS and fragmentation trees for each
compound individually using CSI:FingerID
fingerprint prediction.
summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write Summary
files from a given project-space into the
given project-space or a custom location.
Usage: sirius zodiac [-hV] [--ignore-spectra-quality] [--burn-in=<burnInSteps>]
[--considered-candidates-at-300=<numberOfConsideredCandidat
esBelow300>]
[--considered-candidates-at-800=<numberOfConsideredCandidat
esAbove800>] [--iterations=<iterationSteps>]
[--minLocalConnections=<minLocalConnections>]
[--thresholdFilter=<thresholdFilter>] [COMMAND]
<DATASET TOOL> Identify Molecular formulas of all compounds in a dataset
together using ZODIAC.
--burn-in=<burnInSteps>
Number of epochs considered as 'burn-in period'.
Default: 2000
--considered-candidates-at-300=<numberOfConsideredCandidatesBelow300>
Maximum number of candidate molecular formulas (fragmentation
trees computed by SIRIUS) per compound which are considered
by ZODIAC for compounds below 300 m/z.
Default: 10
--considered-candidates-at-800=<numberOfConsideredCandidatesAbove800>
Maximum number of candidate molecular formulas (fragmentation
trees computed by SIRIUS) per compound which are considered
by ZODIAC for compounds above 800 m/z.
Default: 50
-h, --help Show this help message and exit.
--ignore-spectra-quality
As default ZODIAC runs a 2-step approach. First running 'good
quality compounds' only, and afterwards including the
remaining.
--iterations=<iterationSteps>
Number of epochs to run the Gibbs sampling. When multiple
Markov chains are computed, all chains' iterations sum up
to this value.
Default: 20000
--minLocalConnections=<minLocalConnections>
Minimum number of compounds to which at least one candidate
per compound must be connected to.
Default: 10
--thresholdFilter=<thresholdFilter>
Defines the proportion of edges of the complete network which
will be ignored.
Default: 0.95
-V, --version Print version information and exit.
Commands:
fingerprints, fingerprint <COMPOUND TOOL> Predict molecular fingerprint
from MS/MS and fragmentation trees for each
compound individually using CSI:FingerID
fingerprint prediction.
summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write Summary
files from a given project-space into the
given project-space or a custom location.
Usage: sirius [-hV] [--noCite] [--recompute] [--buffer=<initialInstanceBuffer>]
[--cores=<numOfCores>] [--log=<logLevel>] [--maxmz=<maxMz>]
[--workspace=<workspace>] [[-o=<outputProjectLocation>]
[--update-fingerprint-version]] [[-i=<inputPath>[,<inputPath>...]
[-i=<inputPath>[,<inputPath>...]]... [--ignore-formula]
[--allow-ms1-only]] [-z=<parentMz> [-1=<ms1File>[,<ms1File>...]]
[--adduct=<ionType>] -2=<ms2File>[,<ms2File>...]
[-f=<formula>]]...] [COMMAND]
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
--log, --loglevel=<logLevel>
Set logging level of the Jobs SIRIUS will execute.
Valid values: SEVERE, WARNING, INFO, FINER, ALL
Default: WARNING
--cores, --threads, --processors=<numOfCores>
Number of simultaneous worker thread to be used for
compute intense workload. If not specified SIRIUS
chooses a reasonable number based you CPU specs.
--buffer, --instance-buffer=<initialInstanceBuffer>
Number of instances that will be loaded into the
Memory. A larger buffer ensures that there are
enough instances available to use all cores
efficiently during computation. A smaller buffer
saves Memory. To load all instances immediately
set it to -1. Default (numeric value 0): 3 x
--cores. Note that for <DATASET_TOOLS> the
compound buffer may have no effect because this
tools may have to load compounds simultaneously
into the memory.
Default: 0
--workspace=<workspace>
Specify sirius workspace location. This is the
directory for storing Property files, logs,
databases and caches. This is NOT for the
project-space that stores the results! Default is
$USER_HOME/.sirius-<MINOR_VERSION>
--recompute Recompute results of ALL tools where results are
already present. Per default already present
results will be preserved and the instance will
be skipped for the corresponding Task/Tool
--maxmz=<maxMz> Only considers compounds with a precursor m/z lower
or equal [--maxmz]. All other compounds in the
input will be skipped.
Default: Infinity
--noCite, --noCitations, --no-citations
Do not write summary files to the project-space
Specify OUTPUT Project-Space:
-o, -p, --output, --project=<outputProjectLocation>
Specify the project-space to write into. If no
[--input] is specified it is also used as input.
For compression use the File ending .zip or .
sirius.
--update-fingerprint-version
Updates Fingerprint versions of the input project
to the one used by this SIRIUS version.
WARNING: All Fingerprint related results (CSI:
FingerID, CANOPUS) will be lost!
Specify multi-compound inputs (.ms, .mgf, .mzML/.mzXml, .sirius):
-i, --input=<inputPath>[,<inputPath>...]
Specify the input in multi-compound input formats:
Preprocessed mass spectra in .ms or .mgf file
format or LC/MS runs in .mzML/.mzXml format but
also any other file type e.g. to provide input
for STANDALONE tools.
--ignore-formula ignore given molecular formula if present in .ms or
.mgf input files.
--allow-ms1-only Allow MS1 only data to be imported.
Specify generic inputs (CSV) on per compound level:
-1, --ms1=<ms1File>[,<ms1File>...]
MS1 spectra files
-2, --ms2=<ms2File>[,<ms2File>...]
MS2 spectra files
-z, --mz, --precursor, --parentmass=<parentMz>
The mass of the parent ion for the specified ms2
spectra
--adduct, --ionization=<ionType>
Specify the adduct for this compound
Default: [M+?]+
-f, --formula=<formula> Specify the neutralized formula of this compound.
This will be used for tree computation. If given
no mass decomposition will be performed.
Usage: sirius structures [-hV] [-d=<dbName>[,<dbName>...]]
[-e=<expansiveSearchConfMode>] [COMMAND]
<COMPOUND TOOL> Search in molecular structure db for each compound Individually
using CSI:FingerID structure database search.
-d, --db, --database=<dbName>[,<dbName>...]
Search structure in the union of the given databases. If no
database is given the default database(s) are used.
Example: possible DBs: 'ALL,,BIO,PUBCHEM,MESH,HMDB,KNAPSACK,
CHEBI,PUBMED,KEGG,HSDB,MACONDA,METACYC,GNPS,ZINCBIO,YMDB,
PLANTCYC,NORMAN,ADDITIONAL,PUBCHEMANNOTATIONBIO,
PUBCHEMANNOTATIONDRUG,PUBCHEMANNOTATIONSAFETYANDTOXIC,
PUBCHEMANNOTATIONFOOD,KEGGMINE,ECOCYCMINE,YMDBMINE'
Default: BIO
-e, --exp=<expansiveSearchConfMode>
Confidence mode that is used for expansive search. OFF -> no
expansive search. EXACT -> Exact mode confidence score is
used for expansive search. APPROXIMATE -> Approximate mode
confidence score is used for expansive search
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Commands:
denovo-structures, msnovelist <COMPOUND TOOL> Predict MsNovelist compound
candidates and compare them against
molecular fingerprint using CSI:FingerID
scoring method.
summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write Summary
files from a given project-space into the
given project-space or a custom location.