バイオ情報計測学は舟でゆく: SIRIUS6 CLI

Usage: sirius [-hV] [--noCite] [--recompute] [--buffer=<initialInstanceBuffer>]

[--cores=<numOfCores>] [--log=<logLevel>] [--maxmz=<maxMz>]

[--workspace=<workspace>] [[-o=<outputProjectLocation>]

[--update-fingerprint-version]] [[-i=<inputPath>[,<inputPath>...]

[-i=<inputPath>[,<inputPath>...]]... [--ignore-formula]

[--allow-ms1-only]] [-z=<parentMz> [-1=<ms1File>[,<ms1File>...]]

[--adduct=<ionType>] -2=<ms2File>[,<ms2File>...]

[-f=<formula>]]...] [COMMAND]

-h, --help Show this help message and exit.

-V, --version Print version information and exit.

--log, --loglevel=<logLevel>

Set logging level of the Jobs SIRIUS will execute.

Valid values: SEVERE, WARNING, INFO, FINER, ALL

Default: WARNING

--cores, --threads, --processors=<numOfCores>

Number of simultaneous worker thread to be used for

compute intense workload. If not specified SIRIUS

chooses a reasonable number based you CPU specs.

--buffer, --instance-buffer=<initialInstanceBuffer>

Number of instances that will be loaded into the

Memory. A larger buffer ensures that there are

enough instances available to use all cores

efficiently during computation. A smaller buffer

saves Memory. To load all instances immediately

set it to -1. Default (numeric value 0): 3 x

--cores. Note that for <DATASET_TOOLS> the

compound buffer may have no effect because this

tools may have to load compounds simultaneously

into the memory.

Default: 0

--workspace=<workspace>

Specify sirius workspace location. This is the

directory for storing Property files, logs,

databases and caches. This is NOT for the

project-space that stores the results! Default is

$USER_HOME/.sirius-<MINOR_VERSION>

--recompute Recompute results of ALL tools where results are

already present. Per default already present

results will be preserved and the instance will

be skipped for the corresponding Task/Tool

--maxmz=<maxMz> Only considers compounds with a precursor m/z lower

or equal [--maxmz]. All other compounds in the

input will be skipped.

Default: Infinity

--noCite, --noCitations, --no-citations

Do not write summary files to the project-space

Specify OUTPUT Project-Space:

-o, -p, --output, --project=<outputProjectLocation>

Specify the project-space to write into. If no

[--input] is specified it is also used as input.

For compression use the File ending .zip or .

sirius.

--update-fingerprint-version

Updates Fingerprint versions of the input project

to the one used by this SIRIUS version.

WARNING: All Fingerprint related results (CSI:

FingerID, CANOPUS) will be lost!

Specify multi-compound inputs (.ms, .mgf, .mzML/.mzXml, .sirius):

-i, --input=<inputPath>[,<inputPath>...]

Specify the input in multi-compound input formats:

Preprocessed mass spectra in .ms or .mgf file

format or LC/MS runs in .mzML/.mzXml format but

also any other file type e.g. to provide input

for STANDALONE tools.

--ignore-formula ignore given molecular formula if present in .ms or

.mgf input files.

--allow-ms1-only Allow MS1 only data to be imported.

Specify generic inputs (CSV) on per compound level:

-1, --ms1=<ms1File>[,<ms1File>...]

MS1 spectra files

-2, --ms2=<ms2File>[,<ms2File>...]

MS2 spectra files

-z, --mz, --precursor, --parentmass=<parentMz>

The mass of the parent ion for the specified ms2

spectra

--adduct, --ionization=<ionType>

Specify the adduct for this compound

Default: [M+?]+

-f, --formula=<formula> Specify the neutralized formula of this compound.

This will be used for tree computation. If given

no mass decomposition will be performed.

Commands:

config

<CONFIGURATION> Override all

possible default configurations

of this toolbox from the command

line.

custom-db, DB <STANDALONE> Generate a custom

searchable structure/spectral

database. Import multiple files

with compounds into this DB.

similarity <STANDALONE> Computes the

similarity between all compounds

in the dataset and outputs a

matrix of similarities.

decomp, mass-decomposition <STANDALONE> Small tool to

decompose masses with given

deviation, ionization, chemical

alphabet and chemical filter.

mgf-export, MGF <STANDALONE> Exports the spectra of

a given input as mgf.

fingerprinter, FP <STANDALONE> Compute SIRIUS

compatible fingerprints from

PubChem standardized SMILES in

tsv format.

service, rest, REST <STANDALONE> Starts SIRIUS as a

background (REST) service that

can be requested via a REST-API.

for SIRIUS Webservices (e.g. CSI:

FingerID or CANOPUS) and securely

store a personal access token.

settings <STANDALONE> Configure persistent

(technical) settings of SIRIUS (e.

g. ProxySettings or ILP Solver).

install-autocompletion <INSTALL> generates and installs an

Autocompletion-Script with all

subcommands. Default installation

is for the current user.

summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write

Summary files from a given

project-space into the given

project-space or a custom

location.

lcms-align, A <PREPROCESSING> Align and merge

compounds of multiple LCMS Runs.

Use this tool if you want to

import from mzML/mzXml.

denovo-structures, msnovelist <COMPOUND TOOL> Predict MsNovelist

compound candidates and compare

them against molecular

fingerprint using CSI:FingerID

scoring method.

spectra-search, library-search <COMPOUND TOOL> Computes the

similarity between all

compounds/features in the

project-space (queries) one vs

all spectra in the selected

databases.

formulas, trees, formula, sirius <COMPOUND TOOL> Identify molecular

formula for each compound

individually using fragmentation

trees and isotope patterns.

structures, structure-db-search, structure

<COMPOUND TOOL> Search in molecular

structure db for each compound

Individually using CSI:FingerID

structure database search.

zodiac, rerank-formulas <DATASET TOOL> Identify Molecular

formulas of all compounds in a

dataset together using ZODIAC.

classes, canopus, compound-classes <COMPOUND TOOL> Predict compound

categories for each compound

individually based on its

predicted molecular fingerprint

(CSI:FingerID) using CANOPUS.

fingerprints, fingerprint <COMPOUND TOOL> Predict molecular

fingerprint from MS/MS and

fragmentation trees for each

compound individually using CSI:

FingerID fingerprint prediction.

Usage: sirius login [-hV] [--clear] [--limits] [--request-token-only] [--show]

[--select-license=<sid>] [[-u=<username> -p] |

[--token=<token>] | [--password-env=<password>

--user-env=<username>]]

<STANDALONE> Allows a user to login for SIRIUS Webservices (e.g. CSI:FingerID

or CANOPUS) and securely store a personal access token.

--clear, --logout Logout. Deletes stored refresh and access token

(re-login required to use webservices again).

-h, --help Show this help message and exit.

--limits, --license-info

Show license information and compound limits.

-p, --pwd, --password Console password input.

--password-env=<password>

Environment variable with login password.

--request-token-only Requests and prints a new SECRET refresh token but

does not store the token as login.

This can be used to request a token to be used in

third party applications that wish to call

SIRIUS Web Services using your account.

Do never store your username and password in third

party apps.

Do not store the output of this command in any

log. We recommend redirecting the output into a

file.

--select-license, --select-subscription=<sid>

Specify active subscription (sid) if multiple

licenses are available at your account.

Available subscriptions can be listed with

'--show'

--show Show profile information about the profile you are

logged in with.

--token=<token> Refresh token to use as login.

-u, --user, --email=<username>

--user-env=<username> Environment variable with login username.

-V, --version Print version information and exit.

Usage: sirius config [-hV] [--AdductSettings.detectable=[M+H]+,[M+K]+,[M+Na]+,

[M+H-H2O]+,[M+H-H4O2]+,[M+NH3+H]+,[M+FA+H]+,[M+ACN+H]+,

[2M+H]+,[2M+K]+,[2M+Na]+,[M-H]-,[M+Cl]-,[M+Br]-,[M-H2O-H]-,

[M+Na-2H]-,[M+CH2O2-H]-,[M+C2H4O2-H]-,[M+H2O-H]-,[M-H3N-H]

-,[M-CO2-H]-,[M-CH2O3-H]-,[M-CH3-H]-,[2M+H]-,[2M+Cl]-,

[2M+Br]-] [--AdductSettings.enforced=,] [--AdductSettings.

fallback=[M+H]+,[M-H]-,[M+Na]+,[M+K]+] [--AdductSettings.

ignoreDetectedAdducts=false] [--AdductSettings.

prioritizeInputFileAdducts=true]

[--AlgorithmProfile=default] [--CandidateFormulas=,]

[--CompoundQuality=UNKNOWN]

[--ConfidenceScoreApproximateDistance=2]

[--EnforceElGordoFormula=True]

[--ExpansiveSearchConfidenceMode.

confidenceScoreSimilarityMode=APPROXIMATE]

[--ExpansiveSearchConfidenceMode.confPubChemFactor=0.5]

[--ForbidRecalibration=ALLOWED]

[--FormulaResultThreshold=true] [--FormulaSearchDB=none]

[--FormulaSearchSettings.

applyFormulaConstraintsToBottomUp=false]

[--FormulaSearchSettings.

applyFormulaConstraintsToDatabaseCandidates=false]

[--FormulaSearchSettings.performBottomUpAboveMz=0]

[--FormulaSearchSettings.performDeNovoBelowMz=400]

[--FormulaSettings.detectable=S,Br,Cl,B,Se]

[--FormulaSettings.enforced=C,H,N,O,P] [--FormulaSettings.

fallback=S] [--InjectSpectralLibraryMatchFormulas.

alwaysPredict=true] [--InjectSpectralLibraryMatchFormulas.

injectFormulas=true] [--InjectSpectralLibraryMatchFormulas.

minPeakMatchesToInject=6]

[--InjectSpectralLibraryMatchFormulas.minScoreToInject=.7]

[--IsotopeMs2Settings=IGNORE] [--IsotopeSettings.

filter=True] [--IsotopeSettings.multiplier=1]

[--MedianNoiseIntensity=0.015] [--MotifDbFile=none] [--ms1.

absoluteIntensityError=0.02] [--ms1.

minimalIntensityToConsider=0.01] [--ms1.

relativeIntensityError=0.08] [--MS1MassDeviation.

allowedMassDeviation=10.0 ppm] [--MS1MassDeviation.

massDifferenceDeviation=5.0 ppm] [--MS1MassDeviation.

standardMassDeviation=10.0 ppm] [--MS2MassDeviation.

allowedMassDeviation=10.0 ppm] [--MS2MassDeviation.

standardMassDeviation=10.0 ppm] [--NoiseThresholdSettings.

absoluteThreshold=0] [--NoiseThresholdSettings.

basePeak=NOT_PRECURSOR] [--NoiseThresholdSettings.

intensityThreshold=0.005] [--NoiseThresholdSettings.

maximalNumberOfPeaks=60] [--NumberOfCandidates=10]

[--NumberOfCandidatesPerIonization=1]

[--NumberOfMsNovelistCandidates=128]

[--NumberOfStructureCandidates=10000]

[--PossibleAdductSwitches=[M+Na]+:[M+H]+,[M+K]+:[M+H]+,

[M+Cl]-:[M-H]-] [--PrintCitations=True]

[--RecomputeResults=False]

[--SpectralMatchingMassDeviation.allowedPeakDeviation=10.0

ppm] [--SpectralMatchingMassDeviation.

allowedPrecursorDeviation=10.0 ppm]

[--SpectralMatchingScorer=MODIFIED_COSINE]

[--SpectralSearchDB=ALL] [--SpectralSearchLog=10]

[--StructureSearchDB=BIO] [--TagStructuresByElGordo=True]

[--Timeout.secondsPerInstance=0] [--Timeout.

secondsPerTree=0] [--UseHeuristic.useHeuristicAboveMz=300]

[--UseHeuristic.useOnlyHeuristicAboveMz=650]

[--ZodiacClusterCompounds=false]

[--ZodiacEdgeFilterThresholds.minLocalCandidates=1]

[--ZodiacEdgeFilterThresholds.minLocalConnections=10]

[--ZodiacEdgeFilterThresholds.thresholdFilter=0.95]

[--ZodiacEpochs.burnInPeriod=2000] [--ZodiacEpochs.

iterations=20000] [--ZodiacEpochs.numberOfMarkovChains=10]

[--ZodiacLibraryScoring.lambda=1000]

[--ZodiacLibraryScoring.minCosine=0.5]

[--ZodiacNumberOfConsideredCandidatesAt300Mz=10]

[--ZodiacNumberOfConsideredCandidatesAt800Mz=50]

[--ZodiacRatioOfConsideredCandidatesPerIonization=0.2]

[--ZodiacRunInTwoSteps=true] [COMMAND]

<CONFIGURATION> Override all possible default configurations of this toolbox

from the command line.

--AdductSettings.detectable=[M+H]+,[M+K]+,[M+Na]+,[M+H-H2O]+,[M+H-H4O2]+,

[M+NH3+H]+,[M+FA+H]+,[M+ACN+H]+,[2M+H]+,[2M+K]+,[2M+Na]+,[M-H]-,[M+Cl]-,

[M+Br]-,[M-H2O-H]-,[M+Na-2H]-,[M+CH2O2-H]-,[M+C2H4O2-H]-,[M+H2O-H]-,

[M-H3N-H]-,[M-CO2-H]-,[M-CH2O3-H]-,[M-CH3-H]-,[2M+H]-,[2M+Cl]-,[2M+Br]-

Detectable ion modes which are only considered if

there is an indication in the MS1 scan (e.g.

correct mass delta).

--AdductSettings.enforced=,

Describes how to deal with Adducts:

Pos Examples: [M+H]+,[M]+,[M+K]+,[M+Na]+,[M+H-H2O]+,

[M+Na2-H]+,[M+2K-H]+,[M+NH4]+,[M+H3O]+,[M+MeOH+H]+,

[M+ACN+H]+,[M+2ACN+H]+,[M+IPA+H]+,[M+ACN+Na]+,

[M+DMSO+H]+

Neg Examples: [M-H]-,[M]-,[M+K-2H]-,[M+Cl]-,[M-H2O-H]

-,[M+Na-2H]-,M+FA-H]-,[M+Br]-,[M+HAc-H]-,[M+TFA-H]

-,[M+ACN-H]-

Enforced ion modes that are always considered.

--AdductSettings.fallback=[M+H]+,[M-H]-,[M+Na]+,[M+K]+

Fallback ion modes which are considered if the auto

detection did not find any indication for an ion

mode.

--AdductSettings.ignoreDetectedAdducts=false

if true ignores detected adducts from all sources

(except input files) and uses fallback plus

enforced adducts instead

--AdductSettings.prioritizeInputFileAdducts=true

Adducts specified in the input file are used as is

independent of what enforced/detectable/fallback

adducts are set.

--AlgorithmProfile=default

Configuration profile to store instrument specific

algorithm properties.

Some of the default profiles are: 'qtof',

'orbitrap', 'fticr'.

--CandidateFormulas=,

This configuration holds a set of neutral formulas

to be used as candidates for SIRIUS.

The formulas may be provided by the user, from a

database or from the input file.

Note: This set might be merged with other sources

such as ElGordo predicted lipid.

Set of Molecular Formulas to be used as candidates

for molecular formula estimation with SIRIUS

--CompoundQuality=UNKNOWN

Keywords that can be assigned to a input spectrum to

judge its quality. Available keywords are: Good,

LowIntensity, NoMS1Peak, FewPeaks, Chimeric,

NotMonoisotopicPeak, PoorlyExplained

--ConfidenceScoreApproximateDistance=2

Allowed MCES distance between CSI:FingerID hit

(best-scoring candidate) and true structure that

should still count as correct identification.

Distance 0 corresponds to identical molecular

structures. The closest non-identical structures

have an MCES distance of 2 (cutting 2 bonds). It

continues with 4,6,8 and so on.

Currently only 0 (exact) and 2 for approximate are

supported.

--EnforceElGordoFormula=True

El Gordo may predict that an MS/MS spectrum is a

lipid spectrum.

The corresponding molecular formula may be enforeced

or additionally added to the set of molecular

formula candidates.

--ExpansiveSearchConfidenceMode.confidenceScoreSimilarityMode=APPROXIMATE

Expansive search mode

OFF: No expansive search is performed

EXACT: Use confidence score in exact mode: Only

molecular structures identical to the true

structure should count as correct identification.

APPROXIMATE: Use confidence score in approximate

mode: Molecular structures hits that are close to

the true structure should count as correct

identification.

--ExpansiveSearchConfidenceMode.confPubChemFactor=0.5

Expansive search parameters.

Expansive search will expand the search space to

whole PubChem

in case no hit with reasonable confidence was found

in one of the user specified structure search

databases.

Factor that PubChem confidence scores gets

multiplied with as bias against it.

--ForbidRecalibration=ALLOWED

Enable/Disable the hypothesen driven recalibration

of MS/MS spectra

Must be either 'ALLOWED' or FORBIDDEN'

--FormulaResultThreshold=true

Specifies if the list of Molecular Formula

Identifications is filtered by a soft threshold

(calculateThreshold) before CSI:FingerID

predictions are calculated.

--FormulaSearchDB=none

--FormulaSearchSettings.applyFormulaConstraintsToBottomUp=false

--FormulaSearchSettings.applyFormulaConstraintsToDatabaseCandidates=false

--FormulaSearchSettings.performBottomUpAboveMz=0

These settings define the behaviour of de novo and

bottom-up molecular formula generation.

Candidate formulas from database or user input are

handled independently via {@link

CandidateFormulas}.

Candidate formulas from input files are always

prioritized. An (internal) parameter exist to

override this.

--FormulaSearchSettings.performDeNovoBelowMz=400

--FormulaSettings.detectable=S,Br,Cl,B,Se

Detectable elements are added to the chemical

alphabet, if there are indications for them (e.g.

in isotope pattern)

--FormulaSettings.enforced=C,H,N,O,P

These configurations hold the information how to

autodetect elements based on the given formula

constraints.

Note: If the compound is already assigned to a

specific molecular formula, this annotation is

ignored.

Enforced elements are always considered

--FormulaSettings.fallback=S

Fallback elements are used, if the auto-detection

fails (e.g. no isotope pattern available)

-h, --help Show this help message and exit.

--InjectSpectralLibraryMatchFormulas.alwaysPredict=true

If true Fingerprint/Classes/Structures will be

predicted for formulas candidates with

reference spectrum similarity > minScoreToEnforce

will be predicted no matter which soft threshold

rules

will apply.

--InjectSpectralLibraryMatchFormulas.injectFormulas=true

If true formulas candidates with reference spectrum

similarity > minScoreToEnforce will be part of the

result

list no matter of other filter settings or there

rank regarding SIRIUS score.

--InjectSpectralLibraryMatchFormulas.minPeakMatchesToInject=6

Matching peaks threshold to inject formula

candidates no matter which score they have or

which filter is applied.

--InjectSpectralLibraryMatchFormulas.minScoreToInject=.7

Specify settings to inject/preserve formula

candidates that belong to

high scoring reference spectra.

Similarity Threshold to inject formula candidates no

matter which score they have or which filter is

applied.

--IsotopeMs2Settings=IGNORE

--IsotopeSettings.filter=True

This configurations define how to deal with isotope

patterns in MS1.

When filtering is enabled, molecular formulas are

excluded if their theoretical isotope pattern does

not match the theoretical one, even if their MS/MS

pattern has high score.

--IsotopeSettings.multiplier=1

multiplier for the isotope score. Set to 0 to

disable isotope scoring. Otherwise, the score from

isotope pattern analysis is multiplied with this

coefficient. Set to a value larger than one if

your isotope pattern data is of much better

quality than your MS/MS data.

--MedianNoiseIntensity=0.015

--MotifDbFile=none

--ms1.absoluteIntensityError=0.02

The average absolute deviation between theoretical

and measured intensity of isotope peaks.

Do not change this parameter without a good reason!

--ms1.minimalIntensityToConsider=0.01

Ignore isotope peaks below this intensity.

This value should reflect the smallest relative

intensive which is still above noise level.

Obviously, this is hard to judge without having

absolute values. Keeping this value around 1

percent is

fine for most settings. Set it to smaller values if

you trust your small intensities.

--ms1.relativeIntensityError=0.08

The average relative deviation between theoretical

and measured intensity of isotope peaks.

Do not change this parameter without a good reason!

--MS1MassDeviation.allowedMassDeviation=10.0 ppm

Mass accuracy setting for MS1 spectra. Mass

accuracies are always written as "X ppm (Y Da)"

with X and Y

are numerical values. The ppm is a relative measure

(parts per million), Da is an absolute measure.

For each mass, the

maximum of relative and absolute is used.

--MS1MassDeviation.massDifferenceDeviation=5.0 ppm

--MS1MassDeviation.standardMassDeviation=10.0 ppm

--MS2MassDeviation.allowedMassDeviation=10.0 ppm

Mass accuracy setting for MS2 spectra. Mass

Accuracies are always written as "X ppm (Y Da)"

with X and Y are numerical values.

The ppm is a relative measure (parts per million),

Da is an absolute measure. For each mass, the

maximum of relative and absolute is used.

--MS2MassDeviation.standardMassDeviation=10.0 ppm

--NoiseThresholdSettings.absoluteThreshold=0

--NoiseThresholdSettings.basePeak=NOT_PRECURSOR

--NoiseThresholdSettings.intensityThreshold=0.005

--NoiseThresholdSettings.maximalNumberOfPeaks=60

--NumberOfCandidates=10

--NumberOfCandidatesPerIonization=1

Use this parameter if you want to force to report at

least numberOfResultsToKeepPerIonization results

per ionization.

If set to 0, this parameter will have no effect and

just the top numberOfResultsToKeep results will be

reported.

--NumberOfMsNovelistCandidates=128

--NumberOfStructureCandidates=10000

--PossibleAdductSwitches=[M+Na]+:[M+H]+,[M+K]+:[M+H]+,[M+Cl]-:[M-H]-

An adduct switch is a switch of the ionization mode

within a spectrum, e.g. an ion replaces an sodium

adduct

with a protonation during fragmentation. Such adduct

switches heavily increase the complexity of the

analysis, but for certain adducts they might happen

regularly. Adduct switches are written in the

form {@literal a -> b, a -> c, d -> c} where a, b,

c, and d are adducts and {@literal a -> b}

denotes an allowed switch from

a to b within the MS/MS spectrum.

--PrintCitations=True

--RecomputeResults=False

--SpectralMatchingMassDeviation.allowedPeakDeviation=10.0 ppm

Maximum allowed mass deviation in ppm for matching

peaks.

--SpectralMatchingMassDeviation.allowedPrecursorDeviation=10.0 ppm

Maximum allowed mass deviation in ppm for matching

the precursor.

--SpectralMatchingScorer=MODIFIED_COSINE

--SpectralSearchDB=ALL

--SpectralSearchLog=10

--StructureSearchDB=BIO

--TagStructuresByElGordo=True

Molecular structure candidates matching the lipid

class estimated by El Gordo will be tagged.

The lipid class will only be available if El Gordo

predicts that the MS/MS is a lipid spectrum.

If this parameter is set to 'false' El Gordo will

still be executed and e.g. improve molecular

formula annotaton, but the matching structure

candidates will not be tagged as lipid class.

--Timeout.secondsPerInstance=0

This configuration defines a timeout for the tree

computation.

As the underlying problem is NP-hard, it might take

forever to compute trees for very challenging (e.

g. large mass) compounds.

Setting a time constraint allows the program to

continue with other instances and just skip the

challenging ones.

Note that due to multithreading, this time

constraints are not absolutely accurate.

Set the maximum number of seconds for computing a

single compound. Set to 0 to disable the time

constraint.

--Timeout.secondsPerTree=0

Set the maximum number of seconds for a single

molecular formula check. Set to 0 to disable the

time constraint

--UseHeuristic.useHeuristicAboveMz=300

Set minimum m/z to enable heuristic preprocessing.

The heuristic will be used to initially rank the

formula candidates. The Top (NumberOfCandidates)

candidates will then be computed exactly by

solving the ILP.

--UseHeuristic.useOnlyHeuristicAboveMz=650

Set minimum m/z to only use heuristic tree

computation. No exact tree computation (ILP) will

be performed for this compounds.

-V, --version Print version information and exit.

--ZodiacClusterCompounds=false

cluster compounds before running ZODIAC

--ZodiacEdgeFilterThresholds.minLocalCandidates=1

Minimum number of candidates per compound which are

forced to have at least [minLocalConnections]

connections to other compounds.

E.g. 2 candidates per compound must have at least 10

connections to other compounds

--ZodiacEdgeFilterThresholds.minLocalConnections=10

Minimum number of connections per candidate which

are forced for at least [minLocalCandidates]

candidates to other compounds.

E.g. 2 candidates per compound must have at least 10

connections to other compounds

--ZodiacEdgeFilterThresholds.thresholdFilter=0.95

Defines the proportion of edges of the complete

network which will be ignored.

--ZodiacEpochs.burnInPeriod=2000

Number of epochs considered as 'burn-in period'.

Samples from the beginning of a Markov chain do not

accurately represent the desired distribution of

candidates and are not used to estimate the ZODIAC

score.

--ZodiacEpochs.iterations=20000

Number of epochs to run the Gibbs sampling. When

multiple Markov chains are computed, all chains'

iterations sum up to this value.

--ZodiacEpochs.numberOfMarkovChains=10

Number of separate Gibbs sampling runs.

--ZodiacLibraryScoring.lambda=1000

Lambda used in the scoring function of spectral

library hits. The higher this value the higher are

librar hits weighted in ZODIAC scoring.

--ZodiacLibraryScoring.minCosine=0.5

Spectral library hits must have at least this cosine

or higher to be considered in scoring. Value must

be in [0,1].

--ZodiacNumberOfConsideredCandidatesAt300Mz=10

Maximum number of candidate molecular formulas

(fragmentation trees computed by SIRIUS) per

compound which are considered by ZODIAC.

This is the threshold used for all compounds with mz

below 300 m/z and is used to interpolate the

number of candidates for larger compounds.

If lower than 0, all available candidates are

considered.

--ZodiacNumberOfConsideredCandidatesAt800Mz=50

Maximum number of candidate molecular formulas

(fragmentation trees computed by SIRIUS) per

compound which are considered by ZODIAC.

This is the threshold used for all compounds with mz

above 800 m/z and is used to interpolate the

number of candidates for smaller compounds.

If lower than 0, all available candidates are

considered.

--ZodiacRatioOfConsideredCandidatesPerIonization=0.2

Ratio of candidate molecular formulas (fragmentation

trees computed by SIRIUS) per compound which are

forced for each ionization to be considered by

ZODIAC.

This depends on the number of candidates ZODIAC

considers. E.g. if 50 candidates are considered

and a ratio of 0.2 is set, at least 10 candidates

per ionization will be considered, which might

increase the number of candidates above 50.

--ZodiacRunInTwoSteps=true

As default ZODIAC runs a 2-step approach. First

running 'good quality compounds' only, and

afterwards including the remaining.

Commands:

lcms-align, A <PREPROCESSING> Align and merge

compounds of multiple LCMS Runs.

Use this tool if you want to

import from mzML/mzXml.

spectra-search, library-search <COMPOUND TOOL> Computes the

similarity between all

compounds/features in the

project-space (queries) one vs

all spectra in the selected

databases.

formulas, trees, formula, sirius <COMPOUND TOOL> Identify molecular

formula for each compound

individually using fragmentation

trees and isotope patterns.

structures, structure-db-search, structure

<COMPOUND TOOL> Search in molecular

structure db for each compound

Individually using CSI:FingerID

structure database search.

zodiac, rerank-formulas <DATASET TOOL> Identify Molecular

formulas of all compounds in a

dataset together using ZODIAC.

classes, canopus, compound-classes <COMPOUND TOOL> Predict compound

categories for each compound

individually based on its

predicted molecular fingerprint

(CSI:FingerID) using CANOPUS.

fingerprints, fingerprint <COMPOUND TOOL> Predict molecular

fingerprint from MS/MS and

fragmentation trees for each

compound individually using CSI:

FingerID fingerprint prediction.

denovo-structures, msnovelist <COMPOUND TOOL> Predict MsNovelist

compound candidates and compare

them against molecular

fingerprint using CSI:FingerID

scoring method.

custom-db, DB <STANDALONE> Generate a custom

searchable structure/spectral

database. Import multiple files

with compounds into this DB.

similarity <STANDALONE> Computes the

similarity between all compounds

in the dataset and outputs a

matrix of similarities.

decomp, mass-decomposition <STANDALONE> Small tool to

decompose masses with given

deviation, ionization, chemical

alphabet and chemical filter.

mgf-export, MGF <STANDALONE> Exports the spectra of

a given input as mgf.

fingerprinter, FP <STANDALONE> Compute SIRIUS

compatible fingerprints from

PubChem standardized SMILES in

tsv format.

service, rest, REST <STANDALONE> Starts SIRIUS as a

background (REST) service that

can be requested via a REST-API.

for SIRIUS Webservices (e.g. CSI:

FingerID or CANOPUS) and securely

store a personal access token.

settings <STANDALONE> Configure persistent

(technical) settings of SIRIUS (e.

g. ProxySettings or ILP Solver).

install-autocompletion <INSTALL> generates and installs an

Autocompletion-Script with all

subcommands. Default installation

is for the current user.

summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write

Summary files from a given

project-space into the given

project-space or a custom

location.

Usage: sirius formulas [-hV] [--elements-extended-organic]

[--no-isotope-filter] [--no-isotope-score]

[--no-recalibration]

[--bottom-up-search=<bottomUpSearchOptions>]

[-c=<numberOfCandidates>]

[--candidates-per-ionization=<numberOfCandidatesPerIoniza

tion>] [--compound-timeout=<instanceTimeout>]

[-d=<dbName>[,<dbName>...]] [-e=<detectableElements>]

[-E=<enforcedElements>] [-f=<candidateFormulas>]

[--heuristic=<mzToUseHeuristic>]

[--heuristic-only=<mzToUseHeuristicOnly>]

[-i=<ionsConsidered>] [-I=<ionsEnforced>]

[-l=<injectElGordoCompounds>] [-p=<profile>]

[--ppm-max=<ppmMax>] [--ppm-max-ms2=<ppmMaxMs2>]

[--solver=<solver>] [--tree-timeout=<treeTimeout>] []

[COMMAND]

<COMPOUND TOOL> Identify molecular formula for each compound individually using

fragmentation trees and isotope patterns.

--solver, --ilp-solver=<solver>

Set ILP solver to be used for fragmentation

computation. Valid values: 'CLP' (included),

'CPLEX', 'GUROBI'.

For GUROBI and CPLEX environment variables need to

be configure (see Manual).

--no-recalibration Disable Recalibration of input Spectra

--elements-extended-organic

Use extended set of elements for molecular formula

generation. DO NOT USE IN COMBINATION WITH DE

NOVO FORMULA GENERATION!

Enforced elements are: CHNOPFI

Detectable elements are: SBBrCl

--bottom-up-search=<bottomUpSearchOptions>

Valid values: CUSTOM, BOTTOM_UP_ONLY, DISABLED. Use

DISABLED to deactivate bottom up search. Use

BOTTOM_UP_ONLY to replace de novo computations

with bottom up search for every compound.

Default: CUSTOM, which uses the predefined values

from the config tool.

--no-isotope-filter Disable molecular formula filter. When filtering is

enabled, molecular formulas are excluded if their

theoretical isotope pattern does not match the

theoretical one, even if their MS/MS pattern has

high score.

--no-isotope-score Disable isotope pattern score.

-h, --help Show this help message and exit.

-V, --version Print version information and exit.

--ppm-max=<ppmMax> Maximum allowed mass deviation in ppm for

decomposing masses.

Default: 10.0 ppm

-p, --profile=<profile> Name of the configuration profile.

Predefined profiles are: `default`, 'qtof',

'orbitrap', 'fticr'.

Default: default

-e, --elements-considered=<detectableElements>

Set the allowed elements for rare element detection.

Example: `SBrClBSe` to allow the elements S,Br,Cl,B

and Se.

Default: S,Br,Cl,B,Se

-E, --elements-enforced=<enforcedElements>

Enforce elements for molecular formula

determination.

Example: CHNOPSCl to allow the elements C, H, N, O,

P, S and Cl. Add numbers in brackets to restrict

the minimal and maximal allowed occurrence of

these elements: CHNOP[5]S[8]Cl[1-2]. When one

number is given then it is interpreted as upper

bound.

Default: C,H,N,O,P

-c, --candidates=<numberOfCandidates>

Number of precursor formula candidates in the

output - each can correspond to multiple adducts.

Default: 10

-f, --formulas=<candidateFormulas>

Specify a list of candidate formulas the method

should use. Omit this option if you want to

consider all possible molecular formulas

Default: null

-I, --adducts-enforced=<ionsEnforced>

Adducts that are always considered during the

analysis. Example: [M+H]+,[M-H]-,[M+Cl]-,[M+Na]+,

[M]+,[M-H2O+H]+.

Default: ,

--compound-timeout=<instanceTimeout>

Maximal computation time in seconds for a single

compound. 0 for an infinite amount of time.

Default: 0

--tree-timeout=<treeTimeout>

Time out in seconds per fragmentation tree

computations. 0 for an infinite amount of time.

Default: 0

-i, --adducts-considered=<ionsConsidered>

Adducts which are considered during adduct

detection. They are only used for further

analyses if there is an indication in the MS1

scan. If none of them could be detected in MS1,

all of them will be used as a fallback. Example:

[M+H]+,[M-H]-,[M+Cl]-,[M+Na]+,[M]+,[M-H2O+H]+.

Default: [M+H]+,[M+K]+,[M+Na]+,[M+H-H2O]+,[M+H-H4O2]

+,[M+NH3+H]+,[M+FA+H]+,[M+ACN+H]+,[2M+H]+,[2M+K]+,

[2M+Na]+,[M-H]-,[M+Cl]-,[M+Br]-,[M-H2O-H]-,

[M+Na-2H]-,[M+CH2O2-H]-,[M+C2H4O2-H]-,[M+H2O-H]-,

[M-H3N-H]-,[M-CO2-H]-,[M-CH2O3-H]-,[M-CH3-H]-,

[2M+H]-,[2M+Cl]-,[2M+Br]-

--candidates-per-ionization=<numberOfCandidatesPerIonization>

Minimum number of candidates in the output for each

ionization. Set to force output of results for

each possible ionization, even if not part of

highest ranked results.

Default: 1

-d, --db, --database=<dbName>[,<dbName>...]

Search formulas in the Union of the given

databases. If no database is given all possible

molecular formulas will be respected (no database

is used).

Example: possible DBs: 'ALL,,BIO,PUBCHEM,MESH,HMDB,

KNAPSACK,CHEBI,PUBMED,KEGG,HSDB,MACONDA,METACYC,

GNPS,ZINCBIO,YMDB,PLANTCYC,NORMAN,ADDITIONAL,

PUBCHEMANNOTATIONBIO,PUBCHEMANNOTATIONDRUG,

PUBCHEMANNOTATIONSAFETYANDTOXIC,

PUBCHEMANNOTATIONFOOD,KEGGMINE,ECOCYCMINE,

YMDBMINE'

Default: none

--ppm-max-ms2=<ppmMaxMs2>

Maximum allowed mass deviation in ppm for

decomposing masses in MS2. If not specified, the

same value as for the MS1 is used.

Default: 10.0 ppm

-l, --elgordo, --fix-lipids=<injectElGordoCompounds>

Fix the single molecular formula determined by El

Gordo if a lipid class is detected.

Default: True

--heuristic-only=<mzToUseHeuristicOnly>

Use only heuristic tree computation compounds >=

the specified m/z.

Default: 650

--heuristic=<mzToUseHeuristic>

Enable heuristic preprocessing for compounds >= the

specified m/z.

Default: 300

Commands:

zodiac, rerank-formulas <DATASET TOOL> Identify Molecular formulas of

all compounds in a dataset together using

ZODIAC.

fingerprints, fingerprint <COMPOUND TOOL> Predict molecular fingerprint

from MS/MS and fragmentation trees for each

compound individually using CSI:FingerID

fingerprint prediction.

summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write Summary

files from a given project-space into the

given project-space or a custom location.

Usage: sirius zodiac [-hV] [--ignore-spectra-quality] [--burn-in=<burnInSteps>]

[--considered-candidates-at-300=<numberOfConsideredCandidat

esBelow300>]

[--considered-candidates-at-800=<numberOfConsideredCandidat

esAbove800>] [--iterations=<iterationSteps>]

[--minLocalConnections=<minLocalConnections>]

[--thresholdFilter=<thresholdFilter>] [COMMAND]

<DATASET TOOL> Identify Molecular formulas of all compounds in a dataset

together using ZODIAC.

--burn-in=<burnInSteps>

Number of epochs considered as 'burn-in period'.

Default: 2000

--considered-candidates-at-300=<numberOfConsideredCandidatesBelow300>

Maximum number of candidate molecular formulas (fragmentation

trees computed by SIRIUS) per compound which are considered

by ZODIAC for compounds below 300 m/z.

Default: 10

--considered-candidates-at-800=<numberOfConsideredCandidatesAbove800>

Maximum number of candidate molecular formulas (fragmentation

trees computed by SIRIUS) per compound which are considered

by ZODIAC for compounds above 800 m/z.

Default: 50

-h, --help Show this help message and exit.

--ignore-spectra-quality

As default ZODIAC runs a 2-step approach. First running 'good

quality compounds' only, and afterwards including the

remaining.

--iterations=<iterationSteps>

Number of epochs to run the Gibbs sampling. When multiple

Markov chains are computed, all chains' iterations sum up

to this value.

Default: 20000

--minLocalConnections=<minLocalConnections>

Minimum number of compounds to which at least one candidate

per compound must be connected to.

Default: 10

--thresholdFilter=<thresholdFilter>

Defines the proportion of edges of the complete network which

will be ignored.

Default: 0.95

-V, --version Print version information and exit.

Commands:

fingerprints, fingerprint <COMPOUND TOOL> Predict molecular fingerprint

from MS/MS and fragmentation trees for each

compound individually using CSI:FingerID

fingerprint prediction.

summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write Summary

files from a given project-space into the

given project-space or a custom location.

Usage: sirius [-hV] [--noCite] [--recompute] [--buffer=<initialInstanceBuffer>]

[--cores=<numOfCores>] [--log=<logLevel>] [--maxmz=<maxMz>]

[--workspace=<workspace>] [[-o=<outputProjectLocation>]

[--update-fingerprint-version]] [[-i=<inputPath>[,<inputPath>...]

[-i=<inputPath>[,<inputPath>...]]... [--ignore-formula]

[--allow-ms1-only]] [-z=<parentMz> [-1=<ms1File>[,<ms1File>...]]

[--adduct=<ionType>] -2=<ms2File>[,<ms2File>...]

[-f=<formula>]]...] [COMMAND]

-h, --help Show this help message and exit.

-V, --version Print version information and exit.

--log, --loglevel=<logLevel>

Set logging level of the Jobs SIRIUS will execute.

Valid values: SEVERE, WARNING, INFO, FINER, ALL

Default: WARNING

--cores, --threads, --processors=<numOfCores>

Number of simultaneous worker thread to be used for

compute intense workload. If not specified SIRIUS

chooses a reasonable number based you CPU specs.

--buffer, --instance-buffer=<initialInstanceBuffer>

Number of instances that will be loaded into the

Memory. A larger buffer ensures that there are

enough instances available to use all cores

efficiently during computation. A smaller buffer

saves Memory. To load all instances immediately

set it to -1. Default (numeric value 0): 3 x

--cores. Note that for <DATASET_TOOLS> the

compound buffer may have no effect because this

tools may have to load compounds simultaneously

into the memory.

Default: 0

--workspace=<workspace>

Specify sirius workspace location. This is the

directory for storing Property files, logs,

databases and caches. This is NOT for the

project-space that stores the results! Default is

$USER_HOME/.sirius-<MINOR_VERSION>

--recompute Recompute results of ALL tools where results are

already present. Per default already present

results will be preserved and the instance will

be skipped for the corresponding Task/Tool

--maxmz=<maxMz> Only considers compounds with a precursor m/z lower

or equal [--maxmz]. All other compounds in the

input will be skipped.

Default: Infinity

--noCite, --noCitations, --no-citations

Do not write summary files to the project-space

Specify OUTPUT Project-Space:

-o, -p, --output, --project=<outputProjectLocation>

Specify the project-space to write into. If no

[--input] is specified it is also used as input.

For compression use the File ending .zip or .

sirius.

--update-fingerprint-version

Updates Fingerprint versions of the input project

to the one used by this SIRIUS version.

WARNING: All Fingerprint related results (CSI:

FingerID, CANOPUS) will be lost!

Specify multi-compound inputs (.ms, .mgf, .mzML/.mzXml, .sirius):

-i, --input=<inputPath>[,<inputPath>...]

Specify the input in multi-compound input formats:

Preprocessed mass spectra in .ms or .mgf file

format or LC/MS runs in .mzML/.mzXml format but

also any other file type e.g. to provide input

for STANDALONE tools.

--ignore-formula ignore given molecular formula if present in .ms or

.mgf input files.

--allow-ms1-only Allow MS1 only data to be imported.

Specify generic inputs (CSV) on per compound level:

-1, --ms1=<ms1File>[,<ms1File>...]

MS1 spectra files

-2, --ms2=<ms2File>[,<ms2File>...]

MS2 spectra files

-z, --mz, --precursor, --parentmass=<parentMz>

The mass of the parent ion for the specified ms2

spectra

--adduct, --ionization=<ionType>

Specify the adduct for this compound

Default: [M+?]+

-f, --formula=<formula> Specify the neutralized formula of this compound.

This will be used for tree computation. If given

no mass decomposition will be performed.

Usage: sirius structures [-hV] [-d=<dbName>[,<dbName>...]]

[-e=<expansiveSearchConfMode>] [COMMAND]

<COMPOUND TOOL> Search in molecular structure db for each compound Individually

using CSI:FingerID structure database search.

-d, --db, --database=<dbName>[,<dbName>...]

Search structure in the union of the given databases. If no

database is given the default database(s) are used.

Example: possible DBs: 'ALL,,BIO,PUBCHEM,MESH,HMDB,KNAPSACK,

CHEBI,PUBMED,KEGG,HSDB,MACONDA,METACYC,GNPS,ZINCBIO,YMDB,

PLANTCYC,NORMAN,ADDITIONAL,PUBCHEMANNOTATIONBIO,

PUBCHEMANNOTATIONDRUG,PUBCHEMANNOTATIONSAFETYANDTOXIC,

PUBCHEMANNOTATIONFOOD,KEGGMINE,ECOCYCMINE,YMDBMINE'

Default: BIO

-e, --exp=<expansiveSearchConfMode>

Confidence mode that is used for expansive search. OFF -> no

expansive search. EXACT -> Exact mode confidence score is

used for expansive search. APPROXIMATE -> Approximate mode

confidence score is used for expansive search

-h, --help Show this help message and exit.

-V, --version Print version information and exit.

Commands:

denovo-structures, msnovelist <COMPOUND TOOL> Predict MsNovelist compound

candidates and compare them against

molecular fingerprint using CSI:FingerID

scoring method.

summaries, write-summaries, W <STANDALONE, POSTPROCESSING> Write Summary

files from a given project-space into the

given project-space or a custom location.

バイオ情報計測学は舟でゆく

2024年12月13日金曜日

SIRIUS6 CLI

0 件のコメント:

コメントを投稿