Database
If you are searching a single organism database, or using a taxonomy filter, its a good
idea to include a contaminants database in the search.
You can never guarantee the absence of contaminant proteins, such as the enzyme used for the digest,
BSA, keratins, etc. If you don't include a contaminants database, then the result
report will show matches to similar proteins from your organism of interest, which
may be misleading.
Taxonomy
If you want to use a taxonomy filter for a category that is not listed on the default
list, you can easily configure this. Refer to Chapter 9 of the Setup & Installation manual.
Categories can be any organism with a TaxID and can easily be combinations of organisms, e.g.
human+mouse.
Enzyme
An enzyme of low specificity, which digests proteins to a mixture
of very short peptides, is not a good choice, because almost any given 3 or 4 residue
peptide will be found in many database entries. The longer the peptide, the
greater the specificity. In most cases, it is best to use an enzyme of specificity equal
to or greater than trypsin.
Setting the number of allowed missed cleavage sites to zero simulates a limit digest.
If you are confident that your digest is perfect, with no partial fragments present,
this will give maximum discrimination and the highest score. If experience shows
that your digest mixtures usually include some partials,
that is, peptides with missed cleavage sites, you should choose a setting of 1, or
maybe 2 missed cleavage sites. Don't specify a higher number without good reason,
because each additional level of missed cleavages increases the number
of calculated peptide masses to be matched against the experimental data.
If the actual digest does not contain extended
partials, this simply increases the number of random matches,
and so reduces discrimination.
The high level of specificity of an MS/MS ions search means
that it is not essential to choose an
enzyme. However,
a search in which an enzyme has been specified will be considerably
faster and more sensitive. This is because the "no enzyme" search must
test all possible subsequences of each protein, rather than just
(say) tryptic peptides. For a protein of N residues, there are
approximately N/10 tryptic peptides compared with N(N+1)/2 possible
subsequences. For a modest size protein of 250 residues, this
is an increase of three orders of magnitude in the number of peptides
which must be considered.
Sometimes, you have no choice but to use no enzyme. For example,
when the peptides do not originate from
a formal digest, such as endogenous peptides. In all cases where the
sample has been digested with a known enzyme, it is advisable to specify that
enzyme for the search, even though there may be some non-specific cleavage products.
There are two ways to pick up non-specific cleavage. If there is a very high level
of non-specific cleavage, use a semi-specific version
of the enzyme. This will match peptides that are non-specific at one terminus, which
minimises the increase in the search space and avoids too great a
loss of sensitivity. For typical levels of non-specificity (< 5%) an
error tolerant search is the preferred option.
This is a second pass search combining a semi-specific version of the selected enzyme
plus a search for unusual modifications.
Modifications
For a fast and sensitive search, use a minimum of variable modifications. An
error tolerant search is a much better way
to find rare modifications than selecting them as variable modifications.
You cannot select two fixed modifications with the same specificity. If you select
variable modifications with the same specificity as a fixed modification, this excludes
the possibility of an unmodified site. For example, if you choose Carbamidomethyl (C) as
fixed and Propionamide (C) as variable, you can get matches to either of these but never
to a peptide with free cysteine. Also, you will not get matches to peptides containing both
carbamidomethyl and propionamide.
Mass Tolerances
In an MS/MS search, the peptide
mass tolerance determines the number of candidate peptides tested for a match.
This affects the significance threshold score, but it does not affect the ions score.
The match is the same match whether there was 1 candiate or 10,000 candidates. The number of
candidates simply determines whether the match is significant or not.
Sometimes, peak detection chooses the 13C peak rather than the 12C.
In extreme cases, it may pick the 13C2 peak. If this is happening, use the
#13C setting to match these
peaks without using a very large peptide mass tolerance.
There is a common misunderstanding that ions scores should continue to
increase indefinitely as fragment mass accuracy improves. This is not the case because
we are not simply scoring the quality of the spectrum, we are scoring
whether the reported match could occur by chance.
Scores for good matches approach an asymptotic limit as the
fragment mass tolerance
approaches zero. In the limit, with perfect mass accuracy and infinite
S/N, the Mascot score for a spectrum with complete sequence coverage
should approach the score for an identity match in a Blast
search, because the MS/MS spectrum can only correspond to a single
sequence. In practice, the main benefit of high fragment accuracy is that the score
distribution for the random matches drops away. That is, discrimination
and sensitivity are greatly improved even though the score for the correct match
may have increased only slightly.
Charge
The charge control in the search form set the default charge(s), to be used if a
spectrum in the peak list does include charge information. In practice, peak lists
almost always include charge information, so the default is never used.
Instrument
The Instrument setting determines
the set of ion series that are considered when trying to find a match. Mascot looks
for evidence in the spectrum for each series. If no evidence is found, the series
is discarded. Many of the instruments are very similar. It only becomes critical
to select the correct Instrument is when the data come from an ECD or ETD experiment,
which produces c and z ions, not found in most of the other Instrument types.
If you are doing alternate CID/ETD, you may need to define an Instrument that is a super-set
of both settings. Use the Configuration Editor (link from your local Mascot home page).