>RL228e1 AGAAGAAGAGGTAGTAATTAGATCTGACAATTTCACGGACAATGCTAAAACTATAATAGTACAGCTGAAAGAACCTGT AGAAATTAATTGTACAAGACCCCACAACAATACAAGAAGAAGGATAAGTATAGGACCAGGGAGAGCATTTTATGCAAC >RL228e2 AGAAATTAATTGTACAAGACCCCACAACAATACAAGAAGAAGGATAAGTATAGGACCAGGGAGAGCATTTTATGCAAC AGAAGAAGAGGTAGTAATTAGATCTGACAATTTCACGGACAATGCTAAAACTATAATAGTACAGCTGAAAGAACCTGT GAAGAAGAGGTAGTAATTAGATCTGACAATTTCACGGACAATGCTAAAACTATAATAGTACAGCTGAAAG
Mask off segments of the query sequence that have low compositional complexity, as determined by the SEG program of Wootton & Federhen (Computers and Chemistry, 1993) or, for BLASTN, by the DUST program of Tatusov and Lipman (in preparation). Filtering can eliminate statistically significant but biologically uninteresting reports from the blast output (e.g., hits against common acidic-, basic- or proline-rich regions), leaving the more biologically interesting regions of the query sequence available for specific matching against database sequences.
Filtering is only applied to the query sequence (or its translation products), not to database sequences. Default filtering is DUST for BLASTN, SEG for other programs.
It is not unusual for nothing at all to be masked by SEG, when applied to sequences in SWISS-PROT, so filtering should not be expected to always yield an effect. Furthermore, in some cases, sequences are masked in their entirety, indicating that the statistical significance of any matches reported against the unfiltered query sequence should be suspect.