scautoloc

scautoloc is the SeisComP3 program responsible for automatically locating seismic events in near-real time. It normally runs as a daemon, continuously reading picks and amplitudes and processing them in real time. An offline mode is available as well. scautoloc reads automatic picks and several associated amplitudes. On that basis it tries to identify combinations of picks that correspond to a common seismic event. If the produced location meets certain consistency criteria, it is reported, i.e. passed on to other programs that take the origins as input.

Location procedure

The procedure of scautoloc to identify and locate seismic events basically consists of the following steps:

Pick preparation
In scautoloc each incoming pick needs to be accompanied by a specific set of amplitudes. Since in the SeisComP3 data model amplitudes and picks are independent objects, the amplitudes are added as attributes to their corresponding picks upon reception by scautoloc.

Pick filtering
Each incoming pick is filtered, i.e. it is checked if a pick is outdated and if the complete set of associated amplitudes is present already. If a station produces picks extremely often, these are considered to be more likely glitches and result in an increased SNR threshold.
Association
It is first attempted to associate an incoming pick with the known origins. Especially for large events with stable locations based on many picks already associated, this is the preferred way to handle the pick. If the association succeeds, the nucleation process can be bypassed. Under certain circumstances picks are both associated and fed into the nucleator.
Nucleation
If direct association fails, scautoloc tries to make a new origin out of this and other unassociated, previously received picks. This process is called “nucleation”. scautoloc performs a grid search over space and time, which is a rather expensive procedure as it requires lots of resources both in terms of CPU and RAM. Additional nucleation algorithms will become available in future. The grid is a discrete set of -in principle- arbitrary points that sample the area of interest sufficiently densely. In the grid search, each of the grid points is taken as a hypothetical hypocenter for all incoming picks. Each incoming pick is back projected in time for each of the grid points, on the assumption that it is a first-arrival “P” onset. If the pick indeed corresponds to a “P” arrival of a seismic event, and if this event was recorded at a sufficient number of stations, the back projected new pick will cluster with previous picks from the same event. The cluster will be densest around the origin time at the grid point closest to the hypocenter. In principle, the grid could be so dense that the location obtained from the grid search can be used directly. However, as RAM memory as well as CPU speed is limited, this is not possible. Therefore, if a cluster is identified as a potential origin, it does not necessarily mean that all contributing picks actually correspond to “P” arrivals. It may as well be a coincidental match caused by the coarseness of the grid or possible contamination by picked noise. Therefore, a location program (LocSAT) is run in order to try a location and test if the set of picks indeed forms a consistent hypocenter. If the pick residual RMS is too large, an improvement is attempted by excluding each of the contributing picks once to test if a reduction in RMS can be achieved. If the new origin meets all requirements, it is accepted as new seismic event location.
The grid points are specified in a text file “grid.txt”. The default file shipped with scautoloc defines a grid with globally even distributed points at the surface, and depth points confined to regions of known deep seismicity. It may be modified, but should not comprise too many grid points (>3000, depending on CPU speed and RAM). See below for more details about the grid file.
Origin refinement
An origin produced or updated through association and/or nucleation may still be contaminated by phases wrongly interpreted as “P” arrivals. scautoloc tries to improve these origins based on e.g. pick SNR and amplitude. In this processing step, it is also attempted to associate phases which slipped through during the first association attempt, e.g. because the initial location was incorrect. If the origin contains a sufficient number of arrivals to assume a reasonably well location result, scautoloc additionally tries to associate picks as secondary phases such as “pP”. Such secondary phases are only “weakly associated”, i.e. these phases are not used for the location. For the analyst, however, it is useful to have possible “pP” phases predefined.
Origin filtering
This process involves final consistency checks of new/updated origins etc. During this procedure, the origins are not modified any more.

In the course of nucleation and association, as well as in the origin refinement and filtering, certain heuristic criteria are applied to compare the “qualities” of concurring origins. These criteria are combined in an internal origin score, which is based on properties of the picks themselves in the context of the respective origin (residuals, RMS, azimuthal gaps). In addition, the amplitudes provide valuable means of comparing origin qualities. Obviously, a pick with a high SNR will less likely be a transient burst of noise than a pick merely exceeding the SNR threshold. A high-SNR pick thus increases the origin score. Similarly, a pick associated to a large absolute amplitude is more likely to correspond to a real seismic onset, especially in case of simultaneous, large-amplitude observations at neighboring stations. A special case arises, when several nearby stations report amplitudes above a certain “XXL threshold”. For details see the section “Preliminary origins”. The amplitudes used by scautoloc are of type “snr” and “mb”, corresponding to the (relative, unit-less) SNR amplitude and the (absolute) “mb” amplitude, respectively. These two amplitudes are provided by scautopick. In case of a setup in which scautopick is replaced by a different automatic picker, these two amplitudes must nevertheless be provided to scautoloc. Otherwise, the picks are not used. At the moment this is a strict requirement, in the future it may be changed.

The grid file
The grid configuration file consists of one line per grid point, each grid point specified by 6 columns, e.g.:

-10.00 105.00 20.0 5.0 180.0 8
The columns are grid point coordinates (latitude, longitude, depth), diameter, maximum station distance and minimum pick count, respectively. The above line sets a grid point centered at 10° S / 105° E at the depth of 20 km. It is sensitive to events within 5° of the center. Stations in a distance of up to 180° may be used to nucleate an event. At least 8 picks have to contribute to an origin at this location. The diameter should be chosen large enough to allow grid cells to overlap, but not too large. The size also determines the time windows for grouping the picks in the grid search. If the time windows are too long the risk of contamination with wrong picks increases. The maximum station distance allows to restrict to certain stations for the according grid points. E.g. stations from Australia are normally not required to create an event in Europe. If there is doubt, set the value to 180. The minimum pick count specifies how many picks are required for a given grid point to allow the creation of a new origin. The default grid file contains a global grid with even spacing of ~5° with additional points at greater depths where deep-focus events are known to occur.
The station configuration file
The station configuration file contains lines consisting of network code, station code, usage flag (0 or 1) and maximum nucleation distance. A usage flag of 1 indicates the station shall be used by scautoloc. If it shall not be used, 0 must be specified here. The maximum nucleation distance is the distance (in degrees) from the station up to which this station may contribute to a new origin. If this distance is 180°, this station may contribute to new origins world-wide. However, if the distance is only 10°, the range of this station is limited. This is a helpful setting in case of mediocre stations in a region where there are numerous good and reliable stations nearby. The station will then not pose a risk for locations generated outside the maximum nucleation distance. Network and station code may be wildcards (*) for convenience. E.g.:
* * 1 90 
GE * 1 180
GE HLG 1 10 
TE RGN 0 10
The example above means that all stations from all networks by default can create new events within 90°. The GE stations can create events at any distance, except for the rather noisy station HLG in the network GE, which is restricted to 10°. By setting the 3rd column to 0, TE RGN is ignored by scautoloc.
Preliminary origins
Usually, scautoloc will not report origins with less than a certain number of defining phases (specified by autoloc.minPhaseCount), typically 6-8 phases. In the case of events that result in very large amplitudes at a sufficient number of stations (hereafter called “XXL events”), it is possible to produce preliminary origins based on less picks.
Prerequisite is that all these picks have extraordinary large amplitudes and SNR and lie within a relatively small region. Such picks are hereafter called “XXL picks”. A pick is internally tagged as “XXL pick” if its amplitude exceeds a certain threshold (specified by autoloc.thresholdXXL) and has a SNR > 8. For larger SNR picks with smaller amplitude can reach the XXL tag, because it is justified to treat a large-SNR pick as XXL pick even if its amplitude is somewhat below the XXL amplitude threshold. The XXL criterion should be judged as workaround to identify picks which justify the nucleation of preliminary origins.
Logging
scautoloc produces two kinds of log files: a normal application log file containing the processing and location history and an optional pick log. The pick log contains all received picks with associated amplitudes in a simple text file, one entry per line. This pick log should always be active as it allows pick playback for trouble shooting and optimization of scautoloc. If something did not work as expected, playing back the pick log will provide a useful way to find the source of the problem without the need of processing the raw waveforms again. The application log file contains miscellaneous information in variable format. The format of the entries may change anytime, so no downstream application should ever depend on it. There are some special lines, however. These contain certain keywords that allow convenient filtering of the most important information using grep. These keywords are NEW, UPD and OUT, for a new, updated and output origin, respectively. They can be used e.g. like
grep '\(NEW\---UPD\---OUT\)' ~/.seiscomp3/log/scautoloc.log
This will extract all lines containing the above keywords, providing a very simple (and primitive) origin history.
Publication interval
In principle, scautoloc produces a new solution (origin) after each processed pick. This is desirable at an early stage of an event, when every additional information may lead to significant improvements. A consolidated solution, consisting of many (dozens) of picks, on the other hand may not always benefit greatly from additional picks that usually originate from large distances. Updates after each pick are therefore unnecessary. It is possible to control the time interval between subsequent origins reported by scautoloc. The time interval is a linear function of the number of picks:

∆t = aN + b

Setting a = b = 0, then ∆t is always zero, meaning there is never a delay in sending new solutions. This is not desirable. Setting a = 0.5, each pick will increase the time interval until the next solution will be sent by 0.5s. This means that scautoloc will wait 10 seconds after an origin with 20 picks is sent.
Housekeeping
scautoloc keeps objects in memory only for a certain amount of time. This time span is specified in seconds in autoloc.maxAge. The default value is 21600 seconds (6 hours). After this time, unassociated picks expire. Newly arriving picks older than that (e.g. in the case of high data latencies) are ignored. Origins will live slightly longer, including the picks associated to them. In a setup where many stations have considerable latencies, e.g. dialup stations, the expiration time should be chosen long enough to accommodate late picks. On the other hand, the memory usage for large networks may be a concern as well. scautoloc periodically cleans up its memory from expired objects. The time interval between subsequent housekeepings is specified in autoloc.cleanupInterval in seconds.
Test mode
In the test mode, scautoloc connects to a messaging server as usual and receives picks and amplitudes from there, but no results are sent back to the server. Log files are written as usual. This mode can be used to test new parameter settings before implementation in the real-time system. It also provides a simple way to log picks from a real-time system to file.
Offline mode
scautoloc normally runs as a daemon in the background, continuously reading picks and amplitudes and processing them in real time. However, scautoloc may also be operated in offline mode. This is useful for debugging. Offline mode is activated by setting autoloc.offline to true or by adding the parameter --offline to the command line. When operated in offline mode, scautoloc will connect neither to the messaging nor to the database. Instead, it reads picks in the pick file format from standard input. Example for entries in a pickfile:
2008-09-25 00:20:16.6 SK LIKS EH __ 4.6 196.953 1.1 A 20080925.002016.68-SK.LIKS..EH
2008-09-25 00:20:33.5 SJ BEO BH __ 3.0 479.042 0.9 A 20080925.002033.52-SJ.BEO..BH
2008-09-25 00:21:00.1 CX MNMCX BH __ 21.0 407.358 0.7 A 20080925.002100.15-CX.MNMCX..BH
2008-09-25 00:21:02.7 CX HMBCX BH __ 14.7 495.533 0.5 A 20080925.002102.80-CX.HMBCX..BH
2008-09-24 20:53:59.9 IA KLI BH __ 3.2 143.752 0.6 A 20080924.205359.99-IA.KLI..BH
2008-09-25 00:21:04.5 CX PSGCX BH __ 7.1 258.407 0.6 A 20080925.002104.60-CX.PSGCX..BH
2008-09-25 00:21:09.5 CX PB01 BH __ 10.1 139.058 0.6 A 20080925.002109.52-CX.PB01..BH
2008-09-25 00:21:24.0 NU ACON SH __ 4.9 152.910 0.6 A 20080925.002124.06-NU.ACON..SH
2008-09-25 00:22:09.0 CX PB04 BH __ 9.0 305.960 0.6 A 20080925.002209.07-CX.PB04..BH
2008-09-25 00:19:13.1 GE BKNI BH __ 3.3 100.523 0.5 A 20080925.001913.17-GE.BKNI..BH
2008-09-25 00:23:47.6 RO IAS BH __ 3.1 206.656 0.3 A 20080925.002347.67-RO.IAS..BH
2008-09-25 00:09:12.8 GE JAGI BH __ 31.9 1015.304 0.8 A 20080925.000912.82-GE.JAGI..BH
2008-09-25 00:25:10.7 SJ BEO BH __ 3.4 546.364 1.1 A 20080925.002510.77-SJ.BEO..BH
Note that in the above example some of the picks are not in right order of time because of data latencies. In offline mode scautoloc will not connect to the database, in consequence the station coordinates cannot be read from the database and thus have to be supplied via a file. The station coordinates file has a simple format with one line per entry, consisting of 5 columns: network code, station code, latitude, longitude, elevation (in meters), e.g.
GE APE 37.0689 25.5306 620.0 
GE BANI -4.5330 129.9000 0.0 
GE BKB -1.2558 116.9155 0.0 
GE BKNI 0.3500 101.0333 0.0 
GE BOAB 12.4493 -85.6659 381.0 
GE CART 37.5868 -1.0012 65.0 
GE CEU 35.8987 -5.3731 320.0 
GE CISI -7.5557 107.8153 0.0
The location of this file is specified in autoloc.stationLocations or on the command line using --station-locations

Options

scautoloc supports command line options and configuration by file (scautoloc.cfg)

Commandline

--test <flag>
scautoloc will run in test mode. (default: false)
--offline <flag>
scautoloc will run in offline mode. (default: false)
--station-locations <file>
The station location file used in offline mode. (Default: trunk/share/autoloc/station-locations.conf)
--station-config <file>
This is the station configuration file. (Default: trunk/share/autoloc/station.conf)
--pick-log <file>
Write all picks to a plain text file immediately after reception. This feature is for debugging and error analysis and not activated by default. This file can be played back in online mode. Recommended location: $HOME/.seiscomp3/log/autoloc-picks.log. It is highly advisable always to log picks to a file.
--grid <file>
This is the grid file used for the nucleation process. (Default: trunk/share/autoloc/grid.conf)
--default-depth <double> [10]
This value (in km) is used as source depth if scautoloc cannot resolve the source depth, e.g. due to unfavorable station distribution.
--max-sgap <double> [360]
Maximum secondary azimuth gap in degrees for an origin to be reported.
--max-rms <double> [3.5]
Maximum allowed RMS in seconds for a location to be reported. scautoloc actively tries to keep the RMS below this value by excluding large-residual picks from the location.
--max-residual <double> [7]
Maximum individual residual (unweighted) in seconds for a pick to be used in the location. Note that in order to keep RMS below autoloc.maxRMS, scautoloc may exclude picks with residual less than this.
--max-station-distance <double> [180]
Maximum distance of stations to be used.
--max-nucleation-distance-default <double>
Default maximum distance of stations to be used for nucleating new origins.
--min-pick-affinity <double> [0.05]
To be done
--min-phase-count <integer> [6]
Minimum phase count for an origin to be reported (currently equivalent to station count). The absolute, currently hard-coded minimum is 5, except for “XXL events” where it is 4. Usually, 6 or 7 is a recommended minimum, that may still result in a considerable number of fake events.
--min-score <integer> [8]
Minimum score for an origin to be reported. Note that the score is not (yet) a well defined property. It is roughly proportional to the number of P/PKP picks, though large-amplitude and large-SNR picks are counted higher, whereas low-SNR and high-residual picks are counted less. For most setups the default value of 10 will be appropriate.
--min-pick-snr <double>
Ignore picks with SNR amplitude less than this (unit-less) threshold. The ignored picks are nevertheless logged to the pick log if active.
--threshold-xxl <double>
Arrivals with exceptionally large amplitudes may be tagged as XXL. The threshold must be specified in the same units as the mb amplitude, i.e. usually nm/s.
--min-phase-count-xxl <integer>
Minimum number of picks for an XXL origin to be reported.
--max-distance-xxl <double>
In order to quickly declare XXL origins, only stations within this distance from each other are considered. If the location using the XXL picks succeeds, the origin is only reported if all stations are within this epicentral distance.
--min-sta-count-ignore-pkp <integer> [15]
Ignore PKP phases in the location, if more than the given number of direct “P” phases are observed.
--min-score-bypass-nucleator <integer>
Minimum score at which the nucleator is bypassed.
--keep-events-timespan <integer> [86400]
The timespan to keep historical events. (default: 86400s)
--cleanup-interval <integer>
Interval in seconds after which a cleanup is triggered. Best is not to modify the default.
--max-age <double>
During cleanup all objects older than maxAge (in seconds) are removed (maxAge == 0 => disable cleanup)
--wakeup-interval <integer>
Interval in seconds after which the buffer of outgoing origins is flushed in case there are any pending origins. This ensures that the last origin of an event is sent without too large a delay.
--dynamic-pick-threshold-interval arg <integer>
The interval in seconds in which to check for extraordinarily high pick activity, resulting in a dynamically increased pick threshold.

Configuration

autoloc.adoptManualDepth <boolean> [false]
If set to true, autoloc adopts a depth from a manual origin. If false, autoloc may set a default depth (cf. autoloc.defaultDepth).
autoloc.amplTypeAbs <string> No default value?
If this string is non-empty, an amplitude obtained from an amplitude object is used by ... . If this string is "mb", a period obtained from the amplitude object is also used; if it has some other value, then 1 [units?] is used. If this string is empty, then the amplitude is set to 0.5 * thresholdXXL, and 1 [units?] is used for the period.
autoloc.amplTypeSNR <string> No default value?
If this string is non-empty (e.g. "snr"), it is used to obtain a pick SNR from an amplitude object. If it is empty, the pick SNR is 10.
autoloc.publicationIntervalTimeSlope
This is the parameter ‘a’ in the equation ∆t = aN + b for the time interval between origin updates.
autoloc.publicationIntervalTimeIntercept
This is the parameter ‘b’ in the above mentioned equation for the update interval ∆t.
autoloc.publicationIntervalPickCount [20]
If more than this number of new picks are added to the origin, publish the new origin no matter when the last publishing happened.
autoloc.grid
See commandline options
autoloc.stationConfig
See commandline options
autoloc.defaultDepth
See commandline options
autoloc.maxDepth [700]
No origin deeper than this value (in km) is reported. scautoloc tries to find locations shallower than this.
autoloc.thresholdXXL
See commandline options
autoloc.minPhaseCount
See commandline options
autoloc.minScore
See commandline options
autoloc.minStaCountIgnorePKP
See commandline options
autoloc.maxRMS
See commandline options
autoloc.maxResidual
See commandline options
autoloc.maxSGAP
See commandline options
autoloc.pickLog
See commandline options
autoloc.minPickSNR
See commandline options
autoloc.offline
See commandline options
autoloc.test
See commandline options
autoloc.stationLocations
See commandline options
autoloc.wakeupInterval
See commandline options
autoloc.cleanupInterval
See commandline options
autoloc.maxStationDistanceXXL
See commandline options
autoloc.maxStationDistance
Maximum station distance for a pick to be associated with an origin.
autoloc.dynamicPickThresholdInterval
See command line options.
autoloc.useManualOrigins <boolean> [false]
TODO