Proposed Outline of CISN Northern CA Management Center Outlined: 2003/06/02 - DN and LG Updated: 2003/08/08 - Additions from DO Updated: 2003/10/09 - LG and DO Updated: 2004/09/23 - All Introduction: This outline was prepared by Doug Neuhauser, based on a meeting between Dave Oppenheimer, Lind Gee, and Doug Neuhauser in June 2003 to discuss how to move foward. Revisions/additions following 9/18/2003 joint meeting. Revisions/additions following 9/23/2004 to reflect discussions over the past year. Summary: The CISN Northern CA Management Center for CISN will consist of two redundant real-time seismic processing systems (RTSP), one each located at UC Berkeley and USGS/Menlo Park, and will store earthquake parameters in real-time into a replicated database system. Network Services (NS) for the RTSP, such as wave servers, pick/coda servers, trigger servers, and amp servers, will be located at the data acquisition hubs of the BDSN and NCSN, and will provide data resources from the respective networks (and other acquired data) to both of the RTSPs. An RTSP will perform the following functions: a. Earthquake detection and location. b. Determination of amplitude and ground motion parameters. c. Earthquake magnitude determination. d. Creation of ShakeMap e. Determination of seismic moment tensor. f. Determination of finite fault parameters. g. Rapid distribution of earthquake parameters to appropriate CISN agencies and CISN earthquake notification systems. h. Real-time update of earthquake parameters in unified NCAL database in CISN schema. Ideally, the functional core software components of the two RTSPs will be identical in implementation, with differences in communication/interconnection between modules. However, the details of the actual implementation remain to be determined and it is possible that even some core modules may differ (at least initially) between Menlo Park and Berkeley. I. NETWORK SERVICES (NS): The Northern CA Management Center will utilize real-time data from two primary networks: the NCSN and the BDSN. The NCSN data acquisition is centered at the USGS in Menlo Park, and the BDSN data acquisition is centered at UC Berkeley. There is currently insufficient funds to ensure reliable and robust distribution of all waveforms to both locations. Therefore, a series of base level services will be provided at each location and will serve all RTSP for the Northern CA Management Center (Network Services Figure). The services will be implemented with redundant servers at each center to avoid single points of failure. 1. Waveserver All waveforms (continuous and triggered) acquired by USGS/MP and UC Berkeley that may be used by the RTSP or archived by the NCEDC will be made available by one or more waveform request services. The waveform servers must provide the ability to retrieve waveforms based on requests made by channel and time interval. This will be implemented via a "Proxy Wave Server". The Proxy Wave Server will have a front end similar to the TriNet wave servers. Its back end will speak to the various northern California wave servers with currently include Earthworm, Nanometrics, Simple Wave Server (SWS). 2. Pick/Coda server Each network acquisition system will provide a source of pick and codas that will be available to the RTSPs. Picks and codas must be marked by the creater so that they are strongly associated with each other. Pick filtering: This design appears to require pick filtering both at the pick/coda server (due to multiple channels from the same station) and at recipient (since the recipient may also be picking some of the same channels). Multiple pickers: For redundancy, USGS/MP and UCB will probably elect to run 2 pickers. The two pickers will feed multiple pick/coda servers. We will start with the Earthworm picker since it provides the required functionality TBD: a. Server distribution service: Earthworm import/export? FreeORB? See discussion at end of document. ?? How important is non-volatility for this service?? Concern about the tradeoffs between system performance and non-volatility. ?? 3. Amp Server The Amp server will perform continuous real-time processing of waveforms that will be used for magnitude, peak ground motion, spectral amplitude measurements and amp triggers. The output of the Amp Server will be a multi-valued low frequency timeseries (5 second sample interval) that can be distributed to all RTSP systems. The multi-valued timeseries will also contain state-of-health (SOH info) such as whether there were gaps in the input timeseries, and whether the input timeseries is outside of normal useful range (either high or low). We will start with the TriNet rad software and the WDA and ADA Solaris shared memory implementation of the amplitude processing software. At present, we have an implementation that uses four programs: ada2ring, export, import, and ring2ada. It probably makes sense to merge the first two into "ada_export" and the last two into "ada_import"; Berkeley has done this sort of merging with several other modules. Later, we may want to go with a free-orb implementation. TDB: The service should allow for requests for continuous amps as well as for time-window based requests. 4. StaTrig server The StaTrig server will compute STA/LTA ratios of waveform channel(s) at a site and generate station trigger messages for that site. This data will be provided to a StaTrig server that will make the data available to RTSP system. We will initially use Earthworm module CarlStaTrig (or a variant of this code) to perform the STA/LTA computation. TBD: a. Server distribution service: Earthworm import/export? FreeORB? See discussion at end of document. II. RTSP SYSTEM COMPONENTS: At each RTSP, there will be multiple modules that examine waveforms or derived parameters to detect, locate, and parameterize an earthquake. 1. Hypocenter Location System: The purpose of the Hypocenter Location System is to detect and locate earthquake using a phase association system, and to rapidly determine an earthquake hypocenter and optionally an initial (coda) magnitude. Picks and codas from the pick/coda services will be fed info a pick filter, which will feed an Earthworm binder system. A process such as eqproc, which can harvest both "quicklook" and "finalized" event views from the binder output, will feed a program such as hypoinverse to produce standardized event locations. The outputs from the location program will be fed, along with the event view type ("quicklook" or "finalized") into the event coordinator. [See Lombard's description of event coordinator.] The Hypocenter Location System will be initially implemented with the standard Earthworm rings, pick filter, binder, and an appropriately configured event eqproc module and hypoinverse "sausage". Note: other location programs may be added (in time) to replace or supplement hypoinverse. 2. Subnet Trigger System: The purpose of the Subnet Trigger System is to detect earthquake that may be missed by the Hypocenter Location System. By creating one or more subnets of stations, you can use the coincidence of station triggers within a time window to detect an event. The Subnet Trigger system receives station trigger messages from the Network Services servers, and feed them into a subnet trigger program. The output from the subnet trigger program will be fed into the event coordinator (? and possibly the trigger coordinator?). The initial implementation will be based on the Earthworm program CarlSubTrig. TBD: a. Output distribution service: Earthworm ring? FreeOrb? Database? See discussion at end of document. 3. Amp Trigger System: The purpose of the Amp Trigger System is to rapidly detect a large earthquake that may be missed by the other earthquake detection systems. The Amp Trigger System will examine the reduced amplitudes acquired from the Network Services Amp Server to find amplitude values that exceed a specified threshold. A coincidence of large amplitudes at a specified number of stations will generate an amplitude event. The initial implementation will be based on TriNet evtdetect and evt2ps programs. The evtdetect program reads amp data from an ADA (Amp Data Area shared memory region), and places the output event detection messages in an EDA (Event Data Area shared memory region). The TriNet program evt2ps distributes the event detection messages into a publish/subscribe system. TBD: a. Output distribution service: Earthworm ring? FreeOrb? Database? See discussion at end of document. 4. Regional/Teleseismic Trigger System: The purpose of the Teleseismic Trigger System is to save waveforms for teleseismic/regional events. The system could be triggered by messages from NEIC/other nets or by a (not yet designed) local system that would analyze incoming data to determine whether a teleseism was in progress. Regional/Teleseismic triggers could be fed into the event coordinator. Will Kohler has a triggering system based on NEIC messages received through QDDS which may be used as base-level implementation. TBD: a. Output distribution service: Earthworm ring? FreeOrb? Database? See discussion at end of document. b. Where should output from the regional/teleseismic trigger system go? If it goes to the event coordinator, we need to modify the event coordinator so that it can distinguish between events for which amplitudes should be generated (say) and others. Alternatively, the information could be sent to the trigger coordinator or to the request card generator. This requires more discussion. 5. Event Coordinator: Event Coordinator collects event information from multiple sources, associates these into a single event, writes the event information to the database, and notifies downstream modules of the event. The event sources can include: Hypocenter Location System Subnet Trigger System Amp Trigger System When events from different sources are associated together (currently just by a specified time window), the coordinator will consolidate the event info from the multiple sources into a single event. When an event has been finalized from the the hypocenter location module, or has reached the finalization threshold time, it is written into the database, and downstream processes are notified. The event coordinator keeps event information in its memory for specified period of time. When an event's timeout age is exceeded, the event into is purged from from the event coordinator's memory. See Lombard writeup on event coordinator. TBD: a. Output distribution service: Earthworm ring? FreeOrb? Database? See discussion at end of document. b. Downstream processing: REDI scheduler or separate scheduler for each process? This needs to be discussed further. It may be most simple to begin with each module having separate scheduling (the default in some sense) and visit the question of a master scheduler ala REDI later. If each module has a separate scheduler, it will be important to insure the schedulers have suffiecient storage so that no events are lost. 6. Trigger Coordinator: The Trigger Coordinator uses trigger information (and as a fallback, event notifications) to drive the waveform archiving system through the Request Card Generators. A subnet trigger and any events that occur within the duration of that trigger are bundled into a single "associated trigger" that is sent to the Network Trigger Request Card Generator. And events that do not fall within a subnet trigger are sent as "unassociated triggers" to a separate Request Card Generator. The Network Trigger Request Card Generator uses the triggered stations and duration to decide the channels and durations for archiving. It writes request "cards" to the database. The Waveform Archiver continuously monitors the database for new request cards. It queries the wave servers to satisfy these requests. The waveforms are written to the waveform archive area (NCEDC? local data center?) and the database is updated with the stored waveform information. The Request Card Generator receives the "unassociated trigger" messages fromthe trigger coordinator. RCG queries the database for the event information for this trigger and determines which stations should be archived and for how long. RCG writes these request cards to the database. The requests get serviced by the Waveform Archiver as above. At CalTech, another Request Card Generator is driven by teleseism notifications from NEIC. Do we need this? Or can the teleseism triggers be handled by the trigger coordinator? Ideally our hypocenter location system will not be activated by teleseisms, so there would be no "event" information to coordinate with the teleseism trigger. In reality, both the hypocenter system and the subnet trigger system may detect parts of teleseism events. TBD: a. Output distribution service: Earthworm ring? FreeOrb? Database? See discussion at end of document. b. Other? We haven't discussed this as a group. Is this where we want to fit the teleseism triggers? 7. First Motion mechanism The FM program will receive event notification that an event has occurred and will compute a first motion solution if the event satisfies the appropriate criteria. The output first motion solutions will be written to the database. This does not require writing new origin tables. Downstream processing will be notifed. Currently, the NC uses fpfit within the Earlybird system. We should investigate two issues - 1) how easy it will be to "wrap" fpfit so that it operates within this environmet and 2) talk with Jeanne Hardebeck about replacing fpfit with the codes she and Peter Shearer developed. DO will talk with Fred and Lynn. TBD: a. Output distribution service: Earthworm ring? FreeOrb Database? See discussion at end of document. b. Review issue of fpfit vis a vis Jeanne's codes - DO to do c. Table revision required - ?? - depends on (b) d. Wrappping of the FM code? 8. CISN Magnitude The CISN Magnitude program will compute an Ml and/or an Me magnitude for the event. The program will read the hypocenter info from the database, determine the appropriate stations and time windows for the event, read data from the reduced amplitude time series ADA shared memory region, and compute the magnitude(s). The initial code will be based on the TriNet Trimag program. Trimag currently computes an initial magnitude based on a subset of stations near the event, and then recomputes the magnitude based on a refined station list and time window based on the initial magnitude. The CISN Mag program will notify downstream processes that it has computed a magnitude, and will enter the info into the database. TBD: a. If we have an initial coda magnitude, we may be able to skip the initial magnitude computation. We will probably need a hierarchical approach. We could experiment with computing Ml/Me based on the quicklook hypocenter as well as on the final hypocenter and do some comparisions with the coda mags. We suspect that some geographical criteria might be required in the end - won't need to wait for codas in the Bay Area; might want to wait outside of the Bay area. b. What do we do if the event is older than the data contained in the ADA? We think it would be worthwhile to have fall back capability to use waveforms as a configuration option. This would apply to both the magnitude and the ground motion estimation. It does not need to be a priority short term development but will be important for testing purposes. c. Rules for determining preferred magnitude. Part of an ongoing CISN discussion. d. Output distribution service: Earthworm ring? FreeOrb? Database? See discussion at end of document. 9. Amplitude Readings The Ampgen process will receive event notification from the magnitude process, and based on the size of the event, generate amplitude readings for the required data channels based on event size, station distance to the event, and sensor type. The initial code will be based on the TriNet Ampgen process. Ampgen receives event/magnitude notification, and based on the channel selection process, retrieves amplitude readings from the ADA shared memory area. It will write the amplitudes to the real-time database AND the Earthworm amplitude exchange database. If it is the master system, it will generate and distribute strong motion records to CISN partners. The program will notify downstream processing system that it has computed amplitudes. TBD: a. Output distribution service: Earthworm ring? FreeOrb? Database? See discussion at end of document. b. What do we do if the event is older than the data contained in the ADA? See discussion above. 10. Shakemap There are several options for incorporating ShakeMap into the RTSP. The TriNet model is that ShakeMap is very loosely coupled to the RT system. It gets notified of an event (by ID); then ShakeMap queries the database to decide what to do. We would like to pursue the approach where ShakeMap is more tightly integrated in the real-time system. We believe that it should be triggered in the same what that Ampgen or the MT codes will be. TBD: a. When do we want to trigger Shakemap to run? b. How do we notify Shakemap feeder / startup? Earthworm ring? FreeOrb? Database? c. Need to monitor ShakeMap more than in the current system. 11. MT/Mw The MT/Mw program will receive event notification that a ML/Me magnitude has been computed (or is not needed), and will compute a MT solution and Mw if the event satisfies the size criteria. The MT code will retrieve the require waveforms from the waveform server(s). The output Mw and MT solutions will be written to the database. A new input and output origin will be written for the MT solution. Downstream processing will be notifed. TBD: a. One or multiple MT algorithms? b. Rules for determining preferred magnitude. c. Output distribution service: Earthworm ring? FreeOrb Database? d. Table revision required for full MT solution. 12. Finite Fault: The Finite Fault program will receive event notification by the MT/Mw process and will run the Finite Fault (FF) software if the event satisfies the size criteria. The FF code will retrieve the require waveforms from the waveform server(s). Due to the computational requirements of the FF code, it will most likely be run on a separate system. Appropriate inter-system notification is required to schedule the FF code on the remote system. The output of the FF solutions should be written to the database, but we currently have no tables for this info. Downstream processing will be notifed. TBD: a. Output distribution service: Earthworm ring? FreeOrb Database? b. New CISN database tables required for FF solutions. 13. Waveform archive: The Request Card Generator (RCG) on the master RTSP system will receive event notification from event coordinator and the trigger coordinator, and depending on whether the event is a hypocenter event for trigger event, will invoke either the event RCG or the trigger RCG. The request card generators will generate a request card with the SNCL, time period, and eventid for each channel to be archived for this event, and will enter the request card into the database. [See Lombard writeup on RCG.] 14. GM Import The GM Import system will import CISN strong ground motion messages from other CISN source and will enter them as unassociated entries into the database. Currently this is an EW schema. TBD: 15. GM Harvester The GM Harvester process will periodically harvest strong ground records from the unassociated ground motion database, and associate it with known events in the RTSP system database. TBD: a. Can we perform this function at the NCEDC by simply putting all imported ground motion values into an unassoc_amp table in the RTSP CISN schema? III. RTSP SYSTEM DATABASE The RTSP database on each of the RTSP systems will initially use the CISN (aka NCEDC/Trinet) schema and will be implemented on an Oracle database server. The event-related tables will be a snapshot of tables of an NCEDC database. We will use snapshot replication to replicate the data from the RTSP snapshot to the NCEDC database. We will have multiple NCEDC databases, with at least one located at UC Berekely, and one at USGS/MP. We will use multi-master replication to synchronize the NCEDC databases. To ensure that no table row from any database gets overwritten with data from another database instance or snapshot, all IDs (such as origin id, phase ids, etc) should be requested from the database. It will be the responsibility of the database administrators to assign ids in a way the enforces this. If an id (such as an external event id from binder) is used as a unique key within the database, we must ENSURE that the ids do not overlap between systems. However, external ids will be STRONGLY DISCOURAGED. IV. NCEDC functions: The NCEDC Archive is responsible for archiving and non-real-time data distribution of event information and waveforms. The event data from the RTSP will be pushed to the NCEDC Archive as soon as available. The NCEDC Archive will acquire both event and continuous waveforms from the RTSP. The NCEDC will store: i. Event parametric data and selected processing history. ii. Non-real-time analyst event processing state information. iii. Event-associated waveforms. iv. Continuous waveforms. v. Waveform instrument responses. 1. Waveform archiver: The waveform archiver at the NCEDC will use the request cards in the NCEDC master database to collect waveforms from the waveservers. When a waveform has been successfully retrieved, it will be written to an event directory, and the waveform description and waveform association for the event will be entered into the database. 2. Event analysis: The event analysis tool jiggle will review the events in the NCEDC database that were generated by the master RTSP system. The event review can be performed on any of the NCEDC multi-master replicated databases, but only one database should be designated as the review database at any time. Waveforms for event review will be retrieve from the NCEDC master waveform archive at UCB. V. ALARMING/RESPONSE/REVIEW procedures We haven't talked about this yet. The CISN has an effort going toward alarming and notification which will address some of these issues. However, we also need to look at the respose and review procedures - What is out there? What is SoCal using? What is the functionality we want to carry forward to the new system? VI. FUTURE THOUGHTS Some thoughts for future (longer term) development: 1. We should address how to dynamically change configuration of the hypcenter location system to allow for updates in channel usage and channel parameters without having to restart the entire pipeline. 2. Should we eventually migrate away from the EW amp exchange database to only one database with a CISN schema database per RTSP system? OUTPUT DISTRIBUTION SYSTEMS Many of the "TBD"s are related to communication between modules of the RTSP and between modules and the CISN dbms. We are intentionally looking at alternatives to the TriNet 3-party software solution. a. Inter-module communication: We are considering 4 options: EW, FreeORB, dbms, and file based. i) EW embedded server + fast + exists + more than one platform - volatile - non-guaranteed - single machine (req. imp/exp or sendfile/getfile for broadcast) ii) FreeORB client/server access to ring buffer - ring buffer is disk based & memory mapped + non-volatile + network based - solaris/linux based (windows would require development) - smart reconnect/bookmarking not implemented yet - potential performance issues w/ high volume/long messages - single point of failure? iii) DBMS "Advanced Queuing" + non-volatile + network based + publish/subscribe + event notification + robust + flexible - unknown - commercial - single point of failure ? performance iv) flat files + non-volatile + ease of use - intermachine transfer kludgey - highly serial This choice of approach is one that we wish to discuss with SoCal.