Home
Earthquake Info
News & Updates
Products & Services
Who We Are
Calendar
Links & Resources
Contact Us
|
Doug Neuhauser, doug@seismo.berkeley.edu
Lind Gee, lind@seismo.berkeley.edu
DRAFT - 2001/10/02
Archive Database Parameters
Information to be stored in the archive (data center) database and made
available to the public.
- Associated information for each event:
- Earthquake hypocenter(s), with associated observations, such as:
- phase arrivals.
- azimuth info.
- error information.
- Earthquake magnitudes.
- associated amplitude readings, or other associated information.
- Earthquake mechanisms.
- associated information.
- Ground motion observations (PGA, PGV, spectral info) (*).
- Shakemaps (*).
- Other earthquake products (finite fault parameters, etc) (*).
The system should be able to store for each event multiple versions of derived
products (such as hypocenters, magnitudes, mechanisms, and shakemaps), and to
provide an indication of which version is currently the "preferred" version.
- Operational parameters for the data center.
- Channel meta-data over time.
- Channel state-of-health and/or usage information (?) (*).
- System control and configuration parameters (?) (*).
- Event review status and history (*).
- Waveforms and/or pointers and descriptors for waveforms.
(*) = Functionality not currently in NCEDC schema.
NCSS Real-time and Event Collator Database Needs
The NCSS Real-Time Processing and Event Collator would use a database
for the following operations:
- Provide reliable and consistent storage, retrieval system, and communication
area for earthquake parameters and data from real-time earthquake system(s).
- Provide a reliable and consistent storage, retrieval system, and communication area
for network and operational parameters for the real-time earthquake processing system.
- Provide a source for near-real-time data flow from the real-time system(s)
to the analysis systems and NCEDC, or any other downstream or cross-stream
processing system.
- Provide a mechanism to control and log near real time notification.
(Note: Collator database provides the mechanism to control and log
offline event analysis and review).
Real-time Database Parameters
Information to be stored in the real-time database and made available in
near real time to the Event Collator and the data center.
- Associated information for each event:
- Earthquake hypocenter(s), with associated observations, such as:
- phase arrivals.
- azimuth info.
- error information.
- Earthquake magnitudes.
- associated amplitude readings, or other associated information.
- Earthquake mechanisms.
- associated information.
- Ground motion observations (PGA, PGV, spectral info).
- Shakemaps.
- Other earthquake products (finite fault parameters, etc).
The system should be able to store for each event multiple versions of derived
products (such as hypocenters, magnitudes, mechanisms, and shakemaps), and to
provide an indication of which version is currently the "preferred" version.
- Unassociated parameters (?)
- unassociated phase arrivals (?)
- unassociated ground motion observations (?)
- Operational parameters for the real-time system.
- Channel meta-data (current and recent past).
- Channel state-of-health and/or usage information.
- System control and configuration parameters.
- Event review status and history.
- Waveforms and/or pointers and descriptors for waveforms.
- Notification rules (relatively static).
- Notification status.
- Notification history.
NCSS design questions
-
The current proposal is to use the NCEDC/TriNet DBMS schema for the
data archive.
- Should the same schema be used for the Event Collator? Why?
- Should the same schema be used for the Real-Time Processing system? Why?
- Do the different components such as the Real-Time Processing
and the Event Collator require separate databases? Why?
- If different DBSM schemas are used in the Real-Time Processing and
Event Collator components, what type of data should be sent from the
Real-Time Processing system to the Event Collator?
- What is the granularity of information flow between these components?
- at the single observation level or database table row,
- at the earthquake component level (hypocenter and associated phases),
- at the earthquake level (all info for a single event).
- What mechanism should be used to transfer data between the RT and Collator DBMS?
- Database replication tools?
- Application program?
- Combination of the above?
- How do we ensure reliable data transfer between the RT and Collator DBMS?
- Where should rapid event review take place - within the RT or the Event Collator?
- Where should event revision take place - within the RT or the Event Collator?
- Should any DBMS in the sytem store information from other sites within
the NCSS or other data sources such as SCSN/TriNet, UNR, or NEIC?
- other real-time system(s) covering the same reporting area?
- other real-time system(s) covering adjoining areas?
If so, which DBMS?
If so, how should we use this information?
How do we associate this information with our own information, and
how does this affect data migration to other adjacent systems or archive systems
downstream?
- What data (if any) should flow from the archive to the Event Collator, and
from the Event Collator to the Real Time Processing component?
- How should parametric data from the be sent to the archive in near real time?
Should it come from only the master Event Collator, or from all Event Collators?
- How waveforms be delivered to the archive in near real time?
- Should event waveforms be available at the archive in near real time?
- Should continuous waveforms be available at the archive in near real time?
- Should waveforms be push to the archived, pulled by the archive, or
a combination (push waveform requests to archive, archive pulls waveforms)?
- How do we incorporate data quality control (QC) if data is sent in
near real time to the archive?
- What applications are needed to interact with:
- The Real-Time Processing component?
- The Event Collator component?
- The Data Archive?
- How does the design of the real-time and the underlying DBMS and APIs
facilitate participation in:
- CISN
- ANSS
Software design issues
There are a number of fundamental design philosophies that differ between the
existing Earthworm software used by Menlo Park and the exising REDI software
used by the BSL. Some of these arise from fundamental differences in the
original design goals of the systems, and some arise from conditions imposed
outside of the ideal design goal, such as the physical
separation of USGS/MP and UCB, and the CPU time required to certain computations
(such as moment tensors).
If we want to work towards common software for our monitoring networks, we
will need to address these issues. I bring them up now in order that we can
acknowledge these issues, and if possible, decouple them from
decisions made about database and schema issues.
My concern is that our discussions and decisions about database schemas and
APIs do not implicitly (or explicitly) dictate decisions about the use of
existint software or existing software designs. For example, a decision to use
the EW database schema and/or EW API should NOT imply that EW software (in its
current configuration) be used throughout the new real-time system. Nor
should a decision to use a different API or database schema than the EW system
imply that EW software would NOT be used in places within the system.
There are several major design areas that I think we need to discuss.
- Data transport and communication reliability between system components:
Both the EW and REDI systems uses some volatile and non-reliable mechanisms
for communication between some system components, such as broadcast,
multicast, and memory-based transport rings. These mechanisms are not
reliable, in the sense that data is not reliably delivered from sender
to receiver if the receiver is not always available to receive the data
when it is sent. The initial design of these systems was strictly for
"real-time notification", but we are now looking at integrating notification
functionality with permanent and authoritative event processing and
data archiving.
- Reconfiguration capabilities:
Much of the software currently in use does not provide means for dynamic (or
runtime) reconfiguration or restart capability without loss of data. This
limits the ability of the system to incorporate new data, change data or
processing characteristics, or fix and install updated software without loss
of data and information.
- Scheduling and computer resource allocation:
Due to the different characteristics of the processing currently performed by
the EW and REDI software, the EW and REDI systems have very different
approaches to scheduling and computer resource allocation. The EW system by
and large processes data in a time ordered manner (or a first-come,
first-serve) system. The REDI system provides an overall schedule to
determine what processing should be performed, at what time, and in what
order. As we integration the full range of data processing at both sites and
implement new flows of data between the sites, we need to evaluate the entire
system scheduling and resource requirements, and to perform Q/A testing
to ensure that we can handle the system load.
- Joint design and development:
The development of a unified software system for earthquake monitoring in
northern California requires that both institutions participate equally in the
design and implementation of the software. This effort must be coordinated
with CISN partners and the ANSS to ensure that we can accomplish our goals of
unified state and national earthquake monitoring. This process may require
careful negotiation and compromise for all parties in order to reach
consensus.
- Control of our destiny...
What is the process by which we, the users of the DBMS system, can
influence or control the direction of the database schema and application
software design?
|