HP Integrity rx8620 Base System A7026A プリント

製品コード

A7026A

Using RDBMSs to retain and analyze event data

Through more than twenty-five years of Relational Database Management Systems (RDMSs)

technology development, RDBMSs have become the preferred and ‘safe’ choice for almost all data

management problems. Through IT’s broad-based adoption of RDBMS technology, data management

problem solving has become primarily relational-model driven. In other words, no matter what form
the data originally appears, the first step in creating a data management solution is to evaluate the

data using the relational model. This process has been successful in the vast majority of cases, so the

relational models’ acceptance has become self-perpetuating.
As Security Information Management (SIM) vendors discovered the need to manage event data
beyond realtime requirements, they followed the well-established practice of incorporating event data

into the relational model and using RDBMS technology to store and analyze it. As a result, SIM

vendors could focus on other areas of concern, such as realtime event correlation, mitigation, and

user-friendly presentation.
Initially, using RDBMSs to manage event data met enterprise requirements. But as the demand for

managing greater volumes of event data emerged, the limitations of RDBMSs became apparent.
To understand why RDBMS technology now presents event data management obstacles, one needs to

understand the fundamental requirements that drive the technology.

Transactional data
RDBMSs are designed to support the commit/rollback protocol, which dictates only complete

transactions (data changes) that can be permanently stored and visible. Well-designed applications

have very small transactions that take microseconds to complete. Any data stored in an RDMS

database can be changed within a transaction. To support this, RDBMSs have elaborate logging
subsystems that log every change in order to be prepared in the event a transaction rollback occurs.

Isolated concurrent access
RDBMSs present a virtual view of the data, so a given user only sees committed data, or data that the

user has changed. Although other isolation levels are supported, the concurrent isolation paradigm

requires synchronicity and locking subsystems at the row level.

Infrequent scheme changes
Before data can be loaded into an RDBMS, a schema must be in place that defines the semantics and

existing data relationships. This requires that a data model must be complete before an application

can be created. Relational schemas are generally static, or evolve slowly. Infrequent changes in
application requirements drive schema changes.

Precision queries
RDBMSs are designed to optimize precision queries on structured data. This means precise

information is known about the data before a query is formulated, and the data itself has been

structured to fit into a predetermined model or schema. Often, queries are statically stored and
optimized, with the query data being variable. RDBMSs are best optimized for unique key queries,

such as customer number or invoice number. RDBMSs are not enhanced for range, or pattern-

matching style, queries. Examples of this are ‘list all invoices over $100,000’ or ‘list all companies

with “.com” in their names’. These query types usually involve a complete table scan. RDBMS
databases must be tuned to support a specific set of applications. The tuning is accomplished through

indirect references to data, such as the creation of indices, or data organization, such as clustering

the data for performance. Therefore, once a database is tuned for a specific set of applications, it can

easily become less optimal for other applications.