With a high volume of data, this sounds like a table partitioned on day boundaries (in MySQL 5.1). See Sarah’s blog and her links for a quick ramp-up on time-based table partitioning (http://everythingmysql.ning.com/profiles/blogs/partitioning-by-dates-the). One great benefit of table partitioning is that you can drop a partition to lose millions of rows in one quick statement, much faster than deleting millions of rows. Sort of like a partial TRUNCATE TABLE.
In the “Time and Time Again: Scenario 1” installment, and continuing through Transaction 2, 3, and 4, we developed Scenario 1. This is the “plain vanilla” scenario for asserted versioning. It consists of an original insert, two original updates and an original delete. In our experience, 75 percent or more of the transactions against asserted version tables are covered by Scenario 1. This is because the Scenario 1 transactions accept default values for all three bi-temporal dates that may be specified on original transactions. These dates are, in the positional sequence in which they appear on those transactions – the effective begin date, effective end date and the assertion begin date. The default value for the begin dates is the date the transaction is applied. The default value for the effective end date is 12/31/9999.
The past six columns illustrated an asserted versioning original insert, update and delete and, in the last column, introduced a taxonomy to guide the remainder of our work. Figure 1 shows that taxonomy. The transformations we have already covered in scenario 1 are shaded. These three transformations are the most basic ones in the sense that they correspond to inserting, updating and deleting in non-temporal tables.
In this column we turn to a classification of the state transformations that are valid for asserted version tables. We will then develop a scenario for each transformation. In the process, we will be using original transactions on which various combinations of bi-temporal dates have been explicitly supplied, and we will also illustrate how and why relevant integrity constraints are applied.
In our last column, we completed our second update to policy P861. In this column, we will apply a delete transaction. As Figure 1 shows, all original deletes are translated into two physical transactions against the target table, which we call temporal transactions. The first of the two is always a physical update; the second is always a physical insert.
Note: Beginning with this column, it will be increasingly important for readers to refer to our glossary, which can be found on the author’s Web sites (MindfulData.com and InbaseInc.com) with a listing of previous columns and articles. Our glossary-defined terms form a closely-knit conceptual network, and none of these terms can be fully understood apart from their semantic connections with other terms in the Glossary. And a note on the use of italics: we use italics for emphasis, but we also italicize the initial occurrence, from this column forward, of glossary-defined terms.
An original transaction is an insert, update or delete, whose target is an asserted version bi-temporal table. It is the transaction as written by its author and as submitted by the database administrator (DBA) to the database management system (DBMS). A temporal transaction is a physical transaction against an asserted version table. These physical transactions are either physical updates (updates in place) or physical inserts; there are no temporal transactions that are physical deletes.
Versioning is a widely used technique within the IT community. It is used to keep track of a chronological series of changes to objects of interest – in our example, insurance policies. In its simplest form, a non-temporal table is converted into a version table by adding a versioning date or timestamp to the primary key of that table. It is usually added as the low-order part of the primary key, in order to keep all versions for the same object together. In the version table, all rows with primary keys that are identical except for the versioning date or timestamp are rows that represent what was the current state of the object beginning on the versioning date and, usually, continuing until a newer version is added to the table.
There are many variations on version tables, but the variations make no difference to this discussion. In this particular case, version date is also the date the row was inserted, although in other variations, that is not always the case. There is no last update date, because versions are not physically updated. Instead, each logical update is carried out by copying the current version, applying the update to the copy and inserting the result as the new most current version. Finally, this particular variation on version tables also contains a delete flag, although this too is not present in all variations. When a delete is requested, the most current row is copied, its delete flag is changed from “N” to “Y,” and the new row inserted.
The major, but by no means exclusive, focus of future columns in this series will be on what I have been calling the “ultimate versioning pattern” but from now on will be calling the “asserted version pattern,” or simply “asserted versioning.” Following is a brief discussion of the major themes of the asserted versioning approach to managing temporal data.
In this installment, we will begin a discussion of how temporal referential integrity (RI) may be enforced with today’s relational database management systems (RDBMSs). In addition, we will expand the glossary which we began last time, in Part 19. Indeed, the glossary work has proven to be the major portion of the work done here in Part 20. This seems to be the right point in our series to begin formalizing the definitions of the terms we have introduced.