SCADA and MES systems work with data that have a real-time nature (as well as with data related to their own configuration, with external data, etc.). A logical request is to store the data somewhere.
Some systems develop their own repositories - text or structured files (XML). Others rely on standard SQL (or NoSQL) databases. What types of databases does the D2000 real-time application server use and how does it handle them?
It contains the configuration of the entire system. It is accessed by a single process - the D2000 Server. At startup, it reads its contents into memory and creates a dynamic data model (DODM) based on it, containing the configuration of all objects in the system, the relationships between objects, and their values. Subsequently, it only writes to the database (the only exception is finding out object history information that is not kept in memory), so the speed of information processing in the D2000 system is not affected by the speed (or sluggishness :) of the configuration database.
Again, only the D2000 Server can perform writes to the configuration database. They occur based on various events in the system - usually a change in the configuration of the object or a change in the start values.
In redundant systems, the configuration database is synchronized at the start of the kernel - the active D2000 server (HOT) ensures that the configuration database of the starting D2000 Server is identical to its own.
The configuration database is typically in the size of MB to tens of MB (exceptionally hundreds) depending on the scope of the application. Since version 7.2, when object history logging was added, the size of the configuration database has increased (by default, information about the last 7 changes is logged).
A further increase is expected when configurators start using the new Version 11 feature - Resource object. This allows the storage of any file or directory structure in the configuration database and its subsequent application use.
It contains records of various events in the D2000 system. Examples are user login and logout, change of object value (if monitoring of value changes is enabled for a specific object), process start or end, an error message from process or script, alarm occurrence or termination. The monitoring database has a defined time depth (in days) for each event category. Older events are deleted from the monitoring database. Unlike the configuration database, the contents of the monitoring database are not read at system startup.
As with the configuration database, only the D2000 Server accesses the log database. If a user or script wants to access the log monitoring database, it does so by querying the D2000 server.
The log database is in the order of MB to tens of GB in size (depending on the configured depth and the extent of monitored changes).
The configuration and log database in the D2000 system is supported on the Microsoft SQL, Sybase SQL Anywhere, Oracle, and PostgreSQL platforms. A hybrid configuration is also supported in a redundant system (each D2000 Server can use a different platform).
(*Update 2021: support for Oracle database as a configuration and log database has been dropped).
Archive and depository databases
The archive database is used for archiving real-time data of a numerical nature. Thus, a timestamp (in the D2000 system with an accuracy of ms) and various system value flags (validity, process alarms, limit overruns, archive flags) and user flags are stored with each value.
The properties of the archive are described in detail in several articles. Here, from the point of view of databases, let us state that the same platforms are supported as in the configuration and log database, and the sizes of archives can range from tens of MB to tens of TB (large applications with depositories enabled).
The archive database can be on the platforms Sybase SQL Anywhere (in the past the most widely used), Microsoft SQL Server (in the past a freely available MSDE variant was often used by our OEM partners), Oracle (in the past the majority platform for enterprise deployments) and PostgreSQL (currently the most widely used and preferred database). In addition, SQLite (for embedded systems) and MariaDB are experimentally supported.
In a single application, there can be several redundant archives. They may even use different database platforms. In the past, the combination of Sybase SQL Anywhere + Oracle has been used several times, today it is mostly redundant PostgreSQL archives. In the case of redundant archives, each archive writes to its own database and thus the databases are not connected in any way.
For redundant archives, there is a mechanism (the arcsynchro utility) to patch the "hole" that arose during the shutdown of the D2000 Archive. The arcsynchro utility can also work with databases on various platforms.
EDA (Energy Data Archive) is a special archive with optimized properties and functions for working with time series used in energy. Unlike the D2000 archive, it works with vectors - data structures that are used to store values that change over time. Vectors can be with a constant time step (15 minutes, hour, day, and their calculated variants) and with an undefined time step (change, interval, descriptive, and calculated).
The EDA database is accessed by one or more EDA Server processes operating in load-balancing mode. Unlike the D2000 Archive, all EDA Servers share one EDA database. Originally, only the Oracle platform was supported and due to redundancy, Oracle RAC (cluster) and possibly Hot/Standby technologies such as Dataguard or Dbvisit Standby for Disaster Recovery solutions could also be used. In 2020, support for PostgreSQL database has been added and EDA on PostgreSQL is deployed by several customers (new ones or migrated from Oracle).
Other differences from the D2000 Archive:
· Timestamps in vectors have a second accuracy - the archive has millisecond accuracy.
· The calculated vectors are calculated only during reading (with the exception of calculations, which are performed on demand and for a specific time interval) - the archive calculates continuously, on request, or during reading, depending on the configuration.
· The history of the vector is optionally stored, i.e. it is possible to trace back the values of the vector before the change - the archive does not have such functionality.
· Multiple values are written at once, writes are controlled from a script - the archive usually writes the values of D2000 objects in real-time - as they change, although it also supports writing data from a script.
· Computed vectors are supported, which can work with source vectors with high dynamics (selection by the filter, membership in vectors groups, and others) - the archive offers only a calculated archive from source objects defined in a mathematical expression (so the relation between source archives and calculated, or statistical archive is static).
Specialty: EDA Server can also work with so-called archive vectors - data that are actually located in an archive database. If necessary, it reads/writes archive data via the D2000 Archive (timestamps are then aligned to seconds).
General SQL database
The D2000 DbManager process can work with any ODBC database via the ODBC interface (or its variant optimized for Oracle with any Oracle database via the OCI interface). General SQL databases are used either as interfaces to external systems or as an integrated part of an application for storing structured data and data that are not just numbers. The D2000 DbManager can read and write values (line by line and page by page), can use parameterized SQL statements, work with BLOBs, and has extensive support for scaling and increasing performance (multiple database connections, connection sharing in non-transactional mode, connection recycling).
The Ipesoft D2000 real-time application server relies on standard SQL databases to store data. Different D2000 processes use SQL databases in different ways to provide different specialized services. The flexibility of Ipesoft D2000 can be seen in the growing support for different SQL databases. The current trend is to maximize the use of the most advanced open source database, PostgreSQL, for all types of databases - configuration, monitoring, archive, EDA and general databases. In doing so, the enhancements that PostgreSQL development brings are also used (e.g. UPSERT command in D2000 Archive or partitioning in EDA database).
Ing. Peter Humaj, www.ipesoft.com