Some time ago I saw a video in which the authors of a SCADA system demonstrated the functionality of their historian (archive) on a Raspberry Pi platform. They were showing how it could store 10 thousand values per second to an MS SQL database running on a remote Windows server. The data was stored in 200 tables, each containing 500 data objects (containing a single time column and 500 data columns).
At first glance, this demonstration was impressive. A historian running on a Raspberry Pi and inserting 10 thousand values per second... awesome, right?
It was only after a while that the broader implications started to dawn on me.
First - the Raspberry Pi in this example writes to a remote MS SQL database. So basically it only performs once per second an insert (precompiled insert) into 200 tables. And of course, before that, it has to set the 500 values to be inserted and the timestamp for each table. But all other operations are performed by the remote MS SQL server - unlike in our performance tests in our „What load can Raspberry Pi handle“ blog, where the total flow to a PostgreSQL archive database running on the same Raspberry Pi as the D2000 was at 1250-1450 requests per second.
Additionally, the values were obtained by simulation. If they were to come from communication (as during our performance tests), the question is how much CPU power would be consumed by reading a hundred thousand measured points per second ...
Second - only a common timestamp and values are stored. No flags (validity, alarms, limits, user flags) that the D2000 Archive stores by default along with the value. Additionally, this is a periodic storage and thus the accuracy of the timestamp is one second.
Third - every second the values of all 500 objects within a single value table are stored. Regardless of whether they have changed or not. So every second more data is added to the archive database. I have a question: What will be the size of such a table and how convenient will it be to work with it? Moreover, since the values of 500 objects are in one table, they must have a common archiving depth. If I want to increase the archiving depth of one of the objects, it will be changed for all of them.
The D2000 Archive approaches archiving from a completely different side. Firstly, each object is in a separate table in the archive database - it can have an independent archiving depth. In addition, it tries to minimize the number of writes to the table, thus saving both the performance of the database server and its disks. In the case of change-based archives, it is possible to set a filter - it is even possible to define up to 3 bands and set a different filter in each of them. In the case of periodic archives, it has implemented change-based storing of periodic values - the value is stored only if it changes. So if, for example, the value of an object is constant for a whole day and it is archived with a 1-minute period, then the mentioned constant is stored in the database only once. Naturally, when reading, the D2000 Archive must "decompress" the read value (all " decompressed" values have the archive flag 'K' set).
Fourth - if 500 objects are stored in a single table, how does working with delayed values work? It is common to have values with timestamps coming from communication. If these are delayed by a few seconds, the historian has to update the values in the rows which were already inserted. Well, it has to ... it doesn't have to if it defines that it doesn't deal with delayed values. The D2000 Archive does, of course, make such corrections. In the case of periodic archives, thanks to the change-based storing mentioned above, it is not a correction - update, but a simple insertion. In the case of change-based archives, the value is inserted into the database when it arrives. In addition, the D2000 Archive can also automatically recalculate the values of statistical and calculated archives that depend on the archive object with delayed values.
So let me sum it up: I think the demonstrated configuration is pretty far from reality. I don't know of a customer and project that needs to store hundreds or thousands of values in a database, with a period of a second or a few seconds. On the other hand, several applications built on D2000 contain well over a hundred thousand archive objects, but also thanks to optimizations, only a few hundred to a few thousand values per second are stored in the database.
You might wonder if D2000 Archive could store data in a similar "wide" format as the historian described above. The answer is: no, the D2000 Archive cannot. But D2000 DbManager can do it (with a bit of programming in ESL or Java, of course). Rows can be inserted one at a time or several at a time.
Why does this blog have the provocative title "the fastest car in the world"? If someone was offering such a car for sale, it would be very appealing at first glance. Only after studying the documents and thinking about what the car is actually needed for, would the prospective buyer realize that such a car is probably not very practical, has high consumption and a price significantly higher than a normal production car. It probably won't fit even children in it and it won't be the best for shopping... Similarly, in the case of archive subsystems and historians - users in practice appreciate more than speed features such as functionality (in the case of D2000 Archive these are e.g. statistical, calculated and script-filled archives), stability, debugging and diagnostics options, high availability (redundancy of archiving), zero-maintenance (D2000 Archive also performs data deletion or cleaning and reorganization of archive databases, so in practice, it is only necessary to provide the database with enough disk space). And speaking of disk space, the compression of depository databases (available in the latest D2000 version), which reduces the space needed to store long-term archives by several times, is also a feature to consider.
So speed is just one of several parameters to consider when choosing an archive system. And in my opinion, not even the most important one.
Ing. Peter Humaj, www.ipesoft.com