SCADA versus relational databases

Today I don't want to write about the diverse uses of relational (SQL) databases in SCADA and MES systems, that was the subject of another blog. Today I want to think about one absolutely fundamental feature, without which it is difficult for anyone to imagine a database application today. It is referential integrity.

What does this term mean? Simply put, if a record in table A contains an index into table B, then that index must either be empty (null) or point to a specific (existing) row in table B.

To define referential integrity, SQL databases use foreign keys, which are used to create relationships between tables.

This image from Wikipedia shows what would happen if the database system did not enforce referential integrity. There could be a row in a list of albums whose artist_id field pointed to a non-existent record in the authors table. The data would then be “broken” and incorrect.

Figure 1 - Unwanted state in the database without defining foreign keys.

If foreign keys are defined, the relational database ensures that the above state can never occur. It is not possible to change the key identifier of a referenced record, or delete it entirely - with the exception of the defined cascade deletion, in which the corresponding records from the tables that reference this record are also deleted (the ON DELETE CASCADE clause in SQL syntax).

Therefore, integrity does not need to be guarded in a complicated way at the application level, but is ensured directly by the SQL database.

And now let's move on to SCADA (or MES) systems. They contain various types of objects, sometimes the terminology tags is also used (e.g. I/O tags, calculated tags, schemes, graphs, communication stations, bitmaps, scripts ...), between which there are also various relations. For example, a scheme uses an I/O tag, a bitmap and a graph. An output I/O tag can use a calculated tag and publish its value (write it to communication). A calculated tag can use other I/O tags, memory variables (user variables, structured variables) and other objects to calculate the value.

And now the fundamental question: Can a SCADA system guard referential integrity similarly to a SQL database? Can it prevent the deletion of an object that is in use by another object? Can it prevent the renaming of an object (for the same reason)?

For most systems, the answer is: Their authors are aware of the fundamental nature of the problem and are trying to solve it … and for specific situations they are successful :-).

For example, they can prevent the deletion of an object (tag) that is displayed in a diagram or alarm, or at least help the user find references if they want to find out if the object is used.

For example, WinCC has a consistency check that identifies whether and where the object is used. The user can explicitly check before deleting whether the deletion is ok. Similarly, after renaming a tag, it is necessary to compile the project and then perform a consistency check.

GE iFIX does it similarly. It does not always block deletion, but it can generate a dependency report before deleting the tag.

Ignition, for example, when displaying values, has the option to define a link between a tag and a display element as Direct Tag Binding (by directly entering the tag name, e.g. Valves/Valve4/FlowRate), in which case it can also visually indicate during editing that the tag no longer exists. Or, Indirect Tag Binding is used (e.g. Valves/Valve{intParameter}/FlowRate), which can contain one or more parameters (in the example {intParameter}), which are bound to some dynamic parameter. Generalization of this resulted in Expression Tag Binding, in which the tag name is composed using an expression. Of course, consistency checking is not possible for Indirect Tag Binding or Expression Tag Binding.

However, most SCADA systems fail in scripting. The fundamental problem is that if the object identifier in the script is textual (and therefore can be composed of multiple text strings), it is technically difficult or impossible to analyze the script and create a list of all used tags when saving the script. This needs to be handled while the script is running.

For example, WinCC, which uses the SmartTags function in VBScript, can access the second element of the DB10_HMI_Data array as follows:
SmartTags("DB10_HMI_Data")(2)

Of course, it is possible to search for the string "DB10_HMI_Data" in scripts and if it is found, then the given object is probably used (if the name is not in the commented part of the script, or in some unused parts of the code). If it is not found, it can still be used if its name is composed of multiple strings.

Ignition is similar. According to the documentation, it is read in the script from the object (tag) like this:

system.tag.readBlocking(["My/Tag/Path"]) #without specifying a provider
system.tag.readBlocking(["[default]My/Tag/Path"]) #with specifying a provider

The training video about tags says that moving tags between providers (which also changes the tag identification) can break bindings, scripts, and other objects in the project.

GE iFIX, which uses VBA, is a more interesting case. According to the documentation, it supports direct writing (e.g. to a database tag) with the syntax
FIX32.NODE.AI1.F_CV = 50#
but this (due to VBA naming conventions and possible collision with iFIX object names) is not recommended - the use of the WriteValue and ReadValue functions, which have text parameters, is preferred, e.g.
WriteValue "50", "AI-1"

So we're back to text identifiers and referential integrity issues.

The AVEVA Plant SCADA system (formerly Citect SCADA) allows referencing via tag names (direct references) and via string identifiers (indirect references) in the proprietary Cicode scripting language. Consistency checks (for direct references) are only performed at project compilation. For string identifiers, an error occurs only during script execution.

Here is the output from ChatGPT, which prepared a comparison for several SCADA systems.

Figure 2 - Comparison of referential integrity in multiple SCADA systems. Source: ChatGPT.

The only system that provides consistency checking in scripts is ABB 800xA, which is not SCADA but DCS (Distributed Control System) ... which is also at a completely different price level.

We can also notice how the choice of scripting language and the chosen approach to objects also affects the consistency checking.

So it seems to me that most SCADA systems only partially implement referential integrity, or offer tools to determine whether an object is in use - but the problem is the scripts. (If I'm wrong, please let me know).

Another (related) problem is object renaming. With the lifespan of SCADA systems being measured in decades (especially in the case of large and investment-intensive SCADA systems), it is logical that there are changes in technology, new devices are added, old ones are retired, reorganizations within the enterprise ... in short, the surrounding environment is changing and SCADA should also react - by expanding the configuration, deleting obsolete parts of the system and possibly renaming existing ones, if useful. And here we encounter the same problem with referential integrity. If text identifiers are used, how do we rename them without affecting the application, if we do not know where they are used?

In the Ipesoft D2000 real-time application server, which can be used to create SCADA/MES/EMS and other systems, referential integrity is the basis of the design. The core of the system (the D2000 Server process) constantly maps and maintains the referential integrity of the entire configuration, including scripts - these can be in ESL language (part of the D2000 design) or in Java. When storing objects and scripts, all references are analyzed and mapped. Object identifiers are used directly in scripts (e.g., the S.MyScheme scheme in ESL is referenced directly as S.MyScheme, in Java the dots are replaced by the dollar sign, and the names are converted to lowercase, so the scheme is s$myscheme).

However, internally, references are stored as numeric identifiers (which the user cannot change). Therefore, renaming an object will not cause a change in the configuration of other objects that use it, because their configuration contains a numeric object identifier and the conversion to a text name is only performed when the objects are opened for editing.

Therefore, if – as a result of editing - the script (or other D2000 object) contains a reference to a non-existent object, the saving will fail, but the system will generate an error message. This is equivalent to a failure to insert data into an SQL table due to a violation of referential integrity defined by a foreign key. Similarly, during XML import of objects (I/O tags, diagrams, graphs, scripts, etc.), it is verified whether referential integrity will be maintained after the import. If not, the import will not take place. If multiple objects are being imported, either all are imported or none of the objects are imported (equivalent to transactional data insertion into an SQL database). Moreover, both object editing and XML import take place online, on a live system. In the case of a redundant system, changes are propagated from the active D2000 Server to all passive D2000 Servers in real time, so most of the time the user is not even aware that he is working on a redundant system.

At the same time, information about referential integrity is also available to the user. In the graphical tool environment (D2000 GrEditor and D2000 CNF), the user can view for each object a list of objects that it uses and a list of objects that use (reference) it. This is also a tool for monitoring the data flow, especially useful in larger applications.

The following figure shows the objects that use the I/O tag M.StA.Temp1. They are:

The D.StA.Temps graph, which in turn is used by the S.StA.Temperatures scheme.
The archived value H.StA.Temp1 (which is used in the graph and also enters the calculation of the 5-minute statistics H.StA.Temp1_5minAvg)
The calculated tag P.StA.TempMax, displayed by the scheme.
The scheme itself, on which the I/O tag is displayed.

Figure 3 - Example of tracking dependencies between D2000 objects.

Using referential integrity, it is therefore possible to monitor the data flow (values of I/O tags entering calculations, or output I/O tags), which can significantly help in debugging and solving problems.

Conclusion

To sum it up, referential integrity in SCADA systems is, in my opinion, similarly important as in SQL databases. It allows you to build, maintain and develop a functional SCADA system in which errors and inconsistencies do not occur as a result of deleting or renaming objects. At the same time, it is a necessary condition if we want to make modifications to a live, production system without jeopardizing its functionality. It is also a useful tool when mapping data flows in large applications.

The Ipesoft D2000 real-time application server is one of the few that offers its users referential integrity. This is thanks to its creators, who very presciently included this useful feature in the original system design.

September 25, 2025, Ing. Peter Humaj, www.ipesoft.com