Quick tutorial: How to test before upgrade

Image Description

Peter Humaj

December 10 2018, 6 min read

The title of this article intentionally omits what upgrade is being considered. Do you want to deploy a new D2000 version onto existing hardware? Or are you preparing a generational change of hardware or switching to virtualized servers in the ICT environment after four (six, eight ... sometimes ten) years? There are multiple possible scenarios – let's look at the complications with testing.

Why

First point – why do we need to test? Isn't it just spending man-days? For a single reason – to make the transition to the new system as smooth as possible. So that we are not stopped by the lack of CPU or disk performance or memory capacity (caused by deploying a new version or by hardware replacement). Or by incompatibility that originated in the new version of D2000 or in the application scripts when switching to another platform and operating system (OpenVMS -> HP-UX or HP-UX -> Linux) or when migrating to a different database.

At the same time, when planning a scenario, it is necessary to have a backup plan - every step should be reversible, including the possibility of data synchronization (e.g. archives or MES database).

What

To find a way in a large system built for perhaps more than ten years is not easy. In practice, it always helped us to keep thorough documentation of data flows - and, when planning an upgrade, thoroughly review it and make sure it's complete.

Data flow documentation contains all links of the system to the environment:

  • Communications implemented by the KOM process
  • Database connections – third-party databases accessible via DbManager
  • Data downloaded from the web (http, ftp) - weather, stock information, communication with partners
  • Data collected and sent via email - exchange of messages with the environment, confirmation of transactions and others
  • Communication between D2000 systems (via Gateway Server + Client)
  • Other communications - e.g. web services, real-time interface to stock systems like Trayport

Before testing, all communication flows must be reviewed to ensure that the new system (a new version of D2000 on temporary hardware or on new hardware) will run in a sufficiently isolated environment (e.g., on a dedicated VLAN separated from production and from surroundings by firewall) to avoid undesired connection to the surroundings. If necessary, disable the automatic startup of processes that are responsible for individual communications (KOM, DBM or EVH processes) and start them individually when testing specific data flows.

The more data flows are tested, the smaller the chance of unpleasant surprises is when upgrading. Let's look individually at listed items.

Testing communications performed by the KOM process

In most cases, thorough testing of communications is a problem. Why? Simply because the customer does not have a test unit for each type of communication with which testing can be performed. In some cases, testing is possible, e.g.

  • Servers capable of serving multiple clients (e.g. IEC104 server, OPC server, and others) –testing directly by connecting a new KOM process to production. Different risks and problems may arise also in this case. We recommend, for example, to turn off or to de-configure the output I/O tags of the new system so that it doesn’t start writing (i.e., we will only test reading), or to create a few output I/O tags which can be used for testing writing (but which will not affect the controlled system, e.g they will be only looped-back). Additionally, there may be a problem with hardware performance (can it handle the double workload of communication?)
  • Short-term interruption of communication with the production environment (in agreement with application administrators and technologists) and establishing communication with the new environment.
  • File-based communications – files processed by the production environment can be duplicated into a new environment.

Testing database connections

This is usually a less challenging problem than testing communications. A copy of the database or database schema will be used and it will be made available for the new system. Subsequently, duplication of data added to the database by an external system (using triggers, scripts, or data pump) must be ensured.

Testing data from the web

Again, in the case of one-way communication (data download), problems do not arise. In the case of bidirectional communication, the solution can be to configure a test web/ftp server simulating a production server.

Testing communication via email

Duplicating emails in a new mailbox used for testing is not usually a problem. It is more important to restrict the sending of messages or to configure their redirection so that emails from the new system do not get by any chance to the other side.

Testing communications between D2000 systems using gateways

Rules similar to testing communications apply. The D2000 Gateway Server is capable of handling multiple clients (in production, no performance problems have been experienced yet). Obviously, the new system must be limited not to write values to the D2000 Gateway Server.

Testing other communications

Here the situation varies from case to case. E.g. the Trayport stock exchange system provides access to their test environment, so thorough testing can be done. In different web services, it depends on a particular producer or a particular customer, whether it has built a test environment or at least support for testing within a production environment.

Performance and functionality

Testing the performance of the new system as well as its functionality by users is also important. Ideally, we are able to get as much of the production data as possible into the new system and generate the most real workload possible.

In my last year blog "Communication in test environments", I've described several ways to get values from communications into a test environment. These can also be used to test the new system:

Simulation

The simplest to configure, and at the same time, the least similar to reality :)

KOM replay

Replaying values recorded by the production KOM process. Here I would like to point out a limitation that did not matter in the test environment: the replay log format may vary between versions – so it is not guaranteed that the replay logs from production will be usable in the new system.

What can we do about that? Can this restriction be circumvented? The answer is positive – with a little effort it can be. In the latest releases of D2000, not only the KOM process can record data, but also the Gateway Client. So the Gateway Client (of the new version) can be connected to the Gateway Server of the production system. This Gateway Client can run in the standard configuration and record standard communication of a particular gateway. Or, it can run in transparent mode and record the data of the selected production KOM process. Subsequently, such replay logs can be used in the new system as they are obtained from the Gateway Client of the same version as the new environment.

Obviously, the disadvantage of the KOM replay in comparison with the transparent gateway is that it does not provide live data, just repeated recordings. On the other hand, the KOM replay is usable already in the FAT tests at the supplier –  and even earlier, at the hardware parameter design stage. By running the application and by creating workload using the replay logs on the supplier´s developing servers, it is possible to measure the application's demand (CPU, RAM, I/O operations) and optimally and with sufficient reserve to design hardware (or specify requirements for a virtual server in the ICT environment).

Transparent gateway

It is the most comfortable connection between the new system and production. It is an ideal scenario to configure transparent gateways and let users have enough time to test a new system that is 'almost alive' – it collects data, archives them, provides them to users .. it just doesn't control.

In November last year, we configured transparent gateways while migrating and upgrading a relatively large customer system. Since there were several dozens of communication processes, there were also multiple gateways. Subsequently, in January of this year, when switching to a new system, KOM processes of the old system, as well as transparent gateways, were shut down and KOM processes of the new system were started within minutes.

The necessary requirement for deploying transparent gateways is the existence of a direct network connection – so if this is not possible (or we want to do the performance and FAT tests mentioned above), the KOM replay mode also comes in handy.

Conclusion

In any case, this blog does not aim at completely exhausting the topic of testing (this would require a monograph requiring the involvement of colleagues from other units). It just wants to show some aspects of the preparation for the exchange of control systems that we had to deal with in practice and to indicate the methods and ways we've been using.

Subscription was successful

Thank you for submitting form.

Image Description

Your message was successfully sent.

Thank you for submitting the form.

Image Description