Automation systems of the future will have safety fully integrated and networked

Up until the late 1990s, all safety systems were based on relay logic and had to be hardwired. This was due to the lack of safety standards and regulations that would allow the use of microprocessors and software.

Safety relies on trusted technology, material and methods and insufficient confidence in microprocessor-based systems had been gained thus far. 

This is why traditional safety systems were implemented separately to the rest of the automation system. They operated independently, often in parallel to the control system, for good reason – safety systems must be highly-available.

A fault or unexpected occurrence in the "normal" operation of the machine must not degrade or compromise protective or safety processes. 

It was inevitable that as automation systems became more intelligent then so too must safety systems. Put simply, safety functionality increasingly depended on knowing what the rest of the machine was doing. However, trying to add classical safety to an automation system (which was heavily computerised even then) always led to substantial costs due to the additional wiring and engineering required. 

Fortunately, the release of the international safety standard IEC61508 in the late 1990s allowed microprocessors and software to be used in safety systems.

This brought many advantages (see Table 1 below), which in turn led to a proliferation of programmable safety controllers; but at the expense of traditional hardwired systems. 

Table1: Some of the advantages of safety controllers over discrete components

  • Flexibility of configuration, allowing for possible future expansion
  • Ease of modification of program, for future enhancement of system 
  • Reusability of code, including user defined function blocks 
  • Much improved diagnostics for sensors, actuators and their wiring, including possible preventative maintenance 
  • Able to support different I/O types, for different safety devices, like e-stops, safety mats, light curtains
  • Use of graphical development tools speeds up programming and aids documentation
  • Program simulation during development, for testing without hardware

Safety controllers (also called a 'logic solvers') have much in common with standard PLCs. Both have inputs, outputs and programmable memory to hold a program, which is executed cyclically (i.e. scanned). However, safety controllers provide two additional features. 

Firstly fault tolerance is the ability to reliably detect faults and raise appropriate alarms. Also, a single fault cannot cause erroneous operation, nor shall it prevent the system from functioning as designed. This single fault should be repairable without interrupting operation. 

Secondly, safety controllers use a state machine mechanism to select a series of pre-defined states. These ensure correct operation during shutdowns and power cycling.

The "fail safe" or reset state is where if a failure does occur; it happens in a predictable way and with all outputs being turned off. This will deem the protected machinery inoperable and therefore safe. 

In order to support these additional features, several techniques are used in the design of safety controllers. They have special hardware and software, specifically designed to self diagnose improper internal operation and ensure reliability.

Redundancy is used to maintain operation, even with partial failures. There is also extra security for transfers via the communications port. 

Alleviate manufacturer specific errors

Dual processors manufactured by different chip makers are often used with separate program compilations by different compilers, to alleviate manufacturer specific errors from being repeated. The two CPUs constantly monitor each other during runtime and operation is stopped when synchronising discrepancies are detected. 

Safety controllers provide special I/O for safety devices, to monitor both the device and its wiring. Devices with mechanical contact outputs like e stops switches need pulsed test outputs. Light curtains use Output Signal Switching Device (OSSD), while other devices use a continual small current from the controller.

Wiring is sometimes paired to ensure redundancy and concurrent operation and safety rated relays always use dual, forcibly guided contacts to circumvent contact weld. Finally, a series of internationally-recognised standards exist, especially for safety controllers and compliance to them requires stringent testing by specialised third-party laboratories.

Table 2 shows some of the more commonly quoted safety standards for machinery and process applications. These standards are regularly reviewed, and updated with technology advancements. 

Users can therefore be confident that any component meeting the relevant safety standards will work reliably. While the risk of failure can never be completely eliminated, it can however be managed by the implementation of tried and proven methods and technologies. 

Almost immediately after the introduction of safety controllers and software, the idea of using a network to link safety components that are distributed over some distance was considered. Ideally, existing I/O fieldbus infrastructure could be used as this would present considerable cost savings. 

Nearly all safety networks are "single channel", meaning they use a single cable line only; often that which is already installed. Most networks running safety still achieve a SIL3 rating. For many sites, running redundant or dedicated cabling for safety would be cost prohibitive. 

Data in any network is subject to corruption due to noise, cross-talk, network malfunctions, bugs, equipment failures and cable severing. Other issues include traffic overload, latency and even illegitimate packets! These factors may appear to make networks unsuitable for safety applications. 

However, like safety controllers, safety networks are heavily regulated by standards; IEC 61805-2 clause 7.4.8 is one example. Generally speaking, safety protocols are required to manage the following packet related issues:  repetition, loss, insertion, re sequencing, data corruption, delay, masquerading (mimicking of packets) and FIFO errors created in intermediate routers. 

To overcome these issues, safety protocols use techniques such as time stamps, sequence numbers, CRCs (cyclic redundancy checks) and connection identifiers in each transmitted packet. Safety slaves also autonomously detect communication failures including time outs and revert to their fail safe mode if any error occurs. 

White and black

There are two approaches to safety in networks. Firstly, "white channel" networks are where all components are designed from the ground up with a minimum safety rating. This includes all networking adapters, switches, couplers and alike, which generally do not carry safety approvals.

Using only safety certified components adds significantly to costs and is often unnecessary as most network traffic is unrelated to safety. 

The opposite methodology is "black channel", where safety communications is treated like a "black box", and encapsulated inside normal data packets, like PDOs. The underlying network makes no attempt to interpret or modify this data; it merely provides a means to transport process data between safety nodes.

Specific network constraints, such as update rates, aren't usually a direct consideration as safety systems already require intrinsic watchdog timers to safeguard reaction times. 

A black channel can therefore be created within most networks, as has been done in many open vendor networks like DeviceNet and EtherCAT. Safety data can even be routed or ported between networks, as has been implemented by network bridges and gateways. 

Most fieldbus systems use only layers 1, 2 and 7 of the ISO Open systems communications model and layers 1 and 2 can't be affected if compatibility is to be maintained.

Black channels require an additional safety layer between the communication stack and application is used, as per IEC 62280 1, originating in railway signalling.

Safety is therefore implemented in just the application layer (7); refer to Figure 1 below. 

FSoE (Functional Safety over EtherCAT), shown as black channel.

The black channel approach is far more common than white channel as it utilises existing networks without modification. It can also work concurrently with standard communications and does not denigrate network performance – both of which are very important in industrial networks. 

Risk assessment

To ensure a safety system is not compromised by the introduction of a network, a risk assessment needs to be done; as per the principles set out in EN ISO 14121-1. This will evaluate the performance of the overall safety system, with the aim of identifying potential hazards and taking measures to reduce them. 

Functional safety standards such as IEC 61508, IEC/EN 62061 and EN ISO 13849-1/-2 are used to satisfy machinery safety legislation (the European "machinery directive"). They analyse safety levels by using data such as SIL (Safety Integrity Level) ratings, PFHd (Probability of dangerous Failure per Hour) and Performance Level requirements (PLr). 

The same standards and methods can also be applied to networks to determine their suitability. For a safety network to comply with SIL3 (equivalent to PLe), a PFHd of between 10-8 and 10-7 must be achieved. The IEC 61784-3 safety standard recommends that safety networks occupy <1% of the available PFHd.

Therefore the PFHd (residual error) for safety networks is <10-9; which equates to more than 100,000 years of continuous operation without an undetected error. 

To achieve such outstanding data integrity, safety protocols cannot use the error detection mechanisms provided by the underlying network. They instead include their own CRCs, which are generated by with a carefully chosen polynomial, for a set packet length.

Safety standards must use "proper" polynomials as their quality has a major impact on error detection effectiveness, particularly for heavily corrupted data. 

In the case of EtherCAT, its black channel appends a two byte CRC to each two bytes of safety data; refer to Figure 1. This CRC is constructed so that it also aids in detecting missing, duplicated, repeated or non sequenced packets. 

A truly distributed safety system, including non-safety-rated devices, using EtherCAT.

The complete control system is shown in Figure 2 above. It shows a central controller which handles most of the intelligence through its program. It has local I/O and separate communication links to its control and data networks. 

A single control network is used to connect a series of distributed nodes, possibly from multiple vendors. It supports the mixing and matching of standard and safety rated devices, but also needs to be high performance so as to allow say coordinated moves between servo systems. 

The network master is in the main controller. It is not safety rated as this is unnecessary and would add significantly to its cost. It therefore cannot control safety outputs, only standard outputs. However, its master function means it can monitor the status of all I/O, including safety; which in turn allow them to be used in the program. 

Safety functionality is handled by a decentralised safety-rated controller, residing in a node on the network. It often incorporates local safety I/O for convenience, although its black channel allows additional safety I/O to be distributed across the network.

Modification of the architecture does not require re-certification for safety. Using a discrete safety controller separate from the main controller also allows safety to be added to existing networks, where safety may not have been considered.

Modern day software tools make programming transparent, meaning the final destination of the safety aspects of a program may not even be known to the programmer.

Synergy in programming 

One of the biggest advantages of the recent integration of safety with the existing control (logic) and motion systems has been the commonality achieved in programming. Complete integration means all variables and program structures are accessible, whether it is safe data or not.

Also, modern Integrated Development Environments ("IDEs") allow all programming to done together, in one harmonised software package. 

PLCOpen is an international vendor neutral and product independent association, which actively promotes the improvement of software quality in the industrial domain. In doing so, they hope higher efficiencies will be achieved and life-cycle costs lowered.

They endorse a suite of universally accepted standards, including IEC 61131-3 for logic and their own complimentary standards for motion control and safety. 

So the current state of the art automation system has a high performance controller that complies with the three main streams of PLCOpen standards. It utilises the latest programming software to tightly integrate all programming functionality.

It supports both a data network for data communications and a single, open vendor, control network for all remote devices. This network supports standard I/O and distributed safety via a decentralised controller; while still providing the performance needed for high speed devices like servo drives. 

Harry Mulder is Engineering Manager, Omron Electronics.[Harry Mulder is Engineering Manager at Omron Electronics Australia. He has been involved in the industrial control industry for around 25 years, the last 22 of which have been with Omron Electronics. With a degree in computer science, his experience includes sales, engineering and product management of industrial programmable controllers, HMIs, networking and software. He currently manages an engineering team across four states but still likes to get involved with day-to-day problem solving. He is happily married and has three girls who continually keep him on his toes.]