A big change has taken place in data acquisition systems over the last several years, as the growing use of video data in imaging systems has increased the size and speed of the data streams these systems need to deliver and process.
The challenge for design engineers is to support the low-latency requirements of real-time data acquisition and distribution, as well as provide the high throughput required to handle large video streams without dropping frames or diminishing the quality of the data. A distributed shared memory network can provide the low latency and high throughput speed needed for these more demanding imaging systems, while at the same time enable remote placement of the processing system away from the frequently harsh factory floor environment.
Formerly, traditional data acquisition systems (Figure 1) were typically tasked with relatively slow data rates of perhaps 10 to 100 measurements/second at the high end, since they conducted simple types of measurements, such as the displacement of an object, the object’s acceleration or its temperature.
Because throughput requirements were fairly low, the data could be brought into the processing system via normal I/O channels using analog or discrete signals. The processing system itself could be built using simple A/D converters and discrete I/Os. A single processor could easily handle and operate on the low-level data rates required to measure the object’s position, movement or size.
Increasingly, however, the trend in today’s data acquisition systems is to add video cameras to capture additional data and monitor operations. This trend is, in turn, driving a need for greater I/O throughput rates. On a typical manufacturing line today that produces any sort of component, video images are taken of the objects being produced. These images are used to perform a computer comparison, via a video link, between the object and a known good image, and to look for specific characteristics to determine the object’s quality.
Since this use of video means a huge increase in data, instead of hundreds of samples a second there is now the equivalent of 20 Mbytes/s of data coming from the video and being sent to the computer. When a negative result emerges from the image comparison, the computer responds by sending control data to affect the manufacturing process. For example, the system may divert the bad object to a dump bin.
Unfortunately, factory floors can be hot, noisy, dirty and prone to large amounts of shock and vibration. This is a less than ideal environment for video-based processing systems. Another problem is the amount of cabling often required by video imaging systems. An imaging system that monitors multiple stages of a process with multiple cameras, distributed over tens or even hundreds of feet, can require significant amounts of physical cabling, which can create a potential hazard on a factory floor.
The Standard Network Approach
What’s needed is a high-speed I/O network that enables remote processing. Several attempts have been made to address the need for the high data throughput associated with video imaging, using standard networks such as Ethernet. Unfortunately, it is not possible to adequately address these challenges with point-to-point message-based networks such as Ethernet. Although a Gigabit Ethernet network has sufficient bandwidth to handle one video data stream, it lacks the low latency required to handle closed-loop process control (Figure 2).
Substantial processing power is also required by an Ethernet network just to provide the communications protocol. The source node must know all of the destination nodes and must specifically send a message to each of them. This requires modification of the source node’s communications any time a new destination node is added, or if a task using the data is moved to another node. These interdependencies produce a network that is not easily scalable and does not easily accommodate growth. Although some new Ethernet switches do support multi-casting, performance and complexity would still be compromised.
Ethernet uses a source-controlled protocol. This means that the source node controls which destination nodes have access to the data. If multiple nodes require such access, the source node simply sends the data to each of them. By sending the data multiple times the network bandwidth can be quickly used up. A single 20 Mbyte/s video source would require 40 Mbytes/s bandwidth if two nodes required the data. If another destination node were added, the bandwidth would become 60 Mbytes/s.
The Shared Memory Network with Ring Topology
To resolve these problems, a network is needed that requires very little processor overhead, supports true data broadcasting and is destination-controlled. A shared memory network utilizing a ring topology (Figure 3) fulfills all of these requirements. It ensures that every network node has access to all data while it minimizes network latency and maximizes throughput capability to capture and distribute the video imaging data along with the low-speed sensor data and commands. A shared memory network system also lets the data processing phase be handled remotely, away from the process that is being monitored and controlled. It therefore eliminates much of the cabling.
Using a shared memory architecture, data captured at multiple inspection stations can be fed to a number of processors, which can take the parts of the information they need and work on that data simultaneously. Another advantage of shared memory systems is that more computers can be easily added to the network as the system’s requirements grow. A shared memory network also allows a heterogeneous mix of computers, so that a specialized high-speed video processing computer can be integrated into a network of commercial, low-cost PCs.
A shared memory network, in which each node, or computer, has an exact copy of the same data, enables all of the imaging system data to be distributed. Each computer on the network can dedicate its full processing capacity to a single discrete task while working on the same set of data. Changes to the test/manufacturing process—such as adding more sensors or changing the processing of the data—may be easily done by simply adding another computer or changing some routines in an existing computer. The data becomes available to all nodes without the need to change any of the network wiring.
The SCRAMNet GT Shared Memory Architecture
One example of a shared memory network architecture is Curtiss-Wright Control’s Shared Common RAM Network, Greater Throughput (SCRAMNet GT). SCRAMNet GT is a high-throughput technology for connecting multiple processors to form a single, real-time, distributed processing system in which memory is shared among the processors. It supports up to 255 nodes on a network ring with a data throughput of up to 210 Mbytes/s.
In a shared memory system the video data is sent out on the network ring only once. Because each node has access to all the available data, if a node selects to use and display that data it does so without affecting any other node. Conversely, if a node decides to drop off and not display any data, no other node is affected. The receiving station does not have to go back to the source and request that the desired data be sent again.
This shared network system architecture can be compared to a television station broadcast, which is unaffected by how many viewers are watching it at any given time. It is sent only once, and additional viewers do not affect the source. On the other hand, when a point-to-point network such as Ethernet distributes a Webcast, it must send the data individually to every subscriber, significantly burdening network throughput. SCRAMNet GT supports a sustainable throughput rate of 210 Mbytes/s, which is comparable to Ethernet in a one-time, point-to-point connection.
With Ethernet, however, if the data needs to be displayed in several different places, for example in three locations, then the entire data set has to be sent out three times, tripling the throughput. As throughput increases on the network it can become overburdened, resulting in delays and dropped frames. With video, this reduced information quality can result in unacceptably jerky images and lost data. Images that might be adequate for normal viewing may not be acceptable for a quality inspection system using computer vision.
Remotely Located Data Processing
To address the harsh factory floor environment, shared memory enables the data processing to be located remotely from the data acquisition. After a computer node collects the data and
pre-processes it, the data is placed in shared memory. With SCRAMNet GT, the distance between nodes can be quite high: using standard shortwave laser transceivers, this distance can be as high as 200 to 300 meters. With longwave transceivers, the distance between nodes can reach up to 10 kilometers. Cabling is also reduced because only a single fiber optic cable runs between each computer, as compared to a point-to-point network, where every camera and sensor must be wired individually back to a panel connected to the processing computers.
Because the processing can be handled remotely, the transceivers on the factory floor do not need to be integrated in high-speed processors, since all that is required is the simple process of pulling in the data and putting it in shared memory. The high-performance processors can be placed remotely in a safe lab or control room environment.
SCRAMNet GT is the latest version of the popular shared memory architecture that was first introduced about 15 years ago. It is the highest bandwidth shared memory system available.
Although the original SCRAMNet had a 150 Mbit/s data rate and could
transfer data around the network ring at 20 Mbytes/s, SCRAMNet GT supports 2.5 Gbit/s data rates and a throughput of 210 Mbytes/s with a latency of less than 0.5 microseconds per node. It features a one-to-many and many-to-many built-in broadcast capability and ensures that all nodes receive updated information without intervention from either host or user. The original system supported a maximum of 8 Mbytes of memory. Each SCRAMNet GT board, whether VME, PCI or PMC, comes with 128 Mbytes of memory (Figure 4).
Another advantage of using shared memory is the low programming cost associated with application programs. The system designer must assign data only to specific areas of shared memory. Application writers then use the data variables corresponding to these addresses and use the variable names as they would normally. Tasks can be moved to other processors without any changes to the application itself. In actual practice, a task could be talking to another task within the same computer, or to a task on the far side of the ring. With shared memory, the sending task doesn’t need to know where the receiving task is located.
As video-based imaging systems are being more widely deployed, and video resolution and speed increase, it is essential to ensure that enough bandwidth is available. In December 2005, at the I/ITSEC Conference in Orlando, Florida, Curtiss-Wright exhibited a SCRAMNet GT system with a total throughput load of 190 Mbytes/s. Four video sources—two DVD players and two video cameras—were run from four nodes generating 50 Mbytes/s of streaming video data. Another task generating 120 Mbytes/s of additional data throughput was added to burden the system. The result was no video data degradation or lost frames.
Curtiss-Wright Controls Embedded