Hardware aspects

Next: Planning for expansion Up: System platform Previous: Software aspects

Hardware aspects

The hardware aspects concern the complete integrated system delivered to the customer. We will distinguish several components that can be supplied by the technology provider or by the application developer. This chapter aims at decomposing such a system into its basic material components: platforms, speech processing boards, speech input/output interfaces , etc.

Platforms

The platform is the ``black box'' that will be installed at the customer site. It can be a PC - or compatible - or a proprietary system .

The major requirements are related to its capabilities in terms of CPU (386, 486, Pentium, Power-PC , Motorola), memory (RAM, hard disk), data transfer rate through the PC-bus (ISA, EISA, PCI) or through dedicated buses, data transfer rate from the memory cache to the disk (when writing files), the capacity in ampères required to power the expansion boards. Dedicated boards, with DSPs, will be used within the free slots available on the platform. These may be half/full slots of the backplane.

The application developer has to know what the hardware configuration is that will respond to his needs and then state the requirements as above. He has to know if there is any means to use more than one platform using a LAN.

Speech processing boards

For speech processing specific boards (a dedicated board or off-the-shelf from Dialogic, Rhetorex, LSI, NMS, other vendors) or local CPU capabilities may be used. The application developer has to know how to install and configure the boards. He also has to know the capabilities offered by the board with respect to his application. For example if an application has to recognise 10 words then the developer may use a single speech processing board to process two calls simultaneously. So he has to know about the number of simultaneous sessions/calls that can be handled in real-time (how many telephone lines if the system is telephone-network oriented). In some applications this depends on the language and thus has to be taken into account (a TTS board may handle 3 calls for Spanish synthesis but only one for French).

The technology provider may also offer different boards with multi-channel configurations (Board A = 2 recognisers and Board B = 4 recognisers). The application developer has to know whether he can plug in either of those and still run his application.

There are also hardware constraints about the number of free slots, the power and memory requirements, etc. that are needed.

Speech input/output interfaces

If the application is used within a desktop application, the speech input/output may use a sound board with an integrated microphone . If it is used within a telecommunication application then there is a need for an interface to the PTT network. This is provided by many vendors. As for the speech processing boards the application developer has to know what the requirements regarding his input/output interfaces are.

Connectivity

In many configurations one needs at least two boards: one to deal with telephone signalling (telephone interface) and a second one that implements the speech processing. The two boards use a particular bus to exchange speech data. The objective of such bus is to allow interaction between different boards implementing different applications from different technology providers on the same platform in an open environment . These are hardware and software implementations. The best known ones are:

PEB (Pulse coded modulation Expansion Bus)

which is seen as an internal switching matrix capable of routing any time slot to an adequate audio port of the speech recogniser.

MVIP (Multi-Vendor Integration Protocol)

is a multiplexed digital telephony highway for use within one computer chassis. It provides standard connection for digital telephone traffic between individual circuit boards. It supports telephone circuit-switching under direct computer control, using digital switch elements distributed amongst circuit boards in a standard computer. MVIP software standards allow system integrators to combine MVIP-compatible products from different vendors. The communication technologies that are supported include call management, voice store and forward, speech recognition , text-to-speech, Fax, data communications, and digital circuit-switching . The objective of an MVIP bus is to carry telephone traffic. It allows the interface to the telephone network to be separate from voice processing resources so the telephone interface may be obtained from one vendor while the voice processing resources are obtained from others. A single MVIP bus has a capacity of 256 full-duplex telephone channels . (Mitel, MT90810 Manual).

SCSA (Signal Computing System Architecture (SCSA):

According to Dialogic statement the SCSA represents the next generation of call processing architecture that opens up a new means of delivering and communicating information. SCSA is a comprehensive multi-layered hardware and software architecture for building call processing systems with multiple technologies and standard interfaces. So the objective is to provide standards that allow portability , scallability and interoperability with different developers' applications. An SCSA bus has a capacity of 1024 time slots (for a PC ) on its bus called Signal Computing Bus.

The availability of such connections on the technology provided (hardware as well as APIs ) allows easy portability of the application if this is anticipated.

Real-time aspects

As mentioned above, the application developer has to know how to manage his CPU load when using a multi-channel system and should require a uniform and coherent response time on each channel . The technology provider should guarantee maximum response time in the worst conditions. The real-time aspect is related to a complete application and should be estimated with all the lines on. For example if the system prompts a beep before starting speech recognition the application developer has to compute the delay: the beep prompt plus the time needed to start recognition. This time is crucial as people may speak before the beep, which leads to a gap error.

Next: Planning for expansion Up: System platform Previous: Software aspects

EAGLES SWLG SoftEdition, May 1997. Get the book...