Northwestern University
  Search  
Northwestern
Information Technology
INFORMATION TECHNOLOGY
Videoconferencing
Information Technology > Academic Technologies > Videoconferencing > Understanding
Home
 
Request a Videoconference
 
Videoconferencing at NU
 
Deploying Videoconferencing
 
Understanding Videoconferencing
Definition and Overview
Protocols
H.323 Basics
Radvision Enterprise System
Videoconferencing Examples
Addressing Issues
Interesting Links
 
Contact Us
 
Directory Services
  Videoconferencing H.323 Basics
Introduction
Point-to-Point Videoconferencing
The Gatekeeper
Multipoint Videoconferences
Streaming
Gateways
Bandwidth Considerations

Introduction
H.323 is an International standard protocol for videoconferencing. It uses the Internet for connectivity between endpoints. Endpoints can be client videoconferencing terminals, Multipoint Control Units (MCUs), or gateways. This presentation describes the various endpoints and how they interoperate.

   
   

Point-to-Point Videoconferencing
Consider two client terminals that are connected to the Internet. (See Figure 1) An example of a client terminal or end point is a Polycom Viewstation. The Viewstation and its associated peripherals allow the user to make a call to another client, send the local audio/video stream to the remote client, and hear/view the received audio/video stream on a local speaker/monitor that is connected to the Viewstation.

Assume one user (the local user) uses a Viewstation to call a user at a remote Viewstation (client terminal) by entering the IP address of the remote Viewstation. The clients setup a call between the stations following the specifications of the H.323 protocol. Once the call is setup, the clients exchange audio/video streams over the Internet. The point-to-point videoconference continues until one of the users "hangs up" the call.

One of the problems with this type of video call is that IP numbers are used for the call. IP numbers are difficult to remember; some users have dynamically assigned (DHCP) IP numbers that can change every time they boot their system; and we have noted problems in using IP addressing when different vendor systems are used. We thus do not recommend the use of IP dialing although it is occasionally used.

 

   

The Gatekeeper
To alleviate the problem of IP dialing, the H.323 standard defines the use of a gatekeeper. (See Figure 2) The gatekeeper is a system that connects to the Internet just like the client terminals. The IP address of the gatekeeper is configured into the client terminals and when the clients "power up", they communicate with the gatekeeper and transfer certain information to the gatekeeper that describes the client. This process is known as registration; the client registers with the gatekeeper.

Two identifiers are assigned to and configured in each client terminal. One is a H.323 Alias. It is usually descriptive of the particular client terminal and usually contains alphanumeric characters. The other descriptor is the H.323 Extension. It usually consists of several numbers and can be thought of as being the video telephone number of the client. While it is possible to use either the H.323 Alias or the H.323 extension for dialing, it is difficult to dial alphanumeric characters on most clients; it is the H.323 extension that is normally used for dialing. Refer to the section "Understanding Videoconferencing"- "Addressing Issues" for a better understanding of addressing standards used at Northwestern University.

When the clients register with the gatekeeper, they pass their IP numbers, H.323 alias, and H.323 extension to the gatekeeper where it is stored. This allows a local user to dial a remote user by entering the remote users H.323 extension (video telephone number) rather than an IP address. The local client terminal communicates the H.323 extension to the gatekeeper. The gatekeeper then checks to see if the remote client is registered with the gatekeeper. If it has, the gatekeeper sets up the call between the two clients; if it is not registered, the call is rejected. Once the call has been setup, the audio/video streams flow directly between the clients over the Internet. The gatekeeper can perform a number of other management functions as well. For a description of these, see "Understanding Videoconferencing"- Advanced Issues".

 

   

Multipoint Videoconferences
To this point we have only considered point-to-point videoconferences. These are conferences between two client terminals. The question can then be raised, "what if we have users at three or more clients that want to hold a videoconference". To handle this situation, the H.323 standard introduces the concept of a Multipoint Control Unit (MCU). The MCU (See Figure 3) is an endpoint that can be thought of as a "video bridge". The MCU connects to the Internet as does any other endpoint and registers with the gatekeeper, as does any other endpoint.

A MCU, depending on its design capacity, can handle a certain number of simultaneous videoconferences each with each videoconference being logically separate from the others and with each having a specified number of users. System administrators define "services" on the MCU where each service has certain characteristics that contrast it from other defined services on the MCU. As an example, a service of 75 might be defined that allows for several simultaneous videoconferences to be created where each have a maximum size of, say, five sites (clients) and where all must encode their audio/video streams at 384 Kbps. A specific videoconference on service 75 is then defined by the service number and by a conference "password" (e.g. 751234). Each of the simultaneous videoconferences that are held on service 75 is then defined by the service number (75) and by a different password.

When users want to join a particular videoconferencing session, they dial the service number/password combination. The gatekeeper checks to see if that service has been registered by a MCU. If it has, the gatekeeper completes the call by connecting the client to the specified videoconference on the MCU; if the service has not been registered, the call is rejected. Once the call has been connected, the client's audio/video stream is then sent over the Internet from the client to the MCU. Similarly, other clients connect to the session and send their audio/video streams to the MCU. The MCU selects one of the audio/video streams on the videoconference and returns that audio/video stream to all of the clients (that is all except the client whose stream was selected). There are several methods for selecting an audio/video stream. Audio switching and chairman control are two alternatives. Typically, the method that is chosen is audio switching where the MCU selects the stream that currently has active audio (someone is talking or is talking the loudest). We frequently refer to this selection process by saying that this particular stream (client) has "captured" the MCU.

Lets assume that we have several clients connected to a single videoconferencing session on a MCU. The assumption is that no users want to have the MCU send them back video of themselves and no site wants to receive an audio stream that contains their own audio. So the MCU sends the selected video stream to all the clients except the client whose stream was selected; the MCU sends the video from the last site that was selected to the currently selected site. All of the audio streams are aggregated together and sent back to each site except with their audio removed. Thus each site gets a unique audio stream. Each stream only contains the audio from the other sites.

As the user(s) at one site stop talking and the user(s) at another site start to talk, they capture the MCU. The process is repeated with the video from the newly selected site now being sent to all the other sites, and the newly selected site getting the video from the previously selected site.

 

   

Streaming
To participate in a H.323 videoconference, users must have appropriate videoconferencing client terminals and have Internet connectivity with sufficient bandwidth to support the videoconference. Some users may not have these capabilities but would still like to be able to participate even if that meant that they could only see and hear conference participants but not be able to interact with them. This can be accomplished if the videoconference session is captured, encoded in an appropriate format, and streamed over the Internet although this capability is not a part of the H.323 standard. (see Figure 4)

To accomplish the streaming, a H.323 client must be connected to the conference session to be streamed. This station will be able to capture and decode the audio/video that the MCU has currently selected. This decoded audio/video can then be re-encoded and streamed over the Internet. There are two popular encoding standards that are currently being used: RealVideo and Microsoft Windows Media. The encoded audio/video can then be either streamed on the Internet by a server or archived on a disk file for later viewing or both. The system consists of a H.323 client, an encoder, a server, and an archive storage system.

Users can receive the stream using a browser on a computer. They enter the URL of the server, and the server starts the encoded audio/video stream over the Internet to the computer. Plug-Ins for the browser exist that are capable of decoding both RealVideo and Windows media streams. The user can thus see and hear the participants in the streamed videoconference in near real-time. Alternatively, a user can connect to the server at a latter date and view the archived version of the videoconference.

 

   

Gateways
So far we have discussed H.323 videoconferencing capabilities. However, many sites have videoconferencing rooms that implement the H.320 standard that uses telecommunication lines (e.g. dial-up or dedicated ISDN lines). H.323 standard was developed after the H.320 standard and uses many of the encoding/decoding protocols originally developed for H.320. The H.320 systems can be considered to be legacy systems, but since many of them still exist, it is important that we continue to support H.320.

In addition to supporting pure H.320 videoconferences using H.320 MCUs, gateways between the two protocols can be provided. (See Figures 5) A gateway provides a path between H.320 and H.323 systems. It translates H.320 commands and audio/video streams to H.323 audio/video streams and vice versa. Users with H.320 client terminals dial the gateway over ISDN lines. The H.320 client then needs to input a service/password combination for the selected session, and the gateway connects the H.320 terminal to the selected session. All H.323-based users can see and hear the H.320-based users as if they were on H.323 terminals, and similarly the H.320-based users can see and hear the H.323 users as if the were on H.320-based terminals. Multiple H.320 connections can be made to the gateway up to the capacity of the gateway.

 

   

One other benefit of the gateway is that it can accept calls from standard telephones. (see Figure 6) A user with a standard telephone dials the ISDN telephone number and is connected to the gateway. The telephone user then enters a series of digits to indicate the service/password combination of the desired videoconferencing session. The user can then hear the entire audio from the videoconference and can also interact with others in the conference. The gateway is able to simultaneously connect multiple telephone calls and can even connect to a telephone bridge that could allow participation by a large number of audio only users.

 

   

Bandwidth Considerations
The H.323 client terminals encode the selected audio (usually from a microphone) and video (usually from a camera) inputs. The encoded and video are then compressed into a single audio/video stream and sent to the remote end point (another client terminal or a MCU). Different rates can be selected for the encoding process. As an example, an encoding rate of 384 Kbps might be selected. 64 Kbps is reserved for the audio and 320 Kbps is reserved for the video. The 384 Kbps stream is compressed (redundancy is removed) and sent to the remote end point. Similarly a 384 Kbps stream is received from the remote end point. Thus approximately twice 384 Kbps in bandwidth (less any bandwidth saved because of compression) is required to support the videoconference for this end point. If there is a lot of motion in the video, very little compression is achieved. If there is almost no motion in the video, the savings approaches about 50%. Since we must design for the worst case, assume a bandwidth requirement of twice 384 Kbps.

Faster encoding rates can be selected. Most client terminals support rates up to 768 Kbps. Some proprietary implementations can encode at speeds up to 2 Mbps. The higher the encoding rate, the better the quality of the video. However, higher encoding rates also mean higher bandwidth requirements, greater impact on the network, and greater impact on the MCU capacity. Lower encoding speeds can also be selected down to about 128 Kbps. This of course means lower video quality. 384 Kbps is a good compromise between quality on one hand and resource impact on the other. 384 Kbps will support 30 frames per second video. Lower encoding speeds yields lower frame rates and choppy video. There is a discernable but small improvement in quality between 384 Kbps and 768 Kbps.