Video on the Internet
John V. Pavlik* and Andrew Lih**
©1996
For presentation at:
The Impact of Cybercommunications
on Telecommunications
CITI
The Casa Italiana, Columbia University
The City of New York
September 27, 1996
The Graduate School of Journalism
Columbia University
2950 Broadway
New York, NY 10027
http://www.cnm.columbia.edu/
On the Origin of Media Species
"There is no reason for any individuals to have a computer in their home."
Ken Olsen, president, chairman and
founder of Digital Equipment Corp., 1977
Nearly a half century ago television pioneer Ralph Baruch walked down a Manhattan street on his way to a restaurant for dinner when he noticed a gathering outside a storefront window. The people were watching a new device called "television," which the store had on display. Several hours later, when returning from the restaurant, Baruch noticed the gathering was still there, despite the late hour. Curious, he walked up to the storefront to see what programming on the television set was so compelling that it would keep a crowd on hand, even late in the evening. It was a test pattern.
Baruch, who would later recall this story while making a presentation during his year as a senior fellow at the Freedom Forum Media Studies Center, admitted that this was the moment he knew television was destined to become something big. The nature of television content, much less the business of this new medium, however, were completely unsettled and uncertain, and it took people such as Ralph Baruch to build, shape and even continue to invent this new visual medium (among his accomplishments, Baruch founded Viacom, Inc., one of the world's largest and most important media companies and set the standard for quality in cable television programming).
Video on Cybernetworks
Video on the Internet today is in the same state that television found itself nearly a half century ago. The content, technology and business are still evolving and are at best months, and probably years from their settled form.
What do we know today and what can we anticipate for tomorrow? This paper/presentation will examine four questions regarding the state and future of video on the Internet.
¥ How will cyber-networked video delivery differ from traditional media?
¥ How will capacity for video be allocated during peak periods?
¥ What will be the funding mechanisms?
¥ What will be the effect on traditional carriers?
How will cyber-networked video delivery differ from traditional media?
Cyber-networked video delivery differs from traditional media in at least four important ways. First, video on a cybernetwork is digital. This difference, however, will rapidly disappear with the development and diffusion of High Definition Television, which will be digital as well, and delivered via over-the-air broadcasting and via coaxial-fiber optic cable television hybrid systems. Second, video on a cybernetwork is transmitted on a switched basis, thereby making it available on-demand and on a targeted basis. This difference is rapidly disappearing as well, as cable systems, direct-broadcast satellite and other carriers introduce these same capabilities, sometimes in tandem with telephone-based services. Third, cybernetworks, partly because of their decentralized nature and partly because of the radically falling price of video production technology (e.g., CPU, memory, disk, video cards, etc. all continue falling in price), permit anyone to become a video provider. For example, several broadcast-quality digital video editing systems now sell for less than $10,000. This difference between cybernetworks and traditional media may also disappear, although the cable modem may offer limited upstream than downstream capability (i.e., one megabit per second upstream versus ten megabits per second downstream). This difference represents one of the great uncertainties and opportunities of video delivery on the Internet.
Finally, cybernetworks offer substantially less bandwidth for video delivery than traditional carriers. Bandwidth is the amount of electromagnetic spectrum available for transmission of information, designated in terms of transmission speed of medium.
Figure 1 outlines the bandwidth of different "cybernetworks" or media and their capacity for carrying video. As this figure shows, even a phone modem operating at 28.8 kbps can manage a low level of video via the Internet using a free or low-cost product such as CU-SeeMe. This video is likely to be at a fairly low quality of service, with low level audio quality as well. It may be subject to substantial break-up and interference. Video quality increases significantly with increases in bandwidth, however, few with Internet access have access to such high-bandwidth environments as ISDN, much less T1, cable modem or T3, except for those of use lucky enough to work for a University such as Columbia, where T3 has become the standard for Internet service. Most businesses and almost all home consumers are still operating off a phone modem. This makes accessing high-quality Internet-delivered video a slow and frustrating process at best, and an impossibility much of the time. Even low quality video is difficult over a 28.8 modem.
Bandwidth is rapidly increasing for cybernetworks, as ISDN deploys, cable modem trials expand and optical fiber makes it ever-closer to the home. Nynex recently announced, for example, their purchase of 1 million ISDN lines for the New York region, and their intention to rapidly expand ISDN service. They are introducing attractive pricing packages to spur its adoption. Columbia University, for example, has struck an agreement with Nynex to offer any faculty or staff residing near the Morningside Heights campus (between 96th and 137th streets, and west of Morningside Drive and Central Park West) unlimited ISDN service for just $60 a month.
Cable providers such as Cablevision are working in collaboration with chip manufacturers such as Intel to rapidly develop and deploy cable modem trials, including set-top boxes that will enable Internet access via a television set. TCI's @Home cable modem service similarly promises to connect millions of users to the Internet "at speeds never before available to consumers using high-speed hybrid cable infrastructure,"
(@home, 1996).
Figure 1: Bandwidth Barriers
|
|
|
|
|
Text Capability |
|
| Medium |
|
|
Transmission Speed | Per Second | Multimedia |
| Phone Modem |
|
|
14.4/28.8Kbps | 1 story in NYT | Internet/online/No video slow still pix/graphics CUSeeMe/VDOLive/RA |
| ISDN |
|
|
128Kbps | 1 page of NYT |
|
| T1 4-wire copper |
|
|
1.5 Mpbs | 4 pages of NYT | VHS video |
| Cable Modem |
|
|
4Mbps | 1 section of NYT | MPEG2 |
| ADSL |
|
|
6Mbps | 2 sections of NYT | Multichannel MPEG2 |
| Ethernet |
|
|
10Mbps | 1 day of NYT | Multichannel MPEG2 |
| Low-Power Radio* |
|
(entry level ATM) |
25Mbps | 2 days of NYT | Near Broadcast Quality Video |
| T3 Optical, copper, coaxial cable |
|
|
45 Mbps | 4 days of NYT | Near BQV |
| Optical Fiber |
|
|
|
|
|
|
|
OC3 |
|
155Mbps | 1 week of NYT | Broadcast Quality VOD |
|
|
OC12 |
|
622Mbps | 4 weeks of NYT | Broadcast Quality VOD |
|
|
Commercial |
|
2.5Gbps | 26 weeks of NYT | Multichannel BQVOD |
|
|
Laboratory |
|
1Tbps | 300 years of New York Times in one second | Virtual Reality (Immersive, surround) quality beyond VRML) |
|
|
Fujitsu Ltd. |
|
|
|
|
|
|
Nippon Telegraph |
|
|
|
|
|
|
|
and Telephone AT&T Research and Lucent Technologies |
|
|
|
Key: ADSL: Asymmetric Digital Subscriber Line ATM: Asynchronous Transfer Mode bps: bits per second K: 1,000 M: 1,000,000 G: 1,000,000,000 T: 1,000,000,000,000 CU-SeeMe: Cornell University's Internet video conferencing software
Advances in bandwidth also present a variety of economic questions and opportunities. Professor Eli Noam, director of the Columbia Institute for Tele-Information, has developed a computerized model for a spot market for bandwidth, using an approach he calls open spectrum access (Noam, 1995). "In practical terms, it would be a computer that sets access prices based on demand."
How will capacity for video on cybernetworks be allocated during peak periods?
Depending on the approach to video delivery on cybernetworks, the means of allocating capacity during peak periods varies. There are two primary approaches to delivering video on the Internet: downloading stored files vs. "streaming" video (and audio). Each presents special challenges and opportunities for bandwidth allocation. Downloading stored files access is the older and simpler of the two. Here, digital video files are stored on a server (typically in MPEG1 or Quicktime Movie format) and when someone wants to access the video, they must download it in full before viewing it. The time it takes to download the file largely depends on two things: first, how large the file is (i.e., the larger the file, the longer it takes to download), and second, how much bandwidth is available, both on the server and client ends and in-between.
Techniques for improving digital video stored file access are improving and will continue to evolve. One important area of advancement is the development of tools to search the content of the video directly, rather than through traditional techniques based on searching textual descriptions of the content. Instead, new approaches are based on searching based on content inherent in images such as color, texture, composition, and motion. Researchers Columbia University are developing such tools. One publicly accessible example has been developed by a research team led by Prof. Shih-Fu Chang of the School of Engineering School. Working on this team, doctoral student Jon Smith has developed a content-based search engine called VisualSEEK and WebSEEK, which search the content of images, either still or moving, based on low-level features such as color histograms and texture mapping. This effort is available for public review on the World Wide Web at: http://www.vii.org/webseek/columbia.html
Content-based searching, indexing and cataloging will transform the ways people access video on cybernetworks, and will likely be an engine for transforming both the content and business of video on the Internet. They may transform access to the second form of video delivery on cybernetworks, streaming delivery, as well.
Streaming video/audio represents a second and more complex area of video on cybernetworks. Streaming involves the delivery of video or audio "in real time" through a number of techniques, including some that place several frames of video into a buffer on the client's hard drive, and then begin playing the video, as more files are placed into the buffer. To the viewer, the video plays in approximately real time, without having to wait for an entire large video file to download.
Streaming video/audio works particularly well for delivering live Internet "broadcasts." Some of the software products for delivering streaming video/audio products are outlined in Figure 2.
As Figure 2 shows, most of the leading applications for video on the Internet today are using streaming technologies. This is largely because it permits users/viewers to watch video clips without experiencing a long delay while the clip downloads. One important feature that certain approaches to streaming (e.g. that used by Vivo) permits is tailoring video streams to different levels of bandwidth, so a client with just a 28.8 kpbs dial-up modem will access a video stream designed at at lower frame per second rate (e.g., FPS of 5-7), while a client with a higher level of bandwidth available can access a stream with many more frames per second (e.g., 12-15).
Figure 2: Video on the Internet
| Product | Technical Approach | Selected Users | ||||
| Streaming |
|
|
|
|||
| VDOlive |
|
Adaptive streaming system using 28.8kbps data rate.
Specialized advanced wavelet/compression encoding. Uses Internet Protocol datagrams. Live event, real-time encoding possible. Single data rate available. |
CBS, PBS, CBC, NBC
later in 1996 |
|||
|
|
|
|
|
|||
|
VIVO
|
|
Multiple encoded versions for varying bandwidths.
Uses compression based on standard H.263 videoconferencing. Uses Internet Protocol reliable data stream (prone to stalls). Off-line encoding of video. Quality of multiple versions can be tuned for Intranets. |
CNN, ABC,
FirstTV, Intranets |
|||
|
|
|
|
|
|||
|
Xing
StreamWorks |
|
MPEG-1 based technology
Scalable data rate for multiple bandwidths |
PGA, Hootie
and the Blowfish (few others) |
|||
|
|
|
|
|
|||
|
Multi-cast
Backbone (M-Bone)
|
|
Streaming video server, requires
reserving time higher quality because band- width assigned for multicast few clients with software. Allows one-to-many, efficient distribution of streaming content. |
Rolling Stones, other high
profile events, education. IETF conferences (Internet Engineering Task Force) |
|||
|
|
|
|
|
|||
|
CU-SeeMe
|
|
Hub and spoke topology using a "reflector"
Cross platform, Mac and Windows. B&W version, free. Color version, commercial. |
Education,
Group videoconferencing. |
|||
|
|
|
|
|
|||
| QuicktimeTV |
|
Streaming using hub and spoke
repeater feed, color player free, primarily for Apple computers, but Windows version expected in October 1996. |
Primarily Educational
and general public. Videoconferencing. Webcasting' musical concerts |
|||
| ______________________________________________________________________________________ |
|
|
|
|||
| Quicktime |
|
MPEG1 technology, not streaming |
The first widely used
technology for video on the Internet, 1994-95 |
|||
Additionally, although each of these technologies is ostensibly about delivering video via the Internet, the delivery of audio is perhaps more important. When video streams over a cybernetwork, the loss of a frame or two along the way has relatively little effect on the quality of the image (especially if the FPS is 10 or more). However, any loss of audio is readily detectable by even the average ear, and substantially degrades the perceived quality of service (QoS).
QoS of video on the Internet depends not only on frames per second and audio quality, which ranges from near am to fm to stereo cd quality, but also on the size and resolution of the images, as well as the synchronization of the video and sound. Today's streaming technology varies widely along these dimensions, but for the average user dialing up on a 28.8 kbps modem, they can expect little better than poor QoS on most dimensions (e.g., a 2-inch viewing screen, 30-72 pixes per inch resolution, 5-7 fps, near am quality sound, and frequent loss of synchronization). All of this combines to make video on the Internet largely a novelty today. Tomorrow's technical and bandwidth advances (including the coming 56kbps or V.56 modem), however, will likely change this situation dramatically.
Most streaming technologies have available free players that are designed as plug-ins for the main World Wide Web browsers, Netscape Navigator and Microsoft Explorer. The shake-out here will prove interesting in the coming months, as the David and Goliath struggle for control of the Internet plays itself out in the commercial marketplace, and Web inventor Tim Berners-Lee (now a research scientist at MIT) champions the notion of open platforms for the evolving third generation of the Web.
Notably, most applications of video on the Internet, and especially most content, is designed for computers running Windows applications, not Apple Macintosh. This adds a slight edge to the Gates' juggernaut.
Bandwidth allocation for video is largely managed through three alternative approaches. First, is called the ATM solution (Asynchronous Transfer Mode). ATM builds allocation directly into protocol, but is not designed for the Internet, and is primarily used for point-to-point video transmission and conferencing, not broadcasting.
A second approach used largely by streaming video applications on the Internet is called RSVP, or ReSerVation Protocol. It's a way to provide end-to-end quality of service provisions for isochronous traffic. It operates by reserving resources on all routers between the two communication end-points, allocating bandwidth accordingly. VIVO is one company looking to design its next generation product to the RSVP standard in order to better manage bit rate allocation during peak demand periods on the Internet.
An associated protocol is called RTP, or Real Time Protocol,
that runs on top of the Internet Protocol datagram service, UDP.
It does not provide QoS guarantees or resource reservation. It only
provides a standard way of sending payload information between
real-time peers. RTP is a very loose specified protocol, mainly to provide a standard mechanism to exchange real-time "payload" information between two parties, or among several parties using multicast.
A third and newer approach is called tagging. Tagging has been developed by the primary company in Internet routers, Cisco. Cisco's tagging approach is based on an assessment of certain performance charcteristics, and bandwidth is allocated accordingly.allocation is especially critical as increasing amounts of video begin to clog the Internet network backbone. The situation will only get worse as demand increases and content gets more demanding (i.e., video strains the system much more than text).
An important caveat should be added at this point. As network traffic increases, especially as video traffic grows, it is unclear whether the Internet will be able to handle the demand. Numerous reports have appeared in recent months warning of massive clogging of the Internet arteries (Murphy and Hofacker, 1996). Although it is possible that compression and other technological advances will alleviate the strain, it is equally possible that the network will collapse under the load.
Telecommunications expert John Carey (1996) notes an historical example to illustrate how this concern is not a new one.
"The wireless music box has no imaginable commercial value. Who would pay for a message sent to nobody in particular?"
Associates' response to David Sarnoff's
recommendation to invest in radio in the 1920s
The economics of video on the Int ernet are even less settled than the technology. Four funding models are beginning to emerge, however. They are:
¥ Subscription
¥ Usage
¥ Advertising, and
¥ Transactional.
Subscription is the primary approach used in the related area of video on demand delivery in the financial services market where companies such as Reuters, Dow Jones, Bloomberg and NBC Desktop Video are competing for a lucrative marketplace. These companies provide video to the desktop primarily using T1 dedicated lines and satellite distribution to many thousands of customers world wide and especially on Wall Street, where financial traders need real-time data, including live video from press conferences such as Alan Greenspan making important Federal Reserve announcements.
These customers will pay top dollar for such information. Pam Snook of NBC Corporate Communications (Interview, September 18, 1996) reports that NBC Desktop Video has more than 27,000 customers each paying $75 a month for NBC Pro and $50 a month for NBC Private Financial Network. Some simple math reveals that this service is generating between $1.35 million to 2.025 million per month for just NBC Desktop, which is the oldest video service of the main competitors in this arena, and the only to broadcast 24 hours a day. Snook reports that they are now expanding the market as corporate research tool, and will provide software to capture video snippets off NBC Desktop to insert into Intranet presentations. They're working with Microsoft to deliver streaming video to the general market via the Internet for $9.95/month, and expect to have this service operational by the end of 1996. NBC's main web site will be integrating video, as well, Snook adds. Like the other networks, NBC has an enormous video archive, especially for Olympics. They will soon be passing streaming video via Internet to the networks 215 affiliates around the nation. They are experimenting with MCI in a system where editors can pull up clips, preview them before downloading, and even pre-edit them. In a test with chip manufacturer Intel, NBC is working on a new product called Intercast, which will allow a viewer to sit at a PC, watch the program in upper left-hand corner of screen, and get HTML code (Hypertext Markup Language, which is the primary tool for creating Web documents) to access additional information. The first test of this technology linking traditional television with the Internet will be NBC's program "Homocides." News may be next. Interestingly, like the 150-year old fax, Intercast uses a technology that just won't die: the veritcal blanking interval, which has a limited amount of largely unused bandwidth available for such over-the-air or cable-delivered television services such as closed-captioning.
Usage based funding offers the interesting possibility of charging based on the amount of video content accessed. The phone companies are particularly interested in this approach, because it's basically what they do best and make most of their money at. Academics and others interested in the traditional open and democratic culture of the Internet find it particularly troubling, because it is likely to advantage those who can afford it most, and disadvantage those with fewer resources.
Of course, experimenting in the world of cybernetworks does not come without its financial risks, even to the major players. As Dr. Walter S. Baer, Deputy Vice President - Domestic Research for RAND and former head of videotext trials for Time Mirror, once observed: "The videotext trials taught many of us how to become millionaires. All we had to do was start with 40 million, and we were left with one." Of course, technological advances have made the barriers to entry much lower and less expensive, and the risks may not be so great, but the shake-out will still occur.
Advertising supported video on the Internet is emerging as a basic approach to funding, although it is not likely to become a major force until connectivity to the Internet and video client diffusion is much greater (i.e., closer to 50% of the market). One exception may be highly targeted advertising, even advertisements themselves that use video to sell a product directly. This may eventually evolve into virtual reality applications where buyers may try out the product (e.g., test drive a car) online, and then order it directly. First Virtual Holdings, Inc., has already announced its plans to set the standard for this promising industry. As reported in the New York Times:
The First Virtual system, called virtual tags, embeds
the transaction process directly into multimedia
advertisements that can include animation, sound and
even video clips.
Consumers can view the sales pitch simply by waving the
mouse pointer over the ad. If they like what they see,
they can pay for merchandise on the spot, without
leaving the World Wide Web page they are viewing. (The
system is comparable to being able to touch an ad in a
newspaper and buy the leaf blower without having to stop
reading the paper and drive to Wal-Mart.)
Peter H. Lewis,
Advertising: Technology for the Cybermarketing Age,
The New York Times, September 18, 1996
This, of course, leads to the most likely avenue for greatest revenues for video on cybernetworks, transactional services. This encompasses a broad range of applications, from buying adult videos via the Internet (probably the only consumer market for video on cybernetworks today that is really making money, just as adult video sales were a major economic engine in the early stages of the video industry in the 1980s), to testing products and ordering them directly via credit card purchase, to interacting with online therapists who charge for their cybernetwork services delivered interactively with two-way live, streaming video and audio. Virtual reality pioneer Jaron Lanier contends that one of the most promising funding models is "nano-transactional" (Lanier, 1996). Nano-transactional funding refers to placing extremely small charges on virtually all content on the Internet, whether video or otherwise. Nano-transactional funding offers the advantage of making video and other content affordable to virtually everyone, yet income generating to all content providers.
Effect on traditional carriers
Emerging video applications on cybernetworks will likely have at least three major effects on traditional programming (i.e., video) carriers. They are:
¥ Content changes
¥ Changing economics
¥ Paradigmatic shifts for communication industry
Content is already changing and will likely change in both anticipated and unanticipated ways. First, we're seeing the development of a much more integrated news and entertainment cycle throughout traditional media and cybernetworks. MSNBC is a prime example of where one of the strategic goals is to keep viewers in the MS/NBC cycle anytime anywhere--over the air, via cable, or through the Internet. Perhaps more importantly, we may finally see the emergence of truly effective interactive video content, for everything from news to entertainment to training. Rather than an add on, interactivity may become a part of the design of new video content, where viewers may watch a program, interact with each other at remote locations, and even ask questions of real or automated subject experts or celebrities. They may also get on-demand video to follow up on stories they see on the nightly news that capture their interest.
The economics of traditional media will likely change, as well. Transactional models will emerge that are based on the dual abilities of cybernetworks integrated with traditional carriers to provide targeted, on-demand video. Customers will find it increasingly attractive and simple to go directly to sellers to see, test and buy the desired product.
Together, these shifts represent a paradigmatic shift in the nature of mediated communication. Although it will likely take many years if not decades to evolve, cybernetworks and their ability to deliver switch video, will make a new model of communication possible. Combining both the on-demand and targeted capability of switched telecommunications and the compelling nature of real-time video, this new media environment will offers great promise to those bold enough to enter it. The promise is a powerful communication system capable of delivering all forms of communication anytime anywhere and with potentially great profit. Illustrative of this future is First-TV, which uses VIVO as its platform for delivering video content via the Internet. First-TV is "The first Internet-only television network to feature original programming...using new technology that allows programs to be seen on a basic home computer" (Reuters, September 18, 1996). One important scenario for the future may be the development of "advertising" that the buyer generates rather than advertising broadcast by the seller. Using the interactive capabilities of the cybernetworks, buyers can able to request advertising content, and have it tailored to their situation, interests and needs, eliminating the need for sellers to send advertising messages where they may not be wanted. This cultural shift in the economics of cybernetworks may turn the world of traditional media, such as television, on its head.
References