Sabtu, 23 April 2011

H.264/MPEG-4 AVC

H.264/MPEG-4 AVC

From Wikipedia, the free encyclopedia

(Redirected from H.264)
Jump to: navigation, search
H.264, MPEG-4 Part 10, or AVC, for Advanced Video Coding, is a digital video codec standard which is noted for achieving very high data compression. It was written by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG) as the product of a collective partnership effort known as the Joint Video Team (JVT). The ITU-T H.264 standard and the ISO/IEC MPEG-4 Part 10 standard (formally, ISO/IEC 14496-10) are technically identical. The final drafting work on the first version of the standard was completed in May of 2003.
H.264 is a name related to the ITU-T line of H.26x video standards, while AVC relates to the ISO/IEC MPEG side of the partnership project that completed the work on the standard, after earlier development done in the ITU-T as a project called H.26L. It is usual to call the standard as H.264/AVC (or AVC/H.264 or H.264/MPEG-4 AVC or MPEG-4/H.264 AVC) to emphasize the common heritage. The name H.26L, harkening back to its ITU-T history, is far less common, but still used. Occasionally, it has also been referred to as "the JVT codec", in reference to the JVT organization that developed it. (Such partnership and multiple naming is not unprecedented, as the video codec standard known as MPEG-2 also arose from a partnership between MPEG and the ITU-T, and MPEG-2 video is also known in the ITU-T community as H.262.)
The intent of the H.264/AVC project was to create a standard that would be capable of providing good video quality at bit rates that are substantially lower (e.g., half or less) than what previous standards would need (e.g., relative to MPEG-2, H.263, or MPEG-4 Part 2), and to do so without so much of an increase in complexity as to make the design impractical (excessively expensive) to implement. An additional goal was to do this in a flexible way that would allow the standard to be applied to a very wide variety of applications (e.g., for both low and high bit rates, and low and high resolution video) and to work well on a very wide variety of networks and systems (e.g., for broadcast, DVD storage, RTP/IP packet networks, and ITU-T multimedia telephony systems).
The JVT recently completed the development of some extensions to the original standard that are known as the Fidelity Range Extensions (FRExt). These extensions support higher-fidelity video coding by supporting increased sample accuracy (including 10-bit and 12-bit coding) and higher-resolution color information (including sampling structures known as YUV 4:2:2 and YUV 4:4:4). Several other features are also included in the Fidelity Range Extensions project (such as adaptive switching between 4×4 and 8×8 integer transforms, encoder-specified perceptual-based quantization weighting matrices, efficient inter-picture lossless coding, support of additional color spaces, and a residual color transform). The design work on the Fidelity Range Extensions was completed in July of 2004, and the drafting was finished in September of 2004.
Since the completion of the original version of the standard in May of 2003, the JVT has also completed two generations of "corrigendum" errata corrections to the text of the standard.

Contents

[edit]

Features

H.264/AVC/MPEG-4 Part10 contains a number of new features that allow it to compress video much more effectively than older standards and to provide more flexibility for application to a wide variety of network environments. In particular, some such key features include:
  • Multi-picture motion compensation using previously-encoded pictures as references in a much more flexible way than in past standards, thus allowing up to 32 reference pictures to be used in some cases (unlike in prior standards, where the limit was typically one or, in the case of conventional "B pictures", two). This particular feature usually allows modest improvements in bit rate and quality in most scenes. But in certain types of scenes, for example scenes with rapid repetitive flashing or back-and-forth scene cuts or uncovered background areas, it allows a very significant reduction in bit rate.
  • Variable block-size motion compensation (VBSMC) with block sizes as large as 16×16 and as small as 4×4, enabling very precise segmentation of moving regions.
  • Six-tap filtering for derivation of half-pel luma sample predictions, in order to lessen the aliasing and eventually provide sharper image.
  • Macroblock pair structure, allowing 16x16 macroblocks in field mode (vs. 16x8 in MPEG-2).
  • Quarter-pixel precision for motion compensation, enabling very precise description of the displacements of moving areas. For chroma the resolution is typically halved (see 4:2:0) therefore the motion compensation precision is down to one-eighth pixel.
  • Weighted prediction, allowing an encoder to specify the use of a scaling and offset when performing motion compensation, and providing a significant benefit in performance in special cases—such as fade-to-black, fade-in, and cross-fade transitions.
  • An in-loop deblocking filter which helps prevent the blocking artifacts common to other DCT-based image compression techniques.
  • An exact-match integer 4×4 spatial block transform (similar to the well-known DCT design), and in the case of the new FRExt "High" profiles, the ability for the encoder to adaptively select between a 4×4 and 8×8 transform block size for the integer transform operation.
  • A secondary Hadamard transform performed on "DC" coefficients of the primary spatial transform (for chroma DC coefficients and also luma in one special case) to obtain even more compression in smooth regions.
  • Spatial prediction from the edges of neighboring blocks for "intra" coding (rather than the "DC"-only prediction found in MPEG-2 Part 2 and the transform coefficient prediction found in H.263+ and MPEG-4 Part 2).
  • Context-adaptive binary arithmetic coding (CABAC), which is a clever technique to losslessly compress syntax elements in the video stream knowing the probabilities of syntax elements in a given context.
  • Context-adaptive variable-length coding (CAVLC), which is a lower-complexity alternative to CABAC for the coding of quantized transform coefficient values. Although lower complexity than CABAC, CAVLC is more elaborate and more efficient than the methods typically used to code coefficients in other prior designs.
  • A common simple and highly-structured variable length coding (VLC) technique for many of the syntax elements not coded by CABAC or CAVLC, referred to as Exponential-Golomb (Exp-Golomb) coding.
  • A network abstraction layer (NAL) definition allowing the same video syntax to be used in many network environments, including features such as sequence parameter sets (SPSs) and picture parameter sets (PPSs) that provide more robustness and flexibility than provided in prior designs.
  • Switching slices (called SP and SI slices), features that allow an encoder to direct a decoder to jump into an ongoing video stream for such purposes as video streaming bit rate switching and "trick mode" operation. When a decoder jumps into the middle of a video stream using the SP/SI feature, it can get an exact match to the decoded pictures at that location in the video stream despite using different pictures (or no pictures at all) as references prior to the switch.
  • Flexible macroblock ordering (FMO, also known as slice groups) and arbitrary slice ordering (ASO), which are techniques for restructuring the ordering of the representation of the fundamental regions (called macroblocks) in pictures. Typically considered an error/loss robustness feature, FMO and ASO can also be used for other purposes.
  • Data partitioning (DP), a feature providing the ability to separate more important and less important syntax elements into different packets of data, enabling the application of unequal error protection (UEP) and other types of improvement of error/loss robustness.
  • Redundant slices (RS), an error/loss robustness feature allowing an encoder to send an extra representation of a picture region (typically at lower fidelity) that can be used if the primary representation is corrupted or lost.
  • A simple automatic process for preventing the accidental emulation of start codes, which are special sequences of bits in the coded data that allow random access into the bitstream and recovery of byte alignment in systems that can lose byte synchronization.
  • Supplemental enhancement information (SEI) and video usability information (VUI), which are extra information that can be inserted into the bitstream to enhance the use of the video for a wide variety of purposes.
  • Auxiliary pictures, which can be used for such purposes as alpha compositing.
  • Frame numbering, a feature that allows the creation of "sub-sequences" (enabling temporal scalability by optional inclusion of extra pictures between other pictures), and the detection and concealment of losses of entire pictures (which can occur due to network packet losses or channel errors).
  • Picture order count, a feature that serves to keep the ordering of the pictures and the values of samples in the decoded pictures isolated from timing information (allowing timing information to be carried and controlled/changed separately by a system without affecting decoded picture content).
These techniques, along with several others, help H.264 to perform significantly better than any prior standard can, under a wide variety of circumstances in a wide variety of application environments. H.264 can often perform radically better than MPEG-2 video—typically obtaining the same quality at half of the bit rate or less.
Like other ISO/IEC MPEG video standards, H.264/AVC has a reference software implementation that can be freely downloaded. Its main purpose is to give examples of H.264/AVC features, rather than being a useful application per se. (See the links section for a pointer to that software.) Some reference hardware design work is also under way in MPEG.

Profiles

The standard includes the following six sets of capabilities, which are referred to as profiles, targeting specific classes of applications:
  • Baseline Profile (BP): Primarily for lower-cost applications demanding less computing resources, this profile is used widely in videoconferencing and mobile applications.
  • Main Profile (MP): Originally intended as the mainstream consumer profile for broadcast and storage applications, the importance of this profile faded when the High profile was developed for those applications.
  • Extended Profile (XP): Intended as the streaming video profile, this profile has relatively high compression capability and some extra tricks for robustness to data losses and server stream switching.
  • High Profile (HiP): The primary profile for broadcast and disc storage applications, particularly for high-definition television applications (this is the profile adopted into HD DVD and Blu-ray Disc, for example).
  • High 10 Profile (Hi10P): Going beyond today's mainstream consumer product capabilities, this profile builds on top of the High Profile — adding support for up to 10 bits per sample of decoded picture precision.
  • High 4:2:2 Profile (Hi422P): Primarily targeting professional applications that use interlaced video, this profile builds on top of the High 10 Profile — adding support for the 4:2:2 chroma sampling format while using up to 10 bits per sample of decoded picture precision.
  • High 4:4:4 Profile (Hi444P) [deprecated]: This profile builds on top of the High 4:2:2 Profile — supporting up to 4:4:4 chroma sampling, up to 12 bits per sample, and additionally supporting efficient lossless region coding and an integer residual color transform for coding RGB video while avoiding color-space transformation error. Note: The High 4:4:4 Profile is being removed from the standard in favor of developing a new improved 4:4:4 profile.

Baseline
Extended
Main
High
High 10
High 4:2:2
High 4:4:4
I and P Slices
Yes
Yes
Yes
Yes
Yes
Yes
Yes
B Slices
No
Yes
Yes
Yes
Yes
Yes
Yes
SI and SP Slices
No
Yes
No
No
No
No
No
Multiple Reference Frames
Yes
Yes
Yes
Yes
Yes
Yes
Yes
In-Loop Deblocking Filter
Yes
Yes
Yes
Yes
Yes
Yes
Yes
CAVLC Entropy Coding
Yes
Yes
Yes
Yes
Yes
Yes
Yes
CABAC Entropy Coding
No
No
Yes
Yes
Yes
Yes
Yes
Flexible Macroblock Ordering (FMO)
Yes
Yes
No
No
No
No
No
Arbitrary Slice Ordering (ASO)
Yes
Yes
No
No
No
No
No
Redundant Slices (RS)
Yes
Yes
No
No
No
No
No
Data Partitioning
No
Yes
No
No
No
No
No
Interlaced Coding (PicAFF, MBAFF)
No
Yes
Yes
Yes
Yes
Yes
Yes
4:2:0 Chroma Format
Yes
Yes
Yes
Yes
Yes
Yes
Yes
4:2:2 Chroma Format
No
No
No
No
No
Yes
Yes
4:4:4 Chroma Format
No
No
No
No
No
No
Yes
8 Bit Sample Depth
Yes
Yes
Yes
Yes
Yes
Yes
Yes
9 and 10 Bit Sample Depth
No
No
No
No
Yes
Yes
Yes
11 and 12 Bit Sample Depth
No
No
No
No
No
No
Yes
8x8 vs. 4x4 Transform Adaptivity
No
No
No
Yes
Yes
Yes
Yes
Quantization Scaling Matrices
No
No
No
Yes
Yes
Yes
Yes
Separate Cb and Cr QP control
No
No
No
Yes
Yes
Yes
Yes
Monochrome Video Format
No
No
No
Yes
Yes
Yes
Yes
Residual Color Transform
No
No
No
No
No
No
Yes
Predictive Lossless Coding
No
No
No
No
No
No
Yes

Baseline
Extended
Main
High
High 10
High 4:2:2
High 4:4:4

Levels

Level number
Max macroblocks per second
Max frame size (macroblocks)
Max video bit rate (VCL) for Baseline, Extended and Main Profile
Max video bit rate (VCL) for High Profile
Max video bit rate (VCL) for High 10 Profile
Max video bit rate (VCL) for High 4:2:2 and High 4:4:4 Profile
Examples for high resolution/framerate in this profile
1
1485
99
64 kbit/s
80 kbit/s
192 kbit/s
256 kbit/s
128x96/30.9
176x144/15.0
1b
1485
99
128 kbit/s
160 kbit/s
384 kbit/s
512 kbit/s
128x96/30.9
176x144/15.0
1.1
3000
396
192 kbit/s
240 kbit/s
576 kbit/s
768 kbit/s
176x144/30.3
320x240/10.0
1.2
6000
396
384 kbit/s
480 kbit/s
1152 kbit/s
1536 kbit/s
176x144/60.6
320x240/20.0
352x288/15.2
1.3
11880
396
768 kbit/s
960 kbit/s
2304 kbit/s
3072 kbit/s
352x288/30.0
2
11880
396
2 Mbit/s
2.5 Mbit/s
6 Mbit/s
8 Mbit/s
352x288/30.0
2.1
19800
792
4 Mbit/s
5 Mbit/s
12 Mbit/s
16 Mbit/s
352x480/30.0
352x576/25.0
2.2
20250
1620
4 Mbit/s
5 Mbit/s
12 Mbit/s
16 Mbit/s
720x480/15.0
352x576/25.6
3
40500
1620
10 Mbit/s
12.5 Mbit/s
30 Mbit/s
40 Mbit/s
720x480/30.0
720x576/25.0
3.1
108000
3600
14 Mbit/s
17.5 Mbit/s
42 Mbit/s
56 Mbit/s
1280x720/30.0
720x576/66.7
3.2
216000
5120
20 Mbit/s
25 Mbit/s
60 Mbit/s
80 Mbit/s
1280x720/60.0
4
245760
8192
20 Mbit/s
25 Mbit/s
60 Mbit/s
80 Mbit/s
1920x1088/30.1
2048x1024/30.0
4.1
245760
8192
50 Mbit/s
62.5 Mbit/s
150 Mbit/s
200 Mbit/s
1920x1088/30.1
2048x1024/30.0
4.2
522240
8704
50 Mbit/s
62.5 Mbit/s
150 Mbit/s
200 Mbit/s
1920x1088/64.0
2048x1088/60.0
5
589824
22080
135 Mbit/s
168.75 Mbit/s
405 Mbit/s
540 Mbit/s
1920x1088/72.3
2560x1920/30.7
5.1
983040
36864
240 Mbit/s
300 Mbit/s
720 Mbit/s
960 Mbit/s
1920x1088/120.5
4096x2048/30.0
Level number
Max macroblocks per second
Max frame size (macroblocks)
Max video bit rate (VCL) for Baseline, Extended and Main Profile
Max video bit rate (VCL) for High Profile
Max video bit rate (VCL) for High 10 Profile
Max video bit rate (VCL) for High 4:2:2 and High 4:4:4 Profile
Examples for high resolution/framerate in this profile

Patent licensing

As with MPEG-2 Parts 1 and 2 and MPEG-4 Part 2 amongst others, the vendors of H.264/AVC products and services are expected to pay patent licensing royalties for the patented technology that their products use. The primary source of licenses for patents applying to this standard is a private organization known as MPEG-LA, LLC (which is not affiliated in any way with the MPEG standardization organization, but which also administers patent pools for MPEG-2 Part 1 Systems, MPEG-2 Part 2 Video, MPEG-4 Part 2 Video, and other technologies). Via Licensing also operates an H.264 patent pool. Some patent holders may not join either of the two licensing pools. (Licensing pools generally do not indemnify against third-party patents and cannot force patent-holders to join their pools.) This situation has caused reluctance to embrace H.264 among some potential adopters and may result in adoptions of alternative codecs that are believed to have lower licensing fees and lawsuit risks.

Applications

Both of the major candidate next-generation DVD rival formats planned for product deployment in 2006 include the H.264/AVC High Profile as a mandatory player feature — specifically:
  • The HD-DVD format of the DVD Forum
  • The Blu-ray Disc format of the Blu-Ray Disc Association (BDA)
The Digital Video Broadcast (DVB) standards body in Europe approved the use of H.264/AVC for broadcast television in Europe in late 2004.
The prime minister of France, Jean-Pierre Raffarin, announced the selection of H.264/AVC as a requirement for receivers of HDTV and pay TV channels for digital terrestrial broadcast television services (referred to as "TNT") in France in late 2004.
The Advanced Television Systems Committee (ATSC) standards body in the United States is considering the possibility of specifying one or two advanced video codecs for its optional Enhanced-VSB (E-VSB) transmission mode for use in U.S. broadcast television. It has included H.264/AVC and VC-1 into Candidate Standards (CS/TSG-659r1 and CS/TSG-658) for this purpose.
The Digital Multimedia Broadcast (DMB) service in the Republic of Korea will use H.264/AVC.
Mobile-segment terrestrial broadcast services of ISDB-T in Japan will use the H.264/AVC codec, including major broadcasters:
Direct broadcast satellite TV services will use the new standard, including:
The 3rd Generation Partnership Project (3GPP) has approved the inclusion of H.264/AVC as an optional feature in release 6 of its mobile multimedia telephony services specifications.
The Motion Imagery Standards Board (MISB) of the United States Department of Defense (DoD) has adopted H.264/AVC as its preferred video codec for essentially all applications.
The North Atlantic Treaty Organisation (NATO) similarly adopted H.264/AVC for its international military use.
The Internet Engineering Task Force (IETF) has completed a payload packetization format (RFC 3984) for carrying H.264/AVC video using its Real-time Transport Protocol (RTP).
The Internet Streaming Media Alliance (ISMA) has adopted H.264/AVC for its new ISMA 2.0 specifications.
The Moving Picture Experts Group (MPEG) has fully integrated support of H.264/AVC into its system standards (e.g., MPEG-2 and MPEG-4 systems) and its ISO media file format specification.
The International Telecommunications Union-Telecom. Standardization Sector (ITU-T) has adopted H.264/AVC in its H.32x suite of multimedia telephony systems specifications. Based on the ITU-T standards, H.264/AVC is already widely used for videoconferencing, including its support in products of both of the dominant companies in that market (Polycom and Tandberg, and those of a number of other of companies as well). Essentially all new videoconferencing products now include support for H.264/AVC.
H.264 will probably be used by various video-on-demand services on the Internet to provide films and television shows directly to computers, and may eventually replace the current H.262/MPEG2 encoding in DVB terrestrial and satellite broadcasting.

Products and Implementations

Software implementations

  • x264 is a GPL-licensed H.264 encoder that is used in the free VideoLAN and MEncoder transcoding applications and, as of December 2005, remains the only reasonably complete open source and free software implementation of the standard, with support for Main Profile and High Profile except interlaced video. [1] A Video for Windows frontend is also available, but has compatibility problems, as Video for Windows can't support certain features of the AVC standard correctly. x264 is not likely to be incorporated into commercial products because of its license and patent issues surrounding the standard itself. x264 won an independent video codec comparison organized by Doom9.org in December 2005.[2]. Program-pack called Gordian Knot uses x264 to encode ripped DVD video material.
  • The LGPL-licensed libavcodec includes a H.264 decoder. It can decode both Main Profile and High Profile video, except interlaced video. It is used in many programs like in the free VLC media player and MPlayer multimedia players, and in ffdshow and FFmpeg decoders projects.
  • Nero Digital, co-developed by Nero AG and Ateme, includes an H.264 encoder and decoder (as of September 2005, corresponding to Main Profile, except interlaced video support), along with other MPEG-4 compatible technologies. To be released in early 2006 is a version to also support High Profile and also interlaced video.
  • Apple Computer has integrated H.264 into Mac OS X version 10.4 (Tiger), as well as QuickTime version 7, which was released on April 29 2005 with Tiger. The encoder is claimed to be Main Profile, but actually Baseline Profile plus 1 B-frame support, the decoder supports Baseline, Extended, and most of Main Profile [3]. QuickTime 7 is also now available for Microsoft's Windows operating system. Apple uses H.264 in the system for video playback and use in iChat video conferences. In April 2005, Apple Computer updated its version of DVD Studio Pro to support authoring HD content, which employs AVC. DVD Studio Pro allows for the burning of HD-DVD content to both standard DVDs and HD-DVD media (despite hardware being extremely scarce, and not yet a finalised standard). For playing back HD-DVDs burnt onto a standard DVD, Apple requires a PowerPC G5 or Intel Core Duo, Apple DVD Player v4.6, and Mac OS X v10.4 or later.

Hardware implementations

Several companies are producing custom chips capable of decoding H.264/AVC video. Chips capable of real-time decoding at high-definition picture resolutions include these:
Such chips will allow widespread deployment of low-cost devices capable of playing H.264/AVC video at standard-definition and high-definition television resolutions.
Other hardware product offerings for H.264/AVC include these:
  • ASTRI IC Designs Group is producing programmable media processor core, capable of real-time encoding and decoding video with H.264/AVC and China standards in CIF resolution concurrently, for low-power portable multimedia and mobile applications.
  • NeoMagic also has a chip product called the MiMagic 6, which is targeted for low-power applications (and is thus not HD-capable).
  • C&S Technology has a Neptune chip product targeted for the Korean T-DMB market.
  • ATI Technologies' newest graphics processing unit (GPU), the Radeon X1000-series, features hardware acceleration of H.264 decoding starting in the Catalyst 5.13 drivers. H.264 decoding is one component of the ATI "AVIVO" multimedia technology [4] [5][6]
  • The Sony PSP 2.0 (PlayStation Portable with firmware 2.0+ and some custom firmware) handheld console features hardware decoding of video files in a proprietary even newer version of the H.264 format, the MPEG4-SP (Small Profile). It will not play standard H.264 files. The PSP also restricts playback of non-UMD video larger than roughly 320x240 (4:3) or 368x208 (16:9), despite having a screen capable of higher resolution.
  • Apple added H.264 video playback to their 5th Generation iPod on October 12, 2005. The new product uses this format, as well as MPEG-4 Part 2, for video playback. The video-enabled iPod uses the H.264 Baseline Profile with support of bit rates up to 768 kbit/s, image resolutions up to 320x240, and frame rates up to 30 frames per second.
  • WorldGate sells the Ojo videophone (formerly distributed by Motorola), which uses H.264 Baseline Profile at QCIF (144x176) image resolution with bitrates of 80 to 500 kbits/s, at a fixed framerate of 30 frames per second.
  • Envivio, Inc. is shipping broadcast H.264 encoders for standard definition live encoding and off-line encoders for High Definition (720p, 1080i, 1080p). Envivio also supplies H.264 decoders for Windows, Linux and Macintosh as well as H.264 Video Servers and Authoring tools.
  • Modulus Video is shipping broadcast-quality H.264 standard definition real-time encoders to broadcasters (including telephone companies) and has announced its high definition real-time encoder (the ME6000) for shipment in mid 2005. The Modulus Video HD encoder technology was demonstrated at NAB in April 2004, where it won a "Pick Hit" award. The Modulus design uses technology from LSI Logic..

See also

External links




Send by Intramail Of RF Link Staff RCTI
ext : 2679



Send by Intramail Of RF Link Staff RCTI
ext : 2679


Tidak ada komentar:

Posting Komentar

ucx','_assdop');