Shortcuts

MP4Box can natively import many multimedia formats (AVI, MP3, ...) into ISO files. For non-natively supported media formats, MP4Box defines two generic multiplexing languages: a binary format NHNT and a more powerful XML language called NHML.

MP4BOX HELP

A rather complete description of MP4Box functionalities and various switches.

MPEG-4 SYSTEMS

Documentation of GPAC specific handling of MPEG-4 scene description textual formats: BT and XMT-A.

NHNT & NHML

Generic multiplexing languages to manipulate any media format with GPAC and MP4Box.

TIMED TEXT

Description of 3GPP/MPEG-4 Timed Text tracks for subtitles, tickers ...

ISMA E&A - OMA DRM

Encrypting and decrypting content according to ISMA E&A and OMA DRM 2.0 specifications.



ENST Home Page
sourceforge.net EuroLinux.org
Valid XHTML 1.0 Transitional Valid CSS!

NHNT Format

The NHNT format has been developed during the MPEG-4 Systems implementation phase, as a way to easily mux unknown media formats to an MP4 file or an MPEG-4 multiplex. The goal was to have the media encoder produce a description of the media time fragmentation (access units and timestamps) that could be reused by a media-unaware MPEG-4 multiplexer.

A NHNT source is composed of 2 or 3 parts:
  • the media file: This file contains all the media data as written by the encoder. The file extension must be .media.
  • the NHNT (meta) file: This file contains all the information needed by the MPEG-4 multiplexer to use the media data. The file extension must be .nhnt.
  • the decoder initialization file: If the media format requires decoder configuration data (MPEG-4 Visual, AAC, AVC/H264, ...), the binary data is put in this third file in order for the MPEG-4 multiplexer to correctly signal decoder configuration. This is required by the fact that in MPEG-4 Systems, configuration data is never sent in-band of the media stream, but through the object descriptor stream. The file extension must be .info.

The NHNT file format

A NHNT file is made of a header, and a set of access units descriptors. All integers are written in network-byte order.

Header Syntax

char Signature[4];
bit(8) version;
bit(8) streamType;
bit(8) objectTypeIndication;
bit(16) reserved = 0;
bit(24) bufferSizeDB;
bit(32) avgBitRate;
bit(32) maxBitRate;
bit(32) timeStampResolution;

Semantics

  • Signature : identifies the file as an NHNT file. The signature must be 'NHnt' or 'NHnl' for large files (using 64 bits offsets and timestamps).
  • version : identifies the NHNT version used to produce the file. Default version is 0.
  • streamType : identifies the media streamType as specified in MPEG-4 (0x04: Visual, 0x05: audio, ...). Officially supported stream types are listed here.
  • objectTypeIndication : identifies the media type as specified in MPEG-4. For example, 0x40 for MPEG-4 AAC. Officially supported object types are listed here.
  • bufferSizeDB : indicates the size of the decoding buffer for this stream in byte.
  • avgBitRate : indicates the average bitrate in bits per second of this elementary stream. For streams with variable bitrate this value shall be set to zero.
  • maxBitRate : indicates the maximum bitrate in bits per second of this elementary stream in any time window of one second duration.
  • timeStampResolution : indicates the unit in which the media timestamps are expressed in the file (timeStampResolution ticks = 1 second).

After the header, the file is just a succession of access unit (sample) info until the end of the file.

Sample Header Syntax for 'NHnt' files

bit(24) data_size;
bit(1) random_access_point;
bit(1) au_start_flag;
bit(1) au_end_flag;
bit(3) reserved = 0;
bit(2) frame_type;
bit(32) file_offset;
bit(32) compositionTimeStamp;
bit(32) decodingTimeStamp;

Sample Header Syntax for 'NHnl' files

bit(24) data_size;
bit(1) random_access_point;
bit(1) au_start_flag;
bit(1) au_end_flag;
bit(3) reserved = 0;
bit(2) frame_type;
bit(64) file_offset;
bit(64) compositionTimeStamp;
bit(64) decodingTimeStamp;

Semantics

  • data_size : indicates the amount of data to fetch from the source file for this access unit.
  • random_access_point : indicates if the access unit is a random access point.
  • au_start_flag : indicates if this is the start of an access unit or not.
  • au_end_flag : indicates if this is the end of an access unit or not.
  • frame_type : Used for bidirectional video coding sources only, 0 otherwise.
    • frame_type=2: access unit is a B-frame
    • frame_type=1: access unit is a P-frame
    • frame_type=0: access unit is an I-frame
  • file_offset : indicates the position in the source file of the first byte to fetch for this data chunk.
  • compositionTimeStamp : indicates the composition (presentation) time stamp of this access unit.
  • decodingTimeStamp : indicates the decoding time stamp of this access unit.

Note : Samples must be described in decoding order in the nhnt file when using sample fragmentation. Otherwise, sample may be described out of order.

NHML Format

The NHNT format is a very usefull tool for multiplexing data, but is not user-friendly at all when dealing with complex cases such as multi-source media files or NHNT authoring (timing modification, data removal or insertion).

The NHML format has been therefore developed at ENST in order to provide more control about the imported data source and give the user the tools to easily modify the multiplexing process.

The NHML format is an XML-based description of a media file, just like NHNT, with some major enhancements. This format is supported since GPAC 0.4.2.

To obtain some sample NHML files, simply use MP4Box -nhml trackID srcFile

The NHML file format

Just like any XML file, the file must begin with the usual xml header. The file encoding SHALL BE UTF-8.

The root element of an NHML file is the NHNTStream

Syntax

<NHNTStream baseMediaFile="..." specificInfoFile="..." trackID="..." inRootOD="..." DTS_increment="..." timeScale="..." streamType="..." objectTypeIndication="..." mediaType="..." mediaSubType="..." width="..." height="..." parNum="..." parDen="..." sampleRate="..." numChannels="..." bitsPerSample="..." compressorName="..." codecVersion="..." codecRevision="..." codecVendor="..." temporalQuality="..." spatialQuality="..." horizontalResolution="..." verticalResolution="..." bitDepth="..." >
<NHNTSample />
...
<NHNTSample />
</NHNTStream>

Semantics

  • baseMediaFile : indicates the default location of the stream data. If not set, the file with the same name and extension .media is assumed to be the source.
  • specificInfoFile : indicates the location of the decoder configuration data if any.
  • trackID : indicates a desired trackID for this media when importing to IsoMedia. Value type: unsigned integer. Default Value: 0.
  • inRootOD : indicates if the imported stream is present in the InitialObjectDescriptor. Value type: "yes", "no". Default Value: "no".
  • DTS_increment : indicates a default time increment between two consecutive samples. Value type: unsigned integer. Default Value: 0.
  • timeScale : indicates the time scale in which the time stamps are given. Value type: unsigned integer. Default Value: 1000 or sample rate if specified.
  • streamType : identifies the media streamType as specified in MPEG-4 (0x04: Visual, 0x05: audio, ...). Officially supported stream types are listed here.
  • objectTypeIndication : identifies the media type as specified in MPEG-4. For example, 0x40 for MPEG-4 AAC. Officially supported object types are listed here.
  • mediaType : indicates the 4CC media type (handler) as used in IsoMedia. Not needed if streamType is specified. Value Type: 4 byte string. Officially supported handler types are listed here.
  • mediaSubType : indicates the 4CC media subtype (codec) to use in IsoMedia. This subtype will identify the sample description used (stsd table). Not needed if streamType is specified. Value Type: 4 byte string. Officially supported codec types are listed here.
  • width, height : indicates the dimension of a visual media. Ignored if the media is not video (streamType 0x04 or mediaType "vide"). Value Type: unsigned integer.
  • parNum, parDen : indicates the pixel aspect ratio of a visual media. Ignored if the media is not video (streamType 0x04 or mediaType "vide"). Value Type: unsigned integer.
  • sampleRate : indicates the sample rate of an audio media. Ignored if the media is not audio (streamType 0x05 or mediaType "soun"). Value Type: unsigned integer.
  • numChannels : indicates the number of channels of an audio media. Ignored if the media is not audio (streamType 0x05 or mediaType "soun"). Value Type: unsigned integer.
  • bitsPerSample : indicates the number of bits per audio sample for an audio media. Ignored if the media is not audio (streamType 0x05 or mediaType "soun"). Value Type: unsigned integer.

All other parameters are used when creating custum sample description in IsoMedia (eg, not using MPEG-4 streamType and ObjectTypeIndication). Their semantics are given in the QT (and IsoMedia) file format specification.

Each access unit is then described with a NHNTSample element.

Syntax

<NHNTSample DTS="..." CTSOffset="..." isRAP="..." isSyncShadow="..." mediaOffset="..." dataLength="..." mediaFile="..." xmlFrom="..." xmlTo="..." />

Semantics

  • DTS : decoding time stamp of the sample. If not set, the previous sample DTS (or 0) plus the specified DTS_increment is used. Value type: unsigned integer. Default Value: 0.
  • CTSOffset : offset between the decoding and the composition time stamp of the sample. Value type: unsigned integer. Default Value: 0.
  • isRAP : indicates if the sample is a random access point or not. Value type: "yes", "no". Default Value: "no".
  • isSyncShadow : indicates if the sample is a sync shadow sample (IsoMedia storage only). Value type: "yes", "no". Default Value: "no".
  • mediaOffset : indicates the position of the first byte of this sample in the media source file. Value type: unsigned integer. Default Value: 0.
  • dataLength : indicates the size of this sample. Value type: unsigned integer. Default Value: 0.
  • mediaFile : indicates the media source file to use. If not set, the baseMediaFile is used.
  • xmlFrom : if the source file is XML data, indicates the location of the first element to copy fom the XML document. The location can be "doc.start", "elt_id.start" or "elt_id.end". Elements are idendified through their "id", "xml:id" or "DEF" attributes.
  • xmlTo : if the source file is XML data, indicates the location of the last element to copy fom the XML document. The location can be "doc.end", "elt_id.start" or "elt_id.end". Elements are idendified through their "id", "xml:id" or "DEF" attributes.

(C) 2000-05 JLF / (C) 2005-0X ENST - $Date: 2007/08/30 13:19:19 $ - Webmaster