Recommendation
for DVD-Video Karaoke
The
application format for Karaoke usage, which is defined in DVD Specifications
for Read-Only Disc Part3 VIDEO SPECIFICATIONS, is specified so as
to realize many functions which are specific to Karaoke application
of DVD-Video. WG1 of DVD Forum introduces following guides as a
recommendation for DVD-Video Karaoke (hereinafter, "the Recommendation"),
in order for manufacturers to utilize the Karaoke application sufficiently.
The
Recommendation can never be useful and effective without the combination
of DVD-Video Karaoke Disc and DVD-Video Karaoke Player which are
both in compliance with the Recommendation.
We entreat you to consider it to encourage the prosperity
of the DVD-Video Karaoke. Regarding the details of the DVD-Video
format, we recommend to refer to the "DVD Specifications for
Read-Only Disc Part3: VIDEO SPECIFICATIONS" by DVD Forum.
The
Recommendation is intended for AC-3 and/or Linear PCM. It is feasible
to apply to other coding modes, however, some more recommendations
are required for the actual usage. DVD Forum is willing to prepare
them according to the requests in the market.
Index
Chapter
1: Terminology.....................................................................3
Chapter
2: Recommendation on the structure of DVD Karaoke ID Disc...4
2.1
Disc ID..........................................................4
2.2 VTS...............................................................4
2.3 Title structure ................................................5
2.4 Audio stream ................................................6
2.5 Attribute and Multi-channel extension..............10
2.6 Cell type......................................................18
2.7 Disc Menu and Navigation Command...............18
2.8 Karaoke mode for AC-3 bit stream..................20
Chapter
3: Recommendation on functions of DVD Karaoke ID Player...21
3.1
DVD-Video Player Function.............................21
3.2 DVD Karaoke Function...................................21
3.3 Karaoke ID mode.........................................21
Chapter 1: Terminology
The
Recommendation defines following terms for DVD-Video Karaoke Disc,
DVD-Video Karaoke Player, and the Players function.
- DVD
Karaoke ID Disc
[This term means a DVD-Video Karaoke disc in compliance with the
Recommendation.]
- DVD
Karaoke ID Player
[This term means a DVD-Video Karaoke player in compliance
with the Recommendation. In the Recommendation, DVD-Video
player without Karaoke function is called a DVD-Video Player.]
- Karaoke
ID mode
[This term means the reservation/playback function for Karaoke
Songs. It is required that this function be implemented on the
DVD Karaoke ID Player, and be valid only for the presentation
of Karaoke songs on the DVD Karaoke ID Disc.]
In
addition, we request that a Karaoke disc complying with the
Recommendation and a Karaoke player complying with the Recommendation
describe their compliance on the jacket/booklet or the operation
manual.
The
following indicate terms related with a Karaoke song, referred
by the Recommendation.
- Guide
melody
[This term means a music portion performing Guide melody component
of a Karaoke song with musical instruments. Guide melody is the
instruction of singing the song.]
- Guide
vocal
[This term means a music portion singing the vocal part of a Karaoke
song as a practical model. In case that only one Guide vocal exists
as an independent channel separated from L/R in a solo Karaoke
song composed of multi-channels, this term is called Guide vocal
1 (GV1). On the other hand, in case that two Guide vocals exist
as independent channels separated from L/R in a duet Karaoke song,
these terms are called Guide vocal 1 (GV1) and Guide vocal 2 (GV2)
respectively.]
Chapter 2: Recommendation on the structure of DVD Karaoke ID Disc
The
followidng shows the recommendation on DVD Karaoke ID Disc.
2.1
Disc ID
- Disc
ID requires being described contiguously from MSB as shown in
the following table. No "space" is allowed.
| Recording
area |
Recording
field |
Flag
name |
Size
of Flag |
Contents
|
| VMGI
|
VMGI_MAT
|
PVR_ID
|
32
bytes |
DVDKARAOKEIDDISC-V1.0
|
[A
disc on which the Disc ID is described indicates the disc in compliance
with the Recommendation, and hereinafter, is called " DVD Karaoke
ID Disc".]
2.2
VTS
- Application
type of each VTS containing Karaoke songs requires being set to
0001b.
| Recording
area |
Recording
field |
Flag
name |
Size
of Flag |
Contents
|
| VTSI
|
VTS_CAT
|
Application
type |
4
bits |
0001b
|
[In
the Recommendation, the VTS of which Application type is set to
0001b is called "Karaoke VTS".]
- DVD
Karaoke ID Disc requires including one or more Karaoke VTSs
- VTS
numbers for Karaoke VTS require being assigned continuous natural
numbers starting with 1.
- DVD
Karaoke ID Disc may contain VTSs except for Karaoke VTS. The VTS
numbers require being also assigned continuous natural numbers
starting with the last Karaoke VTS number plus one.
The
following table shows the structure of a DVD Karaoke ID Disc containing
four Karaoke VTSs and another VTS as an example.
| VTS
No. |
Application
type |
Contents
|
| #1
|
0001b
|
Karaoke
VTS |
| #2
|
0001b
|
Karaoke
VTS |
| #3
|
0001b
|
Karaoke
VTS |
| #4
|
0001b
|
Karaoke
VTS |
| #5
|
0000b
|
Other
VTS (non-Karaoke VTS) |
2.3
Title structure
- One
Karaoke song requires being One_Sequential_PGC Title.
- Title
numbers for Karaoke songs require being assigned continuous natural
numbers starting with 1.
[A DVD Karaoke ID Disc may contain 99 Karaoke songs in a Volume
at the maximum. In the Recommendation, each Karaoke song stored
in DVD Karaoke ID Disc is called "Karaoke Title".]
- DVD
Karaoke ID Disc may also contain Titles except for Karaoke Titles.
Their Title numbers require being assigned continuous natural
numbers starting with the last Karaoke Title number plus one.
The
following table shows Title structure of a DVD Karaoke ID Disc containing
seven Karaoke Titles and another Title as an example. Each Karaoke
Title belongs to one of five Karaoke VTSs and another one belongs
to other VTS (non-Karaoke VTS) as shown in the following table.
| Title
No. |
Contents
|
PGC
Structure |
VTS
No. |
Application
type |
| #1
|
Karaoke
song #1 |
One_Sequential_PGC
|
#1
|
0001b
|
| #2
|
Karaoke
song #2 |
One_Sequential_PGC
|
#2
|
0001b
|
| #3
|
Karaoke
song #3 |
One_Sequential_PGC
|
#2
|
0001b
|
| #4
|
Karaoke
song #4 |
One_Sequential_PGC
|
#3
|
0001b
|
| #5
|
Karaoke
song #5 |
One_Sequential_PGC
|
#4
|
0001b
|
| #6
|
Karaoke
song #6 |
One_Sequential_PGC
|
#5
|
0001b
|
| #7
|
Karaoke
song #7 |
One_Sequential_PGC
|
#1
|
0001b
|
| #8
|
Non-
Karaoke song |
No
restriction |
#6
|
0000b
|
2.4
Audio stream
A
Karaoke Title requires having either of the following two types
of audio stream that enable Guide vocal mixing-function (Guide vocal
mixing ON/OFF) or the combination of them.
- Channel-mixing
type:
An
audio stream of this type is composed of three or more channels
of which Guide vocal is recorded as an independent channel separated
from L/R channel in the stream. This type realizes Guide vocal
mixing-function mixing L/R ch with the independent channel.
- Stream-changing
type:
An
audio stream of this type is composed of two channels (stereo).
Moreover, two kinds of audio stream are defined in this type. One
is an audio stream without Guide vocal on its L/R channel (hereinafter,
an "audio stream of Stream-changing type without Guide vocal").
The other is an audio stream with Guide vocal pre-mixed on its L/R
channel (hereinafter, an "audio stream of Stream-changing type
with Guide vocal").
The
following are some recommended formations (: patterns) for the audio
stream of Karaoke Title. One Karaoke Title may contain 8 streams
at the maximum, but it may not contain different Audio streams from
the above two types. Audio stream numbers for Karaoke Titles requires
being always assigned continuous natural numbers starting with 1.
(Note)
The following pattern numbers and their structure are commonly used
in the Recommendation.
Pattern1:
Channel-mixing type
A
Karaoke Title contains at least one audio stream of Channel-mixing
type as follows:
- The
audio stream requires being composed of three or more channels
of which Guide vocal is recorded as an independent channel separated
from L/R channel in the stream, and the audio stream number requires
being set to 1. Each audio stream may contain the five channels
at the maximum.
| Audio
stream No. |
Type
of audio stream |
Structure
|
| #1
|
Channel-mixing
|
Multi-channel
composed of 3 or more channels Guide vocal is recorded as
an independent channel separated from L/R. |
- When
a Karaoke Title contains plural audio streams of this type, their
audio stream numbers require being assigned continuous natural
numbers starting with 1.
[The terms of Guide vocal 1 (GV1)/Guide vocal 2 (GV2) defined
in Chapter 1 are applied to corresponding channel in an audio
stream of this type respectively. Figure 1 shows a generic case
composed of 5 channels for AC-3 audio stream. In addition, this
figure also covers how the audio stream is presented by DVD Karaoke
ID Player or by DVD-Video Player, for reference.]

Fig.
1: An example of the audio stream for Channel-mixing type 5ch
Audio stream coded by AC-3 and Players output
Note: As for each mixing-coefficient value in theAC-3 bit stream,
please refer to 2.8.
Pattern
2: Stream-changing type
A
Karaoke Title contains at least two kinds of audio stream composed
of 2ch (stereo) as follows:
- Audio
stream number of Stream-changing type without Guide vocal requires
being set to l, and that of Stream-changing type with Guide
vocal requires being set to 2.
| Audio
stream No. |
Type
of audio stream |
Structure
|
| #1
|
Stream-changing
|
2ch
stereo Guide vocal is not recorded on L/Rch in the stream.
|
| #2
|
Stream-changing
|
2ch
stereo Guide vocal is pre-mixed on L/Rch in the stream.
|
[When
Karaoke Title contains three or more audio streams of Stream-changing
type, the audio stream numbers of Stream-changing type without Guide
vocal require being assigned continuous natural numbers starting
with 1. Those of Stream-changing type with Guide vocal require
being also assigned continuous natural numbers starting with the
last audio stream number of Stream-changing type without Guide vocal
plus one.]
Pattern
3: Combination of Channel-mixing type and Stream-changing type
A
Karaoke Title contains at least both an audio stream of Channel-mixing
type and an audio stream of Stream-changing type with Guide vocal
as follows:
- The
audio stream number of Channel-mixing type requires being set
to 1 and that of Stream-changing type requires being set to
2.
| Audio
stream No. |
Type
of audio stream |
Structure
|
| #1
|
Channel-mixing
|
Multi-channel
composed of 3 or more channels Guide vocal is recorded as
an independent channel separated from L/R. |
| #2
|
Stream-changing
|
2ch
stereo Guide vocal is pre-mixed on L/Rch in the stream.
|
[This
structure composed of above two streams realizes following two ways
of Guide vocal reproduction: One method is mixing L/R with Guide
vocal while audio stream #1 is reproduced, and the other is changing
audio stream, to be reproduced, from #1 to #2.]
- When
a Karaoke Title contains three or more audio streams of both Channel-mixing
type and Stream-changing type, audio stream numbers of Channel-mixing
type require being assigned continuous natural numbers starting
with 1. Those of Stream-changing type require being also assigned
continuous natural numbers starting with the last audio stream
number of Channel-mixing type plus one.
[Pattern
3 does not require audio streams of Stream-changing type without
Guide vocal. Following table shows the structure of this pattern
as an example. When this pattern includes audio streams of Stream-changing
type without Guide vocal, audio stream numbers require being assigned
continuous natural numbers according to following orders. First,
audio streams of Channel-mixing type, second, those of Stream-changing
type without Guide vocal, and finally, those of Stream-changing
type with Guide vocal.]
| Audio
stream No. |
Type
of audio stream |
Structure
|
| #1
|
Channel-mixing
|
Multi-channel
composed of 3 or more channels Guide vocal is recorded as
an independent channel separated from L/R. |
| #2
|
Channel-mixing
|
Multi-channel
composed of 3 or more channels Guide vocal is recorded as
an independent channel separated from L/R.
:
The music performance is different from that of #1, for example.
|
| #3
|
Stream-changing
|
2ch
stereo Guide vocal is pre-mixed on L/Rch in the stream.
|
2.5
Attribute and Multi-channel extension
2.5.1 Audio stream attribute
A
Karaoke Title in DVD Karaoke ID Disc may contain 8 Audio streams
at the maximum as same as DVD-Video Disc. The area describing the
information on structure and specification of Audio stream is VTS_AST_ATR.
VTS_AST_ATR requires being described for every Audio stream, and
flags specific to DVD-Video Karaoke application in VTS_AST_ATR require
being described as follows:
| Recording
area |
Recording
field |
Flag
name |
Size
of Flag (bit) |
| VTSI
|
VTS_AST_ATR
|
Audio
coding mode |
3
|
| Multi
channel extension |
1
|
| Audio
type |
2
|
| Audio
application mode |
2
|
| Number
of Audio channels |
3
|
| Application
Information |
Channel
assignment mode |
3
|
| Version
number |
2
|
| MC
intro. |
1
|
| Solo
Duet |
1
|
- Audio
coding mode: Means Audio coding system applied to the audio stream.
It requires being set to 000b when AC-3 is applied, or being
set to 100b when LPCM is applied.
[Liner PCM may be applied to only Karaoke Title of that audio
stream type is Stream-changing.]
- Multi
channel extension: Means whether the information area of Multi
channel extension which is for describing the channel structure
of each audio stream exists or not. It requires being set to 1b
for the audio stream of Channel-mixing type and that of Stream-changing
type without Guide vocal, and requires being set to 0b for the
audio stream of Stream-changing type with Guide vocal.
- Audio
type: Means whether Language code is applied to the audio stream
or not. Requires being set to 00b when Language code is not
applied.
- Audio
application mode: Means the usage of the audio stream. It requires
being set to 01b for the audio streams of Channel-mixing type
and of Stream-changing type without Guide vocal, and requires
being set to 00b for the audio streams of Stream-changing type
with Guide vocal.
The
following show examples of description of Multi channel extension
and Audio application in each pattern.
Pattern
1:
| Audio
stream No. |
Type
of audio stream |
Structure
|
| #1
|
Channel-mixing
|
Multi-channel
composed of 3 or more channels Guide vocal is recorded as
an independent channel separated from L/R. |
- #1:
Audio application mode requires being set to 01b and Multi channel
extension requires being set to 1b.
Pattern
2:
| Audio
stream No. |
Type
of audio stream |
Structure
|
| #1
|
Stream-changing
|
2ch
stereo Guide vocal is not recorded on L/Rch in the stream.
|
| #2
|
Stream-changing
|
2ch
stereo Guide vocal is pre-mixed on L/Rch in the stream.
|
- #1:
Audio application mode requires being set to 01b and Multi channel
extension requires being set to 1b.
- #2:
Audio application mode requires being set to 00b and Multi channel
extension requires being set to 0b.
Pattern
3:
| Audio
stream No. |
Type
of audio stream |
Structure
|
| #1
|
Channel-mixing
|
Multi-channel
composed of 3 or more channels Guide vocal is recorded as
an independent channel separated from L/R. |
| #2
|
Stream-changing
|
2ch
stereo Guide vocal is pre-mixed on L/Rch in the stream.
|
- #1:
Audio application mode requires being set to 01b and Multi channel
extension requires being set to 1b.
- #2:
Audio application mode requires being set to 00b and Multi channel
extension requires being set to 0b.
| Audio
stream No. |
Type
of audio stream |
Structure
|
| #1
|
Channel-mixing
|
Multi-channel
composed of 3 or more channels Guide vocal is recorded as
an independent channel separated from L/R. |
| #2
|
Channel-mixing
|
Multi-channel
composed of 3 or more channels Guide vocal is recorded as
an independent channel separated from L/R.
:
Musical performance, for example, is different from that of
#1. |
| #3
|
Stream-changing
|
2ch
stereo Guide vocal is pre-mixed on L/Rch in the stream.
|
- #1:
Audio application mode requires being set to 01b and Multi channel
extension requires being set to 1b.
- #2:
Audio application mode requires being set to 01b and Multi channel
extension requires being set to 1b.
- #3:
Audio application mode requires being set to 00b and Multi channel
extension requires being set to 0b.
Number
of Audio channels: Indicates the number of channels in the audio
stream. It requires being set as the following table:
| Number
of Audio channels |
Content
|
| 2ch
|
001b
|
| 3ch
|
010b
|
| 4ch
|
011b
|
| 5ch
|
100b
|
- Application
Information: Describes following information
related with the audio stream for Karaoke usage.
- Channel
assignment mode: Indicates the channel structure of the audio
stream. It requires being described according to AC-3 ATSC Standard:
ANNEX-C AC-3 KARAOKE MODE, when AC-3 is applied to the stream.
[M,
V1, and V2 described in the following table are the description
methods based on ANNEX-C AC-3 KARAOKE MODE. M means the channel
mainly for Guide melody, and it corresponds to the Center channel
in a normal AC-3 stream. V1/V2 mean the channel mainly for Guide
vocal and they correspond to Rear L/R channel in a normal AC-3 stream,
respectively. The following table shows the relation between Channel
assignment mode and M/V1/V2, and also shows, as reference, how M/V1/V2
are reproduced by DVD-Video Player.]
|
|
Channel
assignment mode |
Channel
structure |
Output
of DVD-Video Player |
| [ACH0]
1st
ch |
[ACH1]
2nd
ch |
[ACH2]
3rd
ch |
[ACH3]
4th
ch |
[ACH4]
5th
ch |
Output
of L ch |
Output
of R ch |
| 1
|
010b
|
L
|
R
|
|
|
|
L
|
R
|
| 2
|
011b
|
L
|
R
|
M
|
|
|
L+a
M |
R+a
M |
| 3
|
100b
|
L
|
R
|
V1
|
|
|
L
|
R
|
| 4
|
101b
|
L
|
R
|
M
|
V1
|
|
L+a
M |
R+a
M |
| 5
|
110b
|
L
|
R
|
V1
|
V2
|
|
L
|
R
|
| 6
|
111b
|
L
|
R
|
M
|
V1
|
V2
|
L+a
M |
R+a
M |
Note
The coefficient of a corresponds to "Cmixlev" defined
in AC-3 STANDARD.
- Version
number: Requires being normally set to 00b(Version number is
1.)
[Version
number may be set to 11b at the maximum, when a Karaoke Title
has plural Audio streams and their musical performance is different
from each other. In this case, Version number requires being also
assigned continuous numbers starting with 0 (00b). Therefore,
Version number may be used to distinguish four types of music performance
in a Karaoke Title at the maximum.]
- MC
intro.: Requires being normally set
to 0b (Intro.-part is not included.). Only when the audio stream
includes the Intro-part which introduces the Karaoke Title with
narration, it is required that this flag be set to 1b.
- Solo/Duet:
Requires being set to 0b for solo Karaoke Titles, and being
set to 1b for duet Karaoke Titles. When the contents of a Karaoke
Title is duet, it is required that this flag be set to 1b regardless
of the existence of GV1/GV2 in the stream.
2.5.2
Multi channel extension
DVD-Video
format has the structure that may store many kinds of information,
related with Karaoke application, for each channel of a Karaoke
audio stream (: Audio application mode is set to 01b), and it
defines that VTS_MU_AST_ATR may describe what content is recorded
in each channel. The following show application rules for VTS_MU_AST_ATR
defined in the Recommendation.
- GM:
Indicates whether Guide melody is pre-mixed on L/R channel or
not. However, it requires being always set to 0b regardless
of the existence of Guide melody on L/R channel.
- GM1,
GM2: Indicate whether Guide melody is recorded as M channel (refer
to Channel assignment mode) or not, respectively. However, it
is required that GM1 be always set to 1b and GM2 be always set
to 0b when Guide melody is recorded as M channel, regardless
of the song type whether it is duet or solo.
- GV1,
GV2: Indicate whether Guide vocal is recorded as V1/V2 channel
(refer to Channel assignment mode) or not, respectively. It is
required that GV1 be set to 1b and GV2 be set to 0b when Guide
vocal is recorded as V1 channel. Or it is required that GV1 be
set to 0b and GV2 be set to 1b when Guide vocal is recorded
as V2 channel.
- GMA,
GMB: Indicate whether Guide melody is pre-mixed on V1/V2 channel
or not, respectively. However, it is required that GMA/GMB be
always set to 0b regardless of the existence of Guide vocal
on V1/V2 channel.
- SEA,
SEB: Indicate whether a special component (effect sound, etc.)
is pre-mixed on V1/V2 channel or not, respectively. However, it
is required that SEA/SEB be always set to 0b regardless of the
existence of that component on V1/V2 channel.
2.5.3
Examples of description for VTS_AST_ATR and VTS_MU_AST_ATR
The
following show practical examples for the description of VTS_AST_ATR
and VTS_MU_AST_ATR. In following examples, AC-3 is applied as their
audio coding mode.
Pattern
1:
- 3ch:
Channel assignment mode is set to 100b: L,R,V1
| Recording
area |
Recording
Field |
Flag
name |
Size
of Flag (bit) |
Content
|
| VTSI
|
VTS_AST_ATR
|
Audio
coding mode |
3
|
000b
|
| Multi
channel extension |
1
|
1b
|
| Audio
type |
2
|
00b
|
| Audio
application mode |
2
|
01b
|
| Number
of Audio channels |
3
|
010b
|
| Application
Information |
Channel
assignment mode |
3
|
100b
|
| Version
number |
2
|
00b
|
| MC
intro. |
1
|
0b
|
| Solo
Duet |
1
|
0b
|
| VTS_MU_AST_ATR
|
Audio
channel contents |
|
| ACH0
|
0
|
0
|
0
|
GM
|
4
|
0000b
|
| ACH1
|
0
|
0
|
0
|
GM
|
4
|
0000b
|
| ACH2
|
GV1
|
GV2
|
GMA
|
SEA
|
4
|
1000b
|
| ACH3
|
|
|
|
|
4
|
|
| ACH4
|
|
|
|
|
4
|
|
- 4ch:
Channel assignment mode is set to 101b: L,R,M,V1
| Recording
area |
Recording
Field |
Flag
name |
Size
of SFlag (bit) |
Content
|
| VTSI
|
VTS_AST_ATR
|
Audio
coding mode |
3
|
000b
|
| Multi
channel extension |
1
|
1b
|
| Audio
type |
2
|
00b
|
| Audio
application mode |
2
|
01b
|
| Number
of Audio channels |
3
|
011b
|
| Application
Information |
Channel
assignment mode |
3
|
101b
|
| Version
number |
2
|
00b
|
| MC
intro. |
1
|
0b
|
| Solo
Duet |
1
|
0b
|
| VTS_MU_AST_ATR
|
Audio
channel contents |
|
| ACH0
|
0
|
0
|
0
|
GM
|
4
|
0000b
|
| ACH1
|
0
|
0
|
0
|
GM
|
4
|
0000b
|
| ACH2
|
GV1
|
GV2
|
GM1
|
GM2
|
4
|
0010b
|
| ACH3
|
GV1
|
GV2
|
GMA
|
SEA
|
4
|
1000b
|
| ACH4
|
|
|
|
|
4
|
|
4ch:
Channel assignment mode is set to 110b: L,R,V1,V2
| Recording
area |
Recording
Field
|
Flag
name
|
Size
of
Flag
(bit) |
Content
|
| VTSI
|
VTS_AST_ATR
|
Audio
coding mode |
3
|
000b
|
| Multi
channel extension |
1
|
1b
|
| Audio
type |
2
|
00b
|
| Audio
application mode |
2
|
01b
|
| Number
of Audio channels |
3
|
011b
|
| Application
Information |
Channel
assignment mode |
3
|
110b
|
| Version
number |
2
|
00b
|
| MC
intro. |
1
|
0b
|
| Solo
Duet |
1
|
1b
|
| VTS_MU_AST_ATR
|
Audio
channel contents |
|
| ACH0
|
0
|
0
|
0
|
GM
|
4
|
0000b
|
| ACH1
|
0
|
0
|
0
|
GM
|
4
|
0000b
|
| ACH2
|
GV1
|
GV2
|
GMA
|
| |