KR20050090398A

KR20050090398A - Method and apparatus for selectable rate playback without speech distortion

Info

Publication number: KR20050090398A
Application number: KR1020057010993A
Authority: KR
Inventors: 스리니바스 구타
Original assignee: 코닌클리케 필립스 일렉트로닉스 엔.브이.
Priority date: 2002-12-16
Filing date: 2003-12-12
Publication date: 2005-09-13
Also published as: AU2003303005A8; WO2004056086A3; WO2004056086A2; AU2003303005A1; EP1576803A2; JP2006510304A; CN1726707A

Abstract

A method and an apparatus for selectable rate playback of a selected first portion of a separately stored synchronized video and audio content, without distortion of speech due to the selectable rate playback of the playback content and without loss of synchronization of the selected first portion of a separately stored synchronized video and audio content.

Description

TECHNICAL AND APPARATUS FOR SELECTABLE RATE PLAYBACK WITHOUT SPEECH DISTORTION}

본 발명은 일반적으로 텔레비전 분야에 관한 것이다. 보다 상세하게는 본 발명은 프로그램의 오디오 부분을 왜곡하지 않고 텔레비전 프로그램을 선택가능한 속도로 재생하기 위한 장치 및 방법에 관한 것이다.The present invention relates generally to the field of television. More particularly, the present invention relates to an apparatus and method for playing television programs at selectable speeds without distorting the audio portion of the program.

비디오 카세트 레코더(VCR)와 같은 여러 저장 매체로부터 비디오 콘텐츠를 선택가능한 속도로 재생하는 것은 알려져 있다. 재생 콘텐츠의 오디오 부분은 오디오 부분의 왜곡을 피하기 위해 선택가능한 속도로 재생하는 동안 억압될 수 있다. 선택가능한 속도로 재생하는 동안 재생 콘텐츠의 오디오 부분을 왜곡 없이 재생할 필요성이 있다. 이후, 재생 콘텐츠의 오디오 부분의 "왜곡"은, 재생 콘텐츠의 오디오 부분을 저장하는 속도에 비해 재생 콘텐츠의 오디오 부분의 재생 속도의 변경으로 인해 수신 또는 재생 충실도가 저하하는 것을 의미한다.It is known to play video content at selectable speeds from various storage media such as video cassette recorders (VCRs). The audio portion of the playback content can be suppressed during playback at a selectable rate to avoid distortion of the audio portion. There is a need to reproduce the audio portion of the playback content without distortion while playing back at a selectable speed. Then, the "distortion" of the audio portion of the reproduced content means that the reception or reproduction fidelity is lowered due to a change in the reproduction speed of the audio portion of the reproduced content as compared to the rate of storing the audio portion of the reproduced content.

도 1은 본 발명의 실시예에 따라 재생 콘텐츠의 선택가능한 속도로의 재생을 시작하기 위한 즉 정상 시청을 시작하기 위한 장치의 기능 및 논리를 도시하는 도면.1 illustrates the functionality and logic of an apparatus for starting playback of a playback content at a selectable speed, ie for starting normal viewing, in accordance with an embodiment of the present invention.

도 2는 본 발명의 실시예에 따라 재생 콘텐츠를 선택가능한 속도로 재생하기 위한 장치의 기능 및 논리를 도시하는 도면.2 illustrates the functionality and logic of an apparatus for playing playback content at a selectable speed in accordance with an embodiment of the present invention.

도 3은 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠의 제 1 부분을 선택하기 위한 재생 리스트를 도시하는 도면.3 shows a playlist for selecting a first portion of separately stored synchronized video and audio playback content.

도 4는 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠의 제 1 부분을 선택하기 위한 그래픽 유저 인터페이스(GUI)를 도시하는 도면.4 illustrates a graphical user interface (GUI) for selecting a first portion of separately stored synchronized video and audio playback content.

도 5는 본 발명의 실시예에 따라 재생 콘텐츠를 선택가능한 속도로 재생하기 위한 방법을 도시하는 도면.5 illustrates a method for playing playback content at a selectable speed in accordance with an embodiment of the present invention.

본 발명은, 선택가능한 속도로 재생 콘텐츠를 재생하는 방법으로서, The present invention provides a method of playing a playback content at a selectable speed.

별도로 저장된 비디오 및 오디오 재생 콘텐츠의 제 1 부분을 선택하는 단계로서, 상기 재생 콘텐츠는 저장 속도로 저장되어 있으며, 상기 비디오 및 오디오는 저장된 대로 동기화되어 있으며, 상기 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠는 동기화된 재생을 위해 검색가능한, 제 1 부분을 선택하는 단계와;Selecting a first portion of separately stored video and audio playback content, wherein the playback content is stored at a storage rate, the video and audio are synchronized as stored, and the separately stored synchronized video and audio content is synchronized Selecting a first portion, retrievable for playback, the first portion;

상기 선택가능한 속도로부터 상기 재생 콘텐츠의 재생 속도를 선택하는 단계로서, 상기 선택된 재생 속도는 상기 저장 속도와는 다른, 재생 속도를 선택하는 단계와;Selecting a playback speed of the playback content from the selectable speed, wherein the selected playback speed is different from the storage speed;

상기 재생 콘텐츠의 상기 선택된 제 1 부분에 음성을 태그 부착하는 단계와;Tagging voice on the selected first portion of the playback content;

상기 태그 부착된 음성 내에 있는 적어도 하나의 어구(phrase)를 인식하는 단계와;Recognizing at least one phrase within the tagged voice;

상기 재생 속도로 재생 콘텐츠의 상기 제 1 부분을 플레이하는 단계로서, 여기서 상기 플레이는 상기 태그 부착된 음성을 동시에 검색하며, 상기 재생 속도로의 플레이는 상기 재생 속도가 상기 저장 속도와 다른 경우에도, 상기 재생 콘텐츠 내에 있는 음성의 왜곡을 유발하지 않으며, 상기 비디오 및 오디오는 상기 플레이 동안 상기 재생 속도로 동기화되어 있는, 제 1 부분을 재생하는 단계Playing the first portion of playback content at the playback speed, wherein the play simultaneously searches for the tagged voice, and the playback at the playback speed is performed even if the playback speed is different from the storage speed; Playing a first portion, not causing distortion of voice in the playback content, wherein the video and audio are synchronized at the playback speed during the play

를 포함하는, 선택가능한 속도로 재생 콘텐츠를 재생하는 방법을 제공한다.It provides a method for playing the playback content at a selectable speed, including.

본 발명의 제 2 실시예는, 재생 콘텐츠를 선택가능한 속도로 재생하기 위한 장치로서, A second embodiment of the present invention is an apparatus for playing a playback content at a selectable speed,

별도로 저장된 비디오 및 오디오 재생 콘텐츠로서, 상기 재생 콘텐츠는 저장 속도로 저장되어 있는, 별도로 저장된 비디오 및 오디오 재생 콘텐츠와;Separately stored video and audio playback content, the playback content being stored at a storage rate;

저장 매체 내에 상기 별도로 저장된 비디오 및 오디오 재생 콘텐츠의 선택된 제 1 부분으로서, 상기 비디오 및 오디오 콘텐츠의 선택된 제 1 부분은 동기화되어 있으며, 상기 오디오 콘텐츠의 음성 부분은 태그 부착되어 있는, 상기 선택된 제 1 부분과;The selected first portion of the separately stored video and audio playback content in a storage medium, wherein the selected first portion of the video and audio content is synchronized and the voice portion of the audio content is tagged and;

상기 오디오 콘텐츠의 음성 부분을 태그 부착하기 위한 음성 인식 디바이스와;A speech recognition device for tagging speech portions of the audio content;

상기 태그 부착된 음성으로부터의 어구에 대해 유효 단어를 결정하기 위한 어구 인식 디바이스로서, 상기 유효 단어는 상기 어구 내에 연결되어 있는, 어구 인식 디바이스와, A phrase recognition device for determining valid words for phrases from the tagged speech, the valid words being connected within the phrase;

상기 선택가능한 속도로부터 선택된 속도로 상기 재생 콘텐츠의 선택된 제 1 부분을 재생하기 위한 재생 디바이스로서, 여기서 상기 선택된 속도는 상기 저장 속도와는 다르며, 상기 선택된 속도로의 재생은 상기 오디오 콘텐츠의 태그 부착된 음성 부분을 동시에 검색하며, 상기 선택된 속도로의 재생은, 상기 선택된 속도가 상기 저장 속도와 다른 경우에도, 상기 재생 콘텐츠 내에 있는 음성의 왜곡을 유발하지 않으며, 상기 비디오 및 오디오 콘텐츠는 상기 재생 동안 상기 선택된 속도로 동기화되어 있는, 재생 디바이스A playback device for playing the selected first portion of the playback content at a selected speed from the selectable speed, wherein the selected speed is different from the storage speed, and playback at the selected speed is tagged with the audio content Searching for a voice part simultaneously, playback at the selected speed does not cause distortion of the voice in the playback content, even if the selected speed is different from the storage speed, and the video and audio content is Playback device, synchronized at the selected speed

를 포함하는, 재생 콘텐츠를 선택가능한 속도로 재생하기 위한 장치를 개시한다.Disclosed is an apparatus for playing a playback content at a selectable speed, including.

본 발명은, 선택가능한 속도로 재생하는 동안 재생 콘텐츠의 오디오 부분을 왜곡 없이 재생하는 것을 제공한다.The present invention provides for playing back the audio portion of the playback content without distortion during playback at a selectable speed.

본 발명의 특정 바람직한 실시예가 상세히 예시되고 기술되었지만, 첨부된 청구항의 범위를 벗어남이 없이 여러 변경과 변형이 이루어질 수 있을 것이라는 것은 말할 것도 없다. 따라서, 본 발명의 범위는, 그 구성 요소의 수, 그 재질, 그 형상, 그 상대적 배열 등으로 결코 제한될 수 없으며, 이는 단순히 바람직한 실시예의 한 예로서 개시되어 있는 것일 뿐이다. 본 발명의 특징과 잇점은 첨부된 도면에 상세히 예시되어 있으며, 이 도면에서 동일한 참조 부호는 도면 전체에 걸쳐 동일한 요소를 언급한다. 이 도면이 본 발명을 예시하기 위한 것이긴 하지만, 이 도면은 반드시 축척에 맞게 그려져 있는 것은 아니다. While certain preferred embodiments of the invention have been illustrated and described in detail, it goes without saying that various changes and modifications may be made without departing from the scope of the appended claims. Thus, the scope of the present invention should never be limited by the number of its components, its material, its shape, its relative arrangement, etc., which is merely disclosed as an example of the preferred embodiment. The features and advantages of the invention are illustrated in detail in the accompanying drawings, wherein like reference numerals refer to like elements throughout. Although the drawings are intended to illustrate the invention, they are not necessarily drawn to scale.

본 발명은 일반적으로 텔레비전 분야에 관한 것이다. 보다 상세하게는, 본 발명은, 재생 콘텐츠를 선택가능한 속도로 재생하는 것으로 인해 음성이 왜곡되는 일이 없이 선택된 비디오 및 오디오 재생 콘텐츠를 선택가능한 속도로 재생하기 위한 장치 및 방법에 관한 것이다. The present invention relates generally to the field of television. More specifically, the present invention relates to an apparatus and method for playing selected video and audio playback content at a selectable speed without distorting the voice due to playing the playback content at a selectable speed.

도 1은, 도 5에 흐름도(70)로 도시되어 있고 본 명세서에 기술되어 있는 바와 같이, 재생 콘텐츠를 선택가능한 속도로 재생하기 위한 방법에 따라 그리고 본 발명의 실시예에 따라, 재생 콘텐츠를 선택가능한 속도로 재생하기 위한 장치(10)의 기능 및 논리적 설명을 예시하는 흐름도이다. 도 1은, 유저가 이 장치(10)와 독립적으로 시청하는 것과 같은 정상 시청(61)을 계속하거나 단계 65에서 선택가능한 속도로 재생을 "시작"하게 할 수 있는 것을 예시한다. 재생 콘텐츠를 선택가능한 속도로 재생을 "시작"하는 것(65)은, 3개의 입력, 즉 선택가능한 속도로의 재생의 "정지"(64) 입력; 선택가능한 속도로의 재생의 "잠시멈춤"(67) 입력; 및 "선택된 속도"(49) 입력에 따라 좌우된다. 유저는, 적절한 소프트웨어를 구비하는, 프로그래밍 가능한 논리 제어기(PLC)로부터 또는 대안적으로 중앙 처리 장치(CPU)로부터 이들 입력(64, 67, 49)을 제공하도록 선택할 수 있다.FIG. 1 is a flow diagram 70 in FIG. 5 and described herein, according to a method for playing playback content at a selectable speed and in accordance with an embodiment of the present invention. A flow diagram illustrating the functional and logical description of the device 10 for playback at a possible speed. 1 illustrates that a user may continue normal viewing 61 such as viewing independently of this device 10 or may “start” playback at a selectable speed in step 65. "Starting" playback 65 of playback content at a selectable speed includes three inputs, namely "stop" 64 inputs of playback at a selectable speed; Input of “pause” 67 of playback at a selectable speed; And " selected speed " 49 input. The user may choose to provide these inputs 64, 67, 49 from a programmable logic controller (PLC) with appropriate software or alternatively from a central processing unit (CPU).

일 실시예에서, 결정 단계 55에서 재생이 잠시멈춤되지 않았다고 결정하는 경우와, 결정 단계 50에서 재생이 정지되지 않았다고 결정하는 경우, 유저는, "선택된 속도"(49) 입력을 제공함으로써 단계 65에서 선택가능한 속도로 재생을 시작할 수 있다. "선택된 속도"(49) 입력은 재생 콘텐츠를 저장하는데 사용된 재생 속도를 더 느리게 하거나 더 빠르게 할 수도 있다. 일 실시예에서, "선택된 속도"(49)는 재생 콘텐츠를 저장하는데 또는 임의의 다른 이유로 인해 사용된 속도의 약 50% 내지 약 150% 범위에 있었다. 그러나, 유저는, 재생 콘텐츠의 시청자나 청취자에게 보다 분명하거나 명확히 이해될 수 있는, 선택된 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠(1)를 재생하게 하는 임의의 적절한 "선택된 속도"(49)를 선택할 수 있다. 이후, "선택가능한 속력" 또는 "선택가능한 속도"는, 도 2에 도시되고 아래에 기술되는 바와 같이, 재생 콘텐츠 내의 음성의 왜곡 없이, 선택된 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠(1)를 저장하는 저장 속력이나 속도에 비해, 선택된 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠(1)의 재생 속력이나 속도를 증가 또는 감소시키는 것을 의미한다. 이 재생은 결정 단계 55에서 "잠시멈춤" 입력(67)을 제공함으로써 잠시 멈춰질 수 있다. 이 재생은 결정 단계 50에 "정지" 입력(64)을 제공함으로써 정지될 수 있다. 재생이, 예를 들어, 텔레비전과 같은 오디오 및 비디오 디바이스 상에 보여지며 재생이 "x" 분(minute)보다 더 긴 시간 동안 "잠시멈춤" 입력(67)을 제공함으로써 멈춰질 때, 또는 재생이 "정지" 입력(64)을 제공함으로써 정지될 때, 오디오 및 비디오 디바이스 상에서 정상 시청(61)을 할 수 있다. 이후, 정상 시청(61)은, 예를 들어, 본 발명의 선택가능한 속도로 재생하기 위한 장치 또는 방법과 상관 없는 임의의 적절한 오디오 및 비디오 시청 디바이스의 동작이나 텔레비전 동작을 의미한다. 재생이 "x" 분보다 더 긴 시간 동안 멈춰질 때, "잠시멈춤" 입력(67)이 결정 단계 53에 제공되어, 정상 시청(61)을 할 수 있다. 대안적으로, 재생이 "x" 분 보다 더 긴 시간 동안 멈춰지지 않는 경우, "잠시멈춤" 입력(67)은 결정 단계 55로 되돌아간 후, 다시 "잠시멈춤" 입력(67)이 제거될 때까지 결정 단계 53으로 되돌아간다. 잠시멈춤 입력(67)이 제거되는 경우, 장치(10)는 선택가능한 속도로 재생을 "시작"하는 단계(65)로 진행한다. 일 실시예에서, "x"는 이(2) 분보다 더 작다. 대안적으로, "x"는 오(5)분보다 더 작은 시간 기간일 수 있다. "x"의 값은, "잠시멈춤" 입력(67)이 장치(10)에 제공된 후, 정상 시청(61) 단계로 자동 복귀하기 위해 유저가 기다리기를 원하는 분의 수를 나타내는 임의의 양의 실수일 수 있다.In one embodiment, when determining that playback has not been paused at decision step 55 and determining that playback has not been stopped at decision step 50, the user may at step 65 by providing an input of " selected speed " Playback can be started at a selectable speed. The input "Selected Speed" 49 may slow down or speed up the playback speed used to store the playback content. In one embodiment, the "selected rate" 49 ranged from about 50% to about 150% of the rate used for storing playback content or for any other reason. However, the user may select any suitable " selected speed " 49 which allows the viewer or listener of the playback content to play the selected separately stored synchronized video and audio playback content 1, which can be more clearly or clearly understood. Can be. The "selectable speed" or "selectable speed" then stores the selected separately stored synchronized video and audio playback content 1 without distortion of the voice in the playback content, as shown in FIG. 2 and described below. Means increasing or decreasing the playback speed or speed of the selected separately stored synchronized video and audio playback content 1 relative to the storage speed or speed. This playback can be paused by providing a "pause" input 67 at decision step 55. This playback can be stopped by providing a "stop" input 64 to decision step 50. When playback is shown on an audio and video device such as, for example, a television and playback is stopped by providing a "pause" input 67 for a time longer than "x" minutes, or playback is stopped When stopped by providing a "stop" input 64, normal viewing 61 can be made on the audio and video device. Normal viewing 61 then refers to the operation or television operation of any suitable audio and video viewing device, for example, irrespective of the apparatus or method for playback at a selectable rate of the present invention. When playback is stopped for longer than "x" minutes, a "pause" input 67 is provided to decision step 53 to allow normal viewing 61. Alternatively, if playback is not stopped for longer than "x" minutes, the "pause" input 67 returns to decision step 55 and then when the "pause" input 67 is removed again. Return to decision step 53 until. If the pause input 67 is removed, the device 10 proceeds to step 65 to "start" playback at a selectable speed. In one embodiment, "x" is less than two (2) minutes. Alternatively, "x" may be a time period less than five (5) minutes. The value of "x" is any positive real number representing the number of minutes the user wants to wait to automatically return to normal viewing 61 after the "pause" input 67 is provided to the device 10. Can be.

일 실시예에서, 결정 단계 50에서 "정지" 입력(64)이 제공되었는지 여부를 결정하여, 정상 시청(61)을 할 수 있다. 만약 결정 단계 50에서 예라면, 정상 시청(61)이 가능하다. 대안적으로, 만약 "정지"(64) 입력이 결정 블록(50)으로 전달되지 않은 경우, 장치(10)는 선택가능한 속도로 재생을 "시작"하는 단계(65)로 이동한다.In one embodiment, in decision step 50 it may be determined whether or not a “stop” input 64 has been provided, allowing normal viewing 61. If yes in decision step 50, normal viewing 61 is possible. Alternatively, if the "stop" 64 input is not passed to decision block 50, the device 10 moves to step 65 to "start" playback at a selectable rate.

도 2는, 도 1의 장치(10)의 확장으로서, 도 5에 있는 흐름도(70)에 도시되어 있으며 이후에 기술되어 있는 바와 같이, 재생 콘텐츠를 선택가능한 속도로 재생하기 위한 방법에 따라 포함하는 본 발명의 실시예에 따라, 선택 및 태그 부착 부분(9)과; 어구 및 토큰 인식 부분(2)과; 및 선택가능한 속도로 재생하는 부분(4)을 추가한 것을 도시한다. FIG. 2 is an extension of the apparatus 10 of FIG. 1, including in accordance with a method for playing playback content at a selectable speed, as shown in the flow chart 70 in FIG. 5 and described later. According to an embodiment of the invention, the selection and tagging portion 9; A phrase and token recognition portion 2; And the addition of the portion 4 to reproduce at a selectable speed.

선택 및 태그 부착 부분(9)은, 도 5에 예시되고 이후에 기술되는 흐름도(70)의 단계 75 및 단계 90에 따라 포함하는 본 발명의 실시예에 따라, 도 1의 선택가능한 속도로 재생을 "시작"하는 단계(65)가 제공되는 선택 엔진(13)을 포함한다. "시작"(65) 입력을 수신하는 외에, 선택 엔진(13)은, 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)와, 재생 리스트(109)와, 그래픽 유저 인터페이스(16)로부터의 입력을 수신할 수 있다.The selection and tagging portion 9 allows for playback at the selectable speed of FIG. 1, in accordance with an embodiment of the present invention, including in accordance with steps 75 and 90 of the flowchart 70 illustrated in FIG. 5 and described later. A step 65 of "starting" comprises a selection engine 13 provided. In addition to receiving a "start" 65 input, the selection engine 13 also receives input from separately stored synchronized video and audio content 1, a playlist 109, and a graphical user interface 16. can do.

검색 동안, 선택 엔진(13)은 음성 인식 및 태그 부착 시스템(12)에 비디오 콘텐츠와 동기화된 오디오 콘텐츠를 제공하여, 음성인 콘텐츠(1)의 부분과 잡음인 부분을 태그 부착하여 태그 부착된 음성(7) 저장 매체와 태그 부착된 잡음(23) 저장 매체에 제공된다. 음성 인식 및 태그 부착 시스템(12)은 또한 개별 단어나 토큰을 태그 부착된 음성(7) 내에 입력한다. 이후 "토큰(token)"은, 구분 문자(delimiter)보다 앞서는 문자열(string)에 나타나는 (또는 이 문자열의 시작시에 나타나는) 비-구분 문자 문자의 연속하는 그룹이며, 여기서 구분 문자는, 예를 들어, 콤마와 같은 구두점 형태나 단어 사이의 스페이스일 수 있다. 이후, 비디오 콘텐츠와 음성 또는 문자로 기록된 단어나 어구의 "동기화"는, 상기 비디오 콘텐츠가 디스플레이될 때, 대응하는 비디오 콘텐츠와 함께 단어들이 발음되거나 문자로 기록된다는 것을 의미한다. 비디오 콘텐츠와 동기화된 오디오 콘텐츠는, 동기화된 비디오 및 오디오 콘텐츠(1)가 별도로 저장되어 있고 또 이 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)가 동기화된 재생을 위해 검색될 수 있기 때문에, 이용가능하다. During the search, the selection engine 13 provides the audio recognition and tagging system 12 with audio content synchronized with the video content, tagging portions of the content 1 that are speech and portions that are noisy and tagged. (7) The storage medium and tagged noise 23 are provided in the storage medium. The speech recognition and tagging system 12 also enters individual words or tokens into the tagged voice 7. A "token" is then a contiguous group of non-delimiter characters that appear in a string preceding the delimiter (or at the beginning of this string), where delimiter is an example. For example, it may be a punctuation form such as a comma or a space between words. Then, "synchronization" of a word or phrase written in voice or text with video content means that words are pronounced or written with the corresponding video content when the video content is displayed. Audio content synchronized with video content is available because the synchronized video and audio content 1 are stored separately and the separately stored synchronized video and audio content 1 can be retrieved for synchronized playback. Do.

도 2를 참조하면, 장치(10)의 어구와 토큰 인식 부분(2)은, 어구에 대해 유효 단어(Valid Words)를 결정하는 결정 단계(29)를 포함하며, 여기서 이 결정은 유효성을 위한 테스트 허용가능한 단어(21) 입력과, 어구 데이터베이스(42) 입력에 기초해서 이루어진다. 이후, "단어" 또는 "음성"이라는 말은 문자로 기록되거나 발언된 영어 또는 임의의 다른 언어를 의미한다. 이 결정(29)은 출력을 어구 내 단어 연결 단계(31)에 제공한다. 유효성을 위한 테스트 허용가능한 단어(21)는 입력 발음 규칙(39)을 수신할 수 있다. 여기서, 유효성을 위한 테스트 허용가능한 단어(21)는 유효 단어들이 재생시에 올바르게 발음되도록 발음 규칙을 사용할 수 있다. 이후, "올바르게 발음하는 것"은 액센트나 틀림 발음으로 인한 발음 에러에 대한 음성을 정정하는 것을 의미한다. 연속하는 유효 단어와 어구 데이터베이스(42)는 결정 단계(29)로 입력되어, 연속하는 유효 단어가 어구에 대해 유효 단어인지 여부를 결정한다. 만약 예라고 결정되면, 어구에 대한 연속하는 유효 단어는 어구 내 단어 연결(31) 단계로 입력된다. 만약 아니오 라면, 어구에 대해 연속하는 유효 단어는 어구에 대해 유효하지 않은 단어처럼 저장된 재생 콘텐츠 버퍼(37)에 입력된다. 결정 단계(29)는 이 연속하는 유효 단어와 어구 데이터베이스(42)를 비교하는 것을 포함할 수 있는 공정을 적용할 수 있다. 어구와 같은 어구 데이터베이스(42)에 존재하는 유효 단어는 어구 내 단어 연결(31) 단계에 연결될 수 있다. 사전(Dictionaries)이나 어휘 리스트(Lexicons) 등은 어구 데이터베이스(42)의 예들이다. 어구의 몇몇 예는, 그 구성 단어가 종종 함께 쓰이는 "good morning"과 같은 어구를 포함한다. 어구의 단어가 함께 발음될 필요가 있기 때문에, 어구의 단어는 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)의 대응하는 비디오 콘텐츠가 재생될 때 함께 발음된다. 유저는, 또한, 추가적인 단어나 규칙을 유효성을 위한 테스트 허용가능한 단어(21)에 입력할 옵션을 가질 수 있으며, 이로써 설정된 언어의 일부가 아닌 다른 단어도 또한 단계 31에서 어구 내에 함께 연결될 수도 있다.Referring to FIG. 2, the phrase and token recognition portion 2 of the device 10 includes a decision step 29 for determining valid words for the phrase, where the decision is a test for validity. It is based on acceptable word 21 input and phrase database 42 input. The word "word" or "speech" shall then mean English or any other language written or spoken in letters. This decision 29 provides the output to the word concatenation step 31 in the phrase. The test allowable word 21 for validity may receive the input pronunciation rule 39. Here, the test allowable word 21 for validity may use a pronunciation rule so that the valid words are pronounced correctly at the time of reproduction. Then, "pronouncing correctly" means correcting a voice for pronunciation error due to accent or wrong pronunciation. Consecutive valid words and phrase database 42 are entered into decision step 29 to determine whether the consecutive valid words are valid words for the phrase. If yes, successive valid words for the phrase are entered into the word concatenation 31 step in the phrase. If no, successive valid words for the phrase are entered into the stored playback content buffer 37 as invalid words for the phrase. Decision step 29 may apply a process that may include comparing the phrase database 42 with this consecutive valid word. Valid words that exist in a phrase database 42, such as a phrase, may be linked to a step in word phrase 31 in a phrase. Dictionaries, Lexicons, and the like are examples of the phrase database 42. Some examples of phrases include phrases such as "good morning" where the constituent words are often used together. Since the words of the phrase need to be pronounced together, the words of the phrase are pronounced together when the corresponding video content of the separately stored synchronized video and audio content 1 is played. The user may also have the option to enter additional words or rules into the test allowable word 21 for validity so that other words that are not part of the set language may also be linked together in the phrase in step 31.

도 2를 참조하면, 선택가능한 속도로 재생하는 부분(4)은, 저장된 재생 콘텐츠 버퍼(37)와, 선택가능한 속도로 재생하는 엔진(67)과, 선택가능한 속도로 재생하는 시청(73)을 포함한다. 어구는 어구 내 단어 연결 단계(31)로부터 저장된 재생 콘텐츠 버퍼(37) 내로 전달될 수 있다. 대안적으로, 유효 단어는, 결정 단계(29)에서 어구에 대한 유효 단어가 아닌 것으로 결정된 경우, 저장된 재생 콘텐츠 버퍼(37)에 제공될 수 있다. 대안적으로, 잡음(23)은 저장된 재생 콘텐츠 버퍼(37)에 전달될 수 있다. 일 실시예에서, 선택가능한 속도로 재생하는 엔진(67)은 저장된 재생 콘텐츠 버퍼(37)를 선택가능한 속도로 재생하는 엔진(67)에 제공한다. 선택가능한 속도로 재생하는 엔진(67)은, 선택된 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠(1)의 선택가능한 속도로 재생하는 시청(73)을 위해 선택가능한 속도로 재생하는 시청 단계(73)에 입력을 제공한다. 선택가능한 속도로 재생하는 시청(73)의 일 목적은, 비디오 프로그램 내에 있는 장면 콘텐츠가 불명확한 경우나 발언된 말을 유저가 이해하지 못한 경우에 관련된 것이다. 발언된 단어를 유저가 명확히 이해하지 못한 경우의 예에서, 테스트 허용가능한 단어의 유효성(21)은, 발음 규칙(39)을 입력한 후 단어나 어구를 올바르게 발음하는 발음 디바이스(pronunciator device)를 사용할 수 있다. 따라서, 배우(actor)에 의해 부정확하게 발음된 단어는 발음 디바이스에 의해 올바르게 발음될 수 있다. 유저에는, 유효 단어가 발음을 위해 발음 디바이스를 사용하여야 할지, 또는 예를 들어, 비디오 프로그램에서 배우에 의해 발언된 것처럼 발언되어야 할지에 대한 옵션이 제공될 수 있다. Referring to FIG. 2, the portion 4 for playing back at a selectable speed includes a stored playback content buffer 37, an engine 67 for playing back at a selectable speed, and a viewing 73 for playing back at a selectable speed. Include. The phrase may be transferred from the word concatenation step 31 in the phrase into the stored playback content buffer 37. Alternatively, the valid word may be provided to the stored playback content buffer 37 when it is determined in decision step 29 that it is not a valid word for the phrase. Alternatively, noise 23 can be delivered to the stored playback content buffer 37. In one embodiment, the engine 67 for playing back at a selectable speed provides the engine 67 for playing back the stored playback content buffer 37 at a selectable speed. The engine 67, which plays back at a selectable speed, passes to the viewing step 73, which plays back at a selectable speed for viewing 73, which plays back at a selectable speed of the selected separately stored synchronized video and audio playback content 1 selected. Provide input. One purpose of the viewing 73 to reproduce at a selectable speed is related to when the scene content in the video program is unclear or when the user does not understand the spoken word. In the example where the user does not clearly understand the spoken word, the validity of the test allowable word 21 is to use a pronunciator device that correctly pronounces the word or phrase after entering the pronunciation rule 39. Can be. Thus, words incorrectly pronounced by an actor can be correctly pronounced by the pronunciation device. The user may be provided with an option as to whether the valid word should use the pronunciation device for pronunciation or, for example, as if spoken by the actor in a video program.

도 3은 재생 리스트(109)로부터 재생 콘텐츠의 리스트(110)의 일례를 도시한다. 이 재생 리스트는, "y" 분의 재생 리스트 항목(120)을 포함하며, 여기서 y 는 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)(도 2 참조)가 저장되었을 때로부터의 시간을 나타낸다. 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)가 저장된 때로부터의 시간은, 도 2에 도시되고 본 명세서에 기술된 바와 같이, 저장된 재생 콘텐츠 버퍼(37)의 저장 용량에 따라 좌우된다. 이 저장된 재생 콘텐츠 버퍼(37)의 저장 용량은 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)를 수용하는데 필요한 임의의 적절한 용량일 수 있다. 일 실시예에서, 저장된 재생 콘텐츠 버퍼(37)의 저장 용량은 2분 미만이다. 대안적으로, 저장된 재생 콘텐츠 버퍼(37)의 저장 용량은 5분 미만일 수 있다. 대안적으로, 저장된 재생 콘텐츠 버퍼(37)의 저장 용량은 영화나 비디오 프로그램의 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)를 저장하는데 필요한 용량일 수 있으며, 여기서 비디오 프로그램은 텔레비전 프로그램일 수 있다. 3 shows an example of the list 110 of reproduction contents from the reproduction list 109. This playlist includes a playlist item 120 of "y" minutes, where y represents the time from when separately stored synchronized video and audio content 1 (see FIG. 2) was stored. The time from when the separately stored synchronized video and audio content 1 is stored depends on the storage capacity of the stored playback content buffer 37, as shown in FIG. 2 and described herein. The storage capacity of this stored playback content buffer 37 may be any suitable capacity needed to accommodate separately stored synchronized video and audio content 1. In one embodiment, the storage capacity of the stored playback content buffer 37 is less than two minutes. Alternatively, the storage capacity of the stored playback content buffer 37 may be less than 5 minutes. Alternatively, the storage capacity of the stored playback content buffer 37 may be the capacity required to store separately stored synchronized video and audio content 1 of a movie or video program, where the video program may be a television program.

재생 리스트(109)는, 유저가 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)에 포함된 프로그램이나 영화를 청취하거나 시청하는 것으로부터 기억하는 키워드나 어구에 기초해서 유저에 의해 생성될 수 있는 키워드나 어구 리스트 항목(130)을 포함한다. The playlist 109 is a keyword that can be generated by the user based on keywords or phrases that the user memorizes from listening to or watching a program or movie included in the separately stored synchronized video and audio content 1. The phrase list item 130 is included.

재생 리스트(109)는 키 프레임 리스트 항목(140)을 포함하며, 여기서 키 프레임 리스트 항목(140)의 각 엔트리는 2개의 연속하는 프레임 각각의 세기 "z"를 빼는 것에 의해 선택될 수 있으며, 만약 연속하는 프레임 사이의 세기 "z"의 차이 "Δz"가 임계값 "t"보다 더 큰 경우, 더 높은 세기를 갖는 프레임이 키 프레임으로 선택된다. 유저는 원격 선택 디바이스를 통해 또는 수동으로 리스트 항목(120, 130, 또는 140)을 선택할 수 있다. 리스트 항목(120, 130, 또는 140)을 선택하면 입력을 선택 엔진(13)에 제공할 수 있다. The playlist 109 includes a key frame list item 140, where each entry of the key frame list item 140 may be selected by subtracting the intensity "z" of each of two consecutive frames, If the difference "Δz" of the intensity "z" between successive frames is larger than the threshold "t", the frame having the higher intensity is selected as the key frame. The user can select the list item 120, 130, or 140 via the remote selection device or manually. Selecting list item 120, 130, or 140 may provide input to selection engine 13.

도 4는 그래픽 유저 인터페이스(GUI)(16)로부터 재생 콘텐츠 리스트를 도시하며, 여기서 이 리스트는, 도 3에 도시되고 위에서 기술된 바와 같이, 대응하는 리스트 항목(120, 130, 및 140)과 동일한 방식으로 생성된, "y" 분의 재생 리스트 항목(160)과, 키워드나 어구 리스트 항목(170)과, 키 프레임 리스트 항목(180)을 포함한다. 이 GUI(16)로부터의 재생 콘텐츠 리스트는 리스트 항목(160, 170, 또는 180)으로 스크롤하는데 사용될 수 있는 스크롤 바(190)를 포함한다. 유저는 리스트 항목(160, 170, 또는 180)을 수동으로 또는 원격 선택 디바이스를 통해 선택할 수 있다. 이 리스트 항목(160, 170, 또는 180)을 선택하면, 입력을 GUI(16)로부터 선택 엔진(13)으로 제공할 수 있다(도 2 참조). 그래픽 유저 인터페이스(16)에는 키 프레임 추출을 사용하여 키 비디오 프레임의 리스트가 제공될 수 있다. 이후, "키 프레임 추출"은, 임계값의 세기보다 더 높은 세기를 갖는 키 프레임이 GUI(16)로부터 재생 콘텐츠의 리스트 내로 선택된다는 것을 의미한다. 4 shows a list of playback content from a graphical user interface (GUI) 16, where this list is identical to the corresponding list items 120, 130, and 140, as shown in FIG. 3 and described above. A play list item 160 for "y", a keyword or phrase list item 170, and a key frame list item 180 generated in a manner. The playback content list from this GUI 16 includes a scroll bar 190 that can be used to scroll to list items 160, 170, or 180. The user can select list item 160, 170, or 180 manually or via a remote selection device. Selecting this list item 160, 170, or 180 can provide input from the GUI 16 to the selection engine 13 (see FIG. 2). Graphical user interface 16 may be provided with a list of key video frames using key frame extraction. "Key frame extraction" then means that a key frame having an intensity higher than the threshold strength is selected from the GUI 16 into the list of playback content.

도 5는, 단계 75, 85, 90, 95, 및 97을 포함하는, 재생 콘텐츠를 선택가능한 속도로 재생하기 위한 방법(70)을 도시한다. 일 실시예에서, 텔레비전 프로그램 또는 대안적으로, 영화가 퍼스널 비디오 카세트 레코더, DVD, 또는 광 매체 또는 자기 광 매체와 같은 임의의 적절한 저장 매체 상에 저장될 수 있다. 이 프로그램이나 영화는 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)(도 2 참조)이어야 하며, 여기서 비디오 및 오디오는 저장된 대로 동기화되어 있으며, 그리고 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)는 동기화된 재생을 위해 검색가능하다. 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)를 재생하는 동안, 유저는 비디오 부분이 불명확하거나 또는 오디오 부분이 이해할 수 없기 때문에와 같이 만족스럽게 이해하지 못할 수 있는 프로그램 부분을 만날 수 있다. 이때 유저는 먼저 재생을 정지한다. 단계 75에서, 유저는, "선택된 속도"(49)로 재생하기 위해 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠(1)의 제 1 부분(44)을 선택하며, 여기서 선택된 제 1 부분(44)은 도 3의 재생 리스트(109)로부터 리스트 항목(120, 130, 또는 140)에 대응하거나, 또는 도 4의 GUI(16)로부터 리스트 항목(160, 170, 또는 180)에 대응한다. 재생 콘텐츠(1)는 저장 속도로 저장되어 있으며, 여기서 저장 속도는 상업용의 퍼스널 비디오 카세트 레코더, DVD, 또는 광 매체나 자기 광 매체와 같은 임의의 적당한 저장 매체에 대해 임의의 레코딩 속도일 수 있으며, 그리고 이 저장 속도는 "선택된 속도"(49)와는 다를 수 있다. 이 "선택된 속도"(49)는 재생 콘텐츠(1)의 오디오 콘텐츠의 음성 부분의 왜곡 없이 재생 콘텐츠(1)에 대한 저장 속도보다 더 느리거나 더 빠를 수 있다. 5 shows a method 70 for playing playback content at a selectable speed, comprising steps 75, 85, 90, 95, and 97. FIG. In one embodiment, a television program or, alternatively, a movie may be stored on a personal video cassette recorder, DVD, or any suitable storage medium, such as optical or magnetic optical media. The program or movie must be separately stored synchronized video and audio content 1 (see FIG. 2), where the video and audio are synchronized as stored, and the separately stored synchronized video and audio content 1 is synchronized. Searchable for playback. While playing the separately stored synchronized video and audio content 1, the user may encounter program parts that may not be satisfactorily understood, such as because the video part is unclear or the audio part is incomprehensible. At this time, the user first stops playback. In step 75, the user selects a first portion 44 of synchronized video and audio playback content 1 that is stored separately for playback at “selected rate” 49, wherein the selected first portion 44 is It corresponds to the list item 120, 130, or 140 from the play list 109 of FIG. 3, or corresponds to the list item 160, 170, or 180 from the GUI 16 of FIG. The playback content 1 is stored at a storage speed, where the storage speed can be any recording speed for commercial personal video cassette recorders, DVDs, or any suitable storage medium such as optical or magnetic optical media, And this storage rate may be different from the "selected rate" 49. This "selected speed" 49 may be slower or faster than the storage speed for the playback content 1 without distortion of the voice portion of the audio content of the playback content 1.

단계 85에서, 그래픽 유저 인터페이스(16)나 재생 리스트(109)로부터 재생 콘텐츠로부터 선택된 리스트 항목에 대응하는 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠(1)의 선택된 제 1 부분(44)(도 2 참조)에 포함된 음성은 음성 인식 및 태그 부착 시스템(12)에 의해 태그 부착된다. 단계 90에서, 허용가능한 단어(7)는 음성 인식 및 태그 부착 시스템(12)(도 2 참조)에 의해 인식된다.In step 85, the selected first portion 44 of the separately stored synchronized video and audio playback content 1 corresponding to the list item selected from the playback content from the graphical user interface 16 or the playlist 109 (see FIG. 2). The voice contained in) is tagged by the speech recognition and tagging system 12. In step 90, the acceptable word 7 is recognized by the speech recognition and tagging system 12 (see FIG. 2).

단계 95에서, 태그 부착된 음성(7) 내에 있는 적어도 하나의 어구가 장치(10)의 어구 및 토큰 인식 부분(2)(도 2 참조)에 의해 인식된다.In step 95, at least one phrase in the tagged voice 7 is recognized by the phrase and token recognition portion 2 (see FIG. 2) of the device 10.

단계 97에서, 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)의 선택된 제 1 부분(44)(도 2 참조)은, 비디오 및 오디오 콘텐츠가 동기화되어 있고 별도로 저장되기 때문에, 선택 및 태그 부착 엔진(65)(도 1 참조)에 의해 동기화된 재생을 위해 검색될 수 있으며, 여기서 태그 부착된 음성(7) 및 대응하는 비디오가 연속적으로 제공되어, 이로써 플레이를 위해 별도로 저장된 동기화된 비디오 및 오디오 콘텐츠(1)의 제 1 부분(44)(도 2 참조)을 선택하는 것은 플레이를 위해 대응하는 태그 부착된 음성(7)을 선택하는 것이 된다.In step 97, the selected first portion 44 (see FIG. 2) of the separately stored synchronized video and audio content 1 is selected and tagged engine 65 because the video and audio content are synchronized and stored separately. (See FIG. 1), where the tagged voice 7 and the corresponding video are successively provided, thereby providing separately stored synchronized video and audio content (1) for play. Selecting the first portion 44 (see FIG. 2) of C.sub.2 results in selecting the corresponding tagged voice 7 for play.

음성은, 도 2에 도시되고 전술된 관련 텍스트에 기술되어 있는 바와 같이, 음성 인식 및 태그 부착 시스템(12)에 의해 태그 부착될 수 있다. 태그 부착된 음성(7) 내에 있는 적어도 하나의 어구는, 예를 들어, 도 2에 도시되고 전술된 관련 텍스트에 기술되어 있는 바와 같이, 음성 인식 시스템 및 태그 부착 시스템(12)을 사용하여 인식될 수 있다. 음성 인식 및 태그 부착 시스템(12)은 재생 콘텐츠(1)로부터 영어 단어로부터 형태학적 및 어형 변화 어미(morphological and inflexional endings)를 제거하도록 어간 제거(stemming)를 사용할 수 있다. 이후 "어간 제거하는 것(stemming)"은 영어 단어로부터 보다 일반적인 형태학적 및 어형 변화 어미를 제거하기 위한 공정인 포터 스테밍 장치(Porter stemming apparatus)(또는 '포터 스테머')에 의해 이루어질 수 있다. 이 장치의 주 용도는 정보 검색 시스템을 설정할 때 일반적으로 수행되는 용어 표준화 공정(term normalization process)의 일부로 사용하는 것이다. 이후, 영어 단어에 대한 "형태학적" 어미(morphological endings)는, 과거, 현재, 또는 미래와 같은 동사의 시제(verb tense)이며, 영어 단어에 대한 "어형 변화" 어미(inflexional endings)는 "s", "es", 또는 "ing"과 같은 명사나 동사의 어미이거나 또는 형용사의 비교급과 최상급에 대한 "er", "ier", "iest"와 같은 어미이다.The voice may be tagged by the voice recognition and tagging system 12 as shown in FIG. 2 and described in the associated text described above. At least one phrase within the tagged voice 7 may be recognized using the speech recognition system and the tagging system 12, for example, as described in the relevant text shown in FIG. 2 and described above. Can be. The speech recognition and tagging system 12 may use stemming to remove morphological and inflexional endings from English words from the playback content 1. "Stemming" may then be done by a Porter stemming apparatus (or 'porter stemmer'), a process for removing more general morphological and morphological endings from English words. . Its main use is as part of the term normalization process that is commonly performed when setting up information retrieval systems. The "morphological endings" for English words are then verb tense, such as past, present, or future, and the "inflexional endings" for English words are "s." Nouns or verbs such as "es," or "ing", or "er", "ier", "iest" for adjectives of comparative and superlatives.

그래픽 유저 인터페이스(16)나 재생 리스트(109)로부터 재생 콘텐츠로부터 선택된 리스트 항목에 대응하는 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠(1)의 선택된 제 1 부분(44)(도 2 참조)은 선택가능한 속도로 플레이될 수 있으며, 상기 플레이는 허용가능한 단어와 같은 태그 부착된 음성(7)을 동시에 검색한다. 그래픽 유저 인터페이스(16)나 재생 리스트(109)로부터 재생 콘텐츠로부터 선택된 리스트 항목에 대응하는 별도로 저장된 동기화된 비디오 및 오디오 재생 콘텐츠(1)의 선택된 제 1 부분(44)을 선택된 속도로 플레이하면 재생 콘텐츠(1)(도 2 참조) 내에 있는 음성에 왜곡이 없게 된다. 이 비디오와 오디오는, 도 5에 흐름도(70)에 의해 도시되고 위에서 기술된 바와 같은 본 방법에 따라 그리고 본 발명의 실시예에 따라 선택가능한 속도로 동기화될 것이다. The selected first portion 44 (see FIG. 2) of the separately stored synchronized video and audio playback content 1 corresponding to the list item selected from the playback content from the graphical user interface 16 or the playlist 109 is selectable. It can be played at speed, the play simultaneously searching for tagged voices 7 such as acceptable words. Playing selected first portion 44 of separately stored synchronized video and audio playback content 1 corresponding to a list item selected from playback content from graphical user interface 16 or playlist 109 at a selected rate There is no distortion in the voice in (1) (see FIG. 2). This video and audio will be synchronized at a selectable speed according to the method as shown by the flow chart 70 in FIG. 5 and described above and according to an embodiment of the invention.

전술된 바와 같이, 본 발명은 프로그램의 오디오 부분을 왜곡함이 없이 텔레비전 프로그램을 선택가능한 속도로 재생하는 등에 이용가능하다. As described above, the present invention can be used for playing back a television program at a selectable speed or the like without distorting the audio portion of the program.

Claims

A method for playing playback content at a selectable speed,

Selecting a first portion of separately stored video and audio playback content, wherein the playback content is stored at a storage rate, the video and audio are synchronized as stored, and the separately stored synchronized video and audio content is synchronized Selecting a first portion, retrievable for playback, the first portion;

Selecting a playback speed of the playback content from the selectable speed, wherein the selected playback speed is different from the storage speed;

Tagging voice on the selected first portion of the playback content;

Recognizing at least one phase within the tagged voice;

Playing the first portion of playback content at the playback speed, wherein the play synchronizes to retrieve the tagged voice, wherein play at the playback speed is different from the storage speed Even if it does not cause distortion of the voice in the playback content, the video and audio being reproduced at the playback speed during the play, the first part being reproduced.

Including the playback content at a selectable speed.

The method of claim 1, wherein the first portion of the playback content is selected for playing from a playlist.

The method of claim 1, wherein the first portion of the playback content is selected for playing from a graphical user interface.

4. The method of claim 3, wherein the graphical user interface includes a list of key video frames provided by key frame extraction.

2. The method of claim 1, wherein tagging the voice further comprises recognizing a plurality of valid words for a phrase in the tagged voice.

The method of claim 1, wherein the playback speed is less than the storage speed.

The method of claim 1, wherein recognizing at least one phrase within the tagged speech is by speech recognition.

The method of claim 1, further comprising removing commoner morphological and inflexional endings from English words from the reproduced content by removing the stem. Way.

10. The method of claim 9, wherein a key frame in the list of key video frames has an intensity higher than a threshold intensity.

The playback content of claim 1, wherein the tagged voice and its corresponding video are continuously provided while storing the playback content and playing the first portion of the playback content at the playback speed. How to play.

The audio and video device of claim 1, wherein the playing of the first portion of the playback content at the playback speed is performed such that normal viewing of an audio and video device occurs when the play is stopped by a stop input. And playing back at a selectable speed.

2. The method of claim 1, wherein playing the first portion of playback content at the playback speed is greater than x minutes when the play is stopped by a pause input, wherein the play is any positive real number. Playing on an audio and video device such that when stopped for more time, normal viewing of the audio and video device occurs.

An apparatus for playing playback content at a selectable speed, the device comprising:

Separately stored video and audio playback content, the playback content being stored at a storage rate;

The selected first portion of the separately stored video and audio playback content in a storage medium, wherein the selected first portion of the video and audio content is synchronized and the voice portion of the audio content is tagged and;

A speech recognition device for tagging speech portions of the audio content;

A phrase recognition device for determining valid words for phrases from the tagged speech, the valid words being connected within the phrase;

A playback device for playing the selected first portion of the playback content at a selected rate from the selectable rate, wherein the selected rate is different from the storage rate and playback at the selected rate is tagged with the audio content Synchronized search for the portion of the speech, wherein playback at the selected rate does not cause distortion of speech in the playback content, even if the selected rate is different from the storage rate, and the video and audio content Playback device, synchronized at the selected speed during playback

And playback the playback content at a selectable speed.

The device of claim 13, wherein the playback device for playing the selected first portion of the playback content at the selected rate further comprises a playlist of the selected first portion of separately stored synchronized video and audio playback content. And apparatus for playing the playback content at a selectable speed.

The device of claim 13, wherein the playback device for playing the selected first portion of the playback content at the selected rate further comprises a graphical user interface of the selected first portion of separately stored synchronized video and audio playback content. And playback the playback content at a selectable speed.

16. The playback content of claim 15, wherein the graphical user interface includes a key frame list item, wherein each frame of the key frame list item has an intensity that is different from the intensity of frames that are contiguous by a value that exceeds a threshold. Device for playing back at a selectable speed.

15. The device of claim 13, wherein the phrase recognition device for determining valid words for phrases from the tagged speech includes join words into phrases step. Device for playing at speed.

15. The speed of claim 13, wherein the phrase recognition device for determining a valid word for a phrase from the tagged voice comprises a pronunciation rule input for causing the valid word to be pronounced correctly upon playback. Device for playing with.

15. The apparatus of claim 13, wherein the video content is within a video frame.

15. The apparatus of claim 13, wherein the selected playback rate is slower than the storage rate, and play at the selected playback rate is voiced in the playback content even when the selected rate is slower than the storage rate of the playback content. Apparatus for playing back playback content at a selectable speed that does not cause distortion.