[go: up one dir, main page]

WO2008035829A1 - Apparatus and method for playback speed altering with preservation of tone signal - Google Patents

Apparatus and method for playback speed altering with preservation of tone signal Download PDF

Info

Publication number
WO2008035829A1
WO2008035829A1 PCT/KR2006/003770 KR2006003770W WO2008035829A1 WO 2008035829 A1 WO2008035829 A1 WO 2008035829A1 KR 2006003770 W KR2006003770 W KR 2006003770W WO 2008035829 A1 WO2008035829 A1 WO 2008035829A1
Authority
WO
WIPO (PCT)
Prior art keywords
playback speed
period
pcm data
audio
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/KR2006/003770
Other languages
French (fr)
Inventor
Seok Bong Kang
Hwa Jung Jun
Sung Hwan Choi
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
I-WARE Inc Ltd
Original Assignee
I-WARE Inc Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by I-WARE Inc Ltd filed Critical I-WARE Inc Ltd
Priority to PCT/KR2006/003770 priority Critical patent/WO2008035829A1/en
Publication of WO2008035829A1 publication Critical patent/WO2008035829A1/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B27/00Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
    • G11B27/005Reproducing at a different information rate from the information rate of recording
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/93Discriminating between voiced and unvoiced parts of speech signals
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B2220/00Record carriers by type
    • G11B2220/60Solid state media
    • G11B2220/61Solid state media wherein solid state memory is used for storing A/V content

Definitions

  • the present invention relates to an apparatus and method for altering a playback speed while preserving a voice signal; more particularly, to an apparatus and method for dynamically altering a playback speed of a meaningful voice signal without deteriorating a tone color and voice quality of the voice signal using voice density variation.
  • a non-linear filter is used to compensate voice signal deterioration caused due to the playback speed variation.
  • Fig. 1 shows an original sound reproducing block of a MP3 player.
  • an apparatus for altering a playback speed while preserving a speech signal including: a memory for storing compressed audio files; a file buffer for buffering the audio files stored in the memory; a playback speed controller for storing audio file information of the file buffer and generating playback speed information according to a playback speed request from a user; a decoder for restoring audio files transferred from the file buffer to PCM data; a data buffer for buffering PCM data from the decoder; a PCM data processor for finding a voiceless period from the PCM data from the data buffer and controlling a playback speed according to the playback speed information; and a CODEC for transforming the PCM data from the PCM data processor to audio analog signal.
  • the memory may store at least one MPEG Audio layer 3 (ME3) files, windows media audio (WMA) files, and an Ogg vorbis (Ogg) file
  • the decoder is at least one of a MP3 decoder, a WMA decoder, and OGG decoder for decoding the audio files in the memory.
  • the playback speed controller may include: a header data analyzer for storing audio file information of the file buffer; a playback speed generator for setting up playback speed information of a currently processed audio file in the file buffer according to a playback speed request from a user; a PCM data controller for controlling the PCM data processor to reproduce the audio file according to the playback speed information; and an operation state controller for controlling the playback speed generator to setup playback speed information according to a sensed playback speed request from a user.
  • the playback speed controller may include an operation state display for displaying operation state information of an audio file in the file buffer.
  • the PCM data processor may include: a voice filter for attenuating a sound signal excepting voice from the PCM data transferred from the data buffer; a sound quantity measuring unit for setting up a reference to find a voiced period and a voiceless period from the PCM data, and generating voiceless period information according to the setup reference; a PCM data adder for adding voiceless interval to the voiceless period according to the playback speed information; and a PCM data attenuator for removing voiceless interval from the voiceless period according to the playback speed information.
  • a method of altering a playback speed while preserving a voice signal in reproducing audio data using audio data reproducing information separated from an audio file including the steps of: a) setting up a reference sound pressure defining a playback speed variable period of the audio file; b) setting up a period in a predetermined range from the reference sound pressure as a variable period; c) setting up playback speed control information based on a ratio of the variable period and other periods; d) receiving a playback speed from a user; and e) reproducing the audio file using the playback speed control information and the playback speed.
  • the reference sound pressure may be set after reproducing a predetermined part of the audio file, and attenuating audio signal except voice.
  • the reference sound pressure may be set by an average value of absolute sampled sound pressures obtained through sampling a predetermined portion of the audio file at least four times at a time period of about 24/lOOOsec.
  • a period having a sound pressure less than 30% of the reference sound pressure may be set as a variable period.
  • the audio file may be reproduced according to an equation:
  • X denotes a ratio of adding or removing voiceless intervals to/from a voiceless period to control a playback speed.
  • An apparatus and method for altering a playback speed while preserving a voice signal control the playback speed by dynamically changing a playback speed for a less-meaningful voice period. Therefore, the playback speed can be controlled without deteriorating meaningful voice without using a voice non-learner filter for compensating voice deterioration.
  • FIG. 1 shows an original sound reproducing block of a MP3 player
  • FIG. 2 is a block diagram illustrating an apparatus for altering a playback speed while preserving a voice signal without deteriorating the speed signal according to an exemplary embodiment of the present invention
  • Fig. 3 is a block diagram illustrating a playback speed controller of a playback speed altering apparatus of Fig. 2
  • Fig. 4 is a block diagram illustrating a PCM data processor of an apparatus for controlling a playback speed for preserving tone signal of Fig. 2
  • Fig. 5 is a flowchart illustrating a method for altering a playback speed while preserving a voice signal according to an embodiment of the present invention.
  • Fig. 2 is a block diagram illustrating an apparatus for altering a playback speed while preserving a voice signal without deteriorating the speed signal according to an exemplary embodiment of the present invention.
  • the playback speed altering apparatus includes a memory 110, a file buffer 120, a decoder 130, a data buffer 140, a CODEC 150, a playback speed controller 160, and a PCM data processor 170.
  • the playback speed altering apparatus can be used for an original sound reproducing device as a functional block of changing a playback speed, such as a MP3 player.
  • the playback speed altering apparatus can be embodied as combination of software and hardware.
  • the memory 110 stores audio files. It is preferable that the audio file includes
  • MPEG Audio layer 3 (ME3) files, windows media audio (WMA) files, and an Ogg vorbis (Ogg) file.
  • the decoder 130 decodes the audio files stored in the memory 110 to a PCM data.
  • the playback altering apparatus may include more than one decoder. However, it is preferable to have a plurality of decoders for each type of audio files.
  • the decoder 130 may include a MP3 decoder, a WMA decoder, and an OGG decoder.
  • the CODEC 150 receives the PCM data from the decoder 130 through the PCM data processor 170, and transforms the PCM data to audio analog signal.
  • the CODEC 150 can be connected to an ear phone 180 that converts the audio analog signal to sound.
  • the file buffer 120 is interposed between the memory 110 and the decoder 130, and the data buffer 140 is interposed between the decoder 130 and the CODEC 150.
  • the playback speed controller 160 stores information about a file to be reproduced and provides a reproducing state of a file that is currently reproduced to a user. Also, the playback speed controller 160 creates playback speed information according to a request of controlling a playback speed from a user. Furthermore, the playback speed controller 160 controls and drives the PCM data processor 170.
  • the PCM data processor 170 is driven in response to the control of the playback speed controller 160.
  • the PCM data processor 170 finds a voiceless period from the PCM data from the data buffer 140 and attenuates sounds excepting voice therefrom. Furthermore, the PCM data processor 170 changes the playback speed of the voiceless period according to the playback speed information.
  • the playback speed controller 160 does not influence restoration of the compressed filed although the audio file information of the file buffer 120 is used because the audio file is not modified.
  • the PCM data processor 170 transfers the PCM data stored in the data buffer 140 to the CODEC 150 without processing the PCM data. Therefore, the compressed file can be processed like as a conventional MP3 player.
  • Fig. 3 is a block diagram illustrating a playback speed controller of a playback speed altering apparatus of Fig. 2.
  • the playback speed controller 160 includes a header data analyzer 162, an operation state display 164, a playback speed generator 166, a PCM data controller 168, and an operation state controller 167.
  • the header data analyzer 162 stores file information to process.
  • the file information includes a file type, a version, a sample rate, a samples per channel, packed information, required bits, a free format, and etc.
  • the file information may further include additional information according to the file type.
  • the operation state display 164 stores information about the operation state of files that are currently processing, and displays a time of processing a file, or error states in a text or an icon.
  • the error states can be used as state information in the operation state controller 167.
  • the error state may includes a broken frame, data overflow, unsupported layer, forbidden bit rate, wrong MPEG build, and etc in ca of MP3.
  • the error state may include a bad asf header, a bad packet header, a bad weighting mode, a bad packet, and etc.
  • the playback speed generator 166 assigns a playback speed of a currently reproducing file according to a request of controlling a playback speed from a user.
  • the PCM data controller 168 controls the PCM data processor 170. That is, the
  • PCM data controller 168 controls the voice filter 172 and the sound quantity measuring unit 174. Also, the PCM data controller 168 can add and reduce voiceless interval required to control the playback speed while preserving the voice signal according to a playback speed rate generated from the playback speed generator 166 by controlling the PCM data processor 170.
  • a playback speed altering method is a method of adding voiceless interval to a period with no voice.
  • the lengths of words forming a sentence in the audio file are different from each other. That is, a voiceless period between a word and other word is not constant. Since the voiceless periods are shown irregularly, it is preferable to add or reduce the voiceless interval according to the voice signal.
  • the operation state controller 167 drives constitutional elements of the playback speed controller 160 such as the header data analyzer 162, the operation state display 164, the playback speed generator 166, and the PCM data controller 168.
  • the operation state controller 167 is driven when a timer of an original sound reproducing apparatus generates an interruption, or when a management program of the original sound reproducing apparatus reads a new file.
  • the management program may be a program for reproducing music, displaying information of MP3, performing a menu function and receiving input through a key board.
  • the operation state controller 167 checks whether contents of the operation state display 164 is modified or not when the timer generates the interrupt. If the contents of the operation state display 164 are modified, the operation state controller 167 informs the operation state display 164 of the modified contents.
  • the operation state controller 167 controls the playback speed generator 166 to generate a playback speed according to the modified playback speed.
  • the operation state controller 167 informs the PCM data controller 168 of the modified contents, thereby driving the PCM data processor 170.
  • the operation state controller 167 terminates the related operation and returns the control to a process performed right before the interrupt is generated if the contents are not modified in the operation state display 164.
  • Fig. 4 is a block diagram illustrating a PCM data processor of an apparatus for controlling a playback speed for preserving tone signal of Fig. 2.
  • the PCM data processor 170 includes a voice filter 172, a sound quantity measuring unit 174, a PCM data adder 176, and a PCM data attenuator 177.
  • the voice filter 172 receives PCM data from the data buffer 140, and attenuates a sound signal excepting a voice. It is because the voiceless period cannot be found if the PCM buffer 178 includes a sound signal excepting the voice.
  • the PCM data processor 170 may include two or more voice filters 172 and it is preferable to use the voice filters dedicated for male and female, respectively. Since a sound quality differs according to male and female, the male can be distinguished from the female with only voice. Therefore, the attenuation accuracy can be improved by using the dedicated voice filters 172 according to the female and the male.
  • the sound quantity measuring unit 174 sets a reference to find a voiced period and a voiceless period in the PCM buffer 178, and creates information about a voiceless period according to the set reference.
  • the PCM data adder 176 performs a function of slowing down the playback speed while preserving a voice signal.
  • the PCM data adder 176 adds voiceless interval to a voiceless period at a predetermined interval assigned by the playback speed generator 166 using the information about voiceless period from the sound quantity measuring unit 174.
  • the PCM data attenuator 177 increases up the playback speed while preserving a voice signal.
  • the PCM data attenuator 177 removes voiceless interval from the voiceless period provided from the sound quantity measuring unit 174 at a pre- determined time interval defined by the playback speed generator 166.
  • FIG. 5 is a flowchart illustrating a method for altering a playback speed while preserving a voice signal according to an embodiment of the present invention.
  • the method of altering a playback speed while preserving a voice signal includes a playback sound pressure threshold setup step SlOO, a variable period setup step S200, a playback speed setup step S300, a speed ratio receiving step S400, and a playback step S500.
  • a compressed digital audio signal is separated to sound components and sound playback data for reproducing the sound components, the sound component and the sound playback data are stored, and the sound component are reproduced using the sound playback data.
  • a sound quantity measuring unit 174 sets up a playback sound pressure threshold Tp which is a reference to set a period for controlling a sound source playback speed.
  • the playback speed generator 166 reproduces a sound with a low sound level as low as a user cannot hear or reproduces a sound recorded for a very short time, for example, shorter than 30/1000 second, and passes the reproduced sound through the voice filter 172.
  • the voice filter 172 extracts a voice signal only from the reproduced audio signal by attenuating other signals in the reproduced audio signal, and the sound quantity measuring unit 174 sets up the playback sound pressure threshold Tp to distinguish a voice sound from a voiceless sound using the extracted voice signal.
  • the playback sound pressure threshold may be set up using a sampled value obtained by sampling the recorded sound at least four times at a time period of 24/lOOOsec.
  • an average value of sampled values can be set up as the playback sound pressure threshold. It is preferable to set up the playback sound pressure threshold as an average value of absolute sampled values to correct a sound pressure value generated from the comparative difference with the reference value.
  • the playback sound pressure threshold value can be set up by multiplying a predetermined weight to a predetermined signal band among the sampled signals and obtaining an average thereof.
  • the playback speed generator 166 sets a predetermined period as an active variable period using the playback sound pressure threshold.
  • a voiceless period is not a period with absolutely no- voice included, but is a period with comparatively no voice based on a sound pressure threshold.
  • variable period a period not much influencing the hearing recognition of a user based on the sound pressure threshold is setup as the variable period.
  • a variable period is extended or shorted in the present embodiment.
  • a sound pressure period less than about 30% from the set sound pressure threshold can be set as a variable period. It can be expressed as Eq. 1.
  • variable period ⁇ (T -0.7T 0.3T )
  • the playback speed generator 166 sets the variable period defined at the variable period setting up step S200 as a playback speed setting up period. Also, the playback speed generator 166 sets playback speed control information based on a ratio of the variable period and other periods.
  • the playback speed control information is information for controlling a playback speed according to a requested playback speed from a user.
  • the operation state controller 167 receives a speed from a user to extend or shorten a sound.
  • the speed inputted from a user is information about a desired playback speed of reproducing a current audio file.
  • the format of the speed inputted from a user can be modified in various types and forms according to user interfaces. For example, a comparative speed ratio based on a current playback speed can be inputted.
  • the PCM data controller 168 reproduces a predetermined audio file based on sound reproducing information, playback speed control information, and a speed inputted from a user.
  • voiceless interval is added to a period having a predetermined value lower than a threshold value according to the playback speed control information through controlling the PCM data adder 176 in case of slowing down the playback speed.
  • the voiceless interval is removed from a period having a predetermined value lower than the threshold according to the playback speed control information through controlling the PCM data attenuator.
  • a sound may be reproduced according to a speed ratio inputted from a user like as Eq. 2 in a period with sound pressure exceeding the playback sound pressure threshold range, that is, a variable period.
  • a speed ratio inputted from a user like as Eq. 2 in a period with sound pressure exceeding the playback sound pressure threshold range, that is, a variable period.
  • X denotes a ratio of adding or removing a voiceless interval to/from a voiceless period for controlling a playback speed.
  • Eq. 2 denotes that the playback speed can be dynamically changed only in a period with less meaningful voice by adding or removing the voiceless interval as long as the voiceless period multiplied with X if the voiceless period is found.
  • the sound pressure occupying ratio in the variable period denotes a ratio of variable periods in entire periods in one unit period of storing and processing in a buffer while processing audio signal.
  • the sound pressure occupying ratio is a ratio of the variable period having sound pressure less than 30% of the sound pressure reference in entire periods.
  • the sound pressure ratio out of the variable period is a ratio of not variable period in entire periods.
  • the playback speed altering method according to the present invention can reproduce original sound at a playback speed requested by a user without deteriorating the meaningful voices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

Provided is an apparatus and method for altering a playback speed while preserving a voice signal. The apparatus includes: a memory for storing compressed audio files; a file buffer for buffering the audio files stored in the memory; a playback speed controller for storing audio file information of the file buffer and generating playback speed information according to a playback speed request from a user; a decoder for restoring audio files transferred from the file buffer to PCM data; a data buffer for buffering PCM data from the decoder; a PCM data processor for finding a voiceless period from the PCM data from the data buffer and controlling a playback speed according to the playback speed information; and a CODEC for transforming the PCM data from the PCM data processor to audio analog signal.

Description

Description
APPARATUS AND METHOD FOR PLAYBACK SPEED ALTERING WITH PRESERVATION OF TONE SIGNAL
Technical Field
[1] The present invention relates to an apparatus and method for altering a playback speed while preserving a voice signal; more particularly, to an apparatus and method for dynamically altering a playback speed of a meaningful voice signal without deteriorating a tone color and voice quality of the voice signal using voice density variation. Background Art
[2] In generally, if a playback speed of reproducing a speech signal is slowed down, it gives a longer time to a human's brain to recognize the meaning of reproduced speech after hearing the reproduced speech through the hearing sense of the human, thereby improving human's speech recognition capability. However, if the playback speed is simply slowed down for improving the speech recognition capability, a voice signal may be deteriorated. That is, it may degrade the speech recognition capability.
[3] In a conventional method for altering a playback speed of reproducing voice signal, a non-linear filter is used to compensate voice signal deterioration caused due to the playback speed variation. Fig. 1 shows an original sound reproducing block of a MP3 player.
[4] Since the conventional method of altering a playback time uses the speech nonlinear filter, the speech non-linear filter needs to be designed according to the frequency characteristics of a corresponding speech. Disclosure of Invention Technical Problem
[5] It is, therefore, an object of the present invention to provide a playback speed altering apparatus and method for altering a playback speed without deteriorating a meaningful voice signal by dynamically changing a playback speed of a non- meaningful voice period. Technical Solution
[6] In accordance with one aspect of the present invention, there is an apparatus for altering a playback speed while preserving a speech signal, including: a memory for storing compressed audio files; a file buffer for buffering the audio files stored in the memory; a playback speed controller for storing audio file information of the file buffer and generating playback speed information according to a playback speed request from a user; a decoder for restoring audio files transferred from the file buffer to PCM data; a data buffer for buffering PCM data from the decoder; a PCM data processor for finding a voiceless period from the PCM data from the data buffer and controlling a playback speed according to the playback speed information; and a CODEC for transforming the PCM data from the PCM data processor to audio analog signal.
[7] The memory may store at least one MPEG Audio layer 3 (ME3) files, windows media audio (WMA) files, and an Ogg vorbis (Ogg) file, and the decoder is at least one of a MP3 decoder, a WMA decoder, and OGG decoder for decoding the audio files in the memory.
[8] The playback speed controller may include: a header data analyzer for storing audio file information of the file buffer; a playback speed generator for setting up playback speed information of a currently processed audio file in the file buffer according to a playback speed request from a user; a PCM data controller for controlling the PCM data processor to reproduce the audio file according to the playback speed information; and an operation state controller for controlling the playback speed generator to setup playback speed information according to a sensed playback speed request from a user.
[9] The playback speed controller may include an operation state display for displaying operation state information of an audio file in the file buffer.
[10] The PCM data processor may include: a voice filter for attenuating a sound signal excepting voice from the PCM data transferred from the data buffer; a sound quantity measuring unit for setting up a reference to find a voiced period and a voiceless period from the PCM data, and generating voiceless period information according to the setup reference; a PCM data adder for adding voiceless interval to the voiceless period according to the playback speed information; and a PCM data attenuator for removing voiceless interval from the voiceless period according to the playback speed information.
[11] In accordance with another aspect of the present invention, there is provided a method of altering a playback speed while preserving a voice signal in reproducing audio data using audio data reproducing information separated from an audio file, including the steps of: a) setting up a reference sound pressure defining a playback speed variable period of the audio file; b) setting up a period in a predetermined range from the reference sound pressure as a variable period; c) setting up playback speed control information based on a ratio of the variable period and other periods; d) receiving a playback speed from a user; and e) reproducing the audio file using the playback speed control information and the playback speed.
[12] In the step a), the reference sound pressure may be set after reproducing a predetermined part of the audio file, and attenuating audio signal except voice.
[13] In the step a), the reference sound pressure may be set by an average value of absolute sampled sound pressures obtained through sampling a predetermined portion of the audio file at least four times at a time period of about 24/lOOOsec.
[14] In the step b), a period having a sound pressure less than 30% of the reference sound pressure may be set as a variable period.
[15] In the step e), the audio file may be reproduced according to an equation:
[16] user's input speed- sound pressure occupying ratio out of variable period
X = sound pressure occupying ratio in variable period
[17] , where X denotes a ratio of adding or removing voiceless intervals to/from a voiceless period to control a playback speed.
Advantageous Effects
[18] An apparatus and method for altering a playback speed while preserving a voice signal according to the present invention control the playback speed by dynamically changing a playback speed for a less-meaningful voice period. Therefore, the playback speed can be controlled without deteriorating meaningful voice without using a voice non-learner filter for compensating voice deterioration.
Brief Description of the Drawings [19] The above and other objects and features of the present invention will become apparent from the following description of the preferred embodiments given in conjunction with the accompanying drawings, in which: [20] Fig. 1 shows an original sound reproducing block of a MP3 player;
[21] Fig. 2 is a block diagram illustrating an apparatus for altering a playback speed while preserving a voice signal without deteriorating the speed signal according to an exemplary embodiment of the present invention; [22] Fig. 3 is a block diagram illustrating a playback speed controller of a playback speed altering apparatus of Fig. 2; [23] Fig. 4 is a block diagram illustrating a PCM data processor of an apparatus for controlling a playback speed for preserving tone signal of Fig. 2; and [24] Fig. 5 is a flowchart illustrating a method for altering a playback speed while preserving a voice signal according to an embodiment of the present invention.
Best Mode for Carrying Out the Invention [25] Other objects and aspects of the invention will become apparent from the following description of the embodiments with reference to the accompanying drawings, which is set forth hereinafter. [26] Fig. 2 is a block diagram illustrating an apparatus for altering a playback speed while preserving a voice signal without deteriorating the speed signal according to an exemplary embodiment of the present invention. [27] As shown in Fig. 2, the playback speed altering apparatus according to the present embodiment includes a memory 110, a file buffer 120, a decoder 130, a data buffer 140, a CODEC 150, a playback speed controller 160, and a PCM data processor 170.
[28] The playback speed altering apparatus according to the present embodiment can be used for an original sound reproducing device as a functional block of changing a playback speed, such as a MP3 player. The playback speed altering apparatus can be embodied as combination of software and hardware.
[29] The memory 110 stores audio files. It is preferable that the audio file includes
MPEG Audio layer 3 (ME3) files, windows media audio (WMA) files, and an Ogg vorbis (Ogg) file.
[30] The decoder 130 decodes the audio files stored in the memory 110 to a PCM data.
The playback altering apparatus according to the present embodiment may include more than one decoder. However, it is preferable to have a plurality of decoders for each type of audio files. For example, the decoder 130 may include a MP3 decoder, a WMA decoder, and an OGG decoder.
[31] The CODEC 150 receives the PCM data from the decoder 130 through the PCM data processor 170, and transforms the PCM data to audio analog signal. The CODEC 150 can be connected to an ear phone 180 that converts the audio analog signal to sound.
[32] It is preferable that the file buffer 120 is interposed between the memory 110 and the decoder 130, and the data buffer 140 is interposed between the decoder 130 and the CODEC 150.
[33] The playback speed controller 160 stores information about a file to be reproduced and provides a reproducing state of a file that is currently reproduced to a user. Also, the playback speed controller 160 creates playback speed information according to a request of controlling a playback speed from a user. Furthermore, the playback speed controller 160 controls and drives the PCM data processor 170.
[34] The PCM data processor 170 is driven in response to the control of the playback speed controller 160. The PCM data processor 170 finds a voiceless period from the PCM data from the data buffer 140 and attenuates sounds excepting voice therefrom. Furthermore, the PCM data processor 170 changes the playback speed of the voiceless period according to the playback speed information.
[35] In the present embodiment, the playback speed controller 160 does not influence restoration of the compressed filed although the audio file information of the file buffer 120 is used because the audio file is not modified.
[36] When the playback speed is not controlled, the PCM data processor 170 transfers the PCM data stored in the data buffer 140 to the CODEC 150 without processing the PCM data. Therefore, the compressed file can be processed like as a conventional MP3 player.
[37] Fig. 3 is a block diagram illustrating a playback speed controller of a playback speed altering apparatus of Fig. 2.
[38] As shown in Fig. 3, the playback speed controller 160 includes a header data analyzer 162, an operation state display 164, a playback speed generator 166, a PCM data controller 168, and an operation state controller 167.
[39] The header data analyzer 162 stores file information to process. Herein, the file information includes a file type, a version, a sample rate, a samples per channel, packed information, required bits, a free format, and etc. The file information may further include additional information according to the file type.
[40] The operation state display 164 stores information about the operation state of files that are currently processing, and displays a time of processing a file, or error states in a text or an icon.
[41] Herein, the error states can be used as state information in the operation state controller 167. For example, the error state may includes a broken frame, data overflow, unsupported layer, forbidden bit rate, wrong MPEG build, and etc in ca of MP3. IN case of WMA, the error state may include a bad asf header, a bad packet header, a bad weighting mode, a bad packet, and etc.
[42] The playback speed generator 166 assigns a playback speed of a currently reproducing file according to a request of controlling a playback speed from a user.
[43] The PCM data controller 168 controls the PCM data processor 170. That is, the
PCM data controller 168 controls the voice filter 172 and the sound quantity measuring unit 174. Also, the PCM data controller 168 can add and reduce voiceless interval required to control the playback speed while preserving the voice signal according to a playback speed rate generated from the playback speed generator 166 by controlling the PCM data processor 170.
[44] A playback speed altering method according to the present embodiment is a method of adding voiceless interval to a period with no voice. The lengths of words forming a sentence in the audio file are different from each other. That is, a voiceless period between a word and other word is not constant. Since the voiceless periods are shown irregularly, it is preferable to add or reduce the voiceless interval according to the voice signal.
[45] The operation state controller 167 drives constitutional elements of the playback speed controller 160 such as the header data analyzer 162, the operation state display 164, the playback speed generator 166, and the PCM data controller 168.
[46] The operation state controller 167 is driven when a timer of an original sound reproducing apparatus generates an interruption, or when a management program of the original sound reproducing apparatus reads a new file. The management program may be a program for reproducing music, displaying information of MP3, performing a menu function and receiving input through a key board.
[47] The operation state controller 167 checks whether contents of the operation state display 164 is modified or not when the timer generates the interrupt. If the contents of the operation state display 164 are modified, the operation state controller 167 informs the operation state display 164 of the modified contents.
[48] If the modified contents are about the change of a playback speed, the operation state controller 167 controls the playback speed generator 166 to generate a playback speed according to the modified playback speed. The operation state controller 167 informs the PCM data controller 168 of the modified contents, thereby driving the PCM data processor 170.
[49] The operation state controller 167 terminates the related operation and returns the control to a process performed right before the interrupt is generated if the contents are not modified in the operation state display 164.
[50] Fig. 4 is a block diagram illustrating a PCM data processor of an apparatus for controlling a playback speed for preserving tone signal of Fig. 2.
[51] As shown in Fig. 4, the PCM data processor 170 includes a voice filter 172, a sound quantity measuring unit 174, a PCM data adder 176, and a PCM data attenuator 177.
[52] The voice filter 172 receives PCM data from the data buffer 140, and attenuates a sound signal excepting a voice. It is because the voiceless period cannot be found if the PCM buffer 178 includes a sound signal excepting the voice.
[53] The PCM data processor 170 according to the present embodiment may include two or more voice filters 172 and it is preferable to use the voice filters dedicated for male and female, respectively. Since a sound quality differs according to male and female, the male can be distinguished from the female with only voice. Therefore, the attenuation accuracy can be improved by using the dedicated voice filters 172 according to the female and the male.
[54] The sound quantity measuring unit 174 sets a reference to find a voiced period and a voiceless period in the PCM buffer 178, and creates information about a voiceless period according to the set reference.
[55] The PCM data adder 176 performs a function of slowing down the playback speed while preserving a voice signal. The PCM data adder 176 adds voiceless interval to a voiceless period at a predetermined interval assigned by the playback speed generator 166 using the information about voiceless period from the sound quantity measuring unit 174.
[56] The PCM data attenuator 177 increases up the playback speed while preserving a voice signal. The PCM data attenuator 177 removes voiceless interval from the voiceless period provided from the sound quantity measuring unit 174 at a pre- determined time interval defined by the playback speed generator 166.
[57] Fig. 5 is a flowchart illustrating a method for altering a playback speed while preserving a voice signal according to an embodiment of the present invention.
[58] Referring to Fig. 5, the method of altering a playback speed while preserving a voice signal according to the present invention includes a playback sound pressure threshold setup step SlOO, a variable period setup step S200, a playback speed setup step S300, a speed ratio receiving step S400, and a playback step S500.
[59] In the method of altering a playback speed while preserving a voice signal according to an embodiment of the present invention, a compressed digital audio signal is separated to sound components and sound playback data for reproducing the sound components, the sound component and the sound playback data are stored, and the sound component are reproduced using the sound playback data.
[60] At the playback sound pressure threshold setup step SlOO, a sound quantity measuring unit 174 sets up a playback sound pressure threshold Tp which is a reference to set a period for controlling a sound source playback speed.
[61] At the step SlOO, the playback speed generator 166 reproduces a sound with a low sound level as low as a user cannot hear or reproduces a sound recorded for a very short time, for example, shorter than 30/1000 second, and passes the reproduced sound through the voice filter 172. The voice filter 172 extracts a voice signal only from the reproduced audio signal by attenuating other signals in the reproduced audio signal, and the sound quantity measuring unit 174 sets up the playback sound pressure threshold Tp to distinguish a voice sound from a voiceless sound using the extracted voice signal.
[62] For example, the playback sound pressure threshold may be set up using a sampled value obtained by sampling the recorded sound at least four times at a time period of 24/lOOOsec.
[63] That is, an average value of sampled values can be set up as the playback sound pressure threshold. It is preferable to set up the playback sound pressure threshold as an average value of absolute sampled values to correct a sound pressure value generated from the comparative difference with the reference value.
[64] In order to enhance a predetermined sound-band, the playback sound pressure threshold value can be set up by multiplying a predetermined weight to a predetermined signal band among the sampled signals and obtaining an average thereof.
[65] It is obvious to the skilled in the art that various methods of defining the playback sound pressure threshold using statistical characteristics of the sampled values based on a playback environment, a characteristic of audio file to reproduce, and user's selection can be used.
[66] At the variable period setting step S200, the playback speed generator 166 sets a predetermined period as an active variable period using the playback sound pressure threshold.
[67] In the present embodiment, a voiceless period is not a period with absolutely no- voice included, but is a period with comparatively no voice based on a sound pressure threshold.
[68] That is, in the present embodiment, a period not much influencing the hearing recognition of a user based on the sound pressure threshold is setup as the variable period. Such a variable period is extended or shorted in the present embodiment.
[69] For example, a sound pressure period less than about 30% from the set sound pressure threshold can be set as a variable period. It can be expressed as Eq. 1.
[70] Eq. 1
[71] variable period < (T -0.7T =0.3T )
P P P
[72] It is possible to set a period with a sound pressure lower than the sound pressure threshold as a variable period. By setting a period with a sound pressure less than 30% of sound pressure threshold as shown in Eq. 1 , the influence of the hearing recognition can be further reduced although the period is extended or shortened.
[73] At the step S300 for setting up the playback speed, the playback speed generator
166 sets the variable period defined at the variable period setting up step S200 as a playback speed setting up period. Also, the playback speed generator 166 sets playback speed control information based on a ratio of the variable period and other periods.
[74] The playback speed control information is information for controlling a playback speed according to a requested playback speed from a user.
[75] At the step S400 for receiving the speed ratio, the operation state controller 167 receives a speed from a user to extend or shorten a sound.
[76] The speed inputted from a user is information about a desired playback speed of reproducing a current audio file. The format of the speed inputted from a user can be modified in various types and forms according to user interfaces. For example, a comparative speed ratio based on a current playback speed can be inputted.
[77] At the step S500 for reproducing audio file according to the speed ratio, the PCM data controller 168 reproduces a predetermined audio file based on sound reproducing information, playback speed control information, and a speed inputted from a user.
[78] At the step S500, voiceless interval is added to a period having a predetermined value lower than a threshold value according to the playback speed control information through controlling the PCM data adder 176 in case of slowing down the playback speed. In case of increasing the playback speed, the voiceless interval is removed from a period having a predetermined value lower than the threshold according to the playback speed control information through controlling the PCM data attenuator.
[79] For example, a sound may be reproduced according to a speed ratio inputted from a user like as Eq. 2 in a period with sound pressure exceeding the playback sound pressure threshold range, that is, a variable period. [80] user's input speed- sound pressure occupying ratio out of variable period
K = - sound pressure occupying ratio in variable period
[81] In Eq. 2, X denotes a ratio of adding or removing a voiceless interval to/from a voiceless period for controlling a playback speed. Eq. 2 denotes that the playback speed can be dynamically changed only in a period with less meaningful voice by adding or removing the voiceless interval as long as the voiceless period multiplied with X if the voiceless period is found.
[82] Also, the sound pressure occupying ratio in the variable period denotes a ratio of variable periods in entire periods in one unit period of storing and processing in a buffer while processing audio signal.
[83] That is, if a period having a sound pressure less than 30% of the sound pressure threshold is set as a variable period, the sound pressure occupying ratio is a ratio of the variable period having sound pressure less than 30% of the sound pressure reference in entire periods.
[84] The sound pressure ratio out of the variable period is a ratio of not variable period in entire periods.
[85] Therefore, the playback speed altering method according to the present invention can reproduce original sound at a playback speed requested by a user without deteriorating the meaningful voices.
[86] While the present invention has been described with respect to certain preferred embodiments, it will be apparent to those skilled in the art that various changes and modifications may be made without departing from the scope of the invention as defined in the following claims.

Claims

Claims
[1] An apparatus for altering a playback speed while preserving a voice signal, comprising: a memory for storing compressed audio files; a file buffer for buffering the audio files stored in the memory; a playback speed controller for storing audio file information of the file buffer and generating playback speed information according to a playback speed request from a user; a decoder for restoring audio files transferred from the file buffer to PCM data; a data buffer for buffering PCM data from the decoder; a PCM data processor for finding a voiceless period from the PCM data from the data buffer and controlling a playback speed according to the playback speed information; and a CODEC for transforming the PCM data from the PCM data processor to audio analog signal.
[2] The apparatus of claim 1, wherein the memory stores at least one MPEG Audio layer 3 (ME3) files, windows media audio (WMA) files, and an Ogg vorbis
(Ogg) file, and the decoder is at least one of a MP3 decoder, a WMA decoder, and OGG decoder for decoding the audio files in the memory.
[3] The apparatus of claim 1, wherein the playback speed controller includes: a header data analyzer for storing audio file information of the file buffer; a playback speed generator for setting up playback speed information of a currently processed audio file in the file buffer according to a playback speed request from a user; a PCM data controller for controlling the PCM data processor to reproduce the audio file according to the playback speed information; and an operation state controller for controlling the playback speed generator to setup playback speed information according to a sensed playback speed request from a user.
[4] The apparatus of claim 1, wherein the playback speed controller includes an operation state display for displaying operation state information of an audio file in the file buffer.
[5] The apparatus of claim 1, wherein the PCM data processor includes: a voice filter for attenuating a sound signal excepting voice from the PCM data transferred from the data buffer; a sound quantity measuring unit for setting up a reference to find a voiced period and a voiceless period from the PCM data, and generating voiceless period in- formation according to the setup reference; a PCM data adder for adding voiceless interval to the voiceless period according to the playback speed information; and a PCM data attenuator for removing voiceless interval from the voiceless period according to the playback speed information.
[6] A method of altering a playback speed while preserving a voice signal in reproducing audio data using audio data reproducing information separated from an audio file, comprising the steps of: a) setting up a reference sound pressure defining a playback speed variable period of the audio file; b) setting up a period in a predetermined range from the reference sound pressure as a variable period; c) setting up playback speed control information based on a ratio of the variable period and other periods; d) receiving a playback speed from a user; and e) reproducing the audio file using the playback speed control information and the playback speed.
[7] The method of claim 6, wherein in the step a), the reference sound pressure is set after reproducing a predetermined part of the audio file, and attenuating audio signal except voice. [8] The method of claim 7, wherein in the step a), the reference sound pressure is set by an average value of absolute sampled sound pressures obtained through sampling a predetermined portion of the audio file at least four times at a time period of about 24/lOOOsec. [9] The method of claim 8, wherein in the step b), a period having a sound pressure less than 30% of the reference sound pressure is set as a variable period. [10] The method of claim 6, wherein in the step e), the audio file is reproduced according to an equation: user's input speed- sound pressure occupying ratio out of variable period sound pressure occupying ratio in variable period
, where X denotes a ratio of adding or removing voiceless intervals to/from a voiceless period to control a playback speed.
PCT/KR2006/003770 2006-09-22 2006-09-22 Apparatus and method for playback speed altering with preservation of tone signal Ceased WO2008035829A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/KR2006/003770 WO2008035829A1 (en) 2006-09-22 2006-09-22 Apparatus and method for playback speed altering with preservation of tone signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/KR2006/003770 WO2008035829A1 (en) 2006-09-22 2006-09-22 Apparatus and method for playback speed altering with preservation of tone signal

Publications (1)

Publication Number Publication Date
WO2008035829A1 true WO2008035829A1 (en) 2008-03-27

Family

ID=39200642

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2006/003770 Ceased WO2008035829A1 (en) 2006-09-22 2006-09-22 Apparatus and method for playback speed altering with preservation of tone signal

Country Status (1)

Country Link
WO (1) WO2008035829A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5781696A (en) * 1994-09-28 1998-07-14 Samsung Electronics Co., Ltd. Speed-variable audio play-back apparatus
JPH10224464A (en) * 1997-01-31 1998-08-21 Sanyo Electric Co Ltd Voice communication device
KR20010051961A (en) * 1999-11-26 2001-06-25 이데이 노부유끼 Recording and/or reproducing apparatus and recording and/or reproducing method
JP2005032369A (en) * 2003-07-09 2005-02-03 Matsushita Electric Ind Co Ltd Optical disc playback apparatus and playback method
KR20060128212A (en) * 2005-06-09 2006-12-14 주식회사 아이웨어 Variable playback speed apparatus and method for preserving audio signals

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5781696A (en) * 1994-09-28 1998-07-14 Samsung Electronics Co., Ltd. Speed-variable audio play-back apparatus
JPH10224464A (en) * 1997-01-31 1998-08-21 Sanyo Electric Co Ltd Voice communication device
KR20010051961A (en) * 1999-11-26 2001-06-25 이데이 노부유끼 Recording and/or reproducing apparatus and recording and/or reproducing method
JP2005032369A (en) * 2003-07-09 2005-02-03 Matsushita Electric Ind Co Ltd Optical disc playback apparatus and playback method
KR20060128212A (en) * 2005-06-09 2006-12-14 주식회사 아이웨어 Variable playback speed apparatus and method for preserving audio signals

Similar Documents

Publication Publication Date Title
KR101275467B1 (en) Apparatus and method for controlling automatic equalizer of audio reproducing apparatus
US6055502A (en) Adaptive audio signal compression computer system and method
US8457322B2 (en) Information processing apparatus, information processing method, and program
JP2008191659A (en) Speech enhancement method and speech reproduction system
CA2452022C (en) Apparatus and method for changing the playback rate of recorded speech
KR101334366B1 (en) Method and apparatus for varying audio playback speed
CN104966524B (en) Audio-frequency processing method and audio frequency processing system
JP2010283605A (en) Video processing apparatus and method
WO2007007523A1 (en) Vehicle-mounted sound control system
WO2005057550A1 (en) Audio compression/decompression device
KR100677950B1 (en) Variable playback speed apparatus and method for preserving audio signals
WO2008035829A1 (en) Apparatus and method for playback speed altering with preservation of tone signal
JP2008197199A (en) Audio encoding apparatus and audio decoding apparatus
KR970017457A (en) Voice signal shift playback method
JP4587916B2 (en) Audio signal discrimination device, sound quality adjustment device, content display device, program, and recording medium
JP4580297B2 (en) Audio reproduction device, audio recording / reproduction device, and method, recording medium, and integrated circuit
JPH07191695A (en) Speaking speed conversion device
US8195317B2 (en) Data reproduction apparatus and data reproduction method
JP4275055B2 (en) SOUND QUALITY ADJUSTMENT DEVICE, BROADCAST RECEIVER, PROGRAM, AND RECORDING MEDIUM
JPH08147874A (en) Speech speed conversion device
JPH0854895A (en) Playback device
WO1997009713A1 (en) A method of processing audio signal for fidelity varying-speed replaying
KR100372576B1 (en) Method of Processing Audio Signal
JP2007183410A (en) Information reproduction apparatus and method
JP4275054B2 (en) Audio signal discrimination device, sound quality adjustment device, broadcast receiver, program, and recording medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 06798853

Country of ref document: EP

Kind code of ref document: A1

DPE2 Request for preliminary examination filed before expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06798853

Country of ref document: EP

Kind code of ref document: A1