Developer's Guide--Part III

Speech Synthesis

 

There are three method to implement PSOLA, which are TD-PSOLA; LPC-PSOLA; FD-PSOLA. We use the TD-PSOLA method.

There are mainly three steps to implement speech synthesize by PSOLA.
(1).Pitch synchronous analysis:
กกSet the synchronized mark of the unit of speech synthesize as the center, choose a proper window length and window the unit of synthesized speech, then we can get a group of short-time signal
กกกกกกกกกกกกกกกกกกกกกกกก
where is the pitch mark, the default value of is Hamming window. The window length is often chose to be 2~4 times as a pitch period.
(2).Pitch synchronous modification:
กก According to the TD-PSOLA method, short-time synthesize signal is a copy of its short-time analysis signal. If the short-time analysis signal is , then the short-time synthesize signal is
กกกกกกกกกกกกกกกกกกกกกกกก
(3).Pitch synchronous synthesize:
กกPitch synchronous synthesize is implemented by superposition of short-time synthesize. There are a lot of method about superposition of short-time synthesize, we use the Least-Square Overlap-Added Scheme. The final synthesize signal is:
กกกกกกกกกกกกกกกกกก
it can also be expressed as:
กกกกกกกกกกกกกกกกกกกกกกกกกกกกกก(3.1)
If we set = 1, then we have:
กกกกกกกกกกกกกกกกกกกกกกกกกก กกกกกกกกกกกกกกกก(3.2)
กก Using the Equation ( 3.1) and ( 3.2), we can extend or compress the relatively distance between the pitch synchronized mark ( ) of the original speech, and finally we can have another pitch synchronized mark ( ) of the synthesized speech. The POSLA method can be illustrated in 3.3
กกกกกกกกกกกก
กกกกกกกกกกกกกก
กกกกกกกกกกกกกกกกกกกกกกกกกกกก3.3

(a) Pitch frequency was decreased
(b) Speech was extended but the pitch frequency remains the same

 
 
 
 
Copyright (C) 2006-2007 Scilab group of Xiamen University, China