Dev-syn




Home
Introduction
5-Mins' Guide


Analysis Demo
Synthesize Demo
Recognition Demo


Pre-processing
Analysis
Synthesize
Recognition
GUI


Pre-processing
Time Analysis
Frequency Analysis
Cepstral Analysis
WRLS-VFFAnalysis
Speech Synthesize
Speech Recognition


About us
Future Work
Known bugs
Acknowledgement
FAQ

Developer's Guide--Part III

Speech Synthesis

There are three method to implement PSOLA, which are TD-PSOLA; LPC-PSOLA; FD-PSOLA. We use the TD-PSOLA method.

There are mainly three steps to implement speech synthesize by PSOLA.
(1).Pitch synchronous analysis:
　Set the synchronized mark of the unit of speech synthesize as the center, choose a proper window length and window the unit of synthesized speech, then we can get a group of short-time signal
　　　　　　　　　　　　
where is the pitch mark, the default value of is Hamming window. The window length is often chose to be 2~4 times as a pitch period.
(2).Pitch synchronous modification:
　 According to the TD-PSOLA method, short-time synthesize signal is a copy of its short-time analysis signal. If the short-time analysis signal is , then the short-time synthesize signal is
　　　　　　　　　　　　
(3).Pitch synchronous synthesize:
　Pitch synchronous synthesize is implemented by superposition of short-time synthesize. There are a lot of method about superposition of short-time synthesize, we use the Least-Square Overlap-Added Scheme. The final synthesize signal is:
　　　　　　　　　
it can also be expressed as:
　　　　　　　　　　　　　　　(3.1)
If we set = 1, then we have:
　　　　　　　　　　　　　　　　　　　　　(3.2)
　 Using the Equation ( 3.1) and ( 3.2), we can extend or compress the relatively distance between the pitch synchronized mark ( ) of the original speech, and finally we can have another pitch synchronized mark ( ) of the synthesized speech. The POSLA method can be illustrated in 3.3
　　　　　　
　　　　　　　
　　　　　　　　　　　　　　3.3
(a) Pitch frequency was decreased
(b) Speech was extended but the pitch frequency remains the same

About us|Contact us|Our university|News Group