Speech-to-speech translation using deep learning

dc.contributor.authorBredmar, Fredrik
dc.contributor.departmentGöteborgs universitet/Institutionen för data- och informationsteknikswe
dc.contributor.departmentUniversity of Gothenburg/Department of Computer Science and Engineeringeng
dc.date.accessioned2017-03-17T13:06:45Z
dc.date.available2017-03-17T13:06:45Z
dc.date.issued2017-03-17
dc.description.abstractCurrent state-of-the-art translation systems for speech-to-speech rely heavily on a text representation for the translation. By transcoding speech to text we lose important information about the characteristics of the voice such as the emotion, pitch and accent. This thesis examine the possibility of using an LSTM neural network model to translate speech-to-speech without the need of a text representation. That is by translating using the raw audio data directly in order to persevere the characteristics of the voice that otherwise get lost in the text transcoding part of the translation process. As part of this research we create a data set of phrases suitable for speech-to-speech translation tasks. The thesis result in a proof of concept system which need to scale the underlying deep neural network in order to work better.sv
dc.identifier.urihttp://hdl.handle.net/2077/51978
dc.language.isoengsv
dc.setspec.uppsokTechnology
dc.subjectNeural Networkssv
dc.subjectDeep Learningsv
dc.subjectLSTMsv
dc.subjectRNNsv
dc.subjectSpeech-to-speech translationsv
dc.titleSpeech-to-speech translation using deep learningsv
dc.typetext
dc.type.degreeStudent essay
dc.type.uppsokH2

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
gupea_2077_51978_1.pdf
Size:
649.17 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
876 B
Format:
Item-specific license agreed upon to submission
Description:

Collections