{"id":1710,"date":"2019-03-09T17:14:27","date_gmt":"2019-03-09T16:14:27","guid":{"rendered":"https:\/\/rosetta.vn\/short\/?p=1710"},"modified":"2019-03-25T12:09:32","modified_gmt":"2019-03-25T11:09:32","slug":"speech-to-text-with-google-cloud","status":"publish","type":"post","link":"https:\/\/rosetta.vn\/short\/2019\/03\/09\/speech-to-text-with-google-cloud\/","title":{"rendered":"Speech to text with Google Cloud"},"content":{"rendered":"<p>Code speech to text with Google cloud:<\/p>\n<p>Convert WAV to FLAC, by running sox in terminal:<\/p>\n<blockquote><p>sox ZOOM0005_Tr1.wav ZOOM0005_Tr1.flac<\/p><\/blockquote>\n<p>or use FFMPEG to convert MP3 to FLAC in terminal:<\/p>\n<blockquote><p>ffmpeg -i input.mp3 output.flac<\/p><\/blockquote>\n<p>Upload file to Google Cloud Storage folder (if running gcloud with a file from computer, the file must be less than 10MB): <a class=\"https\" title=\"https:\/\/console.cloud.google.com\/storage\/browser\/dang_gstorage?project=steel-flare-233717&amp;authuser=1\" href=\"https:\/\/console.cloud.google.com\/storage\/browser\/dang_gstorage?project=steel-flare-233717\">https:\/\/console.cloud.google.com\/storage\/browser\/dang_gstorage?project=steel-flare-233717<\/a><\/p>\n<p>or use the gsutil in terminal, e.g.:<\/p>\n<blockquote><p>gsutil cp 1996-03-17-TNiemHoangXuanHan.flac gs:\/\/dang_gstorage\/speech\/<\/p><\/blockquote>\n<p>Run gcloud in terminal:<br \/>\ndang@KubuntuElite:~\/google_cloud\/google-cloud-sdk\/bin$ .\/gcloud ml speech recognize-long-running &#8216;<a class=\"gs\" title=\"gs:\/\/dang_gstorage\/ZOOM0005_Tr1.flac\" href=\"\/\/dang_gstorage\/ZOOM0005_Tr1.flac\">gs:\/\/dang_gstorage\/ZOOM0005_Tr1.flac<\/a>&#8216; &#8211;language-code=&#8217;vi-VN&#8217; &#8211;async<br \/>\n-&gt; output:<br \/>\nCheck operation [607280323405008030] for status.<br \/>\n{<br \/>\n&#8220;name&#8221;: &#8220;607280323405008030&#8221;<br \/>\n}<\/p>\n<p>Test command:<br \/>\ndang@KubuntuElite:~\/google_cloud\/google-cloud-sdk\/bin$ .\/gcloud ml speech operations wait 607280323405008030<\/p>\n<p>Collect data when done:<\/p>\n<blockquote><p>dang@KubuntuElite:~\/google_cloud$ gcloud ml speech operations describe 607280323405008030 &gt; test_transcribe_vn.json<\/p><\/blockquote>\n<p>Run Python code to gather results:<\/p>\n<blockquote><p>import json<\/p>\n<p>def postprocess_json(google_json_file):<\/p>\n<div style=\"padding-left: 30pt\">with open(google_json_file, &#8216;rt&#8217;) as json_file:<\/div>\n<div style=\"padding-left: 60pt\">data = json.load(json_file)<\/div>\n<div style=\"padding-left: 30pt\">for result in data[&#8216;response&#8217;][&#8216;results&#8217;]:<\/div>\n<div style=\"padding-left: 60pt\">print(result[&#8216;alternatives&#8217;][0][&#8216;transcript&#8217;])<br \/>\nprint(&#8216;Confidence: &#8216;+str(result[&#8216;alternatives&#8217;][0][&#8216;confidence&#8217;]))<\/div>\n<p>postprocess_json(&#8216;test_transcribe_vn.json&#8217;)<\/p><\/blockquote>\n<p>Ref:<br \/>\nGoogle Cloud: <a class=\"https\" title=\"https:\/\/cloud.google.com\/storage\/docs\/quickstart-console\" href=\"https:\/\/cloud.google.com\/storage\/docs\/quickstart-console\">https:\/\/cloud.google.com\/storage\/docs\/quickstart-console<\/a><br \/>\nInstall Google Cloud SDK for speech to text: <a class=\"https\" title=\"https:\/\/cloud.google.com\/speech-to-text\/docs\/quickstart-gcloud\" href=\"https:\/\/cloud.google.com\/speech-to-text\/docs\/quickstart-gcloud\">https:\/\/cloud.google.com\/speech-to-text\/docs\/quickstart-gcloud<\/a><br \/>\nTranscribing long audio files with Cloud Speech-to-Text API: <a class=\"https\" title=\"https:\/\/cloud.google.com\/speech-to-text\/docs\/async-recognize\" href=\"https:\/\/cloud.google.com\/speech-to-text\/docs\/async-recognize\">https:\/\/cloud.google.com\/speech-to-text\/docs\/async-recognize<\/a><br \/>\nProcess json file in Python: <a class=\"https\" title=\"https:\/\/stackabuse.com\/reading-and-writing-json-to-a-file-in-python\/\" href=\"https:\/\/stackabuse.com\/reading-and-writing-json-to-a-file-in-python\/\">https:\/\/stackabuse.com\/reading-and-writing-json-to-a-file-in-python\/<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Code speech to text with Google cloud: Convert WAV to FLAC, by running sox in terminal: sox ZOOM0005_Tr1.wav ZOOM0005_Tr1.flac or use FFMPEG to convert MP3 to FLAC in terminal: ffmpeg -i input.mp3 output.flac Upload file to Google Cloud Storage folder<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_mi_skip_tracking":false},"categories":[30,215],"tags":[1156,720,1155],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/p8jhJx-rA","_links":{"self":[{"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/posts\/1710"}],"collection":[{"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/comments?post=1710"}],"version-history":[{"count":6,"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/posts\/1710\/revisions"}],"predecessor-version":[{"id":1770,"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/posts\/1710\/revisions\/1770"}],"wp:attachment":[{"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/media?parent=1710"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/categories?post=1710"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/rosetta.vn\/short\/wp-json\/wp\/v2\/tags?post=1710"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}