音声ファイルのm4aを動画ファイルにコマンドで変換する

最近、ピアノだったりベースだったりを録画よりも音声ファイルで録音することが多くなって、それをときどきYouTubeやらに載せたくなる。しかし音声ファイルの m4a はYouTubeではサポートされてない。無料のWebサービスを渡り歩いて変換することもできるけど、きっとローカルで変換できるんだろうなと気づいたので試すことにする。GUIだともっと細やかにできるだろうけど、細かい必要がないのでコマンドでやりたい。

ffmpeg でできるようだ。

FFmpeg

A complete, cross-platform solution to record, convert and stream audio and video.

とりあえず brew でインストールする。

$ brew install ffmpeg

たぶん環境によってはだいぶインストールする時間がかかる。が、無事インストールされた。とりあえずヘルプを見ることにする。

$ ffmpeg --help
ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers
  built with Apple clang version 14.0.3 (clang-1403.0.22.14.1)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/6.0_1 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-audiotoolbox
  libavutil      58.  2.100 / 58.  2.100
  libavcodec     60.  3.100 / 60.  3.100
  libavformat    60.  3.100 / 60.  3.100
  libavdevice    60.  1.100 / 60.  1.100
  libavfilter     9.  3.100 /  9.  3.100
  libswscale      7.  1.100 /  7.  1.100
  libswresample   4. 10.100 /  4. 10.100
  libpostproc    57.  1.100 / 57.  1.100
Hyper fast Audio and Video encoder
usage: ffmpeg [options] [[infile options] -i infile]... {[outfile options] outfile}...

Getting help:
    -h      -- print basic options
    -h long -- print more options
    -h full -- print all options (including all format and codec specific options, very long)
    -h type=name -- print all options for the named decoder/encoder/demuxer/muxer/filter/bsf/protocol
    See man ffmpeg for detailed description of the options.

Print help / information / capabilities:
-L                  show license
-h topic            show help
-? topic            show help
-help topic         show help
--help topic        show help
-version            show version
-buildconf          show build configuration
-formats            show available formats
-muxers             show available muxers
-demuxers           show available demuxers
-devices            show available devices
-codecs             show available codecs
-decoders           show available decoders
-encoders           show available encoders
-bsfs               show available bit stream filters
-protocols          show available protocols
-filters            show available filters
-pix_fmts           show available pixel formats
-layouts            show standard channel layouts
-sample_fmts        show available audio sample formats
-dispositions       show available stream dispositions
-colors             show available color names
-sources device     list sources of the input device
-sinks device       list sinks of the output device
-hwaccels           show available HW acceleration methods

Global options (affect whole program instead of just one file):
-loglevel loglevel  set logging level
-v loglevel         set logging level
-report             generate a report
-max_alloc bytes    set maximum size of a single allocated block
-y                  overwrite output files
-n                  never overwrite output files
-ignore_unknown     Ignore unknown stream types
-filter_threads     number of non-complex filter threads
-filter_complex_threads  number of threads for -filter_complex
-stats              print progress report during encoding
-max_error_rate maximum error rate  ratio of decoding errors (0.0: no errors, 1.0: 100% errors) above which ffmpeg returns an error instead of success.

Per-file main options:
-f fmt              force format
-c codec            codec name
-codec codec        codec name
-pre preset         preset name
-map_metadata outfile[,metadata]:infile[,metadata]  set metadata information of outfile from infile
-t duration         record or transcode "duration" seconds of audio/video
-to time_stop       record or transcode stop time
-fs limit_size      set the limit file size in bytes
-ss time_off        set the start time offset
-sseof time_off     set the start time offset relative to EOF
-seek_timestamp     enable/disable seeking by timestamp with -ss
-timestamp time     set the recording timestamp ('now' to set the current time)
-metadata string=string  add metadata
-program title=string:st=number...  add program with specified streams
-target type        specify target file type ("vcd", "svcd", "dvd", "dv" or "dv50" with optional prefixes "pal-", "ntsc-" or "film-")
-apad               audio pad
-frames number      set the number of frames to output
-filter filter_graph  set stream filtergraph
-filter_script filename  read stream filtergraph description from a file
-reinit_filter      reinit filtergraph on input parameter changes
-discard            discard
-disposition        disposition

Video options:
-vframes number     set the number of video frames to output
-r rate             set frame rate (Hz value, fraction or abbreviation)
-fpsmax rate        set max frame rate (Hz value, fraction or abbreviation)
-s size             set frame size (WxH or abbreviation)
-aspect aspect      set aspect ratio (4:3, 16:9 or 1.3333, 1.7777)
-display_rotation angle  set pure counter-clockwise rotation in degrees for stream(s)
-display_hflip      set display horizontal flip for stream(s) (overrides any display rotation if it is not set)
-display_vflip      set display vertical flip for stream(s) (overrides any display rotation if it is not set)
-vn                 disable video
-vcodec codec       force video codec ('copy' to copy stream)
-timecode hh:mm:ss[:;.]ff  set initial TimeCode value.
-pass n             select the pass number (1 to 3)
-vf filter_graph    set video filters
-b bitrate          video bitrate (please use -b:v)
-dn                 disable data

Audio options:
-aframes number     set the number of audio frames to output
-aq quality         set audio quality (codec-specific)
-ar rate            set audio sampling rate (in Hz)
-ac channels        set number of audio channels
-an                 disable audio
-acodec codec       force audio codec ('copy' to copy stream)
-ab bitrate         audio bitrate (please use -b:a)
-af filter_graph    set audio filters

Subtitle options:
-s size             set frame size (WxH or abbreviation)
-sn                 disable subtitle
-scodec codec       force subtitle codec ('copy' to copy stream)
-stag fourcc/tag    force subtitle tag/fourcc
-fix_sub_duration   fix subtitles duration
-canvas_size size   set canvas size (WxH or abbreviation)
-spre preset        set the subtitle options to the indicated preset

なっげぇ。バージョンは6.0。何やら標準エラーで出力されている部分もあるので、本来的には色んな設定をする（できる）ようだ。とりあえず問題はなさそうなので、下記でできるという情報があったので、おもむろにコマンドを打つ。

$ ffmpeg -i ./hanon.m4a -c:a copy output.mp4
・
・
割とパッとできる

お、できた。すごく楽。

$ file output.mp4
output.mp4: ISO Media, MP4 Base Media v1 [ISO 14496-12:2003]

ただこの時点でYouTubeにあげても 処理を中止しましたこの動画は処理されませんでした。 でエラーが発生する。もう一回やってみ的なエラーらしいんだけど、もう一回やっても無理でした。まぁなんか無理な気はしてました。ウォーミングアップで実行したけど、そもそももとの記事に書かれてたのは別のオプション描いてましたしね。

ちゃんと実行する。 mov にしないといけないのかな。

$ ffmpeg -loop 1 -i ./piano.jpeg -i ./hanon.m4a -c:a copy -c:v libx264 -crf 51 -pix_fmt yuv420p -shortest output.mov
・
・
めちゃくちゃ時間がかかるけどできる
・
・
$ file output.mov
output.mov: ISO Media, Apple QuickTime movie, Apple QuickTime (.MOV/QT)

お、できた。結構時間かかる。この辺はPCのスペックにかなりよるんだろうけど、2分弱の音声ファイルで10分くらいかかった気がする。実行時のオプションに -preset ultrafast をつけると処理が早くなるらしいけど5分くらいになったかな。画像も荒いけど背景に表示される。荒いのかな。サイズが合ってないからかな。

で、YouTubeにアップロードしてみる。アップロードできるぽいけど、 4K までの動画を処理しています ... 残り 60 分 と表示される。嘘だろ。2分弱の動画だぞ、と思ったけれど、まぁこれも10分くらいで公開可能状態となった。

いちおう、できた、ってことにするか……。

オプションの理解をしておきたいけれど、動画関係は全く理解していないのでChatGPTに聞いておいた結果を残しておく。信頼性、取り扱いにはご注意。てか、そうであれば最初からコマンドも書いてもらえって時代になってきたな。

-loop 1

これは画像入力ファイルにのみ適用されるオプションです。指定された画像（piano.jpeg）をループモードで繰り返し表示します。具体的には、1 が指定されると、画像が1回ループ（つまり繰り返されずに1回だけ表示）されます。この設定により、オーディオの長さに合わせて画像が表示され続けます。

-i ./piano.jpeg -i ./hanon.m4a

入力ファイルを指定します。ここでは2つの入力ファイルがあり、./piano.jpeg は画像ファイル、./hanon.m4a はオーディオファイルです。

-c:a copy

オーディオのコーデックに関する指定です。ここではオーディオストリームを変換（エンコード）せずにそのままコピーすることを指示しています。

-c:v libx264

ビデオのコーデックに関する指定です。ここではlibx264（H.264コーデック）を使用してビデオをエンコードします。

-crf 51

エンコードの品質を指定します。libx264におけるCRF（Constant Rate Factor）は0から51の範囲の値を取ることができ、小さい値ほど高品質となり、大きい値ほど低品質となります。51は最も低品質な設定で、そのため出力される動画のサイズは小さくなります。

-pix_fmt yuv420p

ピクセルフォーマットを指定します。yuv420pは色空間をYUVにし、クロマサブサンプリングを4:2:0で行うことを意味します。この設定は、多くのデバイスやプレイヤーと互換性があります。

-shortest

複数の入力ストリームがある場合、最短のストリームが終了した時点で出力も終了することを指示します。ここでは、オーディオが終了した時点で動画も終了するようになります。