Item type |
SIG Technical Reports(1) |
公開日 |
2022-03-07 |
タイトル |
|
|
タイトル |
視線情報を考慮した機械学習に基づく一人称視点映像の自動要約手法 |
タイトル |
|
|
言語 |
en |
|
タイトル |
An Automatic Summarization Method for First-Person-View Video Based on Machine Learning Considering Gaze Information |
言語 |
|
|
言語 |
jpn |
キーワード |
|
|
主題Scheme |
Other |
|
主題 |
システム開発 |
資源タイプ |
|
|
資源タイプ識別子 |
http://purl.org/coar/resource_type/c_18gh |
|
資源タイプ |
technical report |
著者所属 |
|
|
|
関西学院大学 |
著者所属 |
|
|
|
関西学院大学 |
著者所属(英) |
|
|
|
en |
|
|
Kwansei Gakuin University |
著者所属(英) |
|
|
|
en |
|
|
Kwansei Gakuin University |
著者名 |
濱岡, 啓太
河野, 恭之
|
著者名(英) |
Keita, Hamaoka
Yasuyuki, Kono
|
論文抄録 |
|
|
内容記述タイプ |
Other |
|
内容記述 |
本研究では,長時間にわたる一人称視点映像の高速閲覧を目的とし,視線情報を考慮した機械学習に基づく一人称視点映像の自動要約手法を提案する.一人称視点映像は日常生活やスポーツなどの記録を残す手段の一種であり,ウェアラブルカメラの小型化及び普及に伴い一般に広く浸透した.しかし,ライフログ等のために常時撮影された一人称視点映像は長時間にわたる.そのためユーザにとって有益でないシーンを含むケースが多く,映像の閲覧に時間を要する問題点がある.本研究では視線情報に着目し,視線移動量,瞳孔径,瞼の開き具合を基にした機械学習による予測モデルを構築することで各シーンに対する興味度の推定を行う.興味度が高いシーンは通常速度,低いシーンは高速再生することで入力映像を要約する.評価実験の結果,提案手法により生成した要約映像はユーザの興味や潜在的な意識を反映していることを確認した.本システムを用いて一人称視点映像を要約することでユーザの行動認識や視覚的な日記の作成だけでなく,記憶障害のある患者の支援等多くのアプリケーションに応用可能である. |
論文抄録(英) |
|
|
内容記述タイプ |
Other |
|
内容記述 |
We propose an automatic summarization method for first person view video using interest estimation by machine learning based on gaze information. First-person view video is a method for keeping records in daily life, sports, etc., and it became widespread as wearable cameras have become smaller. Since a wearable camera can shoot first-person view video with both hands free, it can record the user's natural actions. Many of first-person view video are long, and often include scenes which are not useful to users. There is a problem that it takes a long time to view the movie. Our proposal is the method for summarizing long first-person view video. There are a lot of study focusing on video summarization. For example, Higuchi, et al. summarized first-person view video based on four cues that correspond to the basic user actions of body movement, stillness, hand movement, and human interaction. The user sets the importance of four cues, and the scenes with high importance are reflected in the video after summarization. The contents of the input video are not taken into consideration because the cues are fixed to four. This research proposes the video summarization system employing gaze tracking and machine learning. Because gaze is useful for knowing user's intention and interest, our approach reflects the user's interest and potential consciousness in the video by summarizing the video with the above information. In the previous study, we show first-person-view video to the users using a head mounted display with a gaze measurement function and obtain the amount of change in the user's gaze direction vector, pupil diameter, and eyelid opening condition. The user enters whether or not he/she is interested in each scene by keystroke when viewing the video. We create dataset based on the interest level and gaze information. Preprocessing such as smoothing, removal of missing values, and normalization is performed on the dataset. Based on the pre-processed dataset, we construct a prediction model for estimating the user's level of interest through machine learning. In main process, the user shoot first-person-view video using a head mounted display with a gaze-measuring function, and obtain the user's gaze information and camera images. We estimate the degree of interest for each scene by using the prediction model constructed in the previous study. In the generated summary video, important scenes are played back at normal speed, and other scenes are played back at high speed. As a result of conducting an evaluation experiment, it became clear that our system is useful for fast viewing of videos and videos summarized videos reflect the user's interest. Our system is applicable to many applications such as behavior recognition, to create visual diary, and to support for patients with memory impairment. |
書誌レコードID |
|
|
収録物識別子タイプ |
NCID |
|
収録物識別子 |
AA1221543X |
書誌情報 |
研究報告ヒューマンコンピュータインタラクション(HCI)
巻 2022-HCI-197,
号 46,
p. 1-8,
発行日 2022-03-07
|
ISSN |
|
|
収録物識別子タイプ |
ISSN |
|
収録物識別子 |
2188-8760 |
Notice |
|
|
|
SIG Technical Reports are nonrefereed and hence may later appear in any journals, conferences, symposia, etc. |
出版者 |
|
|
言語 |
ja |
|
出版者 |
情報処理学会 |