VSUMM TOOLKIT

Visualising main themes in each video to compare the content of two collections of videos.

RQ: How are themes distributed in each Amazon Fires related Youtube?

VISUALIZATION

ZOOM OUT

VIDEO: JE BALAYE LES BOBO PRAY-FOR-AMAZONIA - EXPLIQUEZ-MOI CETTE MERDE #13

o Desmatamento na AMAZÔNIA de um jEito SIMPLES

Amazon rainforest on fire Lungs of the world in flame Nightline

Amazon Forest fire | Odhygyan Scienze

Amazon Forest Fire / What it tell us about deforestation

Amazon forest / Drone footage reveals after math Amazon Fires

Why the amazon fires are such a big deal

India's Secret Amazon Fire | Tamil

The Amazon Rainforest is On Fire & Nobody Cares

Who is Responsible for Amazon Forest Fire? | Tamil | LMES

Flying above the Amazon fires: 'All you can see is death'

How to read

colour = thematic cluster Responsive image

STEPS

WHAT’S IT FOR

TOOLS

DETAILS AND MATERIALS

SCRAPING

DATA EXPLORATION

DATA PREPARATION

URLS CREATION

RENAMING THE NEW COLUMN

DOWNLOAD VIDEOS

COLLECTING VIDEOS IN A NEW FOLDER

FRAME EXTRACTIONS BY CHANGE OF SCENE

COLLECTING ALL FRAMES IN THE FOLDER OF THE VIDEO THEY BELONG TO

CREATION OF A VECTOR SPACE WITH ALL THE FRAMES OF A SINGLE VIDEO AT ONCE

CHOOSE THE VIEW ON PIXPLOT

EXPORT THE VISUALISATION

ANNOTATE THE VISUALISATION

GET A LIST OF VIDEOS FOR EACH CHOSEN QUERY AND SELECTED TIME-FRAME.

OPEN THE YOUTUBE DATA TOOL(YDT) CSV DOWNLOADED AND EXPLORE THE DATA.

FILTER THE LIST BY ORDER OF VIEWS TO TAKE THE FIRST 10 VIDEOS AS SAMPLES FOR EACH QUERY.

INSIDE THE YDT.CSV THERE IS ONLY THE VIDEOS ID, BUT YOU NEED THE URL TO DOWNLOAD THEM.

TO KEEP TRACK OF THE NEW COLUMN IN WHICH WE HAVE ALL THE VIDEO URLS.

DOWNLOAD THE VIDEO SAMPLE QUICKLY AND AUTOMATICALLY.

IT’S IMPORTANT FOR THE NEXT SCRIPT THAT THE FOLDER CONTAINS ONLY THE DOWNLOADED VIDEOS.

* THE SCRIPT EXTRACTS THREE FRAMES EVERY SCENE CHANGE.

THE DETECT.PY SCRIPT CREATES A SUBFOLDER FOR EACH VIDEO IN WHICH IT INSERTS ALL THE DETECTED FRAMES.

YOU MUST LAUNCH ONE PIXPLOT AT A TIME FOR EACH FRAMES FOLDER TO ORGANISE THEM BY THEIR VISUAL SIMILARITY.

FOR THIS METHOD CHOOSE THE VIEW:CLUSTER IMAGES BY UMAP DIMENSIONALITY REDUCTION ON A GRID.

TAKE A SCREENSHOT OR USE THE "SAVE AS" COMMAND TO OBTAIN A STATIC IMAGE ON WHICH TO MAKE ANNOTATIONS.

HIGHLIGHTING THEMATIC CLUSTERS.

Youtube Data Tools ⟶ [Video List]

Excel ⟶ [Import Data]

Excel ⟶ [Filter-Discending]

Excel ⟶ =CONCATENA(E2;F2)

Excel

Python 3 ⟶ [PyTube3]

No tool needed

Python 3 ⟶ [PySceneDetect]

No tool needed

Anaconda + Python3 + Pixplot

Pixplot

Figma

“Amazon Fires” - “Pray for Amazonia”

videoIdvideoTitlepublishedAtviewCountposition

E2 ⟩ http://www.youtube.com/watch?v=

F2 ⟶ videoId

videoUrl

LINK TO PYTHON3 DOCUMENTATION

LINK TO PYTUBE3 DOCUMENTATION

LINK TO REPOSITORY AND STEP-BY-STEP GUIDE

Rename the videos inside the folder like this:

vid1vid2vid3vid4

LINK TO PYSCENEDETECT DOCUMENTATION

LINK TO REPOSITORY AND STEP-BY-STEP GUIDE

Rename the frames inside the folder like this:

frame1frame2frame3frame4frame5

LINK TO INSTALL ANACONDA

LINK TO PIXPLOT DOCUMENTATION

LINK TO REPOSITORY AND STEP-BY-STEP GUIDE

LINK TO DOWNLOAD FIGMA

METODOLOGY

aim

In this type of analysis, we move from an overview of the frames in the whole collection to a view of the frames for each video. The aim is to be able to compare the contents of two collections of videos based on the thematic clusters found in each video. Frame extraction for this purpose is based on scene change detection, so that the images to be analysed are only taken once and there are no duplicates due to scene length. All frames of a video are analysed with Pixplot and arranged in a UMAP grid based on their visual similarity. The different thematic clusters in each grid are highlighted with colour areas using Figma.

output

The final visualisation consists of a series of matrices representing the videos, where the thematic components that are discussed in the video are represented with different colours.. This type of visualization allows a summary comparison of the thematic contents among the videos of two different collections.