Youtube

Youtube

Youtube videos and comments are discovered and collected through YouTube channel subscriptions and generic keyword searches.

  • Average volumes - 885,833 new videos & 23,677,166 new comments.

 

Streaming Json Schema Mapping

Videos field name

Field description

Data example

video

The parent element that contains information about an individual video

"video": {

id

An ID generated by the crawler to uniquely identify each video. 

yt.FauOVXgJ0zU

siteid

A unique text value that identifies the video site (e.g. youtube).

youtube

site_videoid

An identifier created by the video site to uniquely identify each video

FauOVXgJ0zU

title

The title of the video.

maaf jelek

description

The description of the video which is not always included.

 

thumb_url_code

The image of the screenshot of the actual video.

https://i.ytimg.com/vi/FauOVXgJ0zU/default.jpg

tags

Tags that describe the content of the video. Not always included.

[]

category

The category the video was assigned

People and Blogs

published

The date/time the video was published (UTC)

2024-08-13T13:11:01Z

crawled

The date/time the video was harvested (UTC)

2024-08-16T14:30:11Z

duration

The length of the video in seconds

17

videourl

The URL where the video was found.

 

 

https://www.youtube.com/watch?v=FauOVXgJ0zU

author

The parent element that contains author information

author: {

author/name

The name of the author who submitted the video

Prinsca gaming

author/site_authorid

An identifier created by the video site to uniquely identify each author

UCwmB2IO7xlu4fmTdH6Ul36A

author/authorurl

The URL to the author's profile page

http://youtube.com/channel/UCwmB2IO7xlu4fmTdH6Ul36A

lang

The language code to identify the language used in comment.

id8

langid

The socialgist internal mapped id for this language

40

comment

The parent element that contains information about an individual comment

comment: {

id

An ID generated by the crawler to uniquely identify each video or comment. 

yt.Ugxb8X5GgmyTo0M24FF4AaABAg

videoid

The ID of the Video that the comment is associated with. This is used by BoardReader as the Thread ID.

yt.9IHwqdz8Xhw

siteid

A unique text value that identifies the video site (e.g. youtube).

youtube

site_commentid

An identifier created by the video site to uniquely identify each comment

UgxPGSMDXNWXumpJw8d4AaABAg

title

Some video sites include a title for each comment. Most cases we just add part of the first sentence of the comment text.

অতীত এর ভুল থেকে শিক্ষা নিবেন। এ দেশ হতে আ লিগ শব্দ টা চিরতরে

content

The text of the comment.

অতীত এর ভুল থেকে শিক্ষা নিবেন। এ দেশ হতে আ লিগ শব্দ টা চিরতরে কবর দিতে হবে জনগনের কাছে আইডল হতে হলে পরানতিক পর্যায়ে রিদয়ের ভিতর ডুকতে হলে জিয়াউর রহমানের আদর্শ উপলব্ধি করবেন। নির্বাচন নিয়ে মাথা…..

published

The date/time the video was published (UTC)

2024-08-15T18:56:03Z

crawled

The date/time the video was harvested (UTC)

2024-08-16T13:50:42Z

videourl

The URL where the video was found.

 

 

commentsurl

A URL that displays all the comments for the video. For YouTube this is limited to the most recent 1,000 comments.

 

 

author

The author object element that contains information about the author of the comment

"author": {

author/name

The comment author’s name or video account name

@kingtexzone5051

author/site_authorid

The video site specific id for the author

UCUb5_lzoVTJ67OienqU5FLQ

author/authorurl

The url for the author of the comment. In some cases it can also be the author of the video.

http://www.youtube.com/@kingtexzone5051

author/profile_picture

The author profile avatar or picture.

https://yt3.ggpht.com/ytc/AIdro_kvrMWUD2Ptl9SASYrYT9RvDS24j2RSUw7ieikxndaDiHo=s48-c-k-c0x00ffffff-no-rj

lang

The language code to identify the language used in comment.

bn

langid

The socialgist internal mapped id for this language

8

 

 

 

SAMPLE MESSAGE TYPES

Video

{ "video": { "id": "yt.MK6nBKODkVE", "siteid": "youtube", "site_videoid": "MK6nBKODkVE", "title": "🟢 Thomas train exe vs Sonic the headgehog exe vs Siren Head vs Spider House Head 🌟 Who is best?", "description": "🟢 Thomas train exe vs Sonic the headgehog exe vs Siren Head vs Spider House Head 🌟 Who is best? Tiles Hop EDM\n\n🎥🔥 Thanks for watching! Don't forget to leave a comment and be sure to watch to the end! 🙌💬🌟\n\n\n🎶 Welcome to the mesmerizing world of Adventure TilesHop! 🌟\n\nMy content uses some images and songs from other creators and brands to create music gaming videos for relaxation and entertainment. I have used these images and songs based on the fair use guidelines of Section 107 of the Copyright Act..\nThis content is intended for recreational and entertainment purposes only.\nThis is a music game. Please use headphones for the best experience.\n\nLike 👍\nComment ✍️\nSubscribe ✅\nShare 🙏\n\nFor more gaming content subscribe to Adventure TilesHop\n\n#tileshop #tileshopeveryday #coffindance #choochoocharles #sirenhead #mcqueeneater #thomastrainexe", "thumb_url_code": "https://i.ytimg.com/vi/MK6nBKODkVE/default_live.jpg", "tags": [ "tiles hop", "tiles hop every day", "choo choo charles tiles hop", "lightning mcqueen tiles hop", "skibidi toilet tiles hop", "thomas train tiles hop", "spider thomas tiles hop", "coffin dance tiles hop", "tiles hop edm rush", "car eater tiles hop", "sonic tiles hop", "sonic exe tiles hop", "death sonic tiles hop", "siren head tiles hop", "eater tiles hop", "tiles hop song", "coffin dance", "sonic the hedgehog tiles hop", "sonic exe coffin dance", "siren head coffin dance", "house head tiles hop", "thomas train exe" ], "category": "Gaming", "published": "2024-08-16T11:28:45Z", "crawled": "2024-08-16T15:47:20Z", "duration": "3143", "videourl": "https://www.youtube.com/watch?v=MK6nBKODkVE", "author": { "name": "Adventure TilesHop", "site_authorid": "UCiW8EXSscOMq2pQ-lGsYMsw", "authorurl": "http://youtube.com/channel/UCiW8EXSscOMq2pQ-lGsYMsw" }, "lang": "en", "langid": "22" } }

 

Comment

{ "comment": { "id": "yt.UgxdiWsNRHBmpDlNggl4AaABAg.A79Qt_KSQPfA79h8Tq71DS", "videoid": "yt.s7BjsDqHsto", "siteid": "youtube", "site_commentid": "UgxdiWsNRHBmpDlNggl4AaABAg.A79Qt_KSQPfA79h8Tq71DS", "title": "How to tell I know nothing about motorcycle without saying I know nothing about motorcycle👆🏼", "content": "How to tell I know nothing about motorcycle without saying I know nothing about motorcycle👆🏼", "published": "2024-08-15T12:25:45Z", "crawled": "2024-08-16T15:47:51Z", "videourl": "https://www.youtube.com/watch?v=s7BjsDqHsto", "commentsurl": "https://www.youtube.com/watch?v=s7BjsDqHsto&lc=UgxdiWsNRHBmpDlNggl4AaABAg.A79Qt_KSQPfA79h8Tq71DS", "author": { "name": "@surendarmohan9878", "site_authorid": "UC-4A4p2rBFtcktdQfWLSgGg", "authorurl": "http://www.youtube.com/@surendarmohan9878", "profile_picture": "https://yt3.ggpht.com/ytc/AIdro_kx74ZYm1k0KoPfPQzXHltKaMnmZjeOLGI9rsIGnAgMQ0Ro=s48-c-k-c0x00ffffff-no-rj" }, "lang": "en", "langid": "22" } }

 

REST API Json Schema Mapping

 

Element or Attribute Name

Description

Included in Response? (based on mode parameter)

Basic

Full

Id

Unique ID of the Video (or comment)

x

x

Title

The Video's title. For Video comments, the title is the same as the comment's text

 

x

Text

Video description or the text of the comment

x

x

Text/@truncated

When present, a value of 'true' indicates that the contents in the Text element have been abbreviated.

x

x

TitleHtml

The HTML representation of the Post's title (or Comment's text). This element is only displayed if body=html or both.

 

x

TextHtml

The HTML representation of the Post. This element is only displayed if body=html or both.

x

x

TextHtml/@truncated

When present, a value of 'true' indicates that the contents in the TextHtml element have been abbreviated.

x

x

ThreadId

A unique identifier of the Thread (i.e. a video post and its related comments)

x

x

VideoTitle

Same as Title. Deprecated and included only for backwards compatibility.

 

x

Published

The date/time the Post was published (GMT)

x

x

Inserted

The date/time the Post was inserted into the BoardReader database (GMT)

x

x

Crawled

The date/time the Post was harvested by our crawler (GMT).

 

x

AuthorInfo/Id

The unique identifier for the author

 

x

AuthorInfo/Url

A link to the author's profile

 

x

AuthorInfo/Name

The author's name

 

x

AuthorInfo/SiteAuthorId

The video site's unique identifier for the author

 

x

AuthorInfo/FirstName

The author's first name

 

x

AuthorInfo/LastName

The author's last name

 

x

Url

The URL where the Video or Comment was found (i.e. permalink page).

x

x

ThumbnailUrl

The URL of the image with Video thumbnail

 

x

Language

The language of the video or comment text

x

x

Category

Category of the Video as it appeared on the source site

 

x

Tags

A string containing the terms used by the video author to "tag" the post.

 

x

Duration

Duration of the video in seconds

 

x

ExtKey

A reference ID for the site. 

 

 

IsComment

A Boolean value that specifies if a Post is Comment or not. Values are '0' (false) and '1' (true).

 

x

SiteInfo/Id

The unique identifier for the Video site where the Post or Comment appeared

 

x

SiteInfo/Name

The code name of the Video site used by our harvesting system

 

x

SiteInfo/SiteUrl

The base Url of the Video site.

 

x

PostSize

The size of the Post in characters

x

x

VideoStats

The VideoStats object contain statistics about the video. This object is only displayed for API Keys that have Premium Video enabled.

 

X

VideoStats/ <FavouriteCount>

The favorite count at the time the video’s stats were checked

 

VideoStats/ <NumLikes>

The number of likes at the time the video’s stats were checked

 

VideoStats/ <NumDislikes>

The number of dislikes at the time the video’s stats were checked

 

 X

VideoStats/ <CommentCount>

The number of comments as reported by the video site at the time the video’s stats were checked.

 

 X

VideoStats/ <StatsUpdated>

The date/time when the video's stats were last checked.   The value is specified in ISO 8601 (YYYY-MM-DDThh:mm:ss.Z) format.

 

 X

CommentsInThread

Obsolete

x

x

Mapping with example

Videos Field Name

Field Description

Data Example

Videos Field Name

Field Description

Data Example

Id

Unique ID used in API to identify video

71330637727320943

Title

Title of video on Youtube

Charli xcx - party 4 u (official video)

Text

Description of video created by author

Charli xcx - party 4 u (official video)\nStream: https://charlixcx.lnk.to/hifn-5yearsAY%5C%5Cn%5C%5CnDirector: Mitch Ryan\n \nManagement: Sam Pringle, Twiggy Rowley, Brandon Creed\n \nCreative Director: Imogene Strauss \n\nAtlantic Records \nVP, Creative: Andrew Reid\nEVP, Marketing: Marisa Aron\nCoordinator Creative: Caroline DeFranco\n \nProduction Company: Cadence Films\nEP/Co-Founder: Lorenzo Ragioneri\nEP/HOP: Jeff Sommar\nProducer: Sarah Park\n \nDirector of Photography: Ben Carey\n1st AC: Pancho Ortiz\n2nd AC: Fred Porras\nDIT: Gianennio Salucci\n \nProduction Designer: Miranda Lorenz\nArt Director: Matt Toth\nProp Master: Jillian Oliver\nSet Decorator: Piper Reilly\nLeadman: June Castillo\nArt PA: Henri Wuilloud \nConstruction: Superior Scenery\nSPFX Coord: Omar Torres\nSPFX: Arnold Peterson\nSPFX: Jeremy Hays\nSPFX: Cody Canel\n \nProduction Supervisor: Nathan Israel\nCoordinator: Denise Nuqui\n1st AD: Josh Montes\n2nd AD: David Henriquez\n \nGaffer: Isaac Marziali\nACLT: Em Shafer \nSLT: Matt Hall\nSLT: James \"Ryan\" Copeland\nSLT: Aron Trejo\nSLT: Harrison Segal\n \nKey Grip: Jake Reeder\nBBG: Charlie McGlinsky\nGrip: Joe Lepp\nGrip: Cale Nichols\nGrip / Driver: Nick Binnette\n \nLocation Manager: Travis Beck\nLocations: Matthew Mehl\nPA - Office: Connor Levett\nPA - Prod Truck: Jesus Arroyo\nPA - Camera Truck: Jeff Cerritos\nPA - AD: Aiden Delgado\nPA - Set: Rogelio Vargas\nPA - Set: Jack Lind\nPA - Set: India Whatley\nPA - Pass Van: Noah Vasquez\nPA - Pass Van: Mateo Caldas\nVTR: Gennadi Balitski\nPlayback: Ignacio Martinez\nCraft Service: Will Schwartz\nMedic: Mike Smith\nMotorhome - Talent Verde: Don Clark\nMotorhome - Production Bravo 35: Josh May\nWater Truck: Mike Bossen\n \nStylist: Chris Horan\nHair Stylist: Matt Benns\nMake Up: Yasmin\n \nEditing Company: Cabin Editing Company\nEditor: Dylan Edwards \nAssistant Editors: Ryan Andrus \nAssistant Editors: Diego Astorga\nEdit Producer: Ginna Schilling \nHead of Production: Michelle Dorsch \nExecutive Producer: Britt Carson \nDirector of Production: Liz Lydecker\nManaging Director: Kim Christensen\nManaging Partner: Carr Schilling \n \nColor House: Ethos Studio \nColorist: Dante Pasquinelli \nProducer: Nat Tereshchenko \nColor assists: Annie Cater \nColor assists: Alexandra Makarenko\nHead of Production: Natasha Sattler \nMD/EP: Eliana Carranza-Pitcher \nFounder/EP: James Drew\n\nText 1-310-861-2831\n\nhttps://charlixcx.com/%5C%5Cnhttps://www.tiktok.com/@charlixcx\nhttps://www.instagram.com/charli_xcx\\nhttps://twitter.com/charli_xcx\\nhttps://www.youtube.com/@officialcharlixcx\\nhttps://facebook.com/charlixcxmusic\\n\\n#Charlixcx#party4u #officialvideo

ThreadId

Id used to distinguish comments related to this video

yt.agu22bqGHto

VideoTitle

Title of video on Youtube

Charli xcx - party 4 u (official video)

Published

Timestamp of when video was published

2025-05-15 16:00:04

Inserted

Timestamp of when video was inserted to index

2025-05-15 18:54:39

Crawled

Timestamp of when video was crawled by system

2025-05-15 18:54:39

AuthorInfo

Parent element that contains author information

“AuthorInfo”:{

Id