Blogs
Source Vitals
Source Attributes | Description |
---|---|
Data Collection Method | public RSS feeds |
Latency | Generally under 24 hours |
Geographic Coverage | global, over 170 languages |
Key Use Cases | product & market research; company reputation analysis; influencer identification |
Delivery Methods | full stream; filtered stream; search API |
Data Dictionary
Element | Description | Data Example |
authoremail | The author's email address. | null |
authorname | Name of the author who submitted the comment. | Terra Trevor |
authorurl | The url to the author's profile page. | |
bloghost | The title of the blog. | blogger |
bloghostid | For internal use only. | 1 |
blogid | An incrementing numeric number to uniquely identify blog records. | 87896168 |
blogtitle | Title of the blog | Terra Trevor |
category | The post category value specified by the author of the post | Writing |
content | The content of the comment. | "<div style=\"text-align: right;\"><div style=\"text-align: left;\"><a href=\"http://terratrevor.blogspot.com/\"><span style=\"font-family: \"georgia\" , \"times new roman\" , serif;\">Read Terra's Blog</span></a></div></div><div style=\"text-align: right;\"><div style=\"text-align: left;\"><span style=\"font-family: \"georgia\" , \"times new roman\" , serif;\"><br /></span></div></div><div style=\"text-align: left;\"><div style=\"text-align: right;\"><div style=\"text-align: left;\"><b><i><span style=\"font-family: \"georgia\" , \"times new roman\" , serif;\"><a href=\"http://terratrevor.blogspot.com/\">Writing, Reading and Living</a></span></i></b></div></div></div><div style=\"text-align: left;\"><span style=\"font-family: \"georgia\" , \"times new roman\" , serif;\">For me, writing is a way of reaching out to others, to people I don't know. I sit alone, in silence, but all that time I’m out there, connecting with whoever reads my words.</span></div>", |
country | The source country identifier based on 2-letter codes from ISO-3166 | us |
generator | The generator field we get from the feed, which describes the software used for the blog technology. | |
guid | The md5 hash value of the post permanent url | 52d0f0359b2f6ec49c69d499fa8c40b5 |
lang | The detected language at the comment level. | en |
Link/href | The link to the feed | |
Link/rel | Always set to "alternate" | alternate |
Link/type | Indicates the feed type, it can have one of the following values: | application/atom+xml |
mainurl | The url to the blog homepage | |
parseddate | The date time when we collected the comment. | 2020-08-08T20:01:28 |
post | A parent element to contain a single post. | "post": { |
postid | This is a guid value that is created on the fly when we extracted the post from the source. We do not track this value internally (e.g. not the same as post ID retrieved via API) | db047f80-c858-465b-a01e-b54430d99478 |
postlink | The url to the post permanent link | http://www.terratrevorauthor.com/2020/08/read-terras-blog-writing-reading-and.html |
providerid | A numeric ID for the blog content provider. | 0 |
pubdate | The post publication date in UTC format | 2020-08-06T09:00:00 |
source | name of the channel that the item came from | null, |
title | The title of the post or comment | Reading, Writing and Living |
updated | Updated date If the post was updated (applies only to blogs with atom feeds) | 2020-08-06T15:24:03 |
Sample Post Message
{
"post": {
"Link": {
"rel": "alternate",
"type": "application/atom+xml",
"href": "http://differentpenproductions.blogspot.com/feeds/posts/default"
},
"bloghostid": "1",
"providerid": "0",
"bloghost": "blogger",
"country": null,
"generator": "Blogger v7.00 (http://www.blogger.com/)",
"postid": "8570602b-2cc1-f482-73cd-6b7b0afda54d",
"blogid": "68738433",
"sourceguid": "7335a9866927c2d334a3e5a3965421e3",
"title": "the five decades signpost",
"blogtitle": "Different Pen",
"mainurl": "http://differentpenproductions.blogspot.com/",
"postlink": "http://differentpenproductions.blogspot.com/2024/08/the-five-decades-signpost.html",
"content": "<p>I'm loved by God. I don't need additional love.</p><p>I carry faith (trust instead of worry), hope (joy and beauty) and love. </p><p>If I lose faith and even hope, love will still be there.</p><p>I don't have to live up to any norms and expectations, explain myself or submit to shame.</p><p>I will never marry. I'm set apart for something higher.</p><p>The kingdom of God is near and I'm bringing friends.</p>",
"authorname": "Different Pen",
"authoremail": null,
"authorurl": "http://www.blogger.com/profile/02713516046183679147",
"category": "de profundis,dreams,poet facts",
"guid": "6eb91ce066a3963a9dfde608c1e06513",
"source": null,
"pubdate": "2024-08-09T12:30:00",
"updated": "2024-08-09T12:30:41",
"parseddate": "2024-08-10T01:57:14",
"lang": "en"
}
}
Sample Comment Message
{
"comment": {
"bloghostid": "1",
"bloghost": "blogger",
"providerid": "0",
"provider": "",
"discoveryMethod": "2",
"generator": "Blogger v7.00 (http://www.blogger.com/)",
"sourcetype": "blogs",
"sourceid": "88769412",
"sourceurl": "http://lion-muthucomics.blogspot.com/",
"sourceguid": "cb5272ad834374baa5e9d67b1ccecaa5",
"sourcecrawled": "2020-02-02T00:44:30",
"sourcelanguage": "ta",
"sourcetitle": "Lion-Muthu Comics",
"country": null,
"postlink": "http://lion-muthucomics.blogspot.com/2020/02/blog-post.html",
"postguid": "c1127db48f7b5e477405c068b4834686",
"postpublished": "2020-02-01T18:46:00.000Z",
"posttitle": "அன்போடு அண்ணாத்தே !!",
"commentid": "08a6b353-540a-6ea1-2f13-8d3d16a7606e",
"commentlink": "http://lion-muthucomics.blogspot.com/2020/02/blog-post.html?showComment=1580739049577#c3189066246105663132",
"title": "மாண்ட்ரேக்கை வழிமொழின்றேன்",
"content": "மாண்ட்ரேக்கை வழிமொழின்றேன்",
"authorname": "R.வெங்கடேசன்",
"authorurl": "https://www.blogger.com/profile/10499829746774793680",
"authoremail": null,
"pubdate": "2020-02-03T14:10:49",
"parseddate": "2024-08-09T01:17:27",
"lang": "ta"
}
}