Wordpress
Source Vitals
Source Attributes | Description |
---|---|
Data Collection Method | crawling feed sources |
Geographic Coverage | global; WordPress publishers and visitors produce thousands of new posts and comments every hour across millions of web sites. The source includes several million messages per day but not all web sites and content are available without direct feed access. |
Key Use Cases | brand reputation & product research |
Delivery Methods | The crawled feed is available as a standalone data stream. Also available as a filtered stream or via the Blogs/Search API. |
Understanding Targets & Objects
Depending on the 'verb' or action type of the message, the "target" and "object" json blocks are populated differently. When the message represents a new post, its parent object in the Wordpress data structure, the blog site, is represented in the "target" object. When the message is a new comment, the parent post is represented in the "target" fields and the comment itself is the "object". A comment is never represented in the "target" block, only as an "object" or child entity.
Data Dictionary
Field | Description | Example | Data Type |
---|---|---|---|
verb | The type of action the post represents, "post", "comment" | "post" | string |
published | Timestamp when the action was taken, in UTC | "2024-08-26T18:08:21Z" | string |
displayName | Depending on the verb type, the displayName field in the top-level json object may contain a blog post title or comment text. Note that displayName also appears in the "target" and "object" blocks. | "Cooking ticket marathi" | string |
content | Provided in the top-level json object when the message is a comment, and contains the comment text | "<p>Από πότε τα εθνόσημα μπαίνουν σε προσκλήσεις? Επίσης θα μας ενδιέφερε να μάθουμε μιας και βλέπουμε από αριστερά την περιφέρεια και δεξιά τον δήμο και όλα είναι σε πλήρη διαφάνεια, έχει χρηματοδοτηθεί η συγκεκριμένη εκδήλωση? Αν ναι από ποιόν και με τι ποσό. Αν όχι μια αρνητική απάντηση τότε.</p>" | string |
object - id | A post or comment ID | 592654 | int |
object - objectType | "article" when the message is a post; "comment" when the message is a comment; "blog" when the message indicates the deletion of a blog site. Blog site creation events are not sent. | "article" | string |
object - published | Published timestamp pertaining to the object, in UTC | "2024-08-26T18:08:21Z" | string |
object - displayName | Contains the title of a post when the article represents a post. When the message is a comment, contains the title of the post preceded by the text "Comment on" and followed by the display name of the commenter. | "Instant Pot Teacher" | string |
object - wpcom:post_id | ID for a post | 592654 | int |
object - wpcom:post_type | Can be "post" or "page". Messages which appear in a blog format are "posts"; "pages" are content pages on a Wordpress site which are outside the blog format and typically more static. | "post" | string |
object - summary | A short summary of a post, where provided by the author. | "नारळीभात #नारळाच्यावड्या #cookingticketmarathi #naralibhat Cooking ticket marathi Special Masale …" | string |
object - updated | The timestamp when the message was updated, in UTC | "2024-08-26T18:08:21Z" | string |
object - content | The content of a post in HTML markup | "Instant Pot Pressure Cooker How To Easy Recipe Videos" | string |
object - permalinkUrl | A permalink pointing to a post | string | |
object - url | Provided when the message is a comment and contains a link to the comment | string | |
object - wpCommentId | An ID for a comment | "686375" | string |
target - wpcom:blog_id | An ID for a blog | "168801090" | string |
object - tags - url | The URL for a page on wordpress.com where posts with a particular tag across different blogs are linked | "url": "http://en.wordpress.com/tag/review " | string |
object - tags - objectType | Either "category" or "tag". A category is a classification which represents a broad category of posts. A "tag" represents more detailed topic discussed within one or more posts. | "tag" | string |
object - tags - displayName | The display name of a tag on a post. Provided when the object is a post, not a comment. | "review" | string |
object - inReplyTo - url | When the message represents a comment that is made in reply to another comment, this field contains the URL of the parent comment. We are no longer able to retrieve this field. | null | string |
provider - url | A URL which indicates whether the message is a post or a comment from wordpress.com (ending in /posts.json and /comments.json) or wordpress.org, i.e. a Wordpress-powered site on a domain other than wordpress.com (/posts.org.json or /comments.org.json). Possible values: | "http://xmpp.wordpress.com:8008/posts.org.json" | string |
generator - url | always "http://www.wordpress.com" | "http://www.wordpress.com" | string |
generator - objectType | always "service" | "service" | string |
generator - displayName | always "WordPress" | "WordPress" | string |
socialgist_metadata - language | The language in which a post was written, as detected by Socialgist | "en" | string |
actor - objectType | always "person" | "person" | string |
actor - displayName | The display name for a Wordpress user, i.e. author | "Cooking ticket marathi"" | string |
actor - url | A URL which represents the user or site which created the content is no longer accessible. We populate the field with the author name. | "Cooking ticket marathi"" | string |
actor - wpEmailMd5 | Md5 hash of the user's email address, where provided. | null | string |
actor - id | A user id is not available. | null | int |
target - id | Provided for wordpress.com content. When the target is a post, contains the blog ID prefaced with "object:wordpress.com:". | "object:wordpress.com:168801090" | string |
target - url | Contains the URL of the blog site when the message is a post; contains the URL of the parent post when the message is a comment | string | |
target - lang | The language in which a post was written, as detected by Wordpress | "en" | string |
target - objectType | "article" when the message is a comment, indicating a parent post/article; "blog" when the target is a post/article | "blog" | string |
target - wpcom:blog_id | Numeric ID for a blog site. | “168801090” | string |
target - summary | When the message is a post, contains a summary of the blog site's purpose and content provided by the author where available. When the message is a comment, contains a summary of the parent post where available. | "Instant Pot Pressure Cooker How To Easy Recipe Videos" | string |
target - displayName | When the message is a post, contains the title of the blog site. When the message is a comment, contains the title of the parent post. | "Instant Pot Teacher" | string |
target - feed | Contains an RSS feed URL for a blog site. Provided on posts, not comments. | string | |
target - ads_enabled | A Boolean indicating whether a blog site accepts advertisements. Provided when the message is a post. | FALSE | string |
target - wpcom:post_id | Numeric ID for a post | 592654 | int |
target - wpcom:post_type | A content type indicator, typically "post" but also sometimes "page" (a static web page outside of the blog format), "question", "product" indicating a product page, or "video" | "post" | string |
target - author - id | Numeric ID of a post author | 168801090 | int |
target - author - url | Profile URL for an author, typically a Gravatar URL | string | |
target - author - objectType | always "person" | "person" | string |
target - author - wpEmailMd5 | Md5 hash of the user's email address, where provided | null | string |
target - author - displayName | Display name of the author of a post | "Cooking ticket marathi" | string |
Output Examples
An example post in the Wordpress crawled stream:
{
"published": "2024-08-26T18:08:21Z",
"displayName": "Cooking ticket marathi",
"verb": "post",
"generator": {
"objectType": "service",
"url": "http://www.wordpress.com",
"displayName": "Wordpress"
},
"provider": {
"url": "http://xmpp.wordpress.com:8008/posts.json"
},
"actor": {
"objectType": "person",
"url": null,
"id": null,
"displayName": "Cooking ticket marathi",
"wpEmailMd5": null
},
"target": {
"objectType": "blog",
"displayName": "Instant Pot Teacher",
"summary": "Instant Pot Pressure Cooker How To Easy Recipe Videos",
"url": "https://instantpotteacher.com/",
"feed": "https://instantpotteacher.com/feed/",
"id": "object:wordpress.com:168801090",
"wpcom:blog_id": "168801090",
"lang": "en",
"ads_enabled": "true"
},
"object": {
"objectType": "article",
"published": "2024-08-26T18:08:21Z",
"summary": "नारळीभात #नारळाच्यावड्या #cookingticketmarathi #naralibhat Cooking ticket marathi Special Masale …",
"updated": "2024-08-26T18:08:21Z",
"permalinkUrl": "https://instantpotteacher.com/%e0%a4%a8%e0%a4%be%e0%a4%b0%e0%a4%b3%e0%a5%80-%e0%a4%ad%e0%a4%be%e0%a4%a4-%e0%a4%ac%e0%a4%a8%e0%a4%b5%e0%a4%a3%e0%a5%8d%e0%a4%af%e0%a4%be%e0%a4%9a%e0%a5%80-%e0%a4%b8%e0%a4%b0%e0%a5%8d%e0%a4%b5/",
"id": 592654,
"displayName": "Instant Pot Teacher",
"content": "Instant Pot Pressure Cooker How To Easy Recipe Videos",
"wpcom:post_id": 592654,
"wpcom:post_type": "post",
"tags": null
},
"wpPostRelatedLinks": [],
"socialgist_metadata": {
"language": "en"
}
}
An example comment in the Wordpress comment stream
{
"content": "<p>Από πότε τα εθνόσημα μπαίνουν σε προσκλήσεις? Επίσης θα μας ενδιέφερε να μάθουμε μιας και βλέπουμε από αριστερά την περιφέρεια και δεξιά τον δήμο και όλα είναι σε πλήρη διαφάνεια, έχει χρηματοδοτηθεί η συγκεκριμένη εκδήλωση? Αν ναι από ποιόν και με τι ποσό. Αν όχι μια αρνητική απάντηση τότε.</p>",
"published": "2024-08-26T09:28:21Z",
"displayName": "Από: Ανώνυμος",
"verb": "comment",
"generator": {
"objectType": "service",
"url": "http://www.wordpress.com",
"displayName": "Wordpress"
},
"provider": {
"url": "http://xmpp.wordpress.com:8008/posts.json"
},
"actor": {
"objectType": "person",
"displayName": "Ανώνυμος",
"wpEmailMd5": null,
"url": null,
"id": null
},
"target": {
"objectType": "article",
"displayName": "Από: Ανώνυμος",
"summary": "Από πότε τα εθνόσημα μπαίνουν σε προσκλήσεις? Επίσης θα μας ενδιέφερε να μάθουμε μιας και βλέπουμε από αριστερά την περιφέρεια και δεξιά τον δήμο και όλα είναι σε πλήρη διαφάνεια, έχει χρηματοδοτηθεί η συγκεκριμένη εκδήλωση? Αν ναι από ποιόν και με τι ποσό. Αν όχι μια αρνητική απάντηση τότε.",
"author": {
"objectType": "person",
"url": "Ανώνυμος",
"id": "Ανώνυμος",
"displayName": "Ανώνυμος",
"wpEmailMd5": null
},
"url": "https://kozan.gr/archives/571492#comment-686375",
"id": "object:wordpress.com:46755246:571492",
"wpcom:blog_id": "46755246",
"lang": "el",
"wpcom:post_id": "571492",
"wpcom:post_type": "post",
"wpCommentCount": 0
},
"object": {
"objectType": "comment",
"displayName": "Σχόλια σε: Πρόσκληση: 11η Γιορτή Πατάτας Καπνοχωρίου",
"published": "2024-08-26T09:28:21Z",
"url": "https://kozan.gr/archives/571492#comment-686375",
"id": "686375",
"wpCommentId": "686375",
"inReplyTo": null
},
"socialgist_metadata": {
"language": "el"
}
}