Source Attributes
Field Definitions
Field Name | Field Description | Data Example |
---|---|---|
Description | Description of the data source and its relevance. | Weibo, often referred to as the 'Chinese Twitter,' is a leading social media platform in China, launched by Sina Corporation in 2009. It provides insights into consumer behavior, sentiment, and trends within the Chinese market. |
Data Collection Method | The method by which data is collected. | public web scraping |
Key Use Cases | The primary applications or uses of the data. | understanding public attitudes about news events and issues |
Delivery Methods | The methods by which data is delivered to the customer. | Full stream; filtered stream |
Keyword Payload Fields
Weibo Posts
Field Name | Field Description | Data Example |
---|---|---|
id | The unique identifier associated with the Weibo post. |
|
mblogid | The unique microblog identifier used in Weibo URLs. |
|
text | The content of the Weibo post, which may include text, hashtags, and HTML links. |
|
created_at | The date and time when the Weibo post was originally published (ISO 8601 format). |
|
source | The device or platform used to post the Weibo message (e.g., iPhone, Android, Web). |
|
favorited | Indicates whether the Weibo post has been favorited by the authenticated user ( |
|
reposts_count | The number of times the post has been reposted at the time of data collection. |
|
comments_count | The number of comments on the post at the time of data collection. |
|
attitudes_count | The number of likes or positive reactions to the post. |
|
attitudes_status | Additional status information about likes, possibly a code representing the attitude state. |
|
pic_ids | A list of IDs for images included in the post. |
|
status | Indicates the type or category of the status; may be |
|
status_city | The city associated with the post, if available. |
|
status_country | The country associated with the post, if available. |
|
status_province | The province associated with the post, if available. |
|
visible | An object indicating the visibility level of the post (e.g., public, private). |
|
textLength | The length of the |
|
ad_marked | Indicates if the post is a paid advertisement ( |
|
url | The URL linking directly to the Weibo post. |
|
annotations | Additional metadata or annotations associated with the post. |
|
sgMeta | Metadata related to data collection, including crawl date and keyword query. |
|
User Information
Field Name | Field Description | Data Example |
---|---|---|
The unique identifier associated with the author's Weibo profile. |
| |
user.screen_name | The display name or nickname chosen by the author. |
|
user.profile_image_url | The URL of the author's profile image (thumbnail size). |
|
user.profile_url | The URL of the author's profile page. |
|
user.close_blue_v | Indicates if the user has a closed blue verification badge ( |
|
user.description | The author's profile description or biography. |
|
user.follow_me | Indicates whether the author follows the authenticated user ( |
|
user.following | Indicates whether the authenticated user is following the author ( |
|
user.follow_count | The number of users the author is following. |
|
user.followers_count | The number of followers the author has. |
|
user.cover_image_phone | The URL of the author's cover image optimized for mobile devices. |
|
user.avatar_hd | The URL of the author's high-definition avatar image. |
|
user.badge | An object containing badges or achievements associated with the user. | See |
user.statuses_count | The total number of posts (statuses) the author has made. |
|
user.verified | Indicates whether the author is a verified user ( |
|
Trending Payload Fields
Weibo Posts
Field Name | Field Description | Data Example |
---|---|---|
id | The unique identifier associated with the Weibo post. |
|
mblogid | The unique microblog identifier used in Weibo URLs. |
|
text | The content of the Weibo post, which may include text, hashtags, and HTML links. |
|
created_at | The date and time when the Weibo post was originally published (ISO 8601 format). |
|
source | The device or platform used to post the Weibo message. |
|
favorited | Indicates whether the Weibo post has been favorited by the authenticated user ( |
|
reposts_count | The number of times the post has been reposted at the time of data collection. |
|
comments_count | The number of comments on the post at the time of data collection. |
|
attitudes_count | The number of likes or positive reactions to the post. |
|
attitudes_status | Additional status information about likes, possibly a code representing the attitude state. |
|
pic_ids | A list of IDs for images included in the post. |
|
thumbnail_pic | URL of the thumbnail-sized image associated with the post. |
|
bmiddle_pic | URL of the medium-sized image associated with the post. |
|
original_pic | URL of the original-sized image associated with the post. |
|
geo | Geographic information associated with the post. |
|
status | Indicates the type or category of the status; may be |
|
status_city | The city associated with the post, if available. |
|
status_country | The country associated with the post, if available. |
|
status_province | The province associated with the post, if available. |
|
visible | An object indicating the visibility level of the post (e.g., public, private). |
|
textLength | The length of the | Not provided in the sample payload. |
ad_marked | Indicates if the post is a paid advertisement ( |
|
url | The URL linking directly to the Weibo post. |
|
annotations | Additional metadata or annotations associated with the post. |
|
sgMeta | Metadata related to data collection, including crawl date and keyword query. |
|
User Information
Field Name | Field Description | Data Example |
---|---|---|
The unique identifier associated with the author's Weibo profile. |
| |
user.screen_name | The display name or nickname chosen by the author. |
|
user.profile_image_url | The URL of the author's profile image (thumbnail size). |
|
user.profile_url | The URL of the author's profile page. |
|
user.close_blue_v | Indicates if the user has a closed blue verification badge ( |
|
user.description | The author's profile description or biography. |
|
user.follow_me | Indicates whether the author follows the authenticated user ( |
|
user.following | Indicates whether the authenticated user is following the author ( |
|
user.follow_count | The number of users the author is following. |
|
user.followers_count | The number of followers the author has. |
|
user.cover_image_phone | The URL of the author's cover image optimized for mobile devices. |
|
user.avatar_hd | The URL of the author's high-definition avatar image. |
|
user.badge | An object containing badges or achievements associated with the user. | See |
user.statuses_count | The total number of posts (statuses) the author has made. |
|
user.verified | Indicates whether the author is a verified user ( |
|
Additional Fields from Payloads
After reviewing the payloads, the following additional fields were identified and included:
For Posts
Field Name | Field Description | Data Example |
---|---|---|
thumbnail_pic | URL of the thumbnail-sized image associated with the post. |
|
bmiddle_pic | URL of the medium-sized image associated with the post. |
|
original_pic | URL of the original-sized image associated with the post. |
|
geo | Geographic information associated with the post. |
|
ad_marked | Indicates if the post is marked as an advertisement ( |
|
annotations.client_mblogid | Client microblog ID, possibly related to the device or session used to post. |
|
annotations.source_text | Source text related to the post, possibly empty. |
|
annotations.phone_id | Phone identifier, possibly related to the device used to post. |
|
annotations.mapi_request | Indicates if the request was made via Weibo's API ( |
|
sgMeta.publishedDateUtc | The date and time when the post was published in UTC format. |
|
sgMeta.clientId | Client identifier, possibly used internally. |
|
sgMeta.feed | Indicates the feed from which the data was collected (e.g., |
|
sgMeta.kafkaTopic | The Kafka topic to which the data was published. |
|
Notes on Field Values
visible.type
Values
0
: Public post.1
: Private post.3
: Group post.4
: Friend post.Type > 0
: Not public; often can be ignored.
user.verified_type
Values
-1
: Ordinary user.0
: Personal verification.1
: Government.2
: Enterprise.3
: Media.4
: Campus.5
: Website.6
: Application.7
: Organization.200
: Beginner.220
: Intermediate and advanced experts.400
: Deceased verified user.
Trending Post Payload
{
"id": 5080430986659877,
"mblogid": "OxKnorWxn",
"text": "<a href=\"//s.weibo.com/weibo?q=%23泽连斯基称已准备好胜利计划%23\" target=\"_blank\">#泽连斯基称已准备好胜利计划#</a> 【泽连斯基:已准备好结束俄乌冲突的\"胜利计划\"】...",
"created_at": "2024-09-19T13:30:00.000Z",
"source": "《环球时报》社有限公司官方微博",
"favorited": false,
"reposts_count": 3,
"comments_count": 20,
"attitudes_count": 36,
"attitudes_status": 0,
"status": null,
"status_city": null,
"status_country": null,
"status_province": null,
"visible": {
"type": 0,
"list_id": 0
},
"textLength": 851,
"ad_marked": null,
"url": "https://m.weibo.cn/detail/5080430986659877",
"annotations": [
[],
{
"client_mblogid": null
},
{
"source_text": null
},
{
"phone_id": null
},
{
"mapi_request": null
}
],
"user": {
"id": 1974576991,
"screen_name": "环球时报",
"profile_image_url": "https://tvax2.sinaimg.cn/crop...jpg?...",
"profile_url": "/u/1974576991",
"close_blue_v": null,
"description": null,
"follow_me": false,
"following": false,
"follow_count": null,
"followers_count": null,
"cover_image_phone": null,
"avatar_hd": "https://tvax2.sinaimg.cn/crop...jpg?...",
"badge": {},
"statuses_count": null,
"verified": true
},
"sgMeta": {
"crawledDateUtc": "Thu, 19 Sep 2024 14:48:01 GMT",
"keywordQuery": [],
"type": "status"
}
}
Keyword Search Post Payload
{
"id": "5080078758707432",
"mblogid": "OxBdhAxcs",
"text": "今天在深圳逛了幾個商場,發現很多地方都開了或準備開與俄羅斯有關的商店,實在令人開心,因可以有更多的選擇[酷]。 大家請緊記,九運是旺北面的,中國北面是俄羅斯,所以與俄羅斯合作在俄羅斯中賺錢會是一個不錯的選擇[色]。 #俄罗斯# ",
"created_at": "2024-09-18T14:10:21.000Z",
"source": "华为畅享 50 Pro",
"favorited": false,
"reposts_count": 0,
"comments_count": 1,
"attitudes_count": 3,
"attitudes_status": null,
"pic_ids": [
"0075LoZWgy1hts7imiwdaj30u0140n3i"
],
"status": 0,
"status_city": null,
"status_country": "中国",
"status_province": "香港",
"visible": {
"type": 0,
"list_id": 0
},
"textLength": 239,
"ad_marked": false,
"url": "https://m.weibo.cn/detail/5080078758707432",
"annotations": {},
"user": {
"id": 6498109016,
"screen_name": "MaskTeller蒙面俠",
"profile_image_url": "https://tvax4.sinaimg.cn/crop...jpg?...",
"profile_url": "https://m.weibo.cn/u/6498109016?",
"close_blue_v": false,
"description": "微信﹕maskteller",
"follow_me": false,
"following": false,
"follow_count": 59,
"followers_count": "3505",
"cover_image_phone": "https://tva1.sinaimg.cn/crop...jpg",
"avatar_hd": "https://wx4.sinaimg.cn/orj480/...jpg",
"badge": {
"bind_taobao": 1,
"china_2019": 1,
"gaokao_2024": 1,
"hongbao_2020": 1,
"hongbaofeijika_2021": 1,
"hongkong_2019": 1,
"pc_experiment": 1,
"pc_new": 7,
"user_name_certificate": 1
},
"statuses_count": 1188,
"verified": true
},
"sgMeta": {
"crawledDateUtc": "2024-09-18T20:52:36.989Z",
"keywordQuery": "俄罗斯",
"type": "status"
}
}