/
Socialgist Streaming Services: Full & Filtered Feeds

Socialgist Streaming Services: Full & Filtered Feeds

 

Socialgist provides a "push" option for retrieving real-time data. Using this option the customer will initiate the connection with an HTTPS GET request that remains open, with new data being pushed through as it is collected. There is no need to poll for data.

Retrieving Data

To begin retrieving data for a particular stream simply connect via HTTPS as follows:

Base URL: https://<customername>.<domain>.com/stream/<datasource>_<streamname>/subscription/<subscriptionname>/part/1/data.json

HTTP Request Method: GET

Authentication: HTTP Basic authentication is required. A username and password will be supplied by us to make a connection.

Parameters: To be appended to the Base URL for additional functionality.

Key

Required

Possible Values

Notes

Key

Required

Possible Values

Notes

keepalivestream

No

true, false

This parameter is used to keep the connected client to this stream alive by sending the new line feed ‘\n’ character every 30 seconds. This parameter is most useful when the volume on this stream is very low. The default value is false means no carriage return characters will be sent.

rollback

No

positive integer

This parameter is used to re-read data messages that has been read before, if these messages are still falling in the 24 hours window. For example, if a user wants to re-read the last 1000 data messages this parameter should be set to a 1000. Redundant messages consumed won’t be counted against quota for Sina Weibo 2.0. The default value is 0 means reading from the stream will continue at the last message that has not been acknowledged.

usebuffer

No

true,false

This parameter is used to skip messages collected in the buffer, if any, when user reconnects after a disconnect. When usebuffer is false, that means skip reading from the buffer and read only the new data that arrives to the end point after establishing the connection. When usebuffer is set to true (default), that means send all messages in the buffer, starting after the last acknowledged message.

For many reasons we do not detect a disconnect until there is a failed transmit (stream write io exception). If messages volumes are very low this could take up to five minutes at which time we will forcibly disconnect the connection. The ‘keepalivestream’ parameter can therefore be used to reduce the time for our streaming service to detect a broken connection.  Customers who are doing development and testing should be aware of this situation to avoid getting 403s when trying to reconnect to their stream.

The use of 'rollback' parameter is to reduce the possibility of lost messages when disconnects occur. A conditional use of the parameter would make the most sense. For example, suppress use of 'rollback' forced disconnects by keeping track of last message received and knowing that forced disconnects occur when no messages are available for 5 minutes.

Client Implementation

The basic concept is to stay connected to the stream indefinitely (no time-out set on the client side), waiting for new lines in an infinite loop.
If the connection is broken (e.g. a socket exception is raised), you should attempt to reconnect after waiting some period (sixty seconds recommended).
If there is no activity for a five minute period, usually due to keywords that generate a very low volume (or no keywords), we will break the connection. So, in practice, high volume streams (many messages per second) will stay connected continuously for many hours/days at a time and low volume streams will have several re-connections each day.

In summary, the key points are:

  1. Do not set a time-out on your connection.

  2. Wait at least sixty seconds before reconnecting on a disconnect.

HTTP API Responses

HTTP code

Response body

Notes

HTTP code

Response body

Notes

200

Stream of messages

Valid Json messages.

503

Service Temporarily Unavailable

Internal error, please submit a support case if you get this error

403

{"error_message":"You cannot query using the same URL"}

Multiple clients are trying to access the same streaming API URL at the same time. Once the first connected client starts consuming data, all subsequent clients receive this error response.

401

<html><head><title>Apache Tomcat/7.0.39 - Error report</title><style><!--H1 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;} H2 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;} H3 {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;} BODY {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> </head><body><h1>HTTP Status 401 - </h1><HR size="1" noshade="noshade"><p><b>type</b> Status report</p><p><b>message</b> <u></u></p><p><b>description</b> <u>This request requires HTTP authentication.</u></p><HR size="1" noshade="noshade"><h3>Apache Tomcat/7.0.39</h3></body></html>

Using the wrong credentials or no credentials to access this API.



200

{"error_message":"Stream is disabled"}

When the stream is inactive. (If the requested stream supports enable/disable data collection feature)

400

{"error_message":"Malformed URL"}

When there is a typo in stream or subscription name in the URL

 

Disconnecting a streaming client API

To disconnect a consuming client from the streaming API you’ll need to stop the consuming client which closes the underlying TCP socket connection. In this case, as mentioned above,  a disconnect client is not detected until there is a failed transmit.

 

Best Practices

  • Avoid frequent disconnects and reconnects as this usurps available connection resources and can cause missed data in your consumption.

  • Delay reconnection for at least sixty seconds or 403 errors will be returned.

  • If you are unable to reconnect with continued 503 errors, submit a support ticket.

  • Use the connection history panel to monitor your connection status.

 

  •  

     

 

 

Related content