Jon Palmisciano

How Yahoo Finance serves realtime data

If you take a look at stock on Yahoo Finance during market hours, you will notice the price and other key information updates in realtime:

Live Data Demonstration

I was curious how this was done, so I started to investigate. The Chrome developer tools confirmed my suspicion that these realtime updates are achieved through the use of WebSockets.

Developer Tools

When the page is first loaded, the client opens a WebSocket connection and sends a subscribe request, which looks like the following:

{
  "subscribe":[
    "^GSPC",
    "^DJI",
    "^IXIC",
    "^RUT",
    "AAPL"
  ]
}

As you can see, the client just specifies a list of stock symbols to receive updates for. The backend will then send new data to the client as it becomes available, roughly every second. Here is a response from the server:

CgRBQVBMFZqZ4UMYoI+hlvlcKgNOTVMwCDgBRXIUSr9IyOXZDmWAwmXA2AEE

Looking at the response, it is easy to recognize it is Base64 encoded. But when decoded, the data is not structured in any obvious format. Further inspection of the Yahoo Finance site’s code reveals that these are actually Protobuf messages.

PricingData.decode = function decode(r, l) {
  if (!(r instanceof $Reader)) r = $Reader.create(r);
  var c = l === undefined ? r.len : r.pos + l,
    m = new $root.quotefeeder.PricingData();

  while (r.pos < c) {
    var t = r.uint32();
    switch (t >>> 3) {
      case 1:
        m.id = r.string();
        break;
      case 2:
        m.price = r.float();
        break;
      case 3:
        m.time = r.sint64();
        break;
      case 4:
        m.currency = r.string();
        break;

      // Additional fields omitted for brevity...
    }
  }

  return m;
};

Reverse engineering the decoding function (shown above) allows us to construct a valid Protobuf message definition for the server’s responses:

message PricingData {
    string id = 1;
    float price = 2;
    sint64 time = 3;
    string currency = 4;
    string exchange = 5;
    int32 quote_type = 6;
    int32 market_hours = 7;
    float change_percent = 8;
    sint64 day_volume = 9;
    float day_high = 10;
    float day_low = 11;
    float change = 12;
    string short_name = 13;
    sint64 expire_date = 14;
    float open_price = 15;
    float previous_close = 16;
    float strike_price = 17;
    string underlying_symbol = 18;
    sint64 open_interest = 19;
    sint64 options_type = 20;
    sint64 mini_option = 21;
    sint64 last_size = 22;
    float bid = 23;
    sint64 bid_size = 24;
    float ask = 25;
    sint64 ask_size = 26;
    sint64 price_hint = 27;
    sint64 vol_24hr = 28;
    sint64 vol_all_currencies = 29;
    string from_currency = 30;
    string last_market = 31;
    double circulating_supply = 32;
    double market_cap = 33;
}

It is also worth noting that no authentication is required by the client to connect and receive realtime data. This means you can (theoretically) use the API in your own programs, as long as your language of choice is supported by Protobuf and has a WebSockets library.

I do not believe this API is intended for public use, so I would exercise caution when using it for your own purposes. Furthermore, despite the large number of different fields defined, only a subset of them are regularly streamed, specifically fields 1-3, 5-9, 12 and 27. You will need to fetch the remaining data from a different sounce.

I hope you’ve learned something from this post. If you have any questions, feel free to reach out. Thanks for reading!

— JP