Social data for retail demand forecasting

demand forecasting

How social data can help retailers meet customer demand

Both online and offline retailers try to increase their sales and improve demand forecasting through better understanding their customers. In order to do it they should be aware of what their clients’ preferences are or might be in the near future. Retailers should not only know what their customers currently buy but also understand what the demand will be like. Together with this retailers should solve two major problems: predict the goods customers will want to buy and forecast the quantities of these products.

“Customer behavior is highly uncertain and depends on many uncontrollable factors.”

The problems mentioned above are very complex as customer behavior is highly uncertain and depends on many uncontrollable factors, which makes demand forecasting one of the key issues for retailers to address. Only statistical approach can solve the problems, but such solution requires a lot of data. Sources of data may be the following: internal data, such as retailer’s database with sales history; and external data – demographic, economic and social data. Here, we are mostly interested in social data that includes customer feedback, comments, recorded telephone conversations, search statistics, and text messages from social networks such as Facebook, Twitter and others. Social data is the most diverse and unstructured type of data. This makes it very hard to extract some meaningful information relating to a particular topic. Nevertheless, social data may help in solving important problems in retailing, such as measuring customer satisfaction with the quality of goods. Let’s go into more details about how social data may be transformed into usable market insights.

“Social data may help in solving important problems in retailing, such as measuring customer satisfaction with the quality of goods.”

For example, Hal Varian, Google chief economist, found that the peaks and troughs in the volume of Google searches for certain products, such as cars and holidays, preceded fluctuations in sales of those products (Varian, 2011). He used free tools such as Google Correlate and Google Insight to predict online and offline spending patterns on clicking ads data. Also, Johan Bollen showed that measurements of collective mood states derived from large-scale Twitter feeds are correlated to the value of the Dow Jones Industrial Average (DJIA) over time (Bollen, 2010). Using Twitter text data  he measured mood in terms of 6 dimensions (Calm, Alert, Sure, Vital, Kind, and Happy), and then used Grander causality analysis and a Self-Organizing Fuzzy Network to predict the changes in DJIA closing values. These examples demonstrate how beneficial social data analysis might be.

Guiding principles to build a demand forecast

“Demand forecasting is one of the most challenging fields of predictive analytics.”

It can be explained by an evolving environment of consumer behavior shifts, cyclical economic fluctuations and other factors that make it difficult to identify critical trends. Predictive analytics products and solutions are available to try and solve these issues. What these products cannot resolve that readily, are the internal dynamics that characterize a business, especially where models have reflected lack of consistency in the data, because the assumptions and drivers of one department or operational unit are not aligned with those of the others. Whether that’s a function of the organizational culture and a politically-charged environment, a failure of legitimate but conflicting visions, or poorly communicated agendas and goals – the resulting forecast mismatch can be expensive. To overcome these challenges demand forecasting solutions should follow  certain guiding principles:

  1. Objectivity – forecasting models should serve as a tool to get to objective and unbiased results that can be defended with concrete data.
  2. Consistency – similar approach to forecasting should be used across brands, markets and categories whenever it’s possible.
  3. Transparent assumptions – Key assumptions on market and execution drivers (economy, inflation, price, distribution, marketing, account growth, etc.)
  4. Engagement means that business users should actively support a forecasting solution at all levels of its implementation.
  5. Regularity means that forecast analysis needs to be embedded in regular forecasting processes.

To sum it up, business leaders across the whole enterprise should be aware of current changes on the market and data scientists should be responsible for actual use and application of forecasting solution, and therefore adjust it in time.

Basic workflow for retail analytics on social data

After detailed discussion of general issues that businesses and data analysts face while predicting product demand, let’s go through possible steps of retail demand forecasting process utilizing social data. For the sake of simplicity, consider a case of Twitter feeds data. Big Data analytics is now being applied at every stage of the retail process including the following steps.

  1. Predicting trends – working out what products will be popular next week, month, year, and so on.
  2. Forecasting demand – predicting the amount of demand for those products.
  3. Price optimization – determining whether retailer should increase or decrease the prices of those products.
  4. Identifying the customers – deciding which customers are likely to be interested in a particular product, and the best way to go about putting it in front of them.

Let’s consider an example how we can effectively use Twitter data to work through all the steps mentioned above. At first, we can analyze a large historical dataset of users’ posts that is sampled from streaming Twitter API on a certain condition. For example, posts that include such indicator words as “buy”, “product”, “good”, “shop” and names of some popular brands are chosen. While identifying, which of these terms are present in the data and calculating their frequencies and other statistics, we can make predictions about what customers are willing to buy in the near future. Then we can combine internal data from retailer’s history of sales database with numerical external data from the previous step in order to build a model using machine learning techniques to predict demand on the products of interest. Next, we can predict the prices for products by building econometric models with factors corresponding to economic indicators, such as consumer price index, currency exchange rates and so on,  that might influence the prices of products. Through classifying Twitter users, by their interest in particular products and their outcome, we can identify potential customers and their possible willings. During this step a retailer can already use the results of the analysis to build his strategy, which includes planning his logistics and sending personal adds to the identified users.


We really hope that the solution described can improve online recommendation system for retailing, and, as a result, increase outcomes.

by Alexandr Novopoltsev for InData Labs

Using machine learning, AI and Big Data technologies InData Labs helps tech startups and enterprises explore new ways of leveraging data, implement highly complex and innovative projects, and build breakthrough AI products. Our core services include Data Strategy Consulting, Big Data Engineering, Data Science Consulting.


  1. H. Varian (2011) Predicting the Present. //
  2. J. Bollen, H. Mao, X.J. Zeng (2010) Twitter mood predicts the stock market.