Monthly Archives: February 2014

Predictive Analytics and Social Media- Predicting the Unpredictable

University researchers have discovered a new way to predict what topics on Twitter will be popular hours before they are identified as trending topics, offering a novel method to analyze information that changes over time.

social media crystal ball 300 Predictive Analytics and Social Media: Predicting the Unpredictable
MIT professor Devavrat Shah and his student Stanislav Nikolov have developed a new algorithm that they say can, with 95% accuracy, predict the Twitter topics that trend, or suddenly explode in volume, reflecting their popularity.

Twitter determines the trending topics based on its own algorithm that analyzes the number of Tweets and those that have recently grown in volume, according to an MIT report on the research.

Shah notes that his research differs from the standard approach to machine learning in which researchers develop a general hypothesis about a pattern and specifics about that pattern need to be inferred.

“You’d say, ‘Series of trending things . . . remain small for some time and then there is a step,’” Shah says in the MIT article. “This is a very simplistic model. Now, based on the data, you try to train for when the jump happens, and how much of a jump happens. The problem with this is, I don’t know that things that trend have a step function. There are a thousand things that could happen.”

With the method that he’s developed, the data decides, he adds.

Shah and Nikolov compare changes over time in the number of Tweets about new topics to a sample set of data. Sample data where the statistics are similar to those of the new topic are given more weight to predict whether the topic will become a trend or fade away.

In essence, the comparison to the sample data set allows the sample set to “vote” as to the likelihood that the topic will trend on Twitter. The method can be applied to any sequence of measurements that’s performed at regular intervals such as ticket sales for movies or stock prices, according to MIT.

This is not the first time researchers have used predictive analytics to tap social media data to predict seemingly unpredictable trends.

A professor at the University of California Riverside (UCR), and other researchers, have created a model that uses data from Twitter collected on a particular day to help predict how often a stock will be traded and at what price the following day.

A trading strategy that’s based on the researchers’ model, “outperformed other baseline strategies by between 1.4 percent and nearly 11 percent and also did better than the Dow Jones Industrial Average during a four-month simulation,” according to UCR Today.

“These findings have the potential to have a big impact on market investors,” says Vagelis Hristidis, an associate professor at the Bourns College of Engineering, who has helped to develop the new model. “With so much data available from social media, many investors are looking to sort it out and profit from it.”

The researchers have found that stock price correlates with the number of connected Tweets about a company – those Tweets about distinct topics that relate to one company.

Facebook has also been targeted by data scientists attempting to use predictive analytics to predict the fluctuating stock market. Arthur J. O’Connor, who has worked on Wall Street in risk management for a couple decades, has developed a method that uses data analysis to analyzes if likes on Facebook affect consumer brand stock prices.

“My theory was, you know, it’s like in high school,” he says in a NPR report. “Does being really popular help you win friends [or] help you enhance your performance? And it turns out that, yeah, popularity does seem to help brands.”

O’Connor has spent a year tracking the likes of 30 brands with the most followers on Facebook, while also tracking their daily share prices.

“So, 99.95 percent of the change could be explained by the change in fan counts,” he adds.

The admiration a company gets on social media seems to be a good predictor about stock market performance.

MAP Reports in SSRS 2012- Part-1

This post is in continuation of our last post where we learned to import shape files in SQL server. In this post we will learn how to use this shape file in SSRS reports.
From SSRS 2008 R2 onward Microsoft Introduced Map reports in Data Visualization category. Map reports allow us to create maps or maps layers to let us visualize data against a geographic background.
The idea in this post to demonstrate the map reports by using the SQL Server special query and Map Gallery. For this demonstration we will connect AdventureWorks2012 data source and use below query in Dataset:

SELECT StateProvinceCode,SUM(SOH.SubTotal) as Amount
FROM Sales.SalesOrderHeader SOH
INNER JOIN Person.Address A ON SOH.BillToAddressID = A.AddressID
INNER JOIN Person.StateProvince SP ON A.StateProvinceID = SP.StateProvinceID
INNER JOIN Sales.SalesTerritory ST ON SP.CountryRegionCode = ST.CountryRegionCode
WHERE ST.CountryRegionCode = ‘US’
GROUP BY SP.StateProvinceCode
DataSource DataSet

The very first thing you will see when you drag the Map on report surface as:

Map Layer

You can select the data from below three sources:

  1. Map Gallery: The map gallery contains maps from reports that are located in the map gallery folder for the report authoring environment. Maps from the gallery provide a quick start to add a map to your report.
  2. ERSI Shapefile: Used to take the source of special data as ERSI Shapefile (.shp).
  3. And SQL Server Spatial Query: Take special data as a source from SQL Server database.

First we will choose Map Gallery as source of special data. From the map gallery choose “USA by state Inset” and click on Next. 
Map Layer_1

The next screen is important because you can fit and zoom your map according to you by using Arrows and zoom buttons. When are done this screen click on Next. This screen will open Map Visualization screen where you can choose one of the visualization among three default options. In our case we will choose the Color Analytical Map and click on Next button.

Map Layer_3

After this screen you will see another screen which will give you the freedom to choose dataset which will be used to show the data in Map. Choose MapDataSet which we have already created at the beginning of this post and click on next
Map Layer_4

On next window you will create the relationship between special and analytical data. The best way to relate them is to check the data which is shown in spatial data and Analytical data sub windows. In this example you will notice that we can relate STUSPS and StateProvienceCode from special and analytical data respectively. To make the relationship click on STUSPS check box and select StateProvienceCode from the dropdown with all available options of analytical data. When you have done this click on Next.
Map Layer_5

On this window you will choose the theme and data shown on Map. For this example let the theme and Color Rule as default and choose SUM(Amount) from Field to Visualize and #STUSPS from display labels and click on Next.
Map Layer_6

From the next window we have to set one more property. For this click on report and then Polygon and click on properties as per below screen shot and put below expression as per screen shot and click OK twice and finish.
Map Layer_7 Map Layer_8

Now you have done with this report and check the preview of the report:

Project Connection Managers in SSIS

The very first new thing you will notice in the solution explorer when you create a new SSIS project is the “Connection Managers” node. This is a new feature in SSIS 2012 that allows sharing connection managers across multiple packages.






To demonstrate this create any type of source or destinations connection and configure it accordingly. Then do right click on connection manager in connection Manager Pane and click on “Convert to Project Connection”.











This will make your connection manager available throughout the project and you will use it connection later in other packages. (Check the (Project) just before the connection manager)








The same connection will now show on Connection Manager Node in Solution Explorer.







Anytime you can reconvert this project connection into “Package connection” by right clicking on Project Connection and click on “Convert to Package connection”.










You can download the pdf version of this post from Project Cnnection in SSIS

How to import Shape files in SQL Server.



In this post we will discuss how to use import Map (Shape file) in SQL Server. For this we need to download some data and executables. The following two points will give you the complete information about this.

  1. Download the world and India map files from here
  2. Then download the software which will use to import the shape file into SQL Server database from this link.
  3. Now unzipped both of the folders which you have downloaded in above two points.Now follow the instructions as per below image:
  4. Click on Shape2Sql.exe

5. Browse the india_state.shp file from the folder from step 1.Configure the database server and database where you want to import the data of shape file.
6. Enter the table name where you want to import the data, in our case this name is India_State.
7. Finally click on Upload Database to create the table.
8. When you run select statement on India_State table it will show you result as per two images.










9. When you run select statement on India_State table it will show you result as per two images.
SQLData SQLShape













10. In our next post we will create Map report in SSRS using this data.



Online SPSS Training for Beginners-Session 1


What is in this workshop

  • —SPSS interface: data view and variable view
  • —How to enter data in SPSS
  • —How to import external data into SPSS
  • —How to clean and edit data
  • —How to transform variables
  • —How to sort and select cases
  • —How to get descriptive statistics

 Data used in the workshop

  • —We use 2009 Youth Risk Behavior Surveillance System (YRBSS, CDC) as an example.
  • —YRBSS monitors priority health-risk behaviors and the prevalence of obesity and asthma among youth and young adults. 
  • —The target population is high school students
  • —Multiple health behaviors include drinking, smoking, exercise, eating habits, etc.
—Data view
—The place to enter data
—Columns: variables
—Rows: records
—Variable view
—The place to enter variables
—List of all variables
—Characteristics of all variables


Before the data entry
  • —You need a code book/scoring guide
  • —You give ID number for each case (NOT real identification numbers of your subjects) if you use paper survey.
  • —If you use online survey, you need something to identify your cases.
  • —You also can use Excel to do data entry.

Example of a code book


Enter data in SPSS 19.0 
Enter variables
Enter variables
Enter cases
 Keep watching this page…..