Articles

Using Visual Studio to develop R for SQL Server 2016

As Microsoft released SQL Server 2016 on June 1, a lot of people are starting to investigate how to write R which will run in SQL Server rather than using their local machine. People who have a background in R will automatically migrate to R Studio, the open source UI that has been around for years, but there may be a reason to switch. Visual Studio 2015 Community is also an open source application which can be used to write R code, which is definitely worth investigating.

Which R tool should I use: R Studio or Visual Studio?

For those people who haven’t made the decision as far as which tool to use, let me offer two compelling reasons to pick Visual Studio [VS] instead of R Studio: Intellisense and Improved Debugging Tools. R studio does not have intellisense and it is not possible to debug your code by stepping through it in the manner that many developers of VS are already quite familiar. You will need to configure VS to use R tools, which are detailed below.

Configuring Visual Studio to Run R

Only Visual Studio 2015 can be configured to use R and you must be using a 64 bit operating system to load R tools. If you have a different version of VS, download it here. The next step is to download VS R Tools and lastly download Microsoft R Open. There are two versions of Microsoft R open, one for R Server 2016, which is the one you want if you plan to integrate R with SQL Server 2016, and the standard version of Microsoft R Open, which does not include any of the R Server features. If you like, you can use either version Microsoft R Open in R Studio as well. The standard version is only available for 64 bit platforms, but does include versions for Windows and various flavors of Linux, including Red Hat, SUSE, and Ubuntu. The R open for Microsoft R Server 2016 can be found here.***Update***On June 6, 2016, Microsoft released a new tool called R client. Installing the version of R found in the client 8.0.3 is required to match the version of R released with SQL Server 2016. It is required to log into Visual Studio to be able to access this R client link.

After the tools have been installed, they appear in VS under R Tools, as shown on my screen below. The VS environment looks no different, with the exception of the new menu item for R Tools. This really isn’t an IDE set up for writing R, yet. Time to fix that.

Visual Studio R Tools

Click on RTools->Data Science Settings and the screen goes from the standard VS screen shown above to anR configured VS environment tailored to writing  R code as it has the specific panes used when writing R, such as R interactive and R Plot.  If you want to move these screens around, or close the start page,  feel free to organize the windows in VS in the same manner as one does  when using VS for other development tasks and languages.

If you have multiple R versions loaded, or you just want to see how it works, go to RTools->Options and look at the R engine entry. This code be pointing to C:\Program Files\R\R-3.3.0 for the open source version of R, C:\Program Files\Microsoft\MRO\r-3.2.4 for the Microsoft Open R. For R with SQL Server 2016, after installing the R Client, the R engine needs to point to C:\Program Files\Microsoft SQL Server\130\R_SERVER, assuming you have the developer edition of SQL Server 2016 loaded on your PC. If you change this entry, you will need to restart VS.

 LocationforRToolsinSQLServer2016

After you click ok, it might be a good idea to check the intellisense settings for R. that can be done by going to Go to R Tools-> Editor Options-> Advanced.

Running R in SQL Server 2016

Now that I am using Microsoft’s Version of R, I can use the libraries which allow me to run on the server, which this R code allows me to do. My server name is called MyServer\SQLServer 2016. Notice that I need to put two slashes in my code to be able to connect to the server to be able to get to the SQLServer2016 instance.  To connect can use either a SQL login, or integrated Windows authentication. For this example I am using a SQL Server ID to access the data, and yes I do need to put the password in readable text within my code if I use that option. For Windows authentication, and ODBC account would be needed to connect. The user also needs SQL Server rights granted in order to run R code from within SQL Server. The command rxSetComputeContext(runonServer) changes the location the code will be run from my local machine to SQL Server 2016

library(RevoScaleR)

# Define the SQL connection string
connStr <- "Driver=SQL Server;Server=MYSERVER\\SQLSERVER2016;Database=Review;Uid=ReadData;Pwd=P@$$word"

# Set ComputeContext.
sqlShareDir <- paste("C:\\AllShare\\", Sys.getenv("USERNAME"), sep = "")
sqlWait <- TRUE
sqlConsoleOutput <- FALSE
runonServer <-  RxInSqlServer(connectionString = connStr, shareDir = sqlShareDir,
                    wait = sqlWait, consoleOutput = sqlConsoleOutput)
rxSetComputeContext(runonServer)

As this post hardly scratches the surface of running R code on SQL Server, I intend to cover more in greater detail in a later post. Please subscribe to my blog to be notified when my later post with more information on the specific coding techniques unique to running R in SQL Server 2016.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Using R in SQL Server 2016

As SQL Server 2016 will be released soon and to follow up on the talk I gave at SQL Saturday Atlanta, this post will guide you through the steps needed to make R work. Like many features of SQL Server 2016, R is optionally installed.  To install R make sure that when you install R, the Option for R Services (In-Database) is checked, as shown below.

SQLServer2016_InstallR

Microsoft is incorporating the version of R sold by Revolution Analytics, which they called R Server. This version, while fully compatible with Open Source R, has some additional features which allow the R code to be run not only in memory, but use of a chunking technology to swap the data to disk so that the R code will not run out of memory. The commands to use this additional functionality all start with rx and are part of the proprietary ScaleR feature set. To use the R Server as a separate stand along product, instead of selecting the R Server in database option, select the R Server Standalone shared features. A R server could be useful if you want to perform large scale data analysis on a Hadoop Cluster, or other Non-SQL database like Teradata.

SQL Server Steps to Enable R

SQL Server 2016 is installed, especially if it was installed by others, you may be wondering if the R service really is installed. Take a look at the services running on the machine with SQL Server. If the SQL Server Launchpad service is running, as shown below, the R services are installed and running.  ServicesRunningR

The last thing needed to run R is to configure and restart the SQL Server Services. In a new query type the following command
sp_configure 'external scripts enabled', 1
GO
Reconfigure
GO

After restarting the SQL Server Service, SQL Server will now run R code. The following text can be run on any SQL Server 2016 instance to see if R has been configured correctly

EXEC sp_execute_external_script @language =N'R',
@script=N'OutputDataSet <-InputDataSet',
@input_data_1 =N'SELECT 1 as CheckToSeeIfRIsWorking'
WITH result sets (([CheckToSeeIfRIsWorking] int not null));
GO

The code is executed as an external script, specifying that the language used should be R. @script contains the R code, which is a simple command to take the mean of the data coming from the InputDataSet. @Input_Data_1 contains the location of the data to be processed. Of course the R code could of course be more complicated, but this code example is generic enough to test for everyone to ensure R is set up properly on SQL Server 2016.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Introduction to R – The Follow Up

gdiPhxI really enjoyed the opportunity to share R with GDI Phoenix. Since very few people in the room were familiar with the language, it was great to be able to show something new. The audience was great, and provided some wonderful feedback. With Microsoft’s purchase of R in 2015, many of the disadvantages of the open source version of R, including not being able to process huge data sets, speed of running the code and deployment have been resolved with R Server, which will be released with SQL Server 2016. The integration of R in SQL Server will only increase the demand for R skills, providing a great incentive for people to get started to learn the language now.

As promised I have provided links to the items I covered in the talk. Thanks so much for inviting me and I hope to have the opportunity to present again.

All the Links Need to Get Started Learning R

R Language

R Studio (UI)

Visual Studio Community R Tools

Microsoft R Open

Swirl – R Tutorial

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

2015: Year End Wrap up for Releases and More

As 2015 draws to a close, I started thinking back about everything that has happened this year. 2015 GraphicTechnically this has been a big year as a many new applications were released. Here are just some of them, with links included to provide more detail.

This short list could be a lot longer as it doesn’t count the number of updates released to Power BI, which occur several times a month, the CTP releases for SQL Server 2016, the new web version of BIML, or PowerShell. It’s really hard to keep up with everything that is changing. It’s a good thing that so many people are willing to help others learn how through speaking and blogs which make learning new things easier.

Community Involvement in 2015

Keeping up with all of these events is difficult, especially given the pace of releases.  I spend a lot of time reading various blogs, watching videos and going to hear people speak. I also have been able to talk about topics of particular interest, many Power BI and Machine Learning. This year I spoke a different times at a number of different events including: Speaker Idol, two different user groups, seven webinars, five SQL Saturdays and other Tech Events. I’ve got a number of engagements on the books for next year, including PASS BA Con and SQL Saturday #461 – Austin. 2016 is shaping up to be busy too and hopefully our paths will cross.  I list all of my speaking events on my Engagement Page and I hope that you might take a look at it from time to time if you are interested in catching up in person sometime. Next year I am hoping my list of speaking engagements changes somewhat as I plan on trying harder to get accepted to speak at events where I submitted and was turned down in 2015. On a more positive note, views of my blog are up 1000%, and the number of website subscribers has more than doubled. Thank you very much for continuing to read this site and I hope you find my thoughts helpful. I posted once a week this year, which I thought was pretty good until I talked to Ken Fischer b | t who blogs twice a week. I’ll have to try harder next year. If you think of a topic you think would make a good blog post, let me know as I am always interested in feedback.

Keeping Up the Pace in 2016

Next year there will be no slowdown in the things to learn as SQL Server 2016 is going to be released. Although the exact date has not been announced, my sources tell me to look for it around May-June. The next release of SQL Server is going to be huge as it will include new tools Microsoft added to integrate Big Data and open source platforms to SQL Server. PolyBase, JSON and R are all going to be part of with SQL Server. Personally, I find the R integration most Datazen and SSRS are going to be integrated in the next release too which should really increase the implementation of mobile reporting visualizations.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

R – Why Learn it?

I was talking to someone who is looking to change careers to pursue a technical field. She has been talking to people from various tech companies and has signed up for an intensive training bootcamp designed to teach enough programming to get a job after the class. She has been hearing all about various types of open source languages, but one which has never come up in previous conversations was R. That surprised me since R is on the list of the most popular list of languages (SQL didn’t make the list) as well as list for the programming languages in high demand. She asked me why I thought R would be a good language to learn? Since I had the same thought myself when I started to learn R, I thought I’d answer it. R helps bring meaning to data through its ability to combine data analysis and visualization. Data is important because nearly every application, from FitBit to various flashlight apps, are for better or worse all about data.

Combining Analysis with Visualization

To get started learning R, I took a MOOC class on it. While this appeared to be a good idea, after being in the class for five weeks, I had no idea why the language was considered useful. All we did was load arrays of data into memory and then write some code which approximated aggregation and selection which could be done in SQL. After five weeks, I dropped the class as I had other time commitments and it didn’t seem worth it. I was still interested in finding out what the big deal was. After playing around with R and watching other people play around with it who could make it do a lot more than I could, I got it.

A Picture Is Worth a Thousand Words

BubbleChartThe cliché about a picture is often repeated because it holds true. I have taught a number of Power BI classes, and during the class, we review a giant spreadsheet of data, and I ask the class tell me over time what the impact is over time. With a giant spreadsheet, you can’t readily determine what the answer is. Once we create a visualization of data, it’s easy to see the answer to that question. R has been working on providing graphical answers to data questions for years. A number of different companies are realizing the value of R as well. Microsoft bought Revolution Analytics, a leading R provider in 2014 and are rapidly incorporating R into other tools, like Power BI and SQL Server 2016. I’m convinced now and will be talking more about R in my blog in the future.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

 

 

Getting Started with R

RAbout two years ago, because I kept on reading that R was the language for analyzing data, I signed up for an online class in R. I took the class for four weeks, and then I got busy and couldn’t finish it. Now maybe they would have gotten around to the good parts if I had stayed longer, but in those four weeks I had no idea why you would want to use R. The classes had various projects to load data up into memory and analyze it, which I did thinking all the while “I can do in Excel, so why is this cool?”. I didn’t see anything that showed my why R was the tool for analyzing data. A little while later, I heard someone talking about visualizing data in R. Now I saw why people were so excited about it. This video from Revolution Analytics, which Microsoft bought last year, shows some of the cool visualizations you can do with R. Since Microsoft is including R support in SQL Server 2016, now might be a good time to start learning it.

Starting R from a SQL Perspective

Since R support in SQL Server 2016 will be available in the preview for CTP3, now might a good time to start learning it. When I wrote this post, CTP3 was not available, which is why all of my samples are created in the open source free tool, R Studio (go ahead click on the link and download it then come back). The application generally uses four quadrants. RscreenThe top left contains the code editor, below that is the console where you see the results of the code you run. Like SSMS, you can run what you highlight, which may or may not be everything in the code editor. R Studio sort of reminds me of PowerShell, as R has good help and doesn’t always tell you if your command contains an error or not. The top right contains the workspace, including data which has been loaded, and the bottom right contains tabs for showing your graphics or the help files. R includes you data to play with to get started, just type data() to see the data sets, so you don’t have to go fish for data sets on the internet.

Visualizations in R

If this was the first example I saw with R, I would have better understood why people are paying more for R developers than C# developers as it doesn’t take much to get started to chart your data. You do have to load the libraries for what you want to load. I chose to load a selection of BMI data from gapminder for my sample. I have a simple text file which contains Country, year and BMI. First I am going to reference the libraries beeswarm and ggplot2, then load the data and assign it to a variable. After that, I will call the beeswarm function and have it plot the data, then provide a legend to see which country is which. After I have it loaded I am going to call the boxplot function to overlay the data. Not much code and a lot more satisfying than Hello World.

 install.packages("beeswarm")
 install.packages("ggplot2")
 library(beeswarm)
 library(ggplot2)
 setwd("D:/files/code/r/WorkingDir")
 BMIFemale<-read.csv("d:/Files/code/r/BMIfemaleCountriesSelected.csv")
 #View(BMIFemale) 
 beeswarm(BMIFemale$BMI ~ BMIFemale$Year
 ,data=BMIFemale, pch=20
 ,ylab='BMI', xlab='Year',
 ,pwcol=(Country)
 ,labels=c(BMIFemale$Year)
 )
 legend('topleft',legend =levels(BMIFemale$Country), title='Country',
 pch=20, col=1:50
 )
 boxplot(BMIFemale$BMI~BMIFemale$Year, data=BMIFemale, add=T
 , col="#0000ff18"
 )

Why should I use R ?

I hope this post provided a quick way of getting started with R. I will be devoting more space on this blog to provide other reasons why you might want to use R, especially when it comes to Machine Learning and explaining further R in relation to what you can do with SQL or other tools like Excel so you can better understand it’s place in the data environment. If you are interested in reading my upcoming posts on this topic, feel free to subscribe to my blog to get the updates.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur