04/03/2018
by Marlon Ribunal
3 Comments

Creating Bar Graph In SQL Server 2017 Using R

Creating a bar graph with R in SQL Server 2017 is relatively easy. It’s just a matter of finding the correct functions, components, and properties of the package of your choice. For our example here, I am using ggplot2 which is probably the most popular plotting system for R.

Let’s create a quick colorful bar graph. This should demonstrate how easy it is to use ggplot2 in SQL Server 2017 with R. “Easy”, of course, is a relative term. Well, it’s not easy on the onset as you have to get yourself familiarized with all the components of the package. But once you’re doing this on regular basis, you’ll get the hang of it. Think of the time when you’re still learning how to write your first Select statement in SQL.

Cheesy Bar Graph

Let’s begin with the end in mind. Here’s our cheesy colorful bar graph:

Install Package

Let’s install the package if it hasn’t been installed yet. The easiest way to do that is to run RGUI.exe that came with your SQL Server 2017 In-Database Machine Learning installation. You can find it here:

C:\Program Files\MSSQL14.MSSQLSERVER\R_SERVICES\bin\x64

Take note that you need to run the executable as Administrator. Also, if you’ve installed the R engine prior to your SQL Server 2017 In-Database Machine Learning with R, you have to explicitly tell the R package installer where you want your package installed.

> install.packages("ggplot2", lib="C:\\Program Files\\Microsoft SQL Server\\MSSQL14.MSSQLSERVER\\R_SERVICES\\library", dep = TRUE)

dep = TRUE tells the installer to install dependencies. ggplot2 depends on a lot of other packages. You can check dependencies using MiniCRAN.

Dataset

Our dataset is based on the WorldWideImporters database. Here we’ve got the Sum of Line Profit per Stock Group:

USE WideWorldImporters;
GO

SELECT sg.StockGroupName,
SUM(il.LineProfit) AS LineProfit
FROM Sales.InvoiceLines il
JOIN Warehouse.StockItems si
ON il.StockItemID = si.StockItemID
JOIN Warehouse.StockItemStockGroups sis
ON si.StockItemID = sis.StockItemID
JOIN Warehouse.StockGroups sg
ON sis.StockGroupID = sg.StockGroupID
GROUP BY sg.StockGroupName;

Setup

Let’s create a directory for the graphics output –C:\Marlon_Test.

Here’s the script that we pass to @script parameter for sp_execute_external_script:

library(ggplot2);
chartStockGroupProfit <- "C:\\Marlon_test\\StockGroupProfit.png"
png(filename=chartStockGroupProfit, width=1000, height=800)
StockGroupProfit <- InputDataSet
StockGroupProfitChart <- ggplot(data=StockGroupProfit, aes(x=StockGroupName, y=LineProfit)) +
geom_bar(stat="identity")
print(StockGroupProfitChart)

Where:

chartStockGroupProfit - Filename

StockGroupProfit - Data Frame based on the InputDataSet (see Select Statement above)

StockGroupProfitChart - The actual chart

Create the Bar Graph

Let’s put that together. Note: Although you specified you wanted to install the dependencies for the ggplot2 package (dep = TRUE), the installer in some cases might not install all of them. If the ggplot2 fails with “missing package” error, just install the specified package using RGui.exe (see above).

EXEC sp_execute_external_script
@language = N'R',
@script = N'
library(ggplot2);
chartStockGroupProfit <- "C:\\Marlon_test\\StockGroupProfit.png"
png(filename=chartStockGroupProfit, width=1000, height=800)
StockGroupProfit <- InputDataSet
StockGroupProfitChart <- ggplot(data=StockGroupProfit, aes(x=StockGroupName, y=LineProfit)) +
geom_bar(stat="identity")
print(StockGroupProfitChart)',
@input_data_1 = N'
SELECT sg.StockGroupName,
SUM(il.LineProfit) AS LineProfit
FROM Sales.InvoiceLines il
JOIN Warehouse.StockItems si
ON il.StockItemID = si.StockItemID
JOIN Warehouse.StockItemStockGroups sis
ON si.StockItemID = sis.StockItemID
JOIN Warehouse.StockGroups sg
ON sis.StockGroupID = sg.StockGroupID
GROUP BY sg.StockGroupName;';

Bar Graph Format

You may argue that the above graph looks a lot better than our colorful graph at the top of this blog. But arguments aside, let’s put on some formatting.

You may notice that the Y-Axis (Line Profit) is not in Dollar ($) format. The Packaging Material group is more than $50,000,000.00.

Well, you can argue about the aesthetic of the graph, but you may agree that the exponential format of the Line Profit is not quite intuitive.

If we divide the LineProfit by the exponential value of 1e6 (1 times 10 to the power 6, which is 1,000,000), then we get a readable metrics:

aes(x=StockGroupName, y=LineProfit/1e6)

To format that value in Dollars ($), we need to add another library to the script – library(scales).

Then we can do:

scale_y_continuous(labels = dollar)

Rename Labels

You may notice that the Y-Axis Label says “LineProfit/1e+06”. That can easily be renamed by using the lab() function:

labs(x="Product Groups", y="Line Profit (Millions)")

To format the data frame labels, use the theme() function:

theme(text = element_text(size=20, color="blue"), axis.text.y = element_text(size=18, face="bold"), axis.text.x = element_text(size=18, face="bold", angle=80, vjust = 0.5))

Then, we should have something like this:

Reorder Bar Graph

Just to have the graph easy to read, let’s reorder the bar in descending order.

Plus, we don’t really like to look at bars that seem to resemble that of a dirty finger gesture (you’ll see it).

aes(x=reorder(StockGroupName,-LineProfit / 1e6), y=LineProfit / 1e6))

Rainbow Magic

To have that rainbow effect on the bars, modify the geom_bar() like this:

geom_bar(stat="identity",fill = rainbow(n=length(StockGroupProfit$LineProfit)))

Putting it all together

Here’s the complete R code:

EXEC sp_execute_external_script
@language = N'R',
@script = N'
# Use ggplot2/scales
library(ggplot2);
library(scales)
# Create chart file (folder must be existing)
chartStockGroupProfit <- "C:\\Marlon_test\\StockGroupProfit.png"
png(filename=chartStockGroupProfit, width=1000, height=800)
# Define data frame
StockGroupProfit <- InputDataSet
# Build chart
StockGroupProfitChart <- ggplot(data=StockGroupProfit, aes(x=reorder(StockGroupName,-LineProfit / 1e6), y=LineProfit / 1e6)) +
geom_bar(stat="identity",fill = rainbow(n=length(StockGroupProfit$LineProfit))) +
theme(text = element_text(size=20, color="blue"), axis.text.y = element_text(size=18, face="bold"), axis.text.x = element_text(size=18, face="bold", angle=80, vjust = 0.5)) +
labs(x="Product Groups", y="Line Profit (Millions)") +
scale_y_continuous(labels = dollar)
print(StockGroupProfitChart)',
@input_data_1 = N'
SELECT sg.StockGroupName,
SUM(il.LineProfit) AS LineProfit
FROM Sales.InvoiceLines il
JOIN Warehouse.StockItems si
ON il.StockItemID = si.StockItemID
JOIN Warehouse.StockItemStockGroups sis
ON si.StockItemID = sis.StockItemID
JOIN Warehouse.StockGroups sg
ON sis.StockGroupID = sg.StockGroupID
GROUP BY sg.StockGroupName;'

There you go. Happy plotting!

 

03/27/2018
by Marlon Ribunal
Comments Off on Getting Things Done With SSMS

Getting Things Done With SSMS

As a SQL Server pro, I spend most of my productive time in the SQL Server Management Studio (SSMS). I am currently using the latest version as of this writing, which is v17.6.

I set it up in different configurations according to my taste at any given time. After so many iterations of my preferences which slowly adapted to my working style, I think I found the “perfect” setup that suits me and the way I work.

I’d like to say that my setup is minimal and practical. It only contains settings and windows that I normally need in a typical day.

SSMS Environment Setting

For the general setting, I preferred the Light color theme. It looks clean and, well, light in a sense. It’s easy to focus on the working space where my eyeballs need to zoom in. There are so many interruptions around the working space either in the office or at home (I work from home 2 days a week). I want tools that are minimal and simple.

For the theme of the workspace, I prefer Aaron Bertrand’s (b | tDark Theme with Normal Fonts. I am wearing prescription glasses so I want a theme that’s easy on the eyes.

Registered Servers

Do you find it annoying logging into different servers and forgetting passwords most of the time? I do. I jump from one server to another all throughout the day. And remembering passwords is hard. That’s why I find the Registered Servers windows to be useful.

If I need to run a quick query against any of the servers, I’d just do a right-click on the server list and select Query which opens up a new Query Editor Window that is already connected to the server. No need to enter username and password to connect. That is because the credential is saved when I registered the server.

The added benefits of the Registered Servers Window is that you can backup the setting or share the configuration. I recently upgraded from SSMS v16 to v17x. There is no direct upgrade to v17.x so I have to install the newer version separately. I didn’t have to rebuild all the server one by one, I simply exported the setting from the older version to the newer version.

Object Explorer

The perfect pair for Registered Servers window is the Object Explorer. From the Registered Servers, I just do a right-click on the server and select Object Explorer. Again, there is no need to enter the username and password. I guess this is a common window. I placed it just below the Registered Server window just so I can quickly navigate between servers conveniently. These two windows are placed on the left-hand side of the editor.

Solution Explorer and Template Browser

On the right-hand side are the Solution Explorer and Template Browser windows which are docked on top of each other. The Solution Explorer contains scripts that I either wrote or harvested from the interwebs – scripts that I use day in, day out. The Template Browser is useful for when you forget the syntax of an object.

What are your setting preferences to maximize your productivity in SSMS? Share them in the comment below!

03/20/2018
by Marlon Ribunal
1 Comment

Identify R Package Dependencies Using miniCRAN

Installing R packages in SQL Server 2017 is a breeze with the stored procedure sp_execute_external_script. But so far, it is limited to installing one package at a time. If you have an R Package with dependencies, you have to install those in a separately.

Here’s an easy way to check for package dependencies using the package miniCRAN.

Using RGUI.exe

Find RGUI.exe in C:\Program Files\Microsoft SQL Server\MSSQL14.MSSQLSERVER\R_SERVICES\bin\x64. Run the executable as an administrator (Right-Click & Select “Run as administrator”).

If you want to add the new package to the default library, you have to tell the installer exactly that by using the lib argument. I previously installed R and its environment in my laptop. Without specifying a library, R would have installed the package in the old default library. See the list of my library paths below (.libPaths())

install.packages("miniCRAN", lib="C:\\Program Files\\Microsoft SQL Server\\MSSQL14.MSSQLSERVER\\R_SERVICES\\library")

 

Then you should get a message like this one:

trying URL 'https://mran.microsoft.com/snapshot/2017-05-01/bin/windows/contrib/3.3/miniCRAN_0.2.7.zip'
Content type 'application/zip' length 1025529 bytes (1001 KB)
downloaded 1001 KB

package ‘miniCRAN’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
C:\Users\Marlon.Ribunal\AppData\Local\Temp\Rtmp4UPMso\downloaded_packages

You can find all your library paths with the .libPaths() function.

> .libPaths() [1] "C:/Users/Marlon.Ribunal/Documents/R/win-library/3.3" [2] "C:/Program Files/Microsoft SQL Server/MSSQL14.MSSQLSERVER/R_SERVICES/library"

Then we can double check real quick if the package actually made it:

sp_execute_external_script @language = N'R',
 @script = N'OutputDataSet <- data.frame(installed.packages())';

 

Then

> library("miniCRAN")
> pkg <- "ggplot2"
> pkgDep(pkg)

Results:

 

You may not need all the listed packages. Check the web page for the package you’re installing for the info.