03/13/2018
by Marlon Ribunal
2 Comments

Getting Started With R In SQL Server 2017

Prior to SQL Server 2016, running R on Microsoft platform required an installation of the R Language and its environment and an external IDE to write R codes. Then came the Microsoft R Server and its own flavor of R, the Microsoft R Open. With SQL Server 2016, and now 2017, Microsoft has brought R (and Python) closer to the database engine – SQL Server in this case.

I have a SQL Server 2016 instance that I upgraded to SQL Server 2017 with In-Database Machine Learning. I was not going to reinvent the wheel here, hence the upgrade.

Note: If you’re upgrading SQL Server 2016 Reporting Services to the 2017 version, the SQL Server 2017 RTM installer does include SSRS. It actually uninstalls the service. There is now a separate installer for SSRS 2017. So, you have to plan your SSRS migration prior to the upgrade.

Here are the general steps to install or upgrade an older version of SQL Server, 2016 in this case, to the 2017 version.

Let’s start with the SQL Server Installation Center. Select Upgrade from a previous version of SQL Server because that is what we’re doing here – upgrade from 2016 to 2017.

After  entering the product key, accepting the license terms, and  selecting the instance to upgrade, the installer asks you to confirm the separate migration of the SQL Server Reporting Services

Also, please take note that the SQL Server 2017 RTM installer does not allow changing features. So, it installs both the In-Database Machine Learning Services and the Standalone Machine Learning Server which you can access via Microsoft R Client.

The installer includes Microsoft R Open which is a good thing because you do not have to install and configure it separately.

Configure the instance, Service Accounts, and Full-Text. Check the Summary, then hit Upgrade

 

Before you can run R in SQL Server 2017, you need to enable external script to allow the sp_execute_external_script system stored procedure to run. This is not on by default.

EXEC sp_configure 'external scripts enabled', 1;
RECONFIGURE WITH OVERRIDE;

Once that is set, let’s execute the obligatory Hello, World! in R + TSQL on SQL Operations Studio.

EXEC sp_execute_external_script

@language =N'R',

@script=N'OutputDataSet<-InputDataSet',

@input_data_1 =N'SELECT ''Hello R From TSQL'''

WITH RESULT SETS (([HelloR] VARCHAR(100) notnull));

GO

Enjoy! I hope to write more about R and Machine Learning in the future posts.

03/06/2018
by Marlon Ribunal
1 Comment

Book Review: SQL Server 2017 Machine Learning Services With R

Disclaimer: I’m one of the Technical Reviewers for this book.

With the rising demand for cloud computing and wide adaptation of cloud-based analytics platforms such as Data Lake Analytics and the likes, the chasm between the traditional Data Professionals and their need for analytics literacy is becoming wider and wider. There seems to be an insurmountable gap between the analytics discipline and the in-house SQL Server database platform pros who are looking to transition to Analytics and Data Science. If you’re a SQL Server pro and want to go after a data science career path, you have to “go out” of the SQL Server path and learn, for example, R or Python.

SQL Server 2016 has changed the data science landscape for the SQL Server platform. SQL Server 2016 R Services provided the data pro a direct path to Analytics and Data Science. It facilitated an easier way to and eliminated the barrier to entry for Data Science. The next version of SQL Server pushed the envelope further. SQL Server 2017 has basically become a database with built-in Artificial Intelligence with its In-database Machine Learning feature. Aside from R, you have another option to use Python. There is much, much more about SQL Server 2017 to be excited about!

If you are looking for a reading material for Machine Learning With R, there is this book published by Packt titled SQL Server 2017 Machine Learning with R: Data Exploration, Modeling and Advanced Analytics (Amazon) written by Tomaž Kaštrun (T | B) and Julie Koesmarno (T | B); and reviewed by Dave Wentzel (B) and yours truly. This book provides essential knowledge to get started with Machine Learning. “It gives you the foundational knowledge and insights to help you understand SQL Server 2017 Machine Learning with R,” the back cover says.

One of the strong features of R is its Visualization. Chapter 4 shows you around Data Exploration and Data Munging. You’ll learn about data frames and dplyr, how to use R with TSQL, integrating R in SSRS and Power BI.

Some statistics is covered in Chapter 5 as well as dataset subsetting and merging. Also covered in the chapter are statistical tests and sampling. Predictive Modeling is covered in Chapter 6.

In Chapter 7, Operationalizing R Code, you can see how R and TSQL work together to train and save a model, then operationalize the model using real-time scoring and native scoring. So, TSQL still plays a big part in Machine Learning; and, you’ll see that all throughout the book.

Chapter 8 covers information about deploying, managing, and monitoring database solutions containing R codes. If you are a SQL Server DBA, the whole Chapter 9, Machine Learning Services with R for DBAs, is for you. It talks about how you can use R to do DBA tasks such as prediction of disk usage. This is an example that you can build upon for more complicated DBA metrics or monitoring tasks. Chapter 10 deals with the extended built-in capabilities of JSON. It also discusses Columnstore and in-memory OLTP.

The book is an easy-to-follow step-by-step guide that will bring you up to speed with Machine Learning with R in SQL Server 2017.

 

02/27/2018
by Marlon Ribunal
Comments Off on Productivity For The Data Pro

Productivity For The Data Pro

Productivity is hard to measure when your work has no definite structure like a typical office job. You may operate within the traditional 9-5 office job but the nature of working around databases requires flexibility; and, the focus is shifting constantly minute by minute. One moment you’re looking at something so mundane; the next moment, you’re on another task trying to put out a fire on an incident you’ve never experienced before.

The job of the data pro revolves around what Paul Graham calls Maker’s Schedule, in which the block of time is defined in half-day increments. This is the opposite of Manager’s Schedule in which a day is divided into 1-hour chunks.

The time required to accomplish one thing on any given day varies from one task to another. But as a maker, the data pro can only find productivity success if he or she can manage to get through the day with intense focus despite the usual interruption.

The following productivity elements can help you get things done amidst the chaos:

The List To Live By

The old, trusty List might be, well, old school but you need it to survive life as a data pro. You live and die by what you put in or not put in on your list. Listing your task by priority can go a long way. It is a guide that keeps you on track with the day’s objective – that is to finish as much work as possible.

The list can give you a thousand feet view of your day. It will also give you an idea of how would you want to tackle your tasks. The list is a great tool for planning your day.

The Productive Emotions

There are the must-do tasks that go on top of your list. Any showstoppers would go on top. How do you prioritize the remaining tasks? Some would do the snowball trick – do the smallest task first to gain momentum. Or, some tend to eat their frog first – tasks that they are most likely to procrastinate on.

Ask yourself these questions when prioritizing your tasks: Which task, when done, can give me that instant rush of a sense of accomplishment? Which task is most likely to give me that momentum that I need? Which task simply makes me happy (for whatever reason) to be doing? Which task am I trying to avoid but will be happy to get rid of in my todo list?

Simple questions like these will determine the order of your priority.

Choose Your Poison

Getting Things Done (GTD) is one of the most popular productivity systems. As a system, it is a guide to maximizing output and minimizing input. The core of this system is its 5-stage Workflow: CollectProcessOrganizeReview, and Do. The List is part of the first stage, Collect. GTD could be overwhelming. Here’s how Doug Purnell (t | b) has adopted the system into his daily routines.

Kanban is another popular system. It gives you a visual representation of your tasks and their statuses. Your Kanban board can give you a quick view of the progress status of all the tasks on your plate. There’s nothing more simple than marking/moving your task from Do, Doing, and Done statuses. You always aim to move all your tasks under the Done column at the end of each day.

Manage Time Management

The doing part can be tricky. Some interruptions are beyond our control. There is also the issue of procrastination. An effective tool to manage interruption and procrastination is the Pomodoro Technique. Pomodoro is a time-boxing productivity technique that forces you to remain focus on the task at hand for 25 minutes with a 5-minute break at the end of that 25 minutes. These count as one (1) Pomodoro. You can take 15-30 minutes break after your fourth Pomodoro. A task should not exceed 7 Pomodoros. If a task requires more than that, then break the task into sub-tasks.

The Pomodoro Technique is an implementation of both Time-Boxing and Batching concepts. You can adapt these concepts as separate systems and apply them to your routine.

What’s your productivity system? Do you have productivity tips and tricks? Share them in the comment below.