Introduction to R – The Follow Up

gdiPhxI really enjoyed the opportunity to share R with GDI Phoenix. Since very few people in the room were familiar with the language, it was great to be able to show something new. The audience was great, and provided some wonderful feedback. With Microsoft’s purchase of R in 2015, many of the disadvantages of the open source version of R, including not being able to process huge data sets, speed of running the code and deployment have been resolved with R Server, which will be released with SQL Server 2016. The integration of R in SQL Server will only increase the demand for R skills, providing a great incentive for people to get started to learn the language now.

As promised I have provided links to the items I covered in the talk. Thanks so much for inviting me and I hope to have the opportunity to present again.

All the Links Need to Get Started Learning R

R Language

R Studio (UI)

Visual Studio Community R Tools

Microsoft R Open

Swirl – R Tutorial

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Is a Data Model Needed in Power BI?

Power BI does of course need a data model, but often times there is no reason to create a new one for Power BI. After all the data model could always be contained within the data PowerBIDataModelsource Power BI is using. Figuring out if a data model is required is directly related to the data source(s) being used. In turn the selection of the data source also determines when to us Power BI online client and/or Power BI Desktop. Although they look the same, there are a couple of key differences which govern which to use.

When Should I Use the Power BI Online Client?

When comparing the features of the Online Client with the Desktop version of Power BI, there is one very obvious difference, there is no way to create a data model in Power BI online. It is not possible to create a data model using the online client. The online client is designed to connect to an existing online source such as Sales Force or Azure DB. If you are using an existing model, there is no need to create one. When using the enterprise gateway, which uses an on-premises database such as a SQL Server, SSAS or Hana, the data model is contained within the database exposed via the enterprise gateway, so again no reason exists to create a data model. Report creation can occur either using the online client or desktop as there is compelling technical reason that I am aware of which would determine where the report is created.

When Should I use the Power BI Desktop?

If the reports need to use data mashups, a data model is required to join the disparate data together. For example, if I need to create a report consisting of tables from two different SQL Server databases and an Excel Spreadsheet, relationships linking those tables need to be created. I need a data model, and since it is not possible to create a data model in Power BI Online, I will create my data model in Power BI Desktop. If later on the report data source needs to be changed, this is possible if you create it in Power BI Desktop. Again you have your choice of either creating the reports in Power BI Desktop or in the online client. Creating a data model in Power BI Desktop does not mean that you must to create the reports there too.

Refreshing the Power BI Report Data for the Online Client

The data model used in Power BI dictates how the report is refreshed. If you are using only a cloud based source, you don’t need a gateway of any kind. The data refresh can be automatic–meaning Power BI will do it for you, scheduled, or use a live dataset depending upon the source used. SQL Azure Database, SQL Azure Data Warehouse and Spark on HDInsight connect live. Changes in made in the source will be reflected in Power BI. The data source connected with an enterprise gateway operates the same way. When the source data is updated, Power BI will use the new data in the reports. Please note that the Analysis Services Connector has been deprecated. If you want to directly connect to a Tabular model located on-premises, use the enterprise gateway to make the connection. As I discussed in a previous post, the enterprise gateway uses local Active Directory information to connect, so users will need to have an AD Account with database access to be able to access data in an enterprise gateway. Unlike many web applications, you must set up each user to have access. All users cannot use one account to connect with the enterprise gateway, but you can do that with a personal gateway.

Personal Gateway – How to Refresh the data from Power BI Desktop

If you have created a data model in Power BI Desktop and want to refresh the data, install the personal gateway on a machine within your network, and set the schedule to perform the data refresh for each source within the model. The personal gateway uses the credentials which are entered when it is configured, which means that users who do not have database permissions can look at the data within Power BI updated by the personal gateway.

I hope this clears up some of the confusion surrounding the data models, the Power BI online client and Power BI Desktop. If you have any questions or are interested in other Power BI topics, please ping me on twitter @DesertIsleSql or post a comment as I am always interested in your feedback.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

SSIS – Resolving “Failed to Deploy the Project” Messages

Have you seen this screen when trying to deploy?
SSISFailDeploy

Needless to say, this indicates the SSIS package didn’t load to the server. This is especially frustrating as this particular package worked fine in Visual Studio. Believe it or not this screen actually contains information which can be used to resolve the error.

Finding a Useful Error Message

How do you go about determining what to do? Click on the word Failed. This will bring up some useful information, although it may not appear that way at first. Here’s the error message I received

SSISErrMessage

The message Failed to deploy project isn’t very useful, but the rest of the message is. The operation_messages view lives in SSISDB, and the operation identifier number is how to determine what the error is. Run this query, using the number provided in the error message, which in this case is 173

Select * from catalog.operation_messages where operation_id = 173

Here are the results from that query.

operation_message_id operation_id message_time message_type message_source_type message extended_info_id
50719 173 2016-02-29 15:02:08.2478928 -07:00 120 20 Failed to deploy the project. Fix the problems and try again later.:SqlDateTime overflow. Must be between 1/1/1753 12:00:00 AM and 12/31/9999 11:59:59 PM. NULL

 

Now this message is quite useful as it provides information that I can use to fix the issue. This SSIS Project contains a date parameter BackDate, which I had not set. Here’s the parameter.

SSISVSParameters

I set this parameter to a date between 1/1/1753 and 12/31/9999 and deployed the project again. This time, no error.

I hope that you have found this post helpful, especially if you haven’t deployed a package to SSIS in SQL Server 2012 or later.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Help! The Power BI Enterprise Gateway is not Working

Given some of the questions I’ve been getting recently there seems to be a lot of confusion regarding how the Power BI Enterprise Gateway works. The most common scenario seems to be that one person, probably the one who created the gateway, can use it and create reports, but when the reports are shared to others, there is no data, just an error message. Here’s the TL;DR verison. When using the Power BI Enterprise Gateway the Power BI user needs to use the security model of the on premises data source. Sounds simple enough, but in practice things can and do go wrong. Before you bother to read any further, if you can’t for a number of reasons change any Active Directory or Power BI User information, you can stop reading now and just use the Personal Gateway. All of the data access issues described here are specific to the Enterprise Gateway.

What is Required to get a Power BI user to use data from an Enterprise Gateway?

Still reading? Ok, so here’s another description on how the Enterprise Gateway works. When the Enterprise KnockingonDoorPowerBIGateway is installed, it puts in a secure door into the firewall. Power BI is the little man knocking on the door. The gateway asks two questions: Who are you? And What is your Password? The answers to those questions come from Power BI, in the form of your email address the and the password used to login to the online application. If those don’t match what is found in Active Directory, entry is denied. The reports will blank. In other words, AD is saying, I Hear You Knocking, but You can’t Come In, which is also a great song lyric.

Resolving Enterprise Gateway Access Issues

The first thing that you need to do is ensure that the user has an ID on the network containing the data source which has permissions to access the database in the Enterprise Gateway. Second, the password on both the network account and the Power BI account must be the same. If they are not, change one. Lastly you need to check the UPN [User Principal Name] to make sure that the Power BI email address matches what the UPN has in Active Directory. A UPN is sort of a replacement for an account name and does not need to be the same as the account name used to log into the network. As a side note, while you can run a SQL which will show that you are getting an error, it won’t show what the userid and password being sent by Power BI are. The email address in the UPN does not have to match the email set up for the account, and chances are if it is not working, they are not the same. To review the settings, you are going to need access to Active Direct Directory and check the UPN for the user who cannot access the data exposed by Enterprise Gateway. Once these issues have been resolved, the firewall will be opened and users will have the same access to data on Power BI on the internet as they have on the On premises server.

Let me know if you’ve found this post helpful or would like to hear more on this topic.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Best Practices for the care and use of the SSISDB

Deploying SSIS packages since the release of SQL Server 2012 and beyond, has moved to the SSISDB database. If you are using SQL Server 2012 and beyond and are not using the SSISDB for deployment, my sincere condolences as maintaining and deploying packages any other way is a maintenance headache which thankfully has been resolved. As the SSISDB has not been used for very long, the appropriate care and feeding of this database is not well known. SSISDB is part of Integration Services Catalogs. It is not possible to create an SSISDB without first creating an SSISDB catalog, which does not happen when SQL Server is installed. Follow Microsoft’s instructions for creating a SSISDB Catalog, which creates the SSIS catalog and the SSISDB database. You cannot rename the database, as SSISDB is the name of the database that Integration Services uses internally.

Backup the SSISDB

A client asked me recently why he should back up the SSISDB database. While you can recreate everything inside of the SSISDB, it will take time and you will have to remember exactly how all of your variables were set. Restoring the backup decreases this issue and having a backup allows a server to be redeployed quickly. When you do back up the database, make sure that you remember to backup the database certificate, which is created when the SSISDB is created as well, as you will need this to do a restore. By default. the recovery model of the SSISDB is set to Full. If the packages in SSISDB are changing minute by minute, full would make sense, but given that an SSISDB contains packages which are run on a scheduled basis, most likely the changes made are infrequent. Change the recovery model to simple.

Managing SSISDB Growth Over Time

SSISCatalogSettingsSSISDB contains all of the data used for the reports created when SSIS packages are run. Right click on the SSISDB icon underneath the Integration Services Catalog and take a look at the settings. The default settings are listed here, and to decrease the size of the SSISB over time, you may which to change them. The Retention Period is set to 365 days. Many environments don’t look at reports greater than 90 days, as information prior to that timeframe isn’t very meaningful. If that is the case, change the retention period to the number of days someone is actually going to look at the report, which will decrease the amount of data stored in the database. To get rid of the logs, the setting Clean Logs Periodically needs to be set to True, so don’t change it.

The Server-wide Default Logging Level is by default set to Basic. The information provided at this level is generally what is needed to troubleshoot any issues. Don’t set the logging to None just to save space. If the SSIS code ever crashes, the person doing this may be cursed. If you have simple jobs though, you may find the logging level of Performance may be adequate. Check out Microsoft’s documentation on Logging to better understand the differences between levels.

Lastly if using version control in another application, such as TFS, there may not be a need to set the Maximum Number of Versions per Product to 10. Generally speaking most people don’t look past the last 3 versions. The number should reflect what is practically, which is nearly always less than 10.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

 

 

Who Do You Work For?

Who do you work for?” seems like an obvious question, after all you work for Company X or yourself, but is that really the answer? I recently read an interesting blog post from Mike Fal b | t who recently started a new job and talked about the things he finds important when selecting a position. After reading his post, I thought about a comment I heard about working which has stayed in my head ever since.

You work for your Immediate Manager

bossA few jobs ago, I was working for a company which was purchased by another company. Changes were coming, but they hadn’t happened, yet. I was working for Tyler, who was soon not going to be my manager. He knew I was not going to be working for him soon, and at that time I didn’t know I would be getting a new manager. We had a conversation about the upcoming changes where Tyler told me  you really don’t work for Company X, you work for your immediate boss. He’s right. After all working for a company is one thing, but where the rubber meets the road is when someone directs what you do during the day. The ability to make your life miserable or make you happy to come to work comes from your supervisor, not from the company. One person’s input is a lot smaller picture than Company X, more immediate, and more intense. When Tyler was not my manager, I realized how right he was. I didn’t think much of the new manager and left.

How do you Determine Where to Work?

Because people are such a large part of the working environment, a change in management is a big deal in determining if you want to stay or not. It also explains why two people who work for Company X may have two different perspectives, especially if it is a large company. A friend of mine quit Company Z, which is a large company that continually has very high marks for what a great company it is. Employee surveys continually rank it near the top of several Best Company’s for Employees to Work lists. He quit because he didn’t like his manager. He thought a number of people we knew in common were great, but that couldn’t overcome his bad manager.

Weighing the Criteria

When management is not a consideration, then the criteria change from people to tasks. Quality of work, ability to learn and apply new skills, career advancement, monetary compensation, working environment, scheduling are important considerations. Since rarely is one able to really determine the management question prior to being in a position, these tangible criteria are the only thing one can use to make a decision on where to work. Many times though, this information isn’t enough, and you only find out after you make a decision if it was the right one.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Why the Power Pivot Tab may be Missing in Excel 2016

If you have recently upgraded to Excel 2016 and don’t have a Power Pivot tab available to you, there are a couple things you can do. You might want to go to the Developer tab and click on the COM Add-ins icon, which will display a list of the Add-ins available. If Power Pivot is available, selecting this options will allow you to use Power Pivot. But what if Power Pivot isn’t there? Well unlike Excel 2013, there is nothing for you to go download from Microsoft. Chances are if you don’t have the option listed in the COM Add-ins window, you are going to have to part with some additional cash to get the Power Pivot tab.

Business Analytics Features are no longer included in all Versions of Excel

Power Pivot is considered a Business Analytics feature, but What-if Analysis and Forecast Sheet are not. Seems to be an interesting definition of Analytics Features. If you want Power Pivot, you are going to have to pay for it. Here’s

Extracted from https://blogs.office.com/2015/09/18/new-ways-to-get-the-excel-business-analytics-features-you-need/

Extracted from https://blogs.office.com/2015/09/18/new-ways-to-get-the-excel-business-analytics-features-you-need/

a clip from Microsoft’s website intending to clear up what versions include Power Pivot. Looking at this graphic, this is no way lists all of the versions of Excel which Microsoft sells. What about Office 365 Enterprise E1? Surely you would get Power Pivot functionality with that right? No. How much more money is Power Pivot going to cost you? Well, if you have Office 365, you are paying $8 a month for the Office software, including Excel. There is no guarantee that spending more money will provide Power Pivot though. Office 365 ProPlus, which has Power Pivot, will run you $12 a month. If you have Office Small Business Premium, which runs $12.50 a month you won’t get Power Pivot. Check the version of Excel 2016 by going to File->Account then look at what is listed. If the version isn’t Office 365 Pro Plus or one of the other versions listed in the graphic, there will be no way to make Power Pivot appear.

Power BI: The tool for Desktop Data Analytics

Excel 2016 is the first version to be released after Power BI moved to it’s own application. While Excel received the visualizations of Power BI, Excel did not inherit all of the data modeling capabilities of Power BI. The difference in Power Pivot is the diagram now shows the directionality of the relationship through arrows, a far cry from Power BI’s data modeling capability. Even when it is available, Power View is turned off in Excel 2016, and the reason given for this is “The interactive visual experience provided by Power View is now available in Power BI Desktop”. This sounds like a clear drive to separate the two products. I understand the desire to separate the two products. Changing the licensing model where people don’t know if the version of Excel they have will do what they need it to do, I don’t think is a good way to get people to use Power BI.
Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

2016 Releases of SQL Server and Excel

It’s been 2016 for over a month now, so hopefully you are still not changing 2016your 5 into a 6 still. Chances are you haven’t changed your software to reflect the new year yet. Office 2016 came out in September of 2015 and SQL Server 2016 hasn’t been released yet. It’s hard to keep up with all of the version changes that have come out, especially when you throw in Power BI which is has something new every month. If you know you are going to be upgrading to one or both of these versions, or want to learn more about SQL Server or Excel so you can decide if  is worth the upgrade effort, this week I might be able to help out. I am going to be talking about the 2016 version of SQL Server on Wednesday, February 10 at the Arizona SQL Server User Group meeting and then talk about the 2016 version of Excel on Thursday at the Excel BI SQL Pass Virtual Chapter. If you are not in Arizona right now, you are missing out as we are having Department of Tourism weather of 80 degrees. You can get back to me in August when I am melting in the 115 degree heat.

Polybase in SQL Server 2016

Since there are many new features to talk about in SQL Server 2016, I picked Polybase. As big data matures many places are looking to keep their structured data right where it is and create an HDFS cluster to store other data. Polybase allows SQL Server 2016 users to look at both all in one place.

Excel 2016

It’s been a while since September 22, 2015, the date Office 2016 was released , but I still know very few people who have upgraded. I’ve been to a few clients that hope to upgrade to Excel 2012 this year. In this session, I will show where things got moved and renamed, what’s new and what is on the deprecated list. If you don’t have 2016 installed yet, or if you do and wonder where Power Query went, please join me to hear all about it. Generally speaking, the Virtual Chapters are posted on Youtube, and when they are I will have a link available. Unfortunately for those who attended my last Excel BI Virtual Chapter Meeting, due to technical difficulties that recording is not available, but hopefully this time everything will work.  When the recording is available I will make sure a link it is available on my blog for those who can’t make it.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

DIY Guide to Content Packs

SQL Saturday Austin - 461 2016

As I was rehearsing my Using Power BI when Implementing Data Analytics Management Practices presentation for SQL Saturday Austin, I realized that I wasn’t going to have time to cover everything I wanted to cover. One of the important methods for implementing data management practices in Power BI is using content packs. Content packs are a method of sharing reports and data throughout your organization so everyone doesn’t need to create data model and those people who do can share them with everyone else. For this reason I highly recommend using content packs in Power BI. This guide will walk you through the steps needed to create content packs. If you don’t have Power BI Pro, you can stop reading now, as content packs are a feature only available in the Power BI Pro version. There are a number of steps which will need to be completed to use a content pack within Power BI, and I’ve listed them all in order below. Depending on where you are in the process you might want to skip to the Creating a Content Pack section, but I thought it was important to include everything that should be completed first.

Create a Data Model in Power BI Desktop

For this example, you will need to create a data model in Power BI Desktop, and for this demo, create one report too. I’m not going to review how to do that here, but Microsoft has a video guide to creating Power BI Desktop models here. Save the .pbix file. After saving the file, go to PowerBI.com and login.

Recommendation: Use a Group Workspace

The next step is optional, but if you work with other people, I recommend it. If you don’t have one already, I highly recommend that you create a group workspace. That way the ownership is shared, meaning that if you take a day off, someone else has the ability to access the information. Click on the My Workspace item on the left hand side of the Power BI page. Click on the + (plus) adjacent to the Group Workspaces item. On the right hand side of the screen a form will pop up with fields for Group Name, Privacy, and Group Members. Make sure that you complete the form and save it. Double click on the new group workspace item to open it. If you happen to have an Office 365 Exchange license, creating a group workspace will also create a Group One Drive. This is a great place to put data so that you can all share it and see the file from within Power BI. Now that I have this one drive location created, I am going to copy my newly created Power BI file to it.

Using a Desktop file on Power BI.com

To use the Power Desktop file within PowerBI.com, the next step is to upload the Power BI Desktop file to the web as a dataset. Either clicking on the + (plus) button next to the words Datasets, or click on the Get Data button on the bottom of the screen. Both options will get you to the Get Data screen. We want to Import the Power BI Desktop file, so click on the get button in the Files box. The screen will change to the file location section. Select Local file and upload the Power BI file.

Data Refresh

PersonalGatewayPowerBIConfigurationScreenEnsuring that the data set refreshed, which allows everyone to have current data, requires updating the data with a gateway. For this example I am going to use the Power BI Personal Gateway because I plan to include multiple data sources instead of just SQL Server  and Power BI web application to schedule the data refresh. Assuming I have already installed the Power BI Gateway, Click on the (ellipse) next to the Power BI Desktop file just loaded to the data set, and a box will pop up with a list of features on the bottom of the popup box. Select Schedule Refresh, which will bring up the screen shown.

Assuming the Personal Gateway is online and the Data Source Credentials are ok, change the Schedule Refresh from the default Off to On. Set the Refresh Frequency to one of the available options. If you want to update the data more than once a day, click on the option Add another time. When you have finished adding times, click on the Apply button to save the contents.

Creating a Content Pack

To create a content pack, ideally you want to share a data model which has working appropriately scheduled updates. That way anyone who wants to create a report doesn’t have to worry about having valid working data. Instead they can work on providing meaningful visualizations to business problems.

If you want to create a content pack or use one, the step is the same. Click on the yellow Get Data button on the bottom left corner of the screen. That will change the active window to the Get Data window. On the left hand side of the screen under the words Content Pack Library, there are two options. Click on the Get button from the one on the left, My Organization. Click on the button labeled Create Content Pack. The following screen will be displayed.CreateContentPacks

There are a number of options on the Create content pack screen, starting with the Choose who will have access to this content Pack button. I have selected the option My entire organization. You may want to create different content packs for different groups of users. If you have exchange groups set up, such as Accounting@desertislesql.com which would send an email to everyone listed in the email group, you can enter that email group. If you just want to add a list of emails for people within your organization, you can do that as well.

In the sample Create Content Pack screen shown, I have filled in the blanks, selected my Power BI Desktop file I just added and uploaded a company logo. Once you click on the Publish button, the screen will close and you will get a success window which briefly appears on the top of the screen. The content pack is now ready to use. Click on the Get Data button again, and the new content pack is available to use. When I select the newly created content pack AWDW, I am provided a new window with a big Connect button in it. Click on the Connect button. The data set and any reports connected to it will have yellow stars next to them.

Every one who uses this new data set can be guaranteed a data set which updated on the same schedule, and different people can now create visualizations with one shared dat aset which can be used many different times.

Data Management within an Organization

Having helped a number of organizations implement Power BI, one of the big issues I have seen are not related to the product but related to the processes within the organization which are used to support the data needs of a variety of different users. These processes tend to be the reason a self-service business intelligent process is successful or not. Using Content packs can be a part of that solution which is why I look forward to sharing what I’ve learned at SQL Saturday Austin – 461. I hope to see you Deep in the Heart of Texas!

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

I’m on TV!

Old-fashioned four legged TV set isolatedI’m on TV! Not just any TV, Excel.TV, which unless you stream the internet on your TV set like I do, probably is on a computer monitor, but in my world it still counts. Please check out Episode 40 if you want to see a recording of the show where I talk about Power BI.

There is kind of an interesting story as to how I ended up being asked to be on the show. I was teaching a Power BI class, and Jordan Goldmeier b | t was talking it. Jordan started following me on twitter, and tweeted that he was in my class. I was really surprised and intimidated that an Excel MVP who has written a number of books on Excel was taking a class from me. My class was in the pre-Power BI desktop days, so I was going over the 4 Powers in Excel, Power Pivot, Power Query, Power View and Power Map. Jordan later told me that he learned something in my class. I was relieved. We’ve kept in contact since then, mainly via twitter, which is how I got asked to be on the show.

I was really impressed by Excel.TV and everything that Rick Grantham b | t , Szilvia Juhasz b | t and Jordan do to make the very professional, with graphics and sound effects. They asked me on to talk about Power BI. While on the ExcelTVshow, Rick asked me about the variety of things I have on my blog, and I got to thinking about it. Whether I use Power BI, Excel, Machine Learning, SSIS, SSAS or R, I am trying to do the same thing, make sense of the data and use the data to provide answers. You can call that data science or business analysis or business intelligence, but whatever the label or the tool, I think that really covers what I like to do.

I really enjoyed being interviewed, and I look forward to catching up with Rick and Jordan at the PASS Business Analytics Conference in May where we will all be speaking May 3-4. Maybe I’ll see you there too?

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur