Changes to the Power BI Free Version Include No Sharing

NoSharingIncluded in the recent list of announcements Microsoft made about Power BI Local and Power BI Premium are a series of changes to the Power BI Free version which will go into effect on June 1. The free edition of Power BI will no longer be able to share reports. Currently free users could create reports and share them with others, which will be discontinued.  Only Power BI Pro Editions will be able to share reports.  Currently Power BI Pro users can create reports which can be shared with Free versions as long as no Pro features are used.  This means that if a Power BI report is set to automatically refresh the data, that report cannot be shared as Free versions do not have the ability to create reports which have data refreshed automatically. If the report was recreated to remove the automatic updates and instead refreshed manually, then the report could be shared with Free versions.  Starting June 1, the sharing feature will be removed. No longer can Power BI Pro users share anything to Power BI Free users.  If you have a Power BI Free account, there is no way to share information in the service. The Power BI Desktop will continue to be free but since you cannot print the content within it and sharing a PBIX file means that you will always be sharing the entire data model, this is of limited value.

Future Releases of the Free Version

Microsoft does plan on continuing the free version and improving it.  In the future, it will include features previous included only in the Pro version.  While previously the data sets which the Free version was able to connect with were limited, they will soon match all of the data sets included in the Pro version. Data refresh will be supported, as will streaming and higher data storage rates. Other than sharing and workgroups, which are pretty big features, Pro and Free will have the same feature set.

How Power BI Free Accounts Can Share for One Year

If you have a Free Power BI account and have logged into the account prior to May 2, you have a year to use a Pro license. It does not matter if you have previously used a Power BI Pro Trial.  This trial is a new one, and is available to anyone with a free account. After that, shared reports will not be accessible, unless the account starts paying for the Pro license.

There are a lot of conversations regarding the changes to the free account, and the other recent Power BI announcements.  In my next post, I will be discussing the Power BI Premier option.  To be notified of my latest posts, please subscribe to this blog using the link on this page.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Running Power BI Locally with the Power BI Report Server

Power BI Now Available on your Local Server

Power BI: Now available without being on the cloud

Microsoft had an lot of announcements about Power BI this week, so many that it was easy to miss some of the finer details, including those which are going to be important in making decisions going forward.  Since the announcements are changes which will be effective soon, in the case of the free tier of Power BI on June 1, and released “… generally available late in the second quarter of 2017” this will give Power BI users time to adjust to the changes. In a nutshell, Microsoft has announced they are adding a cloud service called Power BI Premium which will allow people to create capacity instead of per-user licenses, the free edition will no longer to be able to share files, Power BI Embedded is going to be migrated to the Power BI Service from Azure, and finally, at long last, it will be possible to run Power BI reports locally and without needing anything in the cloud.

Running Power BI without a Cloud

It is not possible to run Power BI reports locally right now, but sometime before the 1st of July 2017,  users who have SQL Server 2016 Enterprise Edition per-core and active Software Assurance [SA] can deploy Power BI Report Server.  This means that no one is going to have to wait for SQL Server 2017 for Power BI on premise as it will be available sometime in June.  The functionality in SQL Server 2017 SQL Server Reporting Server [SSRS]. Community Technology Preview edition is going to be available in Power BI Report Server, with the addition of the ability to include custom visuals and many data sources, which the CTP version did not do. The Power BI Server includes all of the functionality of SSRS This means that users will not need an SSRS Server and a Power BI Server, as the Power BI Server will be able to do both.  If you want to migrate all of the reports created in SSRS from 2008 R2, and SSRS Mobile Reports, you can migrate these reports to the new Power BI Report Server. You can use Power BI Reporting Server for reports created on earlier versions, as long as you have a version of SQL Server 2016 Enterprise per-core edition with SA. The Power BI Report Server will be a separate install with separate release schedules, which currently are planned about once a quarter. Power BI Report Server will also be able to publish reports to mobile devices as well. If the reports uses data in the cloud, you can employ a Data Gateway as the Power BI Reporting Server can use the gateway to access cloud data. Of course if all of the data in the report is located on-premises, no gateway will be required.

Power BI Pro Licenses for On-Premise Reporting

While there is going to be no additional cost for running reports locally, or looking at them, creating and sharing reports for the Power BI Report will require a Power BI Pro License.  The Power BI Desktop is going to be free, and there is still going to be a free version of Power BI. There will also be a  new desktop version of Power BI for Reporting Services which will be on the same version as the Server, which will have fewer updates. This means if you support Power BI Service Reports and Power BI Report Server Reports you will have two versions of the Desktop, the Reporting Services Power BI Desktop and the Power BI Service Desktop.  Both are designed to run on the same machine. So far I have not had any problems having both other than remembering which is which as the icons are the same.  You have to load the software to see that the top line has (Report Server).

Starting June 1, free Power BI license holders will no longer be able to share reports.  Reports created with a free license can be viewed only by the person with the free account.

Power BI Desktop does not have Dashboards, and neither will Power BI

When it is released, Power BI Report Server will be displaying reports created from the Power BI Desktop.  Dashboards are not created in the Power BI Desktop application, meaning that there will be no Power BI Dashboards in the Power BI Report Server.  While this may change in a later release, it is not available in the first release, which also does not support R or custom visuals either.  To display and distribute dashboards, use the Power BI service.

I am sure there will be more announcements about this and other upcoming Power BI features. Many will most likely be announced at Microsoft’s Data Summit Conference in June, which I will fortunately have the opportunity to attend.  If you are going to be there as well, drop me a line or ping me on twitter at @desertislesql and perhaps we can meet in person.
 ***Update I have a post which covers the released version of Power BI Report Server.  Click here to find what was changed since this post was written.
Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

 

IoT Security Concerns

When looking at IoT implementations, the topic of security always comes up. Many people remember October 21, 2016 as the day IoT devices broke the internet. After the investigation event was complete, it turns out the that it the outtages were not exactly caused by IoT devices. A majority of the denial of service attacks came from things like home routers, which most people would not classify as an IoT devices. When looking at all of the different ways that IoT devices can be modified to do bad things a few different ideas come to mind in terms of risk. How easy is it for a non-authorized user to gain access to a given device and what kind of device is it? If the device is a network router, that is a big problem. If the device is a water sensor and you need a lot of networking equipment to modify it, then the risk can be classified as a low risk. How an IoT device is modified is also a problem. If the IoT device is hacked in such a way that it becomes unusable, because the code ran out the the battery power that is a bigger deal than an IoT device which can be fixed by power cyling the device, which returns it to the factory configuration. Many times the code used to take over the device prevents any remote control access. This means a person needs to physically go power cycle the infected device. This can be a problem as some of these devices are inaccessible and are designed to be replaced not maintained. Power cycling stops the immediate problem but it does not prevent the same exploitation from happening again. New firmware or patches need to be applied to prevent the problem, requiring maintenance of the device again.

Knowing something about the people involved and the process used to gain access can help assessing risk. Unfortunately a number of devices have been released which have no security as they leave ports open which can be readily hacked, instead of say implementing SSL. Telnet, HTTP, DNS, Port 80, Port 22, and RDP all of these are ready targets for attack. There are also broadly speaking three classifications of users who work on accessing things which they don’t have access. Knowing about these three kinds of users and how they go about the process of gaining access.

Terminology changes over time, and to understand the risk, one needs to understand the language used by the people who access networks outside of the way the software was designed. For starters, the word Hacker is not used much. The word is overused and has come to mean people who have ill-intent, instead of the original version of people who were looking for flaws, not exploiting them. The people who access networks are known as Penetration testers, which is commonly abbreviated as pen testers. They not only figure out how things are broken but more importantly how to fix problems which they have found before someone takes advantage of the flaw.

Nation State Attacks

Many governments employ legions of developers to access or destroy. With nearly unlimited resources at their disposal, they are very successful when they want to target software for attack, as was seen when the Iranian centrifuges exploded. They also have the ability to completely mask where the unauthorized code came from, as that is standard operating procedure. If code is ascribed to be from the area near I-95 exit 41, you can pretty much guarantee that it came from somewhere else. It is virtually impossible to trace the origins of Nation State attacks as they ensure nothing is ever what it seems. If there are clues as to the origins of the software, they are added as intentional misdirection to throw people off the trail or to affix blame someone else. They have dedicated hardware designed to break access codes in a matter of seconds. There is no stopping this type of attack. They have little interest in most things so the risk is fortunately low.

White Hats

These penetration testers generally make their living hacking by permission. Firms hire them to see how vulnerable their networks or IoT Devices are. It is likely that white hats would advise these firms on incorporating effective security solutions (such as Mobile security in Intune) in order to prevent office networks from getting hacked. The white hats contact the vendors and provide warnings about things like the ability to access all of the memory on a single VM server if one knows how to overload a specific buffer, which is getting to be more common as servers are virtualized both on-site and in the cloud, which is after all someone else’s server. These people are trying to help make the code better. Of course, all pen testers are not all white hats, there are grey and black hats too. These people know enough to cause or prevent real damage.

scriptKiddieScript Kiddies

Like everything else there are varying levels of skills involved in pen testing. The lowest skill level is known as a Script Kiddies. People who penetrate networks are not the evil geniuses portrayed in Hollywood, they just know where to download Kali Linux which comes with a tool called Metaspolit. This tool contains a database of libraries which can be run against networks. Knowing how to run a program is not a great skill but then again given how poorly so much of the software is written, you don’t have to be an evil genius to say spy on a TrendNet webcam, as they did not fix their software until the FTC made them do it. The IoT hack that broke the internet? That capability was readily available as a canned exploit. Fortunately these common types of unauthorized access are the easiest to defeat. IoT systems need to be designed to avoid simple and well known intrusions, which is something that I will be talking about in my presentation for 24 Hours of Pass. Hopefully you will get a chance to attend live or watch the upcoming recording.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Writing R code in RTVS and Data Analytics with SQL Server

sqlbitslogoIn preparation for my session at SQLBits on Data Analytics with SQL Server, I reviewed all of my instructions for configuring computer to run R Client for SQL Server.  The instructions have changed with the release of R tools 1.0 for Visual Studio 2015 [RTVS].  Unfortunately, there are no R Tools for Visual Studio 2017. In the documentation for RTVS, they state that a version for VS 2107 will be released “soon”.  This new version makes it easier than ever to set up R for SQL Server as it contains all of the links needed for R Client, and invalidates most of the documentation for RTVS for changing the version.

Configuring a R Environment to use R on SQL Server

In addition to having an SQL Server 2016 instance with R Server installed, the following components are needed on a client machine

The Comprehensive R Archive Network

RStudio (optional)

Visual Studio 2015 R Tools

This list is a change from the previous list I have provided as RTVS contains an installation of R Client, there is no need to download that as well. You do not need to download Microsoft R Open if you are using R Server either.  Once RTVS is installed, there is a menu option on the R Tools window. Selecting Install R Client from the menu will handle the information. Unfortunately, there is no change to the menu option once R Client is installed, it always looks like you should install it.  To find out if R Client has been installed, look in the Workspaces window.

Selecting the Right Version of R within Visual Studio

RTVSWorkspacesPrior to RTVS 1.0, the version of R running was selected in the R Tools->Options. This has moved to the Workspaces window, which if you have the default version of RTVS, is the second tab in the bottom right corner of the screen.  This window will show the version of R that are installed.  In order to use R Client functions, you will need to select Microsoft R Client, as shown in the Workspaces tab.  The version selected will have a green check next to it as shown in the picture. To change the selection, click on the blue arrow near the gear, where you will be prompted with a message asking you if you are sure that you want to switch.

Changes to R Client make RevoScaleR Libraries Not Work

The latest version of the R client tools changes more than where to find the version of R running. The new client tools remove the need to install the RevoScaleR library.  With R Client 3.3.2, the library is no longer compatible and you will get an error if you try to install it.  The libraries are no longer needed as the functionality is included in R Client.  This means no additional libraries are need to for the rx commands like rxSetComputeContext(“local”). The functionality is included in R Client. If when trying to use R Client the error You are running version 9.0.1 of Microsoft R client on your computer, which is incompatible with the Microsoft R server version 8.0.3 appears, then you need to update SQL Server to the latest version, which is SP1 CU2, which you can get here.  If you haven’t installed SP1 for SQL Server, you will need to do that first.

Due to the changes in the R Client, a lot of documentation is no longer accurate, which is why if you are looking for information on R Client, make sure to check the date of the information to ensure what you are looking at is pretty current as things change a lot, which provides continual information for my blog.  I am looking forward to meeting more people who read it here at SQLBits 2017.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Security Updates to Power BI

Office 365 Admin Screen for granting Power BI Admin rights

Office 365 Admin Screen for granting Power BI Admin rights

In the past month, Microsoft has made a number of security changes to Power BI. The first one, is not really a feature update, but a PowerShell replacement. No longer do you need to use PowerShell to become a Power BI Admin. Any Office 365 Admin can grant Power BI Admin permissions via this screen in the Admin Center. The Power BI Admin role was first created in October, but the screen was not complete, which was just fixed in February.

Power BI Security Changed from Tenant Only

People who have been granted Power BI administrator rights will also notice a modification to the Admin screen. The March 2017 update to Power BI provides a major change to the security model in Power BI. Previously all the security settings were set at the Tenant Level, meaning that all the privileges were granted to all users. If I wanted to allow one group within the organization to be able to publish reports to the web, but I did not want to allow everyone to publish reports to the web, there was no way that this could be accomplished. All that has changed. It is now possible to include or exclude groups of users from having rights in Power BI. Users can be classified into security groups in Azure Active Directory, either through the Office 365 Admin Center or via the Azure AD Admin Center. Once created the security groups can be used in Power BI. Security Groups are not the same thing as the groups created in Power BI when a new work group is created.

Using Security Groups in Power BI Admin

PowerBINonTenantAdmin

Power BI Admin Portal

The new Power BI Screen looks different. It now lists which rights can be specified to different groups of users. Share content to external users, Export Data, Export reports as PowerPoint presentations, Printing dashboards and reports, Content pack publishing, and Use Analyze in Excel with on-premises datasets now have the ability to be assigned to security groups so that the rights do not have to be the same throughout the entire tenant.

Unfortunately, some of the permissions are still tenant based. For example, the setting Publish to web, which is one permission I would definitely like to turn on only for some users, is still only available as a tenant level option.  These security changes are a welcome improvement to the product as they provide more options for administrators to grant rights to Power BI.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

SQL Server R Services and the 20 User IDs

What is the Reason why installing R creates 20 User IDs?

If you have installed SQL, you may have noticed that it creates twenty user ids as part of the installation process. To many people the automatic creation of a number for SQL Server User IDs is alarming, and they want to know what these IDs are for and who will be using them. The answer for who could be using those IDs, is just about anyone running R and they are going to be using the resources of the server to do it. If you are a DBA and want to figure out how to stop this, keep reading as I promise to tell you, after I provide some context about SQL Server and R internals.

R Server and Launchpad

SQLServerManagementConsoleWhen R Server is installed as part of SQL Server, one way you can check to see if it is installed is to look to see if the Launchpad service is running. When R code is running it does not run within SQL Server OS. It is by definition an external process and the Launchpad exe serves as a conduit between SQL Server and the space where R is running. If you want to know more about R and SQL Server Internals, this article I wrote for SQL Mag will provide a lot more details. Microsoft designed the Launchpad service so that other languages might someday also run as R does on SQL Server. It also supports a feature of R Server which I wrote about, context switching. Context Switching provides the ability for users to utilize Server memory instead of the memory on their computers for running R, and access is granted through the use of one of the twenty ids created when R is installed.

Launchpad Settings – Where the External Users are Referenced

launchpadThere are many reasons why a DBA might want to not allow clients to access server memory as that will tax the server. Turning it off is relatively simple. Go to the SQL Server Management Console and select SQL Server Launchpad for the instance of SQL Server running R Server.

In the picture of the screen, the instance of SQL Server I have running R Services is in SS2016. Right click on the server and select Properties, then click on the Advanced tab. When looking at the number of external users allowed by default, the number might look familiar. The reason there are twenty User IDs created for R Server is because Launchpad allocates by default external twenty users to connect from SQL Server to run R. If you don’t want to allow external users to run on a server, you will need to prevent the users from connecting by not enabling them to run R. To run R, users need to have db_rrerole permissions. If they do not have that, they cannot run R. On the production server, it is probably best that this permission not be granted to non-system users.

Since the External Users created are used by SQL Server when running R, it is not possible to set the number of external users to 0 as the Launchpad Service will not run, and no R Code can be executed anywhere. If the number of external users is modified, Configuration Manager provides a prompt window as a restart is required. If the number of External Users is set to 0, the Launchpad Service will not start. When the Launchpad Service tries to start, it will generate Error 1053: The Service Did Not Start in a Timely Fashion. The number of users has to be at least 1 for the service to be able to communicate with the external R components. If you add or reduce the number of External Users, the IDs will be either created or deleted to match the number listed.

Let me know if you found this information regarding SQL Server R Service information by commenting or messaging me on twitter. If you are interested in finding out more regarding the internals of SQL Server and R, you might be interested in reading this Article about the topic. I would also like to thank Bob Ward b | t of the Microsoft for helping me better understand the SQL Server R internals, and for patiently answering my questions on the topic.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Calculated Tables and Role-Playing Dimensions

http://michaeljswart.com/2016/06/t-sql-tuesday-079-its-2016/comment-page-1/#comment-186750Working with role playing dimensions, which are found when you have say multiple dates in a table and you want to relate them back to a single date table, have always been problematic in SQL Server Analysis Services Tabular. Tabular models only allow one active relationship to a single column at a time. The picture on the left shows how tabular models represent a role playing dimension, and the model on the right is the recommended method for how to model the relationships in Analysis Services Tabular as then users can filter the data on a number of different date tables.

TabularRolePlaying dimension Modeling

 

RolePlayingDimension

The big downside to this is one has to import the date table into the model multiple times, meaning the same data is imported again and again. At least that was the case until SQL Server 2016 was released. This weeks TSQL topic Fixing Old Problems with Shiny New Toys is really good reason to describe a better way of handling this problem.

 

Calculated Columns: The solution for Role Playing Dimensions

SQL Server 2016 provides a new method of solving the role playing dimension problem, using a calculated column. Instead of copying in the source from the date table, instead create a formula to get a copy. First switch to the data view, of the model. Then select Table->New Calculated Table. ThSSASScreenCalcTablee screen will change to the new table screen and the cursor will be pointed to the formula.

In my model I have one table called date. I am going to add a calculated table called order date. The DAX is couldn’t be simpler. Just select the table named ‘Date’ which is shown in the picture below. Rename the table to something more meaningful, like Order Date and that is it. The modeling required is the same, but now the model size does not increase to accommodate all of the date tables needed, as there is only one copy of the date table referenced multiple times. If you are using Power BI this same concept can be used for handling role playing dimensions as well.

SQL Server 2016 had a lot of great new features, and in addition to the flashy ones like R there are a lot of great enhancements to the Tabular model that are worth investigating as well.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Context Switching in R Server

R code tends to be very memory intensive as R processes primarily in memory. If you want R to perform well, you want as much memory as you can get your hands on to run your code, especially with larger datasets. This is a problem as many individual laptops have pitifully low memory capacity, and unless you have a computer with say as much memory as you can put in this one, if you are analyzing large datasets you may run out of memory. If a new computer is not in the budget, why not develop on the server? You may be thinking that there is no way the administrator of the box is going to provide you the means to be able to use the server memory. Well, if you have a SQL Server 2016 with R Server installed, chances are you can use the memory capacity of the server by connecting your R process to run on it from your computer, without the need to install anything on the server.

Microsoft’s R Server contains some specialized functions which are not part of the standard CRAN R installation. One of the ScaleR functions, RxInSqlServer will allow code to be processed on the server from the client. To make this work, you must have R Server and R Client installed. If you are doing a test on a local machine, you will need both R Client and R Server installed on that computer.

How to use the Server Memory not Local memory for Running R

If you are developing R in your IDE of choice, either R Studio or Visual Studio with R Tools, here is the code you need to make that work, which includes code running on the server

#First you will need to install ('RevoScaleR') if not there already as context switching is included in that library
if (!require("RevoScaleR")) {
install.packages("RevoScaleR")
}
#Load the library
library(RevoScaleR)
#Create a connection to your SQL Server 2016 server instance. Note the double slashes which I needed to identify the instance name
#
sqlConnString <- "Driver=SQL Server;Server=DevSQLServer\\SS2016;Database=TestR;Uid=ReadDataID;Pwd=readd@t@!!!"
#Set the variable containing RxInSQLServer. Note All specific R Server libraries start with Rx
#
serverside #Set the Compute context to SQL server. After this the code will run using Server Memory, not local memory
#
rxSetComputeContext(serverside)
#Check to see what the compute context is. Not this is for informational purposes. You do not need to do this to make anything work.
#
rxGetComputeContext()
#If you want to change the compute context back to your computer run this command
#rxSetComputeContext("local")
#Until the context is switched back, I am now running on the server, not locally.
#Here I am going to take a look at a table in my TestR database called AirlineDemoSmall
#
sqlsampleTable <- "AirlineDemoSmall"
#
sqlPlaneDS<- RxSqlServerData(connectionString = sqlConnString, verbose = 1,table = sqlsampleTable )
#To take a look at the content of the data, I am going to take a look at 30 rows in table in my TestR database called AirlineDemoSmall
#
rxGetInfo(data = sqlPlaneDS, getVarInfo = TRUE, numRows = 10)
#To visually investigate the data, this command will plot a histogram displaying the frequencies of values in #one of the columns, CRSDepTime
#
rxHistogram(~CRSDepTime, data = sqlPlaneDS)

Here’s the output I get back in the R interactive Window.

Data Source: SQLSERVER
Number of variables: 3
Variable information:
Var 1: ArrDelay, Type: character
Var 2: CRSDepTime, Type: numeric
Var 3: DayOfWeek, Type: character
Data (10 rows starting with row 1):
ArrDelay CRSDepTime DayOfWeek
1       -14 16.283333   Monday
2       -1   6.166667   Monday
3       -2   7.000000   Monday
4         0 10.266666   Monday
5         0 13.483334   Monday
6       -10 16.833334   Monday
7       -10 19.949999   Monday
8       350 14.650001   Monday
9       292   9.416667   Monday
10       M   6.000000   Monday

RxHistogram

RxHistogram

Let me know if you found this post helpful, by posting a comment. Thanks also to Mario, who asked me about context switching which gave me the idea to answer his questions on this site. If you are interested in seeing more information about SQL Server and R, please subscribe as I tend to answer more of the questions I receive here.

 

 

 

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

Using Data Analysis to Pick Super Bowl Winners

I know that there is no way to compete with the major sports networks in the compilation of statistics about the two teams footballplaying in the super bowl. Instead I am going to focus on one feature, self-interest. Like many people, I have money in the stock market and I want my investments to make money next year. For this reason, I am an unqualified supporter of the Atlanta Falcons in the 2017 super bowl. The single data point I am using for my analysis is the fact that the falcons are an NFC Team, and when the NFC wins the stock market goes up. Go Falcons!

Correlation without Causation

Correlation does not imply causation is a common term in statistics and data analysis. It means that just because two variables move in relation to one another one does not mean that there is a cause and effect relationship between the two, even though it may seem like it. Just because when I washed my car it rained does not mean that I can control the precipitation patterns in the desert based on my propensity to visit the car wash. You may be thinking that having an NFC team win the super bowl and the stock market is an example of correlation without causation. After all the NFL does not control the world wide financial markets. If you look at the data though, 80% of the time the markets go up when the NFC wins. That is 50 years of data that supports that the winner does impact the market. Why might that be? Perhaps it follows Quantum Mechanics.

Observer Effect of Quantum Mechanics

When studying physics, specifically quantum mechanics researchers noticed that the observation changed the results. This is QuantumMechanicssomething commonly looked at when creating forecasts. Are the forecasts correct because the models are correct or because people believe them enough to make it happen. The superbowl winner impact on the stockmarket is well known. Perhaps it is for this reason that it becomes a self-fulfilling prophesy. This is the entire belief of many self-help ideas. If you believe it will happen, work to make it happen, it will happen. For whatever the reason, one cannot ignore 50 years of data.

Perhaps Patriots fans may think that I am pulling a lot of esoteric facts out of the air because I want the Patriots to lose. In all seriousness though, it is all about the data, and the observable effects of data knowledge. If you are watching the game and your team did not make the playoffs, and you are wondering who to root for because you do not care about the winner, perhaps this post helped you to decide.

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur

 

Appending Data – It is OK to be different

 

One of the more powerful features of Power BI Desktop is the query feature, which was called Power Query back when Power BI was part of Excel. Using the Query feature, if the data which you want to use is bad, has unneeded columns or contains data formatted differently than desired all of that can be readily fixed. The best thing about the query feature is that it uses the M language and records each step. Mess up a step? No problem just delete it and keep on going.

Appending in Power BI

AppendQueriesRecently I worked on a Power BI project where I needed to merge data provided in spreadsheets. The spreadsheets came from different vendors and while they contained mostly the same data, the columns were not in the same order. I wanted all of the data to reside in one table. In Query, that means that I wanted to Append the data. The files which I were merging were very wide, and I missed the fact until after I was done that some of the columns were in different order. Power BI is smart enough to figure out the order on its own. I didn’t need to change the order of the columns at all, as long as they have the same column names. Here’s an example using three different files.

 

File 1

File 1

 

Notice each of these files is a little different

 

 

 

 

 

 

 

File 2

File 2

File 3

File 3

 

 

 

 

 

 

 

 

 

 

I want to Append these files together so that all of the columns containing the like information will be in the same column. To do this the columns do not need to be reordered. As long as the column names are the same the contents will merge. I am going to need to modify File 3 to have the same file names, so I will rename Date to Expected Duration in Minutes and Location to Plant.  Since I know that File 3 came from Slingback Central, I am going to want to add that column to File 3 as well.  Othewise I will get a null value in the Maintenance ProvidAdd Custom Columner Column.  I do not need to place the column in any specific location as long as the name is the same. Renaming the columns is pretty easy.  All one needs to do is right click the column and select rename and type in the correct column name.  To add a new column, in Query select the tab Add Column and click on the Custom Column option. As you can see in the window pictured below, the text name Slingback Central has double quotes around it. If you forget to do that, you will get a syntax error.

Putting it All Together

Now that all of my queries have the same file names, I am ready to append them together.  To do that I select one of the queries and from the Home tab click on the icon on the far right side to Append Queries.  Since I want to paste three files together, I select the option for three or more files, and select all of them so that they appear on the right in the Tables to Append section of the screen.

After appending the data together, it merges all of the like columns together regardless of the order of the original files as shown below.

append

Not having to reorder columns is a great feature as it saved me a lot of time and I hope this post can do the same for someone else.

 

Yours Always

Ginger Grant

Data aficionado et SQL Raconteur