class: center, middle, inverse, title-slide # R in Power BI ## SQL Saturday #759 ### Taras Kaduk ### 2018/05/05 --- # Agenda - What is R - R inside Power BI - Why do you even need it? # Slides and stuff ## [https://github.com/taraskaduk/r-in-power-bi](https://github.com/taraskaduk/r-in-power-bi) --- # ~~What is love?~~ What is R? ## More importantly, what can it do? - data transform - stat analysis - machine learning / predictive modeling - web apps - reports, presentations - websites & books --- # What can R do? ## Data transformation ```r head(iris) ``` ``` ## Sepal.Length Sepal.Width Petal.Length Petal.Width Species ## 1 5.1 3.5 1.4 0.2 setosa ## 2 4.9 3.0 1.4 0.2 setosa ## 3 4.7 3.2 1.3 0.2 setosa ## 4 4.6 3.1 1.5 0.2 setosa ## 5 5.0 3.6 1.4 0.2 setosa ## 6 5.4 3.9 1.7 0.4 setosa ``` --- # What can R do? ## Data transformation ```r iris_gather <- iris %>% gather("key", "size", -Species) head(iris_gather) ``` ``` ## Species key size ## 1 setosa Sepal.Length 5.1 ## 2 setosa Sepal.Length 4.9 ## 3 setosa Sepal.Length 4.7 ## 4 setosa Sepal.Length 4.6 ## 5 setosa Sepal.Length 5.0 ## 6 setosa Sepal.Length 5.4 ``` --- # What can R do? ## Data transformation ```r iris_gather2 <- iris_gather %>% separate(col = key, into = c('part', 'dimension')) head(iris_gather2) ``` ``` ## Species part dimension size ## 1 setosa Sepal Length 5.1 ## 2 setosa Sepal Length 4.9 ## 3 setosa Sepal Length 4.7 ## 4 setosa Sepal Length 4.6 ## 5 setosa Sepal Length 5.0 ## 6 setosa Sepal Length 5.4 ``` --- # What can R do? ## Data transformation ```r iris_summary <- iris_gather2 %>% group_by(Species) %>% summarise(size_avg = mean(size), size_sd = sd(size), size_med = median(size)) head(iris_summary) ``` ``` ## # A tibble: 3 x 4 ## Species size_avg size_sd size_med ## <fct> <dbl> <dbl> <dbl> ## 1 setosa 2.54 1.85 2.10 ## 2 versicolor 3.57 1.76 3.30 ## 3 virginica 4.28 1.92 4.15 ``` --- ## **What can R do?** Data visualization ![](https://d33wubrfki0l68.cloudfront.net/c7361de6666276165f0e6a1cccf88a9aff64ab90/19a4b/post/2017-09-mpaa/mpaa_files/figure-html/good_will_hunting-1.png) [https://taraskaduk.com/2018/01/10/mpaa-damon/](https://taraskaduk.com/2018/01/10/mpaa-damon/) --- # What can R do? ## Data visualization ![](https://d33wubrfki0l68.cloudfront.net/cd18743b0d0cb9270a884f2df772e64595ac99bb/2dfe8/post/2017-11-pixel-maps/pixel-maps_files/figure-html/unnamed-chunk-5-1.png) [https://taraskaduk.com/2017/11/26/pixel-maps/](https://taraskaduk.com/2017/11/26/pixel-maps/) --- ### **What can R do?** Web apps (A.k.a Shiny) [https://taraskaduk.shinyapps.io/rate/](https://taraskaduk.shinyapps.io/rate/) <iframe width="800" height = "500" src="https://taraskaduk.shinyapps.io/rate/" frameborder="0"></iframe> --- ## What can R do? ### So much more! - reports - presentations (including this one!) - websites - books --- ## **What can R do?** Bread and butter > *"The best thing about R is that it was written by statisticians. The worst thing about R is that it was written by statisticians."* > **~ Bob Cowgill** (probably) <sup>1</sup> .footnote[ [1 - Why has R, despite quirks, been so successful? - Revolution Analytics](http://blog.revolutionanalytics.com/2015/06/why-has-r-been-so-successful.html) ] --- ## What can R do? ### Statistical analysis and Machine Learning - perform statistical analysis - create, train and test models on your laptop (`caret`, `modelr` packages, etc) - reach to Keras and TensorFlow from within R - et cetera et cetera --- ## R in Power BI ### R in Microsoft What can you do with R inside of MS products? - Run R on SQL Server. - Run R code in Azure ML - Use Microsoft R Server for paralel processing --- [https://microsoft.github.io/sql-ml-tutorials/R/rentalprediction/step/3.html](https://microsoft.github.io/sql-ml-tutorials/R/rentalprediction/step/3.html) ```sql -- Stored procedure that trains and generates an R model using the rental_data and a decision tree algorithm DROP PROCEDURE IF EXISTS generate_rental_rx_model; go CREATE PROCEDURE generate_rental_rx_model (@trained_model varbinary(max) OUTPUT) AS BEGIN EXECUTE sp_execute_external_script @language = N'R' , @script = N' require("RevoScaleR"); rental_train_data$Holiday = factor(rental_train_data$Holiday); rental_train_data$Snow = factor(rental_train_data$Snow); rental_train_data$WeekDay = factor(rental_train_data$WeekDay); #Create a dtree model and train it using the training data set model_dtree <- rxDTree(RentalCount ~ Month + Day + WeekDay + Snow + Holiday, data = rental_train_data); #Before saving the model to the DB table, we need to serialize it trained_model <- as.raw(serialize(model_dtree, connection=NULL));' , @input_data_1 = N'select "RentalCount", "Year", "Month", "Day", "WeekDay", "Snow", "Holiday" from dbo.rental_data where Year < 2015' , @input_data_1_name = N'rental_train_data' , @params = N'@trained_model varbinary(max) OUTPUT' , @trained_model = @trained_model OUTPUT; END; GO ``` --- class: center, middle ![](https://docs.microsoft.com/en-us/azure/machine-learning/studio/media/r-quickstart/fig1.png) --- class: center, middle ![](https://sdtimes.com/wp-content/uploads/2016/01/0113.sdt-news.png) --- class: center, middle ![](https://image.slidesharecdn.com/1-lab209-160901013524/95/microsoft-r-server-for-distributed-computing-23-1024.jpg?cb=1472693770) --- ## R in Power BI That's cool, but what can I do with R in Power BI? - custom R visuals - scripts - visuals --- class: middle, center # Custom R visuals --- ![](http://gqbi.files.wordpress.com/2016/08/082516_0434_binsightpow3.png) http://community.powerbi.com/t5/R-Script-Showcase/bd-p/RVisuals --- # Custom R visuals https://docs.microsoft.com/en-us/power-bi/desktop-r-powered-custom-visuals ![](https://docs.microsoft.com/en-us/power-bi/media/desktop-r-powered-custom-visuals/powerbi-r-powered-custom-viz_3a.png) --- # Custom R visuals ![](https://docs.microsoft.com/en-us/power-bi/media/desktop-r-powered-custom-visuals/powerbi-r-powered-custom-viz_10.png) --- # Custom R Visuals [https://github.com/Microsoft/PowerBI-visuals-forcasting-exp](https://github.com/Microsoft/PowerBI-visuals-forcasting-exp) ```r ets_params = list(Automatic="Z",Multiplicative="M",Additive="A",None="N") if(frequency(timeSeries) == 1) seasonType = "None" deModel = paste(ets_params[[errorType]],ets_params[[trendType]],ets_params[[seasonType]],sep="") if(sum(deModel==c("ANM","ZMA","MMA","AZM","AMZ","AMM","AMA","AMN","AAM")))# Forbidden model combination deModel = "ZZZ" fit = ets(timeSeries, model=deModel,damped=damped) ``` --- # Custom R Visuals [https://github.com/Microsoft/PowerBI-visuals-forcasting-exp](https://github.com/Microsoft/PowerBI-visuals-forcasting-exp) ```r p1a <- p1a + labs (title = pbiInfo, caption = NULL) + theme_bw() p1a <- p1a + xlab(labTime) + ylab(labValue) p1a <- p1a + scale_x_continuous(breaks = seq(1,length(prediction$x) + length(prediction$mean)-1, length.out = numTicks), labels = x_with_forcast_formatted) p1a <- p1a + theme(axis.text.x = element_text(angle = getAngleXlabels(x_with_forcast_formatted), hjust=1, size = sizeTicks, colour = "gray60"), axis.text.y = element_text(vjust = 0.5, size = sizeTicks, colour = "gray60"), plot.title = element_text(hjust = 0.5, size = sizeWarn), axis.title=element_text(size = sizeLabel), axis.text=element_text(size = sizeTicks), panel.border = element_blank()) ``` --- class: middle, center # Custom R visuals in Power BI are well thought-out and overall legit. Use them --- class: middle, center # R Scripts --- # R scripts <br><br> ### - When Power Query can't do the job ### - When Power Query is slow and clunky --- # R scripts This ``` let Source = Web.Page( Web.Contents( "https://github.com/jennybc/gapminder/blob/master/inst/extdata/gapminder.tsv" )), Data = Source{0}[Data], col_types = Table.TransformColumnTypes(Data,{ {"", type text}, {"country", type text}, {"continent", type text}, {"year", Int64.Type}, {"lifeExp", type number}, {"pop", Int64.Type}, {"gdpPercap", type number} }), filter = Table.SelectRows(col_types, each ([year] = 2007)), mutate = Table.AddColumn(filter, "lifeExpMonths", each [lifeExp] * 12, type number), arrange = Table.Sort(mutate,{{"lifeExpMonths", Order.Descending}}) in arrange ``` --- # R scripts VS this ```r library(gapminder) library(dplyr) gapminder %>% filter(year == '2007') %>% mutate(lifeExpMonths = 12 * lifeExp) %>% arrange(desc(lifeExpMonths)) ``` --- # R scripts ## *What If* demo - what_if.R - what_if.pbix - https://app.powerbi.com/view?r=eyJrIjoiNmNjZGE3MzEtMmMxYy00NDRmLWEyZmEtOTAyM2FmNTQzMzc2IiwidCI6IjQ4NTlkNTVmLTkyMjUtNDNmYy1iMWJiLWUwM2I0MmE4NWM0ZCIsImMiOjJ9 --- class:middle, center <img src="https://raw.githubusercontent.com/taraskaduk/r-in-power-bi/master/presentation/3.PNG"> --- class:middle, center <img src="https://raw.githubusercontent.com/taraskaduk/r-in-power-bi/master/presentation/4.PNG"> --- class:middle, center <img src="https://raw.githubusercontent.com/taraskaduk/r-in-power-bi/master/presentation/5.PNG"> --- <iframe width="800" height="600" src="https://app.powerbi.com/view?r=eyJrIjoiNmNjZGE3MzEtMmMxYy00NDRmLWEyZmEtOTAyM2FmNTQzMzc2IiwidCI6IjQ4NTlkNTVmLTkyMjUtNDNmYy1iMWJiLWUwM2I0MmE4NWM0ZCIsImMiOjJ9" frameborder="0" allowFullScreen="true"></iframe> --- # R scripts ## Forecast demo - financial_forecast_demo.pbix - financial_forecast_demo.r - https://app.powerbi.com/view?r=eyJrIjoiZWU0ZjUzZGUtZjk5Yi00ZGU1LThhZTgtYWJiNGEzYzFlZjgyIiwidCI6IjQ4NTlkNTVmLTkyMjUtNDNmYy1iMWJiLWUwM2I0MmE4NWM0ZCIsImMiOjJ9 --- class:middle, center <img src="https://raw.githubusercontent.com/taraskaduk/r-in-power-bi/master/presentation/1.PNG"> --- class:middle, center <img src="https://raw.githubusercontent.com/taraskaduk/r-in-power-bi/master/presentation/2.PNG"> --- <iframe width="800" height="600" src="https://app.powerbi.com/view?r=eyJrIjoiZWU0ZjUzZGUtZjk5Yi00ZGU1LThhZTgtYWJiNGEzYzFlZjgyIiwidCI6IjQ4NTlkNTVmLTkyMjUtNDNmYy1iMWJiLWUwM2I0MmE4NWM0ZCIsImMiOjJ9" frameborder="0" allowFullScreen="true"></iframe> --- # When to use R in Power BI ## Do - performance - machine learning, statistical analysis - custom visualizations ## Don't - when no one else knows R on your team - when you can get by with Power BI - when you can get by with SQL --- # The end ## Follow me: - [https://taraskaduk.com](https://taraskaduk.com) - [https://www.linkedin.com/in/taraskaduk/](https://www.linkedin.com/in/taraskaduk/) - [https://github.com/taraskaduk/](https://github.com/taraskaduk/) - [https://twitter.com/taraskaduk](https://twitter.com/taraskaduk) ## This presentation is here ### https://github.com/taraskaduk/r-in-power-bi ### https://taraskaduk.github.io/rpowerbi.html