jump to navigation

Power BI – Incremental Refresh – Extra 9 June 2019

Posted by David Wilson in General.
add a comment

#PowerBI #Power BI #IncrementalRefresh #PowerQuery

In a previous post I’ve discussed my experiences with Incremental Refresh:https://dgwilson.wordpress.com/2019/05/19/power-bi-incremental-refresh-and-query-folding/

My implementation is not as clean like the Microsoft examples and Ive encountered some challenges to get it working as per the “label on the box”.

In an Oracle database I have two tables, Financial year (FY) 18 data in one table (55,000 rows) and FY19 onwards data in a second table (18 million rows and growing at 1.5 million rows per month). The FY18 data contains a subset of columns that are in the FY19 table. The tables are separate as the source of the old FY18 data is different and only contains data at a high summary level. The idea is that these two tables are combined in Power Query to present as one table.

The image below shows the Power Query steps for the import of the FY18 data. Note also the “Filtered Rows” step that applies the date range filter for the parameters RangeStart and RangeEnd required for Incremental Refresh. These are the same RangeStart and RangeEnd parameters that are used on the FY19 data table even though the data in this table will never be added to.

The image below shows the Power Query steps for the import of the FY19 data and importantly the Append Query step with brings in the FY18 data. Note also the “Filtered Rows” step that applies the date range filter for the parameters RangeStart and RangeEnd required for Incremental Refresh.

The RangeStart and RangeEnd parameters are showing in the image below. The RangeStart is set to 1/07/2017 so that it covers the FY18 date range. Note that the Financial Year for this data is 1 July to 30 June.

Finally after Power Query is closed and you’re back in Power BI Desktop, set the refresh parameters on both tables.

Save, publish the .pbix file to a Premium Workspace in the cloud and initiate the refresh!

When I was *NOT* applying the Incremental Refresh parameters to the FY18 data the cloud refresh would somehow cause the number of rows in the FY18 data to be duplicated approximately 60 times, so financial totals blew out by a factor of 60 for FY18 for some spectacular results…. not really what was intended. The image below, read right to left, shows how each month of FY18 data has approximately 4,500 rows of data and that after the cloud incremental refresh there us approximately 281,000 rows of data per month for FY18 – which is incremental Refresh going wild.

The trick here is to apply the Incremental Refresh changes to the FY18 data as well as the FY19 data. Even though the FY18 data will never change. You need to apply the change in Power Query and specify the Incremental Refresh parameters in Power BI Desktop.


– David


DAX RANKX – unexpected results 9 June 2019

Posted by David Wilson in General.
add a comment

#PowerBI #Power BI #DAX #RANKX

I’ve been trying list the top 10 clients by Year-To-Date Revenue in Power BI. The background to this is that I have almost one year of corporate account financial data in 19 million rows of data.
My attempts to get RANKX to show me anything other than a Rank of 1 or a Rank of 1, 2, 1, 2, 1, 2 clearly were not giving me my expected results especially when producing a simple table of of customers, their Year-To-Date Revenue sorted Highest to Lowest visually shows me exactly what want.

In researching this I found some excellent articles on the web by Phil Seamark. These articles provide some great insight to how the RANKX function works along with a handy trick on how to debug the results. Reference:

While essentially I had my DAX formula correct, there were some challenges with the table where I was trying to visualise the results. It’s all about the data.
Within my data Customers have a HEAD PARTY NO, and a HEAD PARTY NAME. In the RANKX function I had “ALL(EVO_DMT_BUS_CUST_PROFIT_VW[HEAD_PARTY_NO])” and in the table visualisation I was listing the clients by EVO_DMT_BUS_CUST_PROFIT_VW[HEAD_PARTY_NAME].

The “Problem” is explained here: The measure was removing the filter context from the HEAD_PARTY_NO column, however the visual was still applying a filter over the HEAD_PARTY_NAME column. The filter over the HEAD_PARTY_NAME meant the measure was being restricted to only 1 or 2 rows per line.

The “Solution” was to add the column used in the visual to the RANKX measure – doing this allowed RANKX to see (and rank) more rows.

__RankCustomersByRevenue =
— Loop the rows in this table
— run this expression for each loop

Special thank you to Phil Seamark for the direct focus and quick resolution.

– David

David Wilson

TrafficCamNZ – quick build on macOS 10.15 9 June 2019

Posted by David Wilson in General.
add a comment

With minimal work which was mostly commenting out code like Chromecast support I was able to complete a build on macOS 10.15 of TrafficCamNZ and run it. Here is what it looked like when it was running.
I was able to page the images left and right – I’ll now watch some WWDC videos to learn more.

Power BI Incremental Refresh and Query Folding 19 May 2019

Posted by David Wilson in General.
1 comment so far

I have a 16 Million row dataset that represents about 10 months of financial transactions. Based on its size so far it will be 40 Million rows with about two years of data. The 16 Million rows of data are stored in an Oracle database with the load into Power BI Desktop taking about 5 hours. Impressively the .pbix file is 208MB in size (that includes other tables etc). When the dataset is refreshed in the Power BI Service it takes about 30 minutes through the Power BI Gateway which is again very impressive.

This is the link to the Microsoft article with details on how to setup Incremental Refresh

It is good and does contain the information needed to get Incremental Refresh working. Critically there are some points not to miss:

1. Query Folding
2. Parameter setup
3. Testing

Query Folding
Query Folding out of Power BI is about getting as much work performed Server Side by the SQL query as possible. Let me explain:
Let’s say the dataset on the server had 100 Million rows and that I only want 10,000 rows. If query folding isn’t working then 100 Million rows are downloaded into Power BI Desktop before it discards 99,990,000 of them. That is a waste of compute resource and time.

Critically for me I never had Query Folding out of the box because I never used it and I didn’t know that I’d never used it.
Query Folding does not work when you write your own SQL Queries. Devin Knight explains this very well in his below post linked here:

“When you establish a connection in Power BI to a database (let’s say SQL Server for this discussion) you’re provided an option, under Advanced Options, to write your own SQL Statement to define how the data will be imported. Once you choose this option Query Folding is no longer available on this query.”

Another reference on Query Folding is Adam Saxon from GuyInACube – this short 5 minute video shows what Query Folding is.

Back to my challenge – of course I had used the “advanced” option and entered my own SQL Select statement. Building on Devin’s article – what should I have done? – Not specified any SQL statement and done all of the work to remove columns etc in Power Query. Power Query also lets you check along the “Native Query” is till working… and in my experience you may just have to wait a bit… if you right click on an “APPLIED STEP” and View Native Query is greyed out, wait a few moments and try again.

Parameter setup
When setting up the RangeStart and RangeEnd parameters I just set them up _ALMOST_ exactly like Microsoft had them. Unfortunately I left the “Required” switch ticked. It is NOT ticked in the Microsoft article.

You can test on a dataset of millions of rows. Perform your initial testing by setting the RangeStart and RangeEnd parameters so that it will select a significantly smaller number of rows like 10,000. This way when you click Close and Apply out of Power Query it will only retrieve 10,000 rows rather than spending a lot of resource working through Millions – that’s the specialist job of your database server and Query Folding.

When I tested with 10,000 rows, and published the .pbix file to the Premium tenant, connected to the gateway and performed the refresh on the dataset… the refresh took approximately 30 minutes. Way better than the 5 hour load to the desktop client. And because it is an incremental refresh subsequent updates should be very fast.


– David

DAX YTD Measures 19 May 2019

Posted by David Wilson in General.
add a comment

I’ve recently built some YTD Measures for some financial reporting. These have been enhanced because while the built in DAX Functions work well (and I use them), they can impact your presentation later when todays real date moves across the financial year boundary. I’ll illustrate what I mean. Below is a DAX Measure that calculates the YTD revenue up to the end of the financial year:

_Revenue YTD = CALCULATE(SUM(CUSTOMER[_Revenue]), DATESYTD(Calendar[Date], “06/30”))

There’s two things wrong with this:
1. If today is 5 July – you’re only going to see transaction data from 1 July to 5 July – this isn’t so good if you want to see Year to Date figures for last year
2. If you want to see whole months of data, not part months then this DAX needs some enhancement to not include part of a month – i.e. todays date is 15 May and you only want the YTD measure to show you data to the end of April.

Before I show you the new _Revenue YTD measure we need a little bit of “secret sauce”. In my case I need to know the month for which we are reporting on. Here’s the DAX Measure that determines that.

Report_Month =
IF(MONTH(last_date) = MONTH(TODAY()),
EDATE(last_date, -1),

Now that I know the report month – I can bring out my YTD calculations.

_Revenue YTD =
VAR reportMonth = [Report_Month]
IF([Fiscal Year LY] <> [Fiscal Year LY Reporting],
CALCULATE(SUM(CUSTOMER[_Revenue]), DATESYTD(Calendar[Date], “06/30”), CUSTOMER[RPT_MONTH] <= reportMonth)

The DAX below shows the Revenue Last Month, which directly uses the Report_Month measure determined above.

_Revenue LM =
VAR fiscalYear = [Fiscal Year TY]
VAR reportMonth = [Report_Month]

From here determining the Revenue Last year and the Revenue for the equivalent period YTD last year is straight forward. I have just a little bit of work to do so that the hard coding of 2018 is removed.

_Revenue LY = CALCULATE(TOTALYTD(SUM(CUSTOMER[_Revenue]),’Calendar'[Date], “30 June”), ‘Calendar'[Fiscal Year]=2018, ALL(CUSTOMER[RPT_MONTH]))

I have the same set of DAX measures for Cost and Margin.

#Power BI

– David

Installing pfsense 2.4 on 64bit AMD architecture 28 April 2019

Posted by David Wilson in General.
add a comment

I have two HP Small Form Factor PC’s, specifically models 8100 and 8200.The model 8100 is running pfsense 2.3 and there is no obvious way to upgrade to version 2.4.

Version 2.4 of pfsense requires a 64bit AMD architecture to install and run. So the first task is to confirm that these machines support the 64bit AMD architecture.

The processor chip the 8200 is an Intel(R) Core(TM) i5-2400 CPU running at 3.10 GHz
Intel chip specifications are confirmed here: https://ark.intel.com/content/www/us/en/ark/products/52207/intel-core-i5-2400-processor-6m-cache-up-to-3-40-ghz.html
About half way down the page there’s an entry that says Intel 64 = YES.

The processor chip the 8100 is an Intel(R) Core(TM) i5-650 CPU running at 3.20 GHz
Intel chip specifications are confirmed here: https://ark.intel.com/content/www/us/en/ark/products/43546/intel-core-i5-650-processor-4m-cache-3-20-ghz.html
About half way down the page there’s an entry that says Intel 64 = YES.

Given the 8100 is already running as my main firewall/router – I decided to install pfsense on the new 8200 machine. So I downloaded the latest “memstick” version of pfsense and copied it to a 2GB USB key.
—> gzcat pfSense-CE-memstick-2.4.4-RELEASE-p1-amd64.img.gz | sudo dd of=/dev/rdisk3 bs=1m
I installed the USB key in one of the front USB ports… powered on the machine and went into the BIOS to ensure that the USB key would boot (if present). Once done I booted to the USB key – this worked and pfsense installer began to boot… … … until it came to “masks 0x00ff0000, 0x0000f00, 0x000000ff, 0xff000000” where upon there’s a brief pause and the machine automatically reboots.

I found this reference… but no solution. It was a great clue though.

The Solution! In the BIOS – DISABLE EFI Boot Sources. There is a catch however…
The USB key boot no longer works. I was lucky and had attempted this previously with an installation by CD/DVD. That CD/DVD was still in the CD/DVD drive… it booted and installation proceeded successfully.

Checking the BIOS after the install the EFI Boot status has changed… I’ve left it like this.

I subsequently backed up the config.xml file and restored it to the new firewall… after installing a quad network card (so all network operations are on the same card) and reconfiguring the interface assignments… I swapped out the 8100 and swapped in the 8200. Firewall change over complete.

Focusing on the 8100 – I them performed the USB key install – it successfully booted and the i install proceeded… however did not complete and claimed it was missing files… yea… who knows. So I did the install of CD/DVD… so I now have a backup machine ready to go.

Screen image below for reference.

– David

New iOS releases TrafficCamNZ and Missile Control NZ 14 January 2019

Posted by David Wilson in General.
1 comment so far

New releases of the iOS applications TrafficCamNZ and Missile Control NZ have been released for sale in the iOS AppStore. Check for updates on your iOS device.

Missile Control NZ
– this is an iOS compatibility release to allow the application to run on the latest Operating System versions.

– Fixes an error on iPad when using the Chooser to add a camera
– Fixes a crash that sometimes occurs when data is synced across devices through iCloud

– David

Power Query (Power BI) and SharePoint Lists 2 January 2019

Posted by David Wilson in General.
add a comment

via Power Query (Power BI) and SharePoint Lists

Power Query (Power BI) and SharePoint Lists 2 January 2019

Posted by David Wilson in General.
1 comment so far

Time off work and some quiet time to experiment with SharePoint Lists. I’ve made extensive use of them as a data store which is working well although somewhat hostile for users at times… So the idea with this experimentation is to demonstrate how to expand data in Power Query (for Power BI) particularly with attributes on User type records… and as I have discovered for some reason it has caused me problems at the starting point which may be related to [ApiVersion].

Let’s take a look. Here is my List (you can see I’ve populated it with three rows of data), followed by the list definition.

Next step is to import the SharePoint List into Power BI and take a look at it in Power Query.

You can see the Logo field is of type record. Clicking in the icon to the top right of the field gives you an option to expand the record into Description and URL fields. Doing so will replace the Logo field with Logo.Description and Logo.Url fields. You can retain the original column by duplicating it if you need.

One of my defined fields is “Who”. It’s of type Person or Group and can be populated with Active Directory users – this is really great as you get some data quality control.

You should be able to click in the cell where it says “Table” and have details of those rows explored in a split screen below. However doing so I get the following error… and it might be a bug?

DataSource.Error: SharePoint: Request failed: The remote server returned an error: (500) Internal Server Error. (Specified method is not supported.)

Error Message:
Argument should not be zero.
Parameter name: width

Google searches have not been helpful. Until I found this article:
Where it refers to ApiVersion… and as I’m only testing there’s no harm in seeing what happens…

Here’s my M code before:
Source = SharePoint.Tables(“https://wgtnpowerbi.sharepoint.com/sites/WellingtonPowerBi/“, [ApiVersion = 15]),
#”0cec4993-754c-4330-a963-00bba4fa654a” = Source{[Id=”0cec4993-754c-4330-a963-00bba4fa654a”]}[Items]

Here’s my M code after, where the only thing I have changed is the ApiVersion:
Source = SharePoint.Tables(“https://wgtnpowerbi.sharepoint.com/sites/WellingtonPowerBi/“, [ApiVersion = 14]),
#”0cec4993-754c-4330-a963-00bba4fa654a” = Source{[Id=”0cec4993-754c-4330-a963-00bba4fa654a”]}[Items]

After making this change I did have to delete all of the applied Steps, go back to Source and choose my list.

And now when I click on the first cell in the Who field I’m getting what I expect along with the ability to expand the Table into it’s attributes:

When you expand the table additional rows are created in the main table to handle the multiple records in the table within the field.

That’s just a little bit of experimentation.

I suspect the error reported above in ApiVersion = 15 is actually a bug. To confirm I am using Windows 10 and Power BI December 2018.
For the record I have reported the issue via the Report Issue button as captured in the image above.

– David

TrafficCamNZ – News 19 September 2018

Posted by David Wilson in General.
add a comment

An iOS 12 compatibility release has been submitted to the AppStore for approval. All going well it will be available for distribution in a day or so.

In addition, there is a datafile update for TrafficCamNZ – this file adds 39 cameras for New Zealand c/- NZTA (Auckland, Christchurch, Wellington).

And… In further testing I’ve been working on a couple of issues that can cause the program to crash under rare circumstances (for one of them you have to be using it on an iPad). I need to do some more testing with the fix and will hopefully submit another release for approval in a few days time.

UPDATE: Approved and Released – 20 Sep 2018 2030 NZDT

– David