Author

Stian Ulriksen

Browsing

Business Intelligence dashboards are becoming more and more prevalent in businesses. Building an effective dashboard following best-practices leads down a comprehensive BI process. In this post, we will try to cover 4 of the most important things to keep in mind when assembling your dashboard. Good dashboard design simplifies large amounts of data to answer important questions raised by the business. In order to answer these questions, the dashboard needs to tell a clear and defined story while expressing the meaning of the data in clear visualizations, allowing the viewer to dig into the details if necessary. Bad Example A quick Google search and we have found a plethora of terribly designed dashboards. Here is one example: Terrible dashboard design There is simply too much going on in such a small place, all at once. It is cluttered, and distracting. How to Create Beuautiful Dashboards? So how can you avoid…

A logistic regression is a model that is appropriate to use when the dependent variable is binary, i.e. 0s and 1s, True or False, Yes or No. The logistic regression is part of the regression analysis library and could therefor be interpreted as a predictive analytics model.

Pandas in Python is an awesome library to help you wrangle with your data, but it can only get you so far. When you start moving into the Big Data space, PySpark is much more effective in accomplishing what you want. This post aims at helping you migrate what you know about Pandas to PySpark. If you are new to Spark, checkout this post about Databricks, and go spin up a cluster to play around. Apache Spark and PySpark Before we get going, let’s take a step back and talk about Apache Spark. Spark is a fast and general engine for large-scale data processing. Spark uses distributed computing to accomplish higher speeds on large datasets. When you submit a request to Spark, the driver node distributes the workload to a number of worker nodes who processes parts of the request in parallel. Think of it as an improvement to original…

With the rise of cloud computing and big data, columnar databases have increased in popularity. One of the main reasons for its rise in popularity is due to its efficiency for analytical queries and therefore business intelligence tools. This post aims to identify the key differences between these two database types and point you in the right direction for your future data warehouse.