Discover the biggest data issues in online real estate

6 April 2021 | 3 min read

real estate data issues

Throughout our years of experience working with data from multiple real estate portals and websites, we have identified the four biggest data struggles:

● Limited user tracking
● Indirect user preferences
● Poor listing data quality
● Cold start problem

Let’s look at each of these more closely.

“ 40% of business objectives fail due to inaccurate data“

Limited user tracking

A majority of real estate portals and websites, either currently or in the past, use little or too general user tracking. Of course, most companies use Google Analytics or even tools like Hotjar to track users for marketing or UX purposes. And everybody claims to be capturing all user data. But usually, the actual in-depth raw data necessary for personalization isn’t available. Portals heavily underutilized the user tracking techniques that form the basis of personalization.

So, what is effective user tracking? Ideally, this means tracking every user interaction necessary for your business case in real-time. In most cases, this translates into contact requests, search queries, clicks, viewed pages, and scrolling actions, whether anonymous or logged in. It also means collecting data on which elements they interacted with (for example, photos) and for how long.

By utilizing this data, you can start profiling your users and eventually (hyper-)personalize your portal. You are still able to do this GDPR proof. Read how we do it on our previous blog.

To start your personalization journey, depending on the size of your dataset, you’ll need approximately 3-6 months of listings and user data. There are different tools you can use to do this:
Google Analytics
● Google Analytics 360
● …

We won’t be covering all the differences between these tools, but we have an extended blog about Google Analytics vs. Google Analytics 360 vs. Snowplow. 

User data, and its practical utilization, is one of the primary reasons behind the success of Netflix, Amazon, and the other tech giants. Data leads to powerful outcomes for both consumers and companies.

If you haven’t already started harnessing the power in your user data, then now is the time to get started. Your user data not only impacts the user experience but also has much value for your advertisers.

You can give real estate advertisers lead qualifications, insights into the profiles of the leads reaching out, make sure their listings are shown to all relevant users and provide them with market insights to improve their business.

Indirect user preferences

Going hand-in-hand with incomplete user tracking is the phenomenon that website visitors seldom state their preferences directly. Users don’t want to fill in a complete form to express their preferences. In other words, their choices will need to be deducted from their past and current behavior on the website.

One can say while performing a search on a portal, users fill in exactly what they want. This is partially true, and we noticed that some search fields are barely used. In a good case, the user will fill in the location, price, and listing type. On average location is the best scoring feature with 81% of the users using this search feature. 42% of the users search on max price and only 7% on min price.

real estate data issues percentages

Furthermore, users usually take their ranges broader as they don’t know yet what they want or fear missing out on the house of their dreams.

The only way to get more insights into users’ needs and desires is to deduct their preferences from their behavior on your website.

Poor listing data quality

Before you start implementing data-driven or AI solutions, you should conduct an Exploratory Data Analysis (EDA) on both your listing as users’ data. We’ll cover the topic on EDA more in-depth later on. But by doing this, you can determine the data quality and see what is present in the data and what is missing. This process will uncover where your data is lacking and which steps are necessary to utilize your data fully.

From all of the EDAs done over the years, the biggest challenge we saw was that most portals have lots of blank and empty fields in the listings data. Often structured data fields such as surface, amount of bedrooms, and more are missing. More ‘advanced’ structured fields such as elevators, swimming pools, garage, and more were only limited available.

Bear in mind, a missing value can also be a value, meaning that it’s merely not present, or we do not have any information regarding this field. But if you have, i.e. 87% of the cases not filled in, your variable’s usability will also be quite poor. This is usually due to real estate agents and sellers, who forget or ignore most optional fields.

“ 40% of business objectives fail due to inaccurate data“

real estate data issues percentages-3

This results in a poor user experience because when visitors use the search engine, listings with missing values won’t show. We solved this problem for multiple real estate websites by looking at the listing descriptions and images. We found that these were packed full of valuable information that wasn’t available in the searchable data. So we extracted them with the help of our AI engines.

With one of our most recent clients, we improved their data quality by over 30% in searchable features. More on this in the next section, ‘3: extra use cases on data & personalization for real estate’.

Cold-start problem

The term derives from cars. When it’s really cold, the engine has problems starting up, but it will run smoothly once it reaches its optimal operating temperature.

In our context, it refers to the case where the amount of available data is limited, the recommendations are poor, or they lack full coverage over the entire spectrum of possible recommendation combinations.

In terms of recommendations means that a recommendation engine meets a new visitor for the first time. Because there is no user history about her, the system doesn’t know the user’s personal preferences. Getting to know your visitors is crucial in creating a great user experience for them.

There are numerous ways and algorithms to counter the cold-start problem, but one must be aware that this issue is present.

Solving Data issues

Now that we went over the most dominant data issues for real estate, it is important to know that all those problems have pretty straightforward solutions. However, even though the answers are straightforward it is still crucial to take the time to understand the reasoning behind them. That is why we will have an in-depth analysis on how to solve those data issues in our next article.

You don’t want to wait until next week? You can request a free copy of our latest e-book here, where you will find everything about the potential of data& personalization for real estate websites.

Check out our computer vision solutions

Check out our computer vision solutions

Related articles

How House Alerts Help To Obtain Higher Click-Through Rates

How House Alerts Help To Obtain Higher Click-Through Rates

Thanks to house alert systems, you can quickly inform potential buyers when new properties that match their criteria show up. Here is a step-by-step guide that will show you how you can retarget potential customers and help them find the house of their dreams.