Skip to main content
Newspaper illustration

Maximize Your Lead-to-SQL Conversion with Regression-Backed Lead Scoring

Traditionally, most marketers have taken a “finger in the wind” approach to behavior scoring. That is to say, they scored and weighted lead behaviors based on the time investment it took for a lead to engage with a particular marketing tactic. A webinar recording that takes 40-minutes to watch would typically warrant a higher behavior score than a whitepaper that takes 10-minutes to read.

This is directionally sound and certainly much better than no behavioral scoring at all, but this is fairly subjective and not grounded at all in the reality of what content or assets leads are consuming that may be triggering them to “hand-raise” or become customers.

With AI, or more specifically, machine learning capabilities are more expansive and accessible than ever, and it is not surprising to see its application making its way into many of the latest martech solutions. Behavior scoring is an area in which machine learning can be incredibly useful if you understand how to harness its power.

Martech to the Rescue… Diving In Head First

There are several popular solutions specializing in machine learning or data science-backed behavior scoring. The clear benefits of these solutions include:

  • Data. Many of these vendors also double as data enrichment solutions, which is convenient, as more (and more accurate) data points generally make for a more statistically sound model. This enables them to enrich your 1st-party dataset with 3rd-party data to build a more complete, and ostensibly, more accurate lead profile and predictive model.
  • Specialization. Specializing in this category means they have gone through a similar exercise with multiple clients, and are more likely to know what types of behaviors to look for, and what levels of aggregation and profiling typically work well.
  • Extension. Some vendors offer true machine learning solutions, dynamically adjusting the scoring model as more behavioral data is collected.

Dipping Your Toes in the Water

As with many martech solutions, the cost for implementation can be significant, with most tools easily starting at five figures and very quickly climbing to the six-figure range. Moreover, these vendors are selling a solution and are (understandably) tight-lipped around their methodology. The result is essentially a black-box scoring output that provides very little context as to how they arrived there. This doesn’t exactly help you position your budget request for an unproven martech investment.

Luckily, for those of you that are not quite ready to take the plunge, or simply looking for more budget-friendly alternatives to data science-backed lead scoring, there are methods you can test and employ using just your 1st-party data.

Outputs and Inputs

As with any analytical exercise, the most important first step is to establish a clear objective. 

  • What is the definition of success?
  • What KPI are you looking to optimize for?

For most B2B marketing organizations, maximizing Lead-to-SQL conversion is usually a good place to start. Depending on your specific business process and needs, you may want to adjust your “north star” either up or down the funnel (e.g. SAL or closed-won opportunities).

The next priority is data preparation. The goal of this exercise is to leverage prospect engagement behavior as indicators for lead interest and quality, so at a high level, we want to take as many behavioral data points as available and aggregate them up to the lead level. This implicitly requires that you have access to your lead behavioral data– for a marketing automation platform like Marketo, this ideally means access to your activity logs. By the time you are done building your lead profiles, it should look something like the following table.

The above example is actually about the simplest version of a lead profile you can build with your activity data. In pretty much every lead scoring exercise, you will want to build on this simplistic view with additional meaningful slices of your engagement data. Using lifetime value concepts of recency, frequency, and monetary overlaid with key context from your business can provide a good framework for building a meaningful lead profile. Some examples of applying these lenses may include looking at the following data points over 30, 60, 90, or 180-day increments:

  • Form fills
  • Webinar registrations (form fills)
  • White paper form fills
  • Email links clicked
  • Product email links clicked
  • Web pages visited
  • Product web pages visited

It is not unusual to start with a lead profile consisting of 100 to 200 individual engagement metrics representing each lead. That said, by the time you run your regression analysis, you will typically find that many of these metrics will not have any meaningful correlation to your KPIs, allowing you to whittle down your list pretty quickly.

100-Level Statistics in Excel

Once your dataset is fully prepared, it is time for some Excel magic. To run the Regression in Excel, you will need to enable the included Analysis ToolPak. Select the Regression feature and run it against your Lead Profile dataset. The “Y” input range will be your intended output. In the case of the example dataset above, it is the Conversion from MQL > SQL. The “X” input range will be all the behavioral profile metrics you built around your lead.

Running the Regression will result in a set of summary data in Excel. The output can be a bit daunting at first, but there are a few key metrics to focus on. Ideally, you are looking for an R Square value that is 0.95 or higher and a Significance F that is 0.05 or lower. That said, for a lead scoring exercise, you can probably relax a bit on the statistical rigor and call an R Square of 0.85 or above good enough. If either or both of these values are well outside of those ranges, you may want to remove some of the less correlated behavioral attributes (P-values below 0.05) and rerun the Regression Analysis.


If both your R Square and your Significance F look reasonably good, you can start focusing on the individual behavioral attributes that are most likely to be correlated with your business outcome of MQL to SQL conversions. For this, there are two numbers to focus on, the P-value and the Coefficients. P-values at or below 0.05 generally indicate that the behavioral metric is likely to be correlated with your business outcome. Meanwhile, the Coefficients tell you how much impact this metric likely has on your business outcomes. The larger the number, the greater the impact.

One way to use these two numbers together is to simply take all the metrics with P-values of 0.05 or lower and use the Coefficients to weigh your Lead Score for that behavior. This gives you a much more statistically sound method of scoring your leads beyond the “finger in the wind” approach.
Working through whether or not this exercise yields usable data, and determining whether or not that data is moving and changing over time should give you a good idea of the value you may get from a predictive analytics platform.

This introduction into the powerful world of machine learning is your first step toward increasing the effectiveness of your lead scoring program. You will find that Regression analysis will help you eliminate the “finger-in-the-wind” guesswork and reveal what are statistically significant datasets for your organization to measure against to achieve your KPIs.

Implementation and Extension

Once you have completed your analysis, implement the tweaks to your behavioral scoring model that the Regression suggested. For example, if you’ve found statistically significant overlaps in hand-raisers with a particular piece of thought leadership, then you can take your behavioral scoring model and ensure that piece of content  (or all of the pieces like it), is getting more weighting than, say, a webinar you found very little correlation to actual buying signals.

Once you’ve refined your model, let your scoring run for another 2-3 months before doing the Regression analysis again. Are you closer to understanding what content pieces are driving interest? If so, and you think you want that kind of continual analysis and feedback, AND you are investing a large amount in content marketing, then investing in predictive scoring from a 3rd party tool may be in your company’s best interest (plus you now have the evidence to make the budget request).

If you still feel tentative about that level of investment for a black-box model, it may be in your best interest to create a workflow for continually improving and revamping your scoring models using this process, and re-run the analysis once per quarter or so (depending on lead volume).

If you want to try the exercise above and are struggling with how to get started, get in touch.

Get in Touch with Us

At Etumos, we love what we do and we love to share what we know. Call us, email us, or set up a meeting and let's chat!

Contact Us