How to Classify Thousands Of Unnatural Links (Really Quickly)
One of URL Profiler’s strengths is its speed and accuracy at auditing links. Our customers are using it every day to quickly recover from unnatural link penalties and penguin related downgrades. We’ve even got some who have used it to diagnose negative SEO attacks. Yeah, they do exist!
This post will show you how to very quickly classify thousands of links so you can figure which ones are unnatural. We have this information in video form, if you are more visually inclined:
Importing Links
An important part of link cleanup work is trying to find as many links as you possibly can. When carrying out link audits we will typically use 5 different link sources to try and make sure we have found as many links as possible.
Rather than forcing you to combine all the links and dedupe them yourself, URL Profiler will do this for you. So just import the raw data files and URL Profiler will combine them all, across a wide range of supported reporting tools:
- Google Webmaster Tools
- Bing Webmaster Tools
- ahrefs
- Majestic SEO
- Open Site Explorer
Within URL Profiler, right click in the white box and choose to ‘Import from File’, then you can import multiple csv files in one go, and let the tool de-dupe the links for you.
Setting Up The Tool
URL Profiler is a bit of a swiss army knife, in that it can be used for many different tasks. To get maximum benefit out of each use, you need to adjust some of the setting specific to the task. URL Profiler will save these settings for the next use so if you are doing lots of similar tasks (e.g. link prospecting) you can just leave the settings alone after the first run.
Checking Your Links
We need URL Profiler to go and look at the links to your site, to determine things like:
- Does the link still exist?
- What anchor text is used on the link?
- Where is the link positioned on the page?
- What type of site is the link on?
In order to check the links still point at your site, we need to tell URL Profiler what your site is, so enter your domain name in the box ‘Domain to Check’ in the Link Analysis section at the bottom. You don’t need to worry about the ‘www’ bit unless you are specifically checking links to a subdomain.
I’ve written another post about why our link checker is awesome – and why the following settings are recommended – so I’ll not repeat myself here (you can read that post here).
Choose the Settings option top left and make sure ‘Connection Timeout’ is set above 40 seconds:
Then navigate to the ‘Link Analysis’ tab, and push the maximum retries up to 5 (or at least 3). Again, read my other post if you want to know why this will give you the most accurate link analysis possible.
Defining Anchor Text
URL Profiler has a refined unnatural links classification system, which isn’t based on some fancy ‘machine learning algorithm’, but is instead based on how SEOs perform link classification.
When an SEO carries out a link classification, he/she is looking for patterns. We all know that SEOs of days gone by looked to scale any tactic that could get them links quickly and easily. By identifying patterns, you can identify link building footprints that will make your link classification much quicker.
For example, you might notice a few articles with heavily optimised anchors, and then look to see if there were many other links from article sites – a classic link building footprint. I wrote a lot more about this over on Search Engine Journal, and a lot of the same principles have been built into URL Profiler’s link classification system.
In order for it to work properly, you need to comprehensively define anchor text. So, after entering your domain to check, click the ‘Anchors’ button underneath:
You will see 3 tabs – Branded, Commercial and Generic. So first define your branded anchor text, which is typically the name of the site, the web address and variations thereof:
Similarly, go to the Commercial tab and enter in any anchor text that might signify manipulative intent.
URL Profiler uses phrase matching so you don’t need to enter every single exact variation here. In my screenshot I have entered ‘seo’ – this would pick up both ‘seo links’ and ‘seo backlinks’ as commercial anchor text without me having to enter anything else – plus any other anchor text variations that have seo in them.
The Generic anchor text is pretty self explanatory, so just fill in a few options here:
I can’t emphasise enough the importance of entering your anchor variations properly (particularly Commercial) – it will save you a ton of time when you come to analyse the results.
Selecting Domain Metrics
We deliberately built the link scoring system so that it doesn’t use any link metrics (such as Moz Domain Authority or PageRank). This is because an unnatural link isn’t an unnatural link because ‘Trust Flow is low’ – it’s an unnatural link because it was placed in order to manipulate the Google algorithm. If you somehow placed a highly optimised link onto Google.com itself, it would still be unnatural – never mind the PageRank.
That said – I do often look at some metrics to give me more information about my links (typically homepage PageRank, homepage index status, URL index status, Majestic Citation Flow & Trust Flow). However, we didn’t want the link scorer to be reliant on metrics, and we wanted it to be API independent. As such you can get very accurate link scores by selecting only the following:
Only ‘Site Type’ and IP Address. That’s it.
Site Type will identify if the site is an article repository, blog, link directory, forum etc… and the IP Address will allow you to identify links all coming from the same IP or subnet.
Once these settings are done, paste in your links and then run the Profiler. Armed with the site type and the anchor text, URL Profiler can find and classify all your links.
Analysing The Data
Once you open the results, you will be presented with a spreadsheet separated into different worksheets. The first sheet is a summary report.
This is simply an overview of the results – a bird’s eye view of the data. So you can see how many links were found, or not, and how many were classed as suspect, unnatural etc…
Many of our users will run this report before taking on new clients, almost like a risk analysis report!
It’s also really useful for getting a quick picture of how much work it might take to do an unnatural link clean up.
Link Scoring
The second sheet, entitled All, contains all the URLs and their data regardless of whether a link was found or not. The other worksheets split out the links based on their link score.
If we zoom in on some of the data you can see how this works:
This is a selection of links which we have deemed to be unnatural, which is listed in the ‘Link Score’ column.
Notice in the ‘reason’ column we are completely transparent with the data. There are no annoying codes or obscured reasoning, it is all perfectly clear so you can see exactly why we have come to this conclusion.
This represents a very important principle for URL Profiler – we believe that the data is YOUR data. We can tell you what we think, but at the end of the day you are the SEO expert, and where you add value is in your interpretation of the data. URL Profiler is a tool built by SEOs for SEOs, so we will always try to make our data as useful as possible.
At this stage, you will have an abundance of data about your links. You will know which links to ignore (nofollow links etc…) and which ones to be most concerned about. You will still need to spend some time going through and verifying the data (we can’t do everything for you!) but we are sure we will have saved you a hell of a lot of time.