What happened?
When you see a traffic drop from Google, it does not necessarily mean it is due to a penguin demotion. There might be other causes such as:
Tracking issues
In certain occasions, the actual reason for a Google organic traffic drop might be that the Google Analytics tracking pixel is missing from one or more pages.
Actions
Scan Site – Craw all websites to identify which pages are missing the Google Analytics tag (this can be done with either screaming frog or gachecker.com)
Referrals – Make sure your site is not showing as a referral traffic source (GA > Acquisition > All Traffic > Referrals)
Top pages – Check the top pages in traffic to determine if any pages have stopped working. (GA > Behaviour > Landing Pages)
Hostnames – Identify if any other website is using your analytics code. (caches, translations, and other websites.
Alerts – Set up intelligence alerts to get notified when there are significant traffic drops. (Use the following (read more here:https://support.google.com/analytics/answer/103302…Content issues
The site’s content might be of poor quality, or there might be duplicate content issues that might have resulted in a Panda or manual penalty.
Duplicate Content
Your site might be suffering from duplicate content. Here is what you need to do to identify any issues:
- Check with Google Web Master Tools, Siteliner and also search for the following using Google to find internal duplicate content issues (duplicate title tags, meta descriptions)
- www.site.com and site.com
- http:// and https://
- dir and dir/
- / and /index.php
- /cat/dir/ and /dir/cat/
- /cat/dir/id/ and /cat /id/
- param_1 = 12?t_1 = 34 and / cat_12 / dir_34 /
- site.com and test.site.com
- test.site.com and site.com/test/
- /? you_id = 334
- / session_id = 344333
- Use CopyScape or PlagSpotter for external duplicate content issues.
- Make sure all parameters are blocked from the search engines so that these pages do not get indexed.
- Check for partial duplicates.
- Check for inconsistent internal linking (Screaming Frog, Deepcrawl).
- Look for any sub-domains.
Technical Issues
This is very common after a site migration due to a disallow directive in the robots.txt, wrong implementation of rel=”canonical” severe site performance issues, etc.
- Proper use of 301s.
- Any “Bad” redirects.
- Redirect chains.
- Has the Canonical version of the site been specified in Google Webmaster Tools?
- Has the canonical version of the page appropriately been implemented across the site?
- Does the site use absolute instead of relative URLs?
- Has the canonical version of the site been established through 301s?
- Is the robots.txt file blocking any pages?
- Has the no-index tag been implemented by mistake on one or more pages?
- Check for indexed PDF versions of your content. (site:mydomain filetype: pdf)
Outbound linking issues – Many times a site links to a spam sites or websites operating in untrustworthy niches.
- Links to low-trust websites
- Paid Links
Negative SEO – If you experience a sudden traffic drop, you might have been a victim of negative SEO. Negative SEO usually refers to a competitor buying low-quality links and pointing them to your website to hurt your organic traffic.
Hacking – On several occasions, your website could have been hosting spam, malware, or viruses due to being hacked.
Google Updates
To get a better understanding of how the various Google updates affected your organic traffic, it is also recommended to identify all the core dates that any updates have taken place (official and unofficial).
Google Algorithm Updates Sources
- Algorooo
- Google Algorithm Change Index
Google Traffic
In order to isolate Google Traffic you will need to create the following segment:
https://www.google.com/analytics/web/template?uid=CsPptfU_QE-X-Yngg00dVQ
Google Algorithm Updates Tools
There are two online tools that I highly recommend to speed up the process:
- Sistrix Updates Tool
- Barracuda Penguin Tool
What is Google Penguin?
Google Penguin is an Algorithmic update first launched by Google Search Engine in April 2012 to improve the value of the search results returned for users by trying to deal with any form related to spam (also known as spamdexing or Black Hat SEO) such as:
- Keyword Stuffing
- Link spamming
- Invisible text
- Duplication of content from high-ranking websites.
Key facts about Penguin
- Penguin is an algorithmic update, which means that it is not possible to instantly recover from it.
- You can only partially recover from Penguin before Google does a refresh or an update.
- Penguin seems to affect more keyword rankings.
- You DO NOT receive a notification in Google Web Master Tools if a Penguin update has hit you.
- You can only submit a reconsideration request after receiving a manual penalty.
- The Key date is the 24th of April 2012, so if you show a traffic drop after this date, you have been hit by the Google Penguin Algorithmic Update.
How to find out if Penguin hit you
As Penguin is mainly related to back-links, it is necessary to examine the following:
- Over-optimized anchor text (externally and internally).
- Over-optimized anchor text on low-quality websites.
- The dates that your website traffic was affected.
- If you have received any notification in Google Web Master Tools.
- Is it a site-wide drop, or does it seem to be keyword-specific?
Steps to Recovery
Step 1 – Match updates to Google analytics organic traffic
Google Analytics is a very useful tool as it can help you identify if there was any traffic drop after each Penguin update.
April 24, 2012: Penguin 1
May 25, 2012: Penguin 1.2
October 5, 2012: Penguin 1.3
May 22, 2013: Penguin 2.0
October 4, 2013: Penguin 2.1
October 18, 2014: Penguin 3.0
Step 2 – Compare organic traffic before and after the update
Now, you need to compare the organic traffic two weeks before each Penguin update with two weeks after each update. If the drop is higher than 50% usually it demonstrates clearly that the site has been penalised.
Step 3 – Investigating what dropped
Now that you have a clear understanding of which updates affected the website’s organic traffic, you also need to find out what dropped.
Step 4 – Which keywords dropped?
Penguin seems to affect websites more at a keyword level rather than site-wide. Make a comparison for the same period that you checked your traffic for the top keywords that you are optimizing your website to see if there were any keywords that were affected severely.
Step 5 – Check keywords visibility
Once you have found for which keywords you have dropped, log in also to Google Web Master Tools to check each keyword visibility. (Only for one update). If you would like to this for all the updates for a web site you need to use SEMrush or any other tool that was used to track keywords.
Step 6 – Gather all links
Now you reached the point that you need to gather all links to start the analysis. For this process you will need you backlink profile from the following tools:
- Ahrefs
- Majestic SEO
- Open Site Explorer
- Google Web Master Tools
- Backlink Profiler (BLP)
After you have exported all data and removed all duplicates with excel, start the analysis of the anchor text. What you need to do initially is to find instances of the anchor text by using the following functions:
COUNTIF
Microsoft Excel Definition: Counts the number of cells within a range that meet the given criteria.
Syntax: COUNTIF (range, criteria)
COUNTIF is your go-to function for getting a count of the number of instances of a particular string.
IFERROR
Microsoft Excel Definition: Returns a value that you specify if a formula evaluates to an error; otherwise, it returns the result of the formula. Use IFERROR to trap and handle errors in a formula.
Syntax: IFERROR(value, value_if_error)
IFERROR is really simple and will become an important piece of our formulas as things get more complex. IFERROR is your method to turn those pesky #N/A, #VALUE or #DIV/0 messages into something a bit more presentable
Step 7 – Combine data
Now you need to pull data from Google Analytics for each update (15 days before vs 15 days after) for the top anchor texts in order to discover if there was a drop in the organic traffic for these keywords that were used to improve your rankings by linking back to your web site (top anchors). Here is what you need to do step by step:
- Combine all link resources in Excel.
- Keep only the i) Anchor Text, ii) Linking Domains, and ii) Links Containing Anchor Text.
- De-duplicate data.
- Use COUNTIF AND IFERROR to find anchor text instances.
- Extract data from Google Analytics (pre and post-update).
- Find the percentage of traffic drop by using the following formula (a date before – date after)/date before selecting the columns and cells that represent the data for each date range.
- Create a pivot table and combine the following information.
- The drop.
- # of LRDs.
If you are unfamiliar with Excel and pivot tables, I recommend downloading the following spreadsheet and using it as a guide, as it can help you save a lot of time.
Step 8 – Check links using Link Detox
Link Detox is a very powerful tool if used correctly, as it combines data from multiple resources. Here is what you need to do:
- Create an account here https://www.linkresearchtools.com/.
- Go to Link Detox https://www.linkresearchtools.com/toolkit/dtox.php.
- Enter Domain to analyze.
- Analyse links going to Root Domain.
- Activate the NOFOLLOW evaluation.
- Select the theme of the domain from the dropdown.
- Select if Google has Sent You A Manual Spam Example (Yes, No, Do Not Know).
- Upload any links you already have (Ahrefs, Open site Explorer, Majestic, Google Web Master Tools).
- Upload Disavowed links (if you have disavowed any).
- Hit the Run Link Detox and wait until the report is ready.
- Classify all your anchor text before start auditing your links to:
- Money
- Brand
- Compound (brand + money example: Debenhams toys collection)
- Other
- Download the report in CSV format and open it with Excel.
- Keep only the following columns
Step 9 – Create additional columns in Excel
- From URL– This is the page URL linking to your website.
- To URL– This is the page of your website that the external website is linking to.
- Anchor Text – This is the keyword or phrase used as link text.
- Link Status – If the link is passing link juice or not for the search engines (follow or nofollow).
- Link Loc – The link’s location on the page (paragraph, footer, widget, etc.). Very useful when you need to remove it.
- HTTP-Code – These codes will help you identify the type of error when a page is not loading or responding.
- Link Audit Profile – The higher the priority, the more urgent it is to examine the links.
- DTOXRISK – How toxic is each link (bad for your website regarding organic search)?
- Sitewide links – A site-wide link appears on most or all of a website’s pages (blogroll, footer, etc.)
- Disavow – Has Google been notified through the Disavow Tool that this link has to be ignored?
- Power Trust – The power trust is a metric used to show how powerful and trusted is a page or domain to the eyes of Google.
- Power Trust* Domain – Power trust metric applied to an individual page.
- Rules – Spam link classification (banned domain, link network etc.).
Before you start the analysis with Excel based on the Link Detox report, you will need to create the following columns:
- Contact Email
- Contact Page URL
- Removed (Yes, No)
- Page Trust (Majestic)
- Domain Trust (Majestic)
- Niche (use majestic for this)
- Page Indexed (double check)
- Date of 1st Contact
- Date of 2nd Contact
- Date of 3rd Contact
- Notes
The following are supplementary:
- Edu Domains (Majestic)
- Domain Toxic Links (OpenLinkProfiler)
- Governmental Domains (Majestic)
- Page Facebook Shares
- Page Facebook Likes
- Page Twitter Shares
- Page Google ++ Likes
Step 10 – Keep only one URL per domain
Create an additional column after the domain column and paste the following function into the fist cell =IF (B1=B2,”duplicate”,”unique”) and copy it across the whole column. Next select from the filter control you have applied to view only unique values.
Step 11 – Exclude non-verified links
Simply use the filter from the column anchor text to exclude the unverified links. These are links that do not exist anymore.
Step 12 – Exclude disavowed links
If you have done a link audit before and you have upoad disavowed files, it would be good to exclude the lins included in them too as this will help save precious time. You can review these links separately later.
Step 13 – Banned domains
- Now is the time to start reviewing your links. Follow backlinks are always a higher priority as they violate the Google web master guidelines directly.
- Apply a filter to all cells.
- Then, apply a filter to view links with TOX1 (banned links).
- Use the tag columns to mark which domain needs to be removed or not by using a descriptive tag.
- Mark any URL or domain that needs to be disavowed so that you can create a file at the end very easily.
- Be careful while reviewing, as in some instances, several domains might not be indexed for reasons other than being penalized (robots.txt, no-index tag).
- Also, several domains might be authoritative and trustworthy, and there is nothing wrong with linking to you.
Step 14 – Domains infected with viruses
- Apply a filter to all cells.
- Then, apply a filter to view links with TOX2 (virus-infected).
- If you find a good one, do not remove it; simply contact the webmaster.
- Remove only the bad ones.
- Double-check the domains with one of the following tools:
- https://sitecheck.sucuri.net/
- http://safeweb.norton.com/
- https://www.mywot.com/ (exntension)
Step 15 – Audit TOX3 Domains
All these links according to Link Genesis are classified a highly toxical, so you will need to check them very carefully and remove them if you agree with link detox suggestions.
Step 16 – Double check Google Web Master backlinks
Pay particular attention to links imported from Google Web Master Tools during your reviews as according to John Mueller from Google this should be the primary source of backlinks used to audit your link profile.
How to judge the value of a link
Before deciding to take any action with any links that might be toxic and therefore could result in your website receiving a Penguin penalty by Google, you need to devote time to understand all the data that you have pulled in the spreadsheet, such as:
Domain Trust Flow (Majestic): How respected is the domain on the web. If the domain has a high trust in general, this is an indication that Google values it. (Usually domains with trust flow <10).
Page Trust Flow (Majestic): This metric is similar to Domain Trust Flow but applied at a page level.
Domain Power Trust (Cember): This metric determines the quality of a website according to its strength and trustworthiness.
There are four types of links:
- High Trust and Low Power – Links from highly trusted domains such as Universities or governmental institutions. These links are usually very difficult to get and have a very positive impact on your website’s credibility.
- Low Trust and High Power – These links require further research as they are not always good.
- Low Trust and Low Power – These links do not help much in general as they may come from newly established websites that might have even been penalised. Review any of these sites carefully before you build any links from them.
- High Trust and High Power – This is ideally the kind of links you want to earn.
DTOXRISK: This is the risk for each link based on how harmful it might be for your web site based on Link Detox calculations (client feedback, observations, linking domains, neighbourhood, internal and external SEO experts, known google publications etc.)
To get a full understanding please go the following page: http://www.linkdetox.com/faq
Link Audit Priority: The higher the priority the more important it is to review each link.
Link Status: Whether a link is follow or no-follow.
Link Location: Where the link is located on the page (header, footer, navigation etc.
Niche: The niche that the domain falls under (finance, property, computers etc.)
HTTP code: These codes help identify the cause of the problem based on the response send from the server (For detailed information please go:http://www.w3.org/Protocols/rfc2616/rfc2616-sec10….)
Page indexed: Whether the page is indexed or not by Google. Please double-check and also use http://indexchecking.com/
On top of all these metrics, you will need to consider how search engines judge the value of each link.
Which links to remove
Prioritize the following types of links as they violate the Google Web Master guidelines and, therefore, could result in a penalty.
- Link networks.
- Article submissions.
- Directory submissions.
- Duplicate content links (e.g. guest blog duplicated over 100s of domains).
- Spammy bookmarking sites.
- Forum profiles (if done for backlinks).
- Malware/hacked sites.
- Gambling/Adult sites (if your site is not in the same niche).
- Comment links with over-optimized anchor text
- Blog roll links.
- Footer links.
- Site-wide links (in most cases).
- Scraper sites.
- Any auto-generated links (xRumer forum posts, etc.).
How to remove links
When it comes to removing links, there are several options available.
Contact webmasters
- Draft a document containing a complete list of links to be removed. Send it to the webmaster in a single, well-drafted, short email.
- Adopt one communication channel (email/linked-in/facebook). Shift the channel only when the earlier one isn’t responding.
- Be polite to webmasters. They are trying their best to solve your problem! Keep human touch in communication protocol. An email referring to the webmaster by his/her name is more likely to get a response and develop strong professional relations.
- Email from the domain (you wish to remove) as it is more like for webmasters to answer and remove the links.
- Be polite because you are asking for a favor.
- Be polite while talking to Spammers because they can blackmail you for money if you act rude.
If you spammed somebody’s site, be polite, admit it, and apologize. Make sure you don’t repeat it (the webmaster will check).
The disavow tool
- If you can’t remove all toxic links, use the disavow tool.
- Use the disavow tool only as a last resource, and mostly when you are hit by an algorithmic update, as in case of a manual penalty, you will need to show proof to Google that you have done everything possible to remove the bad links.
- Make a spreadsheet of the links removed without using the disavow tool; sort it and list un-removed links, removed links, and removal methods.
- Focus on un-removed links. Try to sort data by domain.
- You can either disavow an entire domain or just one link from a domain. Choose the method wisely.
- Manage separate domains and links.
404 the pages
In several cases, if you cannot remove any deep links, you can also change the URL of the page so that all these links go to a 404 page. I redirect them to another site that I create specifically for this reason, as I do not like to increase the errors on any website that I am working on.
A common sense approach
Based on everything that has been said by John Mueller, Mutt Cuts from Google, industry experts, and my personal experience, I would suggest the following actions:
- Review all links very carefully.
- Clean up as many links as possible you can and, in particular, the ones that you have created yourself (directories, forums, mini-sites, profiles, press releases on poor quality sites, mini sites)
- Disavow all toxic links that you could not remove.
- Ensure 60% of your anchor text is branded and only 40% focuses on money keywords (20% exact, 20% miscellaneous).
- Review carefully your niche to understand your top-ranking competitors’ link profiles (branded vs. non-branded percentage, link types).
- Build quality-only links to restore link equity and also build trust with Google.
- Grow your brand.
- Get media coverage.
- Wait until Google reruns the Penguin algorithm and reassesses your site.
I try to remove as many links as possible to recover a site from Penguin, even if it is not necessary, simply because I do not want the sites that I work with to be associated with any spammy or low-quality sites. Furthermore, I am not 100% convinced that the disavow works without removing any Inks. Another reason is that sometimes, I might miss several bad links if I choose to disavow links instead of domains (rarely).
Depending on your time and resources, you will have to decide if you wish to clean up the site’s backlink profile or whether to focus only on recovering from Penguin. Link removal campaigns have in general 5 – 20% success rate, so for algorithmic updates are inefficient but you should always talk to your clients about this option.