If you’ve been wandering around your Google Analytics of late, you may have noticed an unusual spike in your overview; most readily seen in your bounce rate and traffic statistics having risen sharply.
What is Ghost Spam?
The recent deluge of Ghost Spam sites may be new to some of you, and for some this is a historical nuisance which keeps cropping up. But here is brief overview of the why and where from regarding Ghost Spam:
Ghost Spam never hits your site, doesn’t crawl it, ‘fiddle’ with it or anything like that. There is no need to worry about whether the spam ghosts will damage your rankings in SERPS, this is simply a nuisance and one which we would like to get under control.
What happens goes something like this; the Spammer uses Google’s Measurement Protocol in order to send data directly to Google Analytics HQ servers, where along with random UA codes they’ve also conjured up along the way, they are free to send fake site hits to any number of sites without actually visiting them physically.
This in turn gets major site views for their respective websites, the one’s with URL’s such as ‘duckduckgo.com’ for example which pop up in your referral panel in GA. As you investigate, and naturally check out these referral sites who you suspect have been visiting your site, you are upping their site visits and making the ‘ghostly’ advertised companies very happy indeed.
Filtering ghost traffic has become a major issue for site owners lately, especially webmasters as their security is constantly under threat. Fake traffic, or ghost traffic, always manages to find its way into your analytics, and it can be tricky finding out exactly where it’s come from.
Who is sending me Ghost Spam?
Ghost spam is normally designed with the intention of getting you to visit another website by disguising itself as referral or organic traffic in your analytics reports. This type of spam normally shows up in your reports for a few days and then disappears, but it is best to block it from being on the page in the first place; as you can see below, it skews your GA data massively!
We recently noticed large increases in traffic and the bounce rate on our own site and decided to pinpoint the perpetrators; a list of sites that have been cropping up in our GA referral traffic sending ‘ghost’ traffic to the site, causing the alarming spike in our bounce rate. The list below is by no means complete, there are tons of different ones and these are simply the most recurrent ones across our clients. So keep your eyes peeled!
Keep an eye on your GA referrals for the following sites:
How can I stop Ghost Spam?
Unfortunately there is no way to block these sites from sending traffic to your domain, but there is a way to filter these sites from appearing in your GA reporting:
Actual website traffic will always have a host name. Since ghost spam operates by targeting random tracking-ids in Google Analytics, they will have either a fake host name or none at all.
Step 1: You must first access the hostnames report.
In order to do this, go to your Google Analytics reporting tab. On the left hand panel locate and click on the Audience option.
After this select Technology, then Network and at the top of the report click on Hostname.
Step 2: List valid hostnames that are contained in the information you have just accessed.
Most websites will have a main domain, plus a couple of subdomains in their list.
Step 3: Create a regular expression which includes your main domain.
Regular expressions are special characters that capture portions of a field. The main domain will be able to match all the subdomains, and is the only one that is necessary to include when creating the expression.
After you have written down all the valid hostnames, you should now create an expression for all the valid hostnames that you have found and match all of them.
You can follow the follow tips to configure it;
- Before you can separate each hostname, you need to use the both the dot and the hyphen symbol to create the regex and add a backslash before each of them.
- You have to add all the valid hostnames together.
- Make sure you will not leave any spaces.
- The maximum limit of the characters by the regex is 225 characters so you must try to minimize it in order to get every character under one expression.
This is because you can only have one to include in the hostname filter.
- You should also not add the bar (|) at the beginning or the end of the expression
It is very important to note that, you must add all the most important hostnames together else you may lose some of the valuable data.
Step 4: Create a custom filter which will exclude ghost spam from your traffic report.
This can be done by first going to Admin and selecting Filters. Then select include, and then Hostname on the filter field. Add the regular expression that you have created, in the previous step, into the Filter Pattern box.
Step 5: Verify and then save the filter.
Once a filter is set up it becomes permanent and the data cannot be restored. Verifying the filter will let you see what its effect will be before it has been saved.
This one will work by blocking any ghost spam with an invalid hostname. Once it has been set up, it also requires little maintenance. The only thing that needs to be added is a new tracking code whenever you have attached one to a service.
This of course won’t stop the sites from racking up your bounce rate, but it will at least save you a few hours when you come to putting together your client reports!
So there we have it! The nuisance that is, and that will probably continue to be, Ghost Spam.
Unfortunately, as of yet there is no way to completely stop these ghostly hits skewing your GA data, but if you follow the steps above, we can at least put on a pretty decent defence and keep these spammy blighters from making our reporting truly hellish!