How to stop Google Analytics Language SPAM?

Few days ago I have notice a big rise of my Google analytics stats from some very suspicious sources from Russia. Strange thing about this was the browser languages.

Here is a sample of my analytics report:

There are some more curious language codes in the report:

Sometimes this are HTTP requests to your website. They probably used cURL or something similar to do the request. Other times a hacked legit website has a hidden link to your website. And other times there is really no referrer it is just a ghost visit. This means that the request is made some way to your analytics code directly and there is no way to block the visit since they never make a request to your website.

cURL for example has a really simple way to hijack the language header like this

I’m sure that if you research a little bit the are tons of other ways to simulate a browser and use the code to SPAM the language header.

So how can you defend against this new type of Language SPAM?

Block user by language in .htaccess

The .htaccess file is used to customize the behavior of the Apache web server. You can use this file to write rules and block the access to your website is a specific IP, user-agent, country, language or other kind of things are detected

You can also block users by referer if this is your case:

Block ghost visits at Google Analytics level.

To block this hits in your Google stats you will have to create a filtered View. It is recommended to clone this view just in case you screw things up :- )

So go to your property in and you will find the filter menu for selected and specific view.

The most important thing you will have to remember is that you will have to set up a custom filter. Set it to exclude by language and just paste the language string into the Filter Pattern field.

You can verify the filter by hitting Verify just before the save button.

This is very effective but also very time-consuming since spammers are changing quite often the SPAM message. and you will have to create a new filter. Remember that you will have to create a filter for every language code that was spammed.

You can also use a regular expression and assume that all SPAM in your Google analytics Language will contain a domain name like it can be noticed here .

  • Secret.ɢoogle.com You are invited! Enter only with this ticket URL. Copy it. Vote for Trump!
  • o-o-8-o-o.com search shell is much better than google!
  • Google officially recommends o-o-8-o-o.com search shell!

So filtering all hits with a dot in the language header should be ok (at least until spammers will find a work around)

Filter pattern that you have to put in place is: backslash period  ( \. )

 

UPDATE: Found some new ones:

  • Vitaly rules google ☆*:。゜゚・*ヽ(^ᴗ^)ノ*・゜゚。:*☆ ¯\_(ツ)_/¯(ಠ益ಠ)(ಥ‿ಥ)(ʘ‿ʘ)ლ(ಠ_ಠლ)( ͡° ͜ʖ ͡°)ヽ(゚Д゚)ノʕ•̫͡•ʔᶘ ᵒᴥᵒᶅ(=^ ^=)oO
  • Congratulations to Trump and all americans

Note that “Vitaly rules google” does not have a dot…So you might want to create a separate filter for that. A good and more general approach wold be to create a filter with the string ‘google‘.

A good and more general approach to block “Congratulations to Trump” SPAM wold be to create a filter with the string ‘Trump‘.

Remember to test and re-test before going into production with this and always create a parallel view just in case you miss something

Clean up the historical data in analytics

YOu may also be interested to clean up the historical data in your analyticsacount. For that you will have to create a segment and add same rules you used in you filters. This will filter out all ghost visits and SPAM in your data.

Here you can the new segment data comparison. You can easily identify the Spammer attacks.

Let me know if you find some other patterns in other kind of SPAM in your Analytics stats

Leave a Reply

Your email address will not be published. Required fields are marked *

6 − 3 =