8 Lessons Learned from Network Bots

Our systems have detected unusual traffic
from your computer network

At Securly, we have gained a lot of insight into bot behavior based on our own experience seeing this message. Google’s security team has been tremendously helpful in understanding this issue.

We are sharing our findings below hoping they will be helpful for anyone experiencing similar issues. Note that we are intentionally not sharing any insights that Google wouldn’t want bot writers to get access to.

Here are 8 lessons learned from our experience with network bots:

1) Google search generally doesn’t like “onion routing”

Onion routing can happen even unintentionally on many networks when the school has a network or a web-filter with more than one egress points to the public Internet.

If Google detects that the same user(s) is accessing its service from different source IP addresses, it flags the behavior as bad. This is how Tor or HOLA VPN works as well.

We haven’t dealt with HOLA specifically, but it is possible that some students may have installed HOLA VPN to get around web-filtering, leading to the entire district/school IP(s) getting flagged by Google.

2) Google attempts to flag traffic at the user-level first before it bans the entire IP address

To do this, it naturally has to rely on the IP addresses present in the traffic it sees.

If the school network uses proxy-based web-filters that are incapable for preventing onion-routing and additionally are incapable of adding X-Forwarded-For headers to the Google traffic, the school will trigger Google bot alerts in our experience.

3) Repeated Google queries can get the IP flagged

If onion-routing is not involved, even with fixed IP addresses, we have found that bots that are performing repeated Google queries can get the IP flagged.

4) If you use hosted or cloud-based proxies, ensure that these are not open proxies.

In our experience, restricting traffic to registered school IPs helps a lot.

If your school needs to keep proxy access open to even unregistered source IPs (e.g. to support take-home 1:1 iPads being proxied through on-premise web filtering proxy), then you must ensure that X-forwarded-for headers are added.

Again, this is only possible if your web-filters are capable of MITM HTTPS handling. As explained above, without this, the bots would appear to Google to come from the IP address of the school web-filter causing bot alerts to show up for all users behind that IP.

5) Unsolvable Google captchas can be caused by having multiple public IP addresses

If you are seeing a Google captcha that you are unable to solve, in our experience that is a sign that the Google related traffic is exiting your network from more than one public facing IP address. This is common in medium to large sized districts.

What happens in this case is the captcha gets served because Google sees “onion routing”, and once it is served, even when it is solved, Google doesn’t associate the solved captcha from one IP address to the offending IP for which it served the captcha.

For example, google.com/search may have happened from IP address A, but the captcha itself was served from ipv4.google.com which is accessed via IP address B. In our experience, ensuring Google.com and ipv4.google.com get accessed from the same source IP get rids of the issue where the captcha is unsolvable.

6) Bots are smart!

Bots are extremely smart, and we have found even cases where the bots discovered that we were an open-proxy only over the CONNECT + HTTP methods (we have regular HTTP, regular HTTPS and CONNECT HTTPS covered).

7) A good way to detect bots on your network is to look at night-time traffic

Bots in general do not go to sleep like humans do, and for that reason, you should see Google search traffic from these happening even at 2AM local time.

Firewall logs can quickly point to the source of the traffic on your school network.

8) A single infected machine can impact your whole network

Yes, a single infected machine on your network can bring the access to Google search down for the entire district.

This information is not endorsed by Google in any way, and is provided on a as-is basis here with the hope of helping the school admin community.

For more security tips and best practices, sign up below:

2 thoughts on “8 Lessons Learned from Network Bots

  1. Exactly how confident are you that “Man In The Middle”, https interception to insert the X-Forwarded-For value into the header would be trusted by Google? That simply means that Google is going to believe someone that intercepted the https connection, and used an unauthorized self-signed certificate to impersonate Google is somehow trustworthy enough to not really be a bot. Has this theory actually been tested in the last 6 months? Google switched all their services from http to https several years ago. Since that change, there is simply no way to change the header to include the X-Forwarded-For value without performing a MITM https interception. I have found no one that has actually demonstrated that implementing MITM https interception would reduce the Google Captcha.

    • Keith – good point, and we are actually not sure if implementing MITM guarantees a reduction in Google captchas. However, not implementing MITM guarantees that Google has no way of differentiating users behind the proxy, and this is not good. Also, we are not expecting Google to “trust” the XFF IP address blindly, or trust the proxy IP blindly, we are just allowing Google’s algorithms to use more information in making its decision. In our case, implementing all of these mechanisms (not a single one) drastically reduced and has almost eliminated captchas for us, and we serve 1M+ users across a small pool of proxy IP addresses.

Leave a Reply