1. Background#
The company's network was very unstable recently. There were situations where I couldn't clock in and couldn't browse the web in the morning when I came to work. There were also occasional network blockages during the day. After conducting speed tests, the download speed was halved and the upload speed was almost zero.
At first, I thought it was due to telecommunications line faults and network congestion during peak hours in the park. However, the duration and frequency of these issues were surprisingly high. After receiving numerous complaints from leaders and HR, I gradually began to pay more attention to this issue.
Based on my years of experience, I came up with several possible reasons and started to eliminate them one by one:
- Telecommunications line faults or speed restrictions in the park
- High device access and concurrency during peak hours, exceeding the capacity of the equipment in the computer room
- Overload of AC and AP as the number of employees increased
- Rising temperatures in spring and summer, resulting in high temperatures in the computer room++ (Some devices rely on passive cooling and there is indeed a risk of overheating and frequency reduction)++
2. Attempted Solutions#
1. Changing the line#
The company originally had a telecommunications "200M" public broadband connection with an upload speed of only "30M". Moreover, most of the park uses telecommunications, and there is indeed a large fluctuation during peak hours. So, after a simple discussion with the leaders, we directly switched to a "China Unicom 1000M" broadband connection, and the upload speed directly reached "100M".
I have a complaint here. The China Unicom staff really know how to pick their timing. I waited for them all morning and they didn't show up until just before noon when I was about to finish work. I had to work overtime with them for a whole lunch break without eating anything warm. I was so hungry.
Changing the line did have some effect, but it was not significant because the next morning the blocking issue reappeared.
2. Monitoring the computer room equipment#
Most of the equipment in the computer room was indeed performing poorly, but one thing that puzzled me was that the company did not have a large number of new hires recently, and there was no significant increase in concurrency. So, after monitoring the equipment for more than two days, I ruled out this factor because the hardware performance was not even half utilized, and the temperature issue was baseless.
3. Replacing AC and AP#
The company's AC has always had a problem, which is that it++ consumes almost half of the network bandwidth++. At first, because the AC had many network ports, I used it as a Layer 3 switch. I initially thought that the AC's capacity was insufficient, which resulted in a halving of the outbound bandwidth. However, even after removing all unnecessary patch cords, the AC could still only achieve half the speed on the local area network. So, I directly requested a budget from the company to replace the AC and AP devices.
The AC selected was "H3C Xiao Bei You Xuan RT-UR7208-P-E", and the AP selected was "H3C EWP-UAP673". The maximum bandwidth is 300, which is sufficient for the company's use. After replacing them, the local area network speed did improve, and the previous issue of consuming half of the local area network bandwidth did not reoccur. However, network blockages still occurred.
==So, where is the problem exactly?==
3. Clash Port Exposure#
After the hardware investigation, I began to focus on the software. I checked the current company's traffic exit on the bypass router and found that the upstream traffic had increased abnormally by "20%~30%" compared to previous months. (++Actually, I asked the telecommunications technician to check the upstream and downstream traffic in the telecommunications background, but he told me there was no problem. So, I neglected this point at the beginning.++)
After observing the abnormal upstream traffic, I started to investigate each IP in the local area network to see if there were any abnormal conditions. At first, I thought someone inside the company was using the company's broadband for "P2P" or "PCDN". However, after a thorough investigation, there was no abnormality in the local area network IP regarding upstream usage. Instead, I found several public IPs in the Clash service list. By reverse searching the IPs for domain names, I found that most of these IPs belonged to a TV company in Russia. With the keywords "Clash," "abnormal upstream traffic," and "Russia," I found a help post on GitHub. Here is the original post link for everyone to visit:
https://github.com/vernesong/OpenClash/issues/2629
::: grid {cols=2}
:::
Seeing this, I suddenly realized that the company's bypass router was used as a DMZ host, so many ports were exposed to the public network. It is highly likely that this abnormal upstream bandwidth event was detected by a brute force virus scanning for open ports, causing the bypass router to be used as a jump host or accelerator. So, I closed port 7890 and set firewall rules to only allow local area network devices to access it. After observing the upstream traffic, it gradually decreased from 8-9M/s to 200Kb/s~2M/s. At the same time, speed test websites showed that the upstream and downstream bandwidth had returned to normal, and the unidentified Russian IPs in Clash disappeared.
==So, this incident can be considered resolved for now.==
https://www.xiaohanwu.com/thinking/66276cf3176d45931ddc9e21
4. Conclusion#
Although this incident was not a virus infection and can only be considered a network vulnerability, it is surprising that the ultimate user of the vulnerability turned out to be a large-scale TV service provider in Russia. It inevitably makes people think of TV service providers colluding with hackers to obtain a large number of illegal network CDN acceleration services.
Looking at China, there are also many unknown and allegedly "unrecorded" "high-defense CDNs". Due to their low prices and lack of record-keeping requirements, they are highly regarded by many webmasters. At that time, due to financial constraints, I almost considered using these unknown CDNs. However, this virus incident gave me a whole new perspective on these small CDNs.
This article is synchronized and updated to xLog by Mix Space.
The original link is https://www.xiaohanwu.com/posts/IT/3