BattleGrounds Mod 2 / Rate caps in the source engine
So it’s the BattleGrounds Mod team who approached me a while back about an issue they are having with their mod. It’s an hl2dm mod, which because of its open plan maps and long distance views, sometimes, or even quite often, generates more than the 30000 byte rate cap, built into the core buffer limitations in source. I’m not a modder, so I’m not familiar with methods of culling data in mods, but I am a telecommunications developer, so could provide some advice from one aspect.
After chatting with the guys for a while, I suggested a few things they could do to cut down on this excessive bandwidth, and also suggested they contact Alfred Reynolds directly, which they did. Alfred was suitably prompt, and this is what he said:
30,000 is 240kbps symmetric, our recent survey suggests that over half our userbase won’t be able to play your game if you try to even go near that cap. We have several mechanisms in the engine to help mods manage bandwidth effectively (send/recv proxies, various hammer entities to help cull LOS on large open maps), has your team investigated using them?
- Alfred
Here’s the interesting / of note part…
If you go to the Valve Survey Summary, and look at the results you’ll see that in fact, there’s less than 5% of the known connection speeds are under 240kbps. If you were to add the unkown section to that, then sure it seems many are under the required 240kbps speed, but this is totally unrealistic. For unknown data, it should be treated as unknown, as it likely has a very similar distribution to the other data you have - which statistically has no affect on the percentiles.
Here are the numbers (Statistics from Valve: 7:22pm PST (03:22 GMT), March 01 2007):
Line Speed # Users %ge Users > 240kbps
============= ========== ======= ============ %ge
=======
33.6 Kbps 5,024 0.43%
56.0 Kbps 30,150 2.59%
112.0 Kbps 21,630 1.86% 56,804 4.88% < 240kbps
256.0 Kbps 174,378 14.97%
768.0 Kbps 184,829 15.86%
1,024.0 Kbps 72,737 6.24%
2,048.0 Kbps 358,001 30.72%
10,000.0 Kbps 122,137 10.48% 912,082 78.27% > 240kbps
Unspecified 196,295 16.85% 196,295 16.85% Unknown
Totals Sanity Check: 100.00% 100.00%
Lol @ your statistics Alfred.
And just to clarify before someone challenges, this also says that over 60% of the users will be able to cope with 768kbps or up. Considering 512kbps ADSL is actually VERY popular in many parts of the world, there is also a major portion of this survey possibly missing. Also - 20% having 256kbps or under is a long way from the “over half” suggestion.
Update:
Just added this from the first comment, to keep anyone just browsing past informed, Alfred did also say the following:
Oh, and I should also add that sure, we will look at upping that limit (there are some fundamental limits due to memory buffering that will need to exist, but we should be able to make it a factor of 10 larger at a minimum I suspect). You will be shooting yourself in the foot if you use this larger number however.
- Alfred
When asked on a vague-ish timeframe, he said:
When its done (within a month or so if I had to guess) and yup, we would post a changelog point about it.
Personally, I think this is good news. I suspect that, particularly with larger servers, this could have a profound effect accross all of the source engine based games.
Another update, since someone has just raised the issue again, and I realised there was some useful justification on the ED Discussion of the topic:
TuF - I urge you to spend a minute thinking about the sub-second rate cap, not the ‘whole second’ rate cap. As I was describing to J3di in a post above, this happens actually quite frequently, remember 30kb/s is a poor quantization for bandwidth measurement in a 100pps data flow. A better measurement is 2.4kb per packet, which is more closely related to the real buffers. I’m actually unsure exactly what the buffers look like (depth, frequency of flushing and so on) but I do know what the packet flow patterns look like - and realistically, it spikes alot, but not high enough, thus these ‘exponential decays’ you see on net_graph, or on a packet logger.
Clearly for something which is relatively immediate (someone dieing) there should not be an exponential decay of data flow from the spike of when it occured, instead in many cases it should be one spike, and then back to normal, with maybe some limited (and flat!!!) data carrying rag-doll and animation corrections.
Lets also not forget that part of the reason for the load on a 100 tick server is that it’s having to do alot of work to choke the data flow. Remove this work, and it should cope with it better. I don’t know too many GSPs that run under 10mbit lines to each box, and as we all know this is something which is concerned with spikes, and not continual flow - so there’s no real cause for concern regarding server capabilities. Furthermore - costs for GSPs who pay for bandwidth in the 95th percentile (VERY COMMON!!!) will actually REDUCE because the top 5% of the spikes essentially gets cut off. - another bonus.
I’d also point out that often servers which are suffering (which are housed in a DC) don’t have trouble with the bandwidth, but the processing power and packetrate - this is not the same as bandwidth and both of these issues could be potentially be relieved by opening up the spikes, as there will be less of the choke management system running (less processing power), and it’s possible less packets will actually be required to carry all the relevant information (although I’m not sure how well the specific implementation allows for this one).
Yah, I’m one of the Team Leaders from BG2. Thanks a bunch to Raggi, and, just in case anyone was wondering, Mr. Reynolds also said:
“Oh, and I should also add that sure, we will look at upping that limit (there are some fundamental limits due to memory buffering that will need to exist, but we should be able to make it a factor of 10 larger at a minimum I suspect). You will be shooting yourself in the foot if you use this larger number however.
- Alfred”
When I asked on a vague-ish timeframe, he said:
“When its done (within a month or so if I had to guess) and yup, we would post a changelog point about it.”
what was it you always say raggi? 40% of statistics are useless? ;p
rate is both dl/upld. 240 kbps is regularly more than an average connection upload (eg. 512 adsl).
thus, while the dl wont be reached, the upload will be.
h3iki - the upload rate in source is almost always considerably less than the download rate for clients.
cl_rate should be re-introduced at this stage.
untrue. would be true for HL1, however, the source engine physics give a lot more upload. still less, true. but not considerably.
i believe its upload that alfred is talking about :]
Admin note: Anyone, and I do mean ANYONE can simply load up the game and look at net_graph to see the difference between in and out bandwidth usage. This guys talking crap.
Clients don’t upload physics. Clients upload controls and interaction information. Please see the Source engine documentation in the wiki before making more assumptions.
Furthermore net_graph, net_channels and rcon net_channels will give you some accurate numbers for this. Upload bandwidth for a single client is generally 50% or less of the speed of the average download rate.
I might be wrong, but I think this is what rags is referring to ..
“The client creates user commands from sampling input devices with the same tick rate that the server is running with. A user command is basically a snapshot of the current keyboard and mouse state”
That’s taken from the wiki. there’s more on the source engine and networking, but i felt that was the most relevant part for what he referred to. Interesting read, if you’ve got the time for it.
Do you actually know what symmetric means in this context raggi?
The statistics/statements provided by Valve are completely correct, clueless :\
Yes I do.
‘rate’ was made symmetric by valve quite a while ago (presumably, what in fact happened and was documented was cl_rate being removed), but the actual usage of client upload is significantly lower, and doesn’t spike in the same way as server data.
The ‘rate’ cvar is still documented in game as meaning the maximum rate the host can recieve data.
Furthermore it is not excessive upload that they require.
You can try and pick holes like that all you want, the fact is, it’s an unnecessary cap these days with regard to server to client data flow, and it needs to be lifted. Even 100 tick servers suffer issues with this value being this low. I can only presume this to be one of the reasons that Valve have not recommended 100 tick for a long time. I’m unsure if they’ve even ever retracted that.