Archive for March, 2007

BattleGrounds Mod 2 / Rate caps in the source engine

So it’s the BattleGrounds Mod team who approached me a while back about an issue they are having with their mod. It’s an hl2dm mod, which because of its open plan maps and long distance views, sometimes, or even quite often, generates more than the 30000 byte rate cap, built into the core buffer limitations in source. I’m not a modder, so I’m not familiar with methods of culling data in mods, but I am a telecommunications developer, so could provide some advice from one aspect.

After chatting with the guys for a while, I suggested a few things they could do to cut down on this excessive bandwidth, and also suggested they contact Alfred Reynolds directly, which they did. Alfred was suitably prompt, and this is what he said:

30,000 is 240kbps symmetric, our recent survey suggests that over half our userbase won’t be able to play your game if you try to even go near that cap. We have several mechanisms in the engine to help mods manage bandwidth effectively (send/recv proxies, various hammer entities to help cull LOS on large open maps), has your team investigated using them?

- Alfred

Here’s the interesting / of note part…

If you go to the Valve Survey Summary, and look at the results you’ll see that in fact, there’s less than 5% of the known connection speeds are under 240kbps. If you were to add the unkown section to that, then sure it seems many are under the required 240kbps speed, but this is totally unrealistic. For unknown data, it should be treated as unknown, as it likely has a very similar distribution to the other data you have - which statistically has no affect on the percentiles.

Here are the numbers (Statistics from Valve: 7:22pm PST (03:22 GMT), March 01 2007):

Line Speed      # Users    %ge     Users > 240kbps
=============   ========== ======= ============ %ge
                                               =======
33.6 Kbps       5,024      0.43%
56.0 Kbps       30,150     2.59%
112.0 Kbps      21,630     1.86%     56,804     4.88%       < 240kbps
256.0 Kbps      174,378    14.97%
768.0 Kbps      184,829    15.86%
1,024.0 Kbps    72,737     6.24%
2,048.0 Kbps    358,001    30.72%
10,000.0 Kbps   122,137    10.48%    912,082    78.27%      > 240kbps
Unspecified     196,295    16.85%    196,295    16.85%      Unknown
Totals Sanity Check:       100.00%              100.00%

Lol @ your statistics Alfred.

And just to clarify before someone challenges, this also says that over 60% of the users will be able to cope with 768kbps or up. Considering 512kbps ADSL is actually VERY popular in many parts of the world, there is also a major portion of this survey possibly missing. Also - 20% having 256kbps or under is a long way from the “over half” suggestion.

Update:

Just added this from the first comment, to keep anyone just browsing past informed, Alfred did also say the following:

Oh, and I should also add that sure, we will look at upping that limit (there are some fundamental limits due to memory buffering that will need to exist, but we should be able to make it a factor of 10 larger at a minimum I suspect). You will be shooting yourself in the foot if you use this larger number however.

- Alfred

When asked on a vague-ish timeframe, he said:

When its done (within a month or so if I had to guess) and yup, we would post a changelog point about it.

Personally, I think this is good news. I suspect that, particularly with larger servers, this could have a profound effect accross all of the source engine based games.

Another update, since someone has just raised the issue again, and I realised there was some useful justification on the ED Discussion of the topic:

TuF - I urge you to spend a minute thinking about the sub-second rate cap, not the ‘whole second’ rate cap. As I was describing to J3di in a post above, this happens actually quite frequently, remember 30kb/s is a poor quantization for bandwidth measurement in a 100pps data flow. A better measurement is 2.4kb per packet, which is more closely related to the real buffers. I’m actually unsure exactly what the buffers look like (depth, frequency of flushing and so on) but I do know what the packet flow patterns look like - and realistically, it spikes alot, but not high enough, thus these ‘exponential decays’ you see on net_graph, or on a packet logger.

Clearly for something which is relatively immediate (someone dieing) there should not be an exponential decay of data flow from the spike of when it occured, instead in many cases it should be one spike, and then back to normal, with maybe some limited (and flat!!!) data carrying rag-doll and animation corrections.

Lets also not forget that part of the reason for the load on a 100 tick server is that it’s having to do alot of work to choke the data flow. Remove this work, and it should cope with it better. I don’t know too many GSPs that run under 10mbit lines to each box, and as we all know this is something which is concerned with spikes, and not continual flow - so there’s no real cause for concern regarding server capabilities. Furthermore - costs for GSPs who pay for bandwidth in the 95th percentile (VERY COMMON!!!) will actually REDUCE because the top 5% of the spikes essentially gets cut off. - another bonus.

I’d also point out that often servers which are suffering (which are housed in a DC) don’t have trouble with the bandwidth, but the processing power and packetrate - this is not the same as bandwidth and both of these issues could be potentially be relieved by opening up the spikes, as there will be less of the choke management system running (less processing power), and it’s possible less packets will actually be required to carry all the relevant information (although I’m not sure how well the specific implementation allows for this one).