Why have any more than One Server?

UnnoticedShadow · 2019-01-17 00:51:35

As of now, the server system has been changed to stop nocturnal infertility from killing one server and leaving the others fine, to affecting all servers to a lesser extent, however, what I fail to understand is why have any more than one server? The system has, of course never worked like this, but has there ever been a reason for there to be more than one, or maybe two servers at any given time? Having one server would both lessen nocturnal infertility, and prevent server under-population, but could prevent single player servers, perhaps?!? Could someone explain why the servers need to work like that?

Greep · 2019-01-17 01:15:35

Lag

UnnoticedShadow · 2019-01-17 01:18:14

Couldn't you just have less, faster running servers? We most certainly don't need 15.

Greep · 2019-01-17 01:24:42

There's generally only 5. I'm assuming 15 exist only because linode is cheap and why not.

CrazyEddie · 2019-01-17 01:42:20

One server is not enough to handle all the players. Making that one server large enough to handle all the players may not be technically possible, as the load on a server (reputedly) increases non-linearly with the number of players.

In other words, putting twice as many players on one server may require much, much more than twice as much RAM or CPU. This would get expensive, and could become impossible at any cost.

(I'm not certain what the exact constraint is. My guess, actually, is that the number of disk seeks increases exponentially, in which case it wouldn't matter how much RAM or CPU the server had. Beyond a certain point the server would start responding too slowly to client requests, and the game would become unplayable, and there would be nothing anyone could do to make it go faster. But this is only a guess.)

Last edited by CrazyEddie (2019-01-17 01:46:30)

BladeWoods · 2019-01-17 06:14:17

There's a limit to a server's capacity. No massively multiplayer game runs on just one server. You can't just get one super fast server that's as good as 15. Such a thing doesn't exist.

pein · 2019-01-17 12:25:01

lag, apocalypse, using the free pass on old towns, playing with group of friends

they are way too big i think
when you cant even really go 5000 and have kids there, i don't see the point, also in each 500x500 you will find some decent places, and no one will ever find the other town 600 tile away or maybe just a few pros and that would be fine
i would love to meet other families on the map

Dodge · 2019-01-17 16:53:21

What is the actual maximum of a server btw?

It says 160 for all servers except server 1 at 120 but never seen this many on one server, at which point it starts to be too much for a server?

maybe just having less than 15 and better servers could do it, if it's not possible to be all on one server it's probably possible to be at least all on 2-3 servers

Would be nice to be at least 80 (half the max) on a server and if possible more

Maybe reducing tile culling to half the current time could reduce server load, finding really ancient village is rare enough that it doesnt matter that much especially if you compare it to having more players on a server, this plus having less but more performing servers, we could maybe have 100 players on a server which would be enough to make it really interesting

andrew_klassen · 2019-01-17 17:52:14

The main reason to have more than one server is for high availability.

CrazyEddie · 2019-01-17 18:13:45

pein wrote:

they are way too big i think

You're talking about the size of the map? That has no impact on server capacity or performance. The map tiles don't exist (and therefore don't consume resources) until someone goes there.

CrazyEddie · 2019-01-17 18:28:45

Dodge wrote:

What is the actual maximum of a server btw?
It says 160 for all servers except server 1 at 120 but never seen this many on one server, at which point it starts to be too much for a server?

Each server has its own defined maximum, which is what you see on the server report. However, in November Jason tweaked the load balancing algorithm to cap each server at half of its defined maximum, so now most servers will be maxed out at 80 and server1 will be maxed out at 60. He did this as a quick fix to solve reported performance problems post-Steam release.

I don't know how Jason determined what the maximum for each server should be. I know he expected it to be higher (200 per server) early on in the game's development, but reduced it sometime after the initial release. He's also done a number of things to improve server performance, like replacing the original database code to use a much faster algorithm.

we could maybe have 100 players on a server which would be enough to make it really interesting

The current problem is not that we can't put lots of players on a server. The current problem is that the population on each individual server is allowed to become low when the global population falls, rather than shutting down servers and consolidating the players onto fewer servers when the global population falls. But this is deliberate; the alternative is that servers are shut down when population falls, which kills all the lineages on the servers that are being shut down.

lionon · 2019-01-17 18:36:52

CrazyEddie wrote:

One server is not enough to handle all the players. Making that one server large enough to handle all the players may not be technically possible, as the load on a server (reputedly) increases non-linearly with the number of players.

If it is non-linearly, than it's due to suboptimal coding. I'm pretty confident it can be made to be close to linear. However when I skim over the code, yes there are lots of vectors that are looked through instead of optimized hash-tabling. So yes. It may be non-linear, but that's something that could be fixed.

However, for reasonable optimization, the first sensible thing would to create a profile while a server is under natural load. Often it are actually very inconspicuous parts of the code that turn out to use the most computation time.

ryanb · 2019-01-17 18:41:08

CrazyEddie wrote:

The current problem is not that we can't put lots of players on a server. The current problem is that the population on each individual server is allowed to become low when the global population falls, rather than shutting down servers and consolidating the players onto fewer servers when the global population falls. But this is deliberate; the alternative is that servers are shut down when population falls, which kills all the lineages on the servers that are being shut down.

I think we need a "partial shutdown" state where babies are allowed but not Eve spawns. At the moment server 5 never reaches the shutdown threshold because eves can still spawn. If we can disable eve spawning then lineages could die out naturally.

This would require some changes to the spawn code. If it tries to spawn an eve then it should send the client back to the reflector to choose a different server.

Gederian · 2019-01-17 18:44:09

CrazyEddie, lionon, ryanb

Not to steal a thread but: https://onehouronelife.com/forums/viewtopic.php?id=5032

lionon · 2019-01-17 18:45:29

ryanb wrote:

This would require some changes to the spawn code. If it tries to spawn an eve then it should send the client back to the reflector to choose a different server.

Or more general, someone (who didn't force server) should only become eve, when they have no eligible mother on all active servers.

This would require some more heavy updating status info about alive mothers and lineage bans between each server and the load distributer... not a trivial thing to do in a short hack.

Last edited by lionon (2019-01-17 18:46:01)

CrazyEddie · 2019-01-17 18:49:09

ryanb wrote:

I think we need a "partial shutdown" state where babies are allowed but not Eve spawns. At the moment server 5 never reaches the shutdown threshold because eves can still spawn. If we can disable eve spawning then lineages could die out naturally.

That's a great idea, but would be a pretty big change. The reflector doesn't know anything about spawning conditions on the servers; it only knows each server's max population and current population. The server only makes a decision about whether the player will spawn as an child or an Eve after the client has been sent there. And similarly, the server doesn't know anything about its state as seen by the reflector - the reflector knows which servers are "active" (i.e. will receive new connections from the reflector) but the server has no idea whether it is active or not. It just accepts any and all connections - whether from the reflector or from manual selection - until it reaches its max, and then it stops accepting new connections.

To implement your idea, the reflector and the servers would have to have some way of mutually knowing that any given server is being made inactive and should reject connections that would be Eve spawns.

But also, note that the server can't determine whether or not a new connection will spawn as an Eve until it knows who the player is! If it's in "no more Eve" shutting-down mode, it can accept some players but will have to reject other players (ones who are lineage banned from all current lineages on the server). Whereas today, a server either accepts all new connections or rejects all new connections without regard to which player is connecting.

So it would take a significant degree of changes. Maybe Jason can do it quickly and easily, I dunno. Obviously that's up to him.

ryanb · 2019-01-17 18:55:06

lionon wrote:

If it is non-linearly, than it's due to suboptimal coding. I'm pretty confident it can be made to be close to linear. However when I skim over the code, yes there are lots of vectors that are looked through instead of optimized hash-tabling. So yes. It may be non-linear, but that's something that could be fixed.

Looking back on the threads regarding performance, I'm not certain what the issue is at the moment. The performance problem earlier had to do with NULL lookups but that was fixed in this thread. Shortly after that he increased the cap to 200 but there were lag issues after the Steam release so he lowered the cap again. I don't know how much profiling he has done since that happened.

ryanb · 2019-01-17 19:07:15

CrazyEddie wrote:

To implement your idea, the reflector and the servers would have to have some way of mutually knowing that any given server is being made inactive and should reject connections that would be Eve spawns.

This is possible without the reflector and servers communicating. The state can be persisted on the client side. The server knows its own population and can be in charge of knowing if Eves can spawn. Here is a scenario.

1. Player connects to reflector which sends client to Server 5.
2. Server 5 has no eligible mothers so it attempts to spawn player as Eve.
3. Looking at its player count it is unable to spawn Eve and rejects the client.
4. The client remembers "I was rejected from Server 5"
5. The client reconnects to reflector passing a list of servers it has been rejected on
6. The reflector chooses a new server that doesn't match the rejected ones

While still a decent size change, it reduces the need for server communication.

Last edited by ryanb (2019-01-17 19:08:15)

Dodge · 2019-01-17 20:33:45

lionon wrote:

CrazyEddie wrote:
One server is not enough to handle all the players. Making that one server large enough to handle all the players may not be technically possible, as the load on a server (reputedly) increases non-linearly with the number of players.
If it is non-linearly, than it's due to suboptimal coding. I'm pretty confident it can be made to be close to linear. However when I skim over the code, yes there are lots of vectors that are looked through instead of optimized hash-tabling. So yes. It may be non-linear, but that's something that could be fixed.
However, for reasonable optimization, the first sensible thing would to create a profile while a server is under natural load. Often it are actually very inconspicuous parts of the code that turn out to use the most computation time.

It's more or less what im thinking, hard to get an estimate but i think we should be able to be 100 or at least 80 on a server without having any lag issues, there has to be some sort of memory leak or innefficiency in the coding somewhere that prevents us from being more than 60-70

What is the max players we had on one server in the recent days btw?

lionon · 2019-01-18 07:19:03

Dodge wrote:

It's more or less what im thinking, hard to get an estimate but i think we should be able to be 100 or at least 80 on a server without having any lag issues, there has to be some sort of memory leak or innefficiency in the coding somewhere that prevents us from being more than 60-70
What is the max players we had on one server in the recent days btw?

I can't say anything about what a max. could be. I just saying, anything but a (close to) linear relationship between number of players and load (on various bottlenecks) is IMO preventable.

PS: At least when the players are not standing all in the same map cell, but are distributed so they don't affect each other.

Last edited by lionon (2019-01-18 07:20:10)

Tarr · 2019-01-18 08:18:41

Dodge wrote:

It's more or less what im thinking, hard to get an estimate but i think we should be able to be 100 or at least 80 on a server without having any lag issues, there has to be some sort of memory leak or innefficiency in the coding somewhere that prevents us from being more than 60-70
What is the max players we had on one server in the recent days btw?

I'd assume the max was earlier tonight when S2 had 120 players on it. I personally didn't get any lag on my end but I think someone mentioned lag (but isn't there always complaining about lag anyways?)

jasonrohrer · 2019-01-19 00:51:09

All the profiling I've done (in the distant past) showed that the map database is the main bottleneck.

More people on the server means more chunks of the map need to be looked up and sent to them as they walk around.

Where the rubber hits the road, the bottleneck is actually the speed of disk seeks on the SSD (cache misses, essentially).

I've done a ton of optimizations on the map database, and probably gotten it as fast as it can be, with a minimal number of disk seeks---given the data structure. It is certainly way more efficient now than it used to be when I was running the "off the shelf" KISSDB.

However, there are still some aspects that simply cannot be made more efficient, given the way that the data is structured. Essentially fundamental flaws in the way that I designed the data model. Very deep problems. that would require a complete overhaul to fix.

To summarize, data entries are keyed by x,y coordinates. In other words, each cell on the map is a uniquely-hashed element, which will be stored in a random location in the hash table (file). Thus, looking up a given chunk of the map (which will contain sparse player changed cells over top of the non-disk proc-genned map) will involve a lot of random-access disk seeks.

Further compounding the problem is the way container slots are stored. Essentially, we key cells as (x,y,s), where s is the slot number in that cell. Thus, even looking up the slots in the cell requires separate random-access disk seeks. And sub containers are even worse. We're actually indexing as (x,y,s,b), were b tells us which sub container we're looking at slots in (baskets in a box, for example). So looking up just a single box full of full baskets will require.... 17 random disk seeks.

Now, if I had to do this all over again, I'm not sure how I would do it... we want to safely store things on disk, and they have unbounded lengths (there's no hard limit to how many slots a container can have). And those stored lists can change at any time, and get longer.

Long ago, someone suggested that we should at least be storing map cells in blocks together (like blocks of 100x100 x-y locations) so that lookups of map chunks can hit the same location in the file and then exploit cache locality. That makes sense for the base cell values, but I'm not sure how that would work for expandable containers. As containers grow, you can't just "make room" in the middle of the file, which sounds like you'd end up adding extra room at the end of the file. I.e., back to a random-access linked list. I.e., it's very hard to have the whole growing container in the same place in the file, and thus in the same cache block.

Another option: if the most frequent case in the read, why touch the disk at all during a read? Can't we just store the whole thing in RAM?

Maybe so, but not with the current $5 linodes. Server1 is currently using 415 MiB of ram just for the server process (and other processes running there take it up somewhere above 900 MiB, on a 1GiB linode).

Well, actually, quite a bit of that RAM is used for caching and meta structures for the current on-disk database. If the whole thing was in RAM for reads, then a lot of that stuff wouldn't be needed, maybe. I'm currently running some file-size analysis for the DBs to get an idea of how much space would be used in RAM to store the whole thing.

jasonrohrer · 2019-01-19 01:26:20

All db files together on server1 are currently 610 MiB.

So I guess that's doable, to hold the entire thing in RAM, but probably on the $10 linode.

Server 15's dbs only take 27 MiB.

jasonrohrer · 2019-01-19 01:30:11

Here are the sizes in KiB for all the db files on each server:

    Running command  on server1.onehouronelife.com
610744
   Running command  on server2.onehouronelife.com
651404
   Running command  on server3.onehouronelife.com
581816
   Running command  on server4.onehouronelife.com
393304
   Running command  on server5.onehouronelife.com
378676
   Running command  on server6.onehouronelife.com
14224
   Running command  on server7.onehouronelife.com
27504
   Running command  on server8.onehouronelife.com
8352
   Running command  on server9.onehouronelife.com
23056
   Running command  on server10.onehouronelife.com
23312
   Running command  on server11.onehouronelife.com
9868
   Running command  on server12.onehouronelife.com
50728
   Running command  on server13.onehouronelife.com
21944
   Running command  on server14.onehouronelife.com
14020
   Running command  on server15.onehouronelife.com
27668

Done running command on all servers.

jasonrohrer · 2019-01-19 01:33:05

In general, though, things are set the way that they are (in terms of player counts) to stay on the safe side of lag while I focus my work on other things.

For example, in the last 24 hours, server1 peaked at 80% cpu. I really don't want to let it get much higher than that.

Essentially, I want to avoid lag at all costs, because it's such a horrible player experience.

One Hour One Life Forums

#1 2019-01-17 00:51:35

Why have any more than One Server?

#2 2019-01-17 01:15:35

Re: Why have any more than One Server?

#3 2019-01-17 01:18:14

Re: Why have any more than One Server?

#4 2019-01-17 01:24:42

Re: Why have any more than One Server?

#5 2019-01-17 01:42:20

Re: Why have any more than One Server?

#6 2019-01-17 06:14:17

Re: Why have any more than One Server?

#7 2019-01-17 12:25:01

Re: Why have any more than One Server?

#8 2019-01-17 16:53:21

Re: Why have any more than One Server?

#9 2019-01-17 17:52:14

Re: Why have any more than One Server?

#10 2019-01-17 18:13:45

Re: Why have any more than One Server?

#11 2019-01-17 18:28:45

Re: Why have any more than One Server?

#12 2019-01-17 18:36:52

Re: Why have any more than One Server?

#13 2019-01-17 18:41:08

Re: Why have any more than One Server?

#14 2019-01-17 18:44:09

Re: Why have any more than One Server?

#15 2019-01-17 18:45:29

Re: Why have any more than One Server?

#16 2019-01-17 18:49:09

Re: Why have any more than One Server?

#17 2019-01-17 18:55:06

Re: Why have any more than One Server?

#18 2019-01-17 19:07:15

Re: Why have any more than One Server?

#19 2019-01-17 20:33:45

Re: Why have any more than One Server?

#20 2019-01-18 07:19:03

Re: Why have any more than One Server?

#21 2019-01-18 08:18:41

Re: Why have any more than One Server?

#22 2019-01-19 00:51:09

Re: Why have any more than One Server?

#23 2019-01-19 01:26:20

Re: Why have any more than One Server?

#24 2019-01-19 01:30:11

Re: Why have any more than One Server?

#25 2019-01-19 01:33:05

Re: Why have any more than One Server?

Board footer