Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cause minetest crash (serialization issue) #6

Open
jimy-byerley opened this issue Jul 1, 2017 · 32 comments
Open

Cause minetest crash (serialization issue) #6

jimy-byerley opened this issue Jul 1, 2017 · 32 comments
Labels

Comments

@jimy-byerley
Copy link

jimy-byerley commented Jul 1, 2017

I guy ! Your mod is a pure dream when it works !

Sometimes, when I use a ship (aroud 10m radius) it brings minetest to crash after few seconds during a flight. For now it always has happend during the first fly. After the first it seems to work always.

Here is the minetest output, (no problem before)

2017-07-01 17:22:56: ERROR[Server]: An unhandled exception occurred: String too long for serializeString
2017-07-01 17:22:56: ERROR[Server]: In thread 7f31f7d1b700:
2017-07-01 17:22:56: ERROR[Server]: /home/jimy/minetest-master/src/server.cpp:73: virtual void* ServerThread::run(): A fatal error occured: String too long for serializeString
2017-07-01 17:22:56: ERROR[Server]: Debug stacks:
2017-07-01 17:22:56: ERROR[Server]: DEBUG STACK FOR THREAD 7f31dc954700: 
2017-07-01 17:22:56: ERROR[Server]: #0  virtual void* UpdateThread::run()
2017-07-01 17:22:56: ERROR[Server]: DEBUG STACK FOR THREAD 7f31dd555700: 
2017-07-01 17:22:56: ERROR[Server]: #0  virtual void* EmergeThread::run()
2017-07-01 17:22:56: ERROR[Server]: DEBUG STACK FOR THREAD 7f31dffff700: 
2017-07-01 17:22:56: ERROR[Server]: #0  virtual void* UpdateThread::run()
2017-07-01 17:22:56: ERROR[Server]: DEBUG STACK FOR THREAD 7f31f7d1b700: 
2017-07-01 17:22:56: ERROR[Server]: #0  virtual void* ServerThread::run()
2017-07-01 17:22:56: ERROR[Server]: DEBUG STACK FOR THREAD 7f320a4087c0: 
2017-07-01 17:22:56: ERROR[Server]: #0  int main(int, char**)
2017-07-01 17:22:56: ERROR[Server]: #1  void ClientMap::renderMap(irr::video::IVideoDriver*, irr::s32)

I think it can be a error during serialization of node's metadata.

When I restart the server, The ship has disappeared ( :-<)

@stujones11
Copy link
Owner

I was not aware there was a size limit on string serialization, it would be interesting to know what that limit was then I could at least prevent a crash, though it could still result in loss of inventory items.

It may also be possible to compress the strings using zlib but I think a better long-term solution would be to move away from metadata storage to the new world storage api.

Thank you for the detailed report and sorry about your loss.

@stujones11 stujones11 added the bug label Jul 2, 2017
@stujones11
Copy link
Owner

stujones11 commented Jul 2, 2017

Please see if this helps aa542be

Sorry I realized that was overly simplistic, I will try to post a better solution asap.

Update: 2692918 should hopefully be better, it just depends whether the compression level is sufficient or not, I have not yet been able to reproduce this error locally.

@jimy-byerley
Copy link
Author

jimy-byerley commented Jul 3, 2017

I just tried it but the problem is still here, and it crashed after the first use.
It seems that itsn't due to your mod: after rebooting the server I get


2017-07-03 10:28:29: WARNING[Server]: meshnode: stray object removed
2017-07-03 10:28:29: ERROR[Server]: suspiciously large amount of objects detected: 197 in (15,3,61); removing all of them.

I think it crashes because of the too large number of "meshed" blocks (that are using one entity each ?)

@jimy-byerley
Copy link
Author

jimy-byerley commented Jul 3, 2017

Found: minetest/util/serialize.cpp
This function is used to serialize. It stores the length of strings on 2 bytes, and raise an error for each string over 2^(2*8) = 65536. This size is used by minetest from a while to store its maps, so it's not possible to modify it without lost all existing worlds ...

@stujones11
Copy link
Owner

They are indeed using one entity each as there is currently no way to create proper voxel area entities so you need to increase max_objects_per_block to a more sensible level to prevent that error.

Thanks for the info on serialization, at least by knowing the limit I can include a check to prevent crashing, however, a better solution would be to use tables rather than strings for inventory items but I keep getting invalid key to 'next' error when I do meta:from_table()

@HybridDog
Copy link

You could save an id number to meta and use modstorage to save the ships, nodemeta is sent to clients and causes unneeded information sending. To fix the serialization problem you could serialize manually.
https://github.com/minetest/minetest/blob/master/doc/lua_api.txt#L3114

@stujones11
Copy link
Owner

Yes, mod storage is definitely the way to go in future versions but for now I think f35f481 should at least prevent crashes and does compress strings where possible, books for example could be problematic, I'm not sure.

@jimy-byerley
Copy link
Author

jimy-byerley commented Jul 7, 2017

I got a crash with this one too. Same error with string length, but I didn't seen any warning.
(I probably put more blocks than the last time to get the crash)

@stujones11
Copy link
Owner

stujones11 commented Jul 7, 2017

Strange, I felt sure it must be a string formatted inventory item (like a book) that could possibly exceed that limit. It must be some specific node that causes this, are you using anything in particular as building materials that could possibly contain very long strings in their meta?

cdc3b1f might help to identify the culprit.

@jimy-byerley
Copy link
Author

jimy-byerley commented Jul 7, 2017

I tested it on a terrain piece (it's easier than building an entire ship), so dirt, stone, grass, including "plantlike" grass, leaves, tree, ...
For what I seen, that is more likely the large amount of blocks than the block type, that generates a too long string. It is strange with f35f481

How should I use cdc3b1f ?

@stujones11
Copy link
Owner

stujones11 commented Jul 8, 2017

Sorry, the last commit merely compresses and checks all string data, if the problem persists then we know the issue is not with node-meta serialization. It's beginning to look that way as the building materials you mention should not contain much (if any) metadata.

Yes, it looks like the controller entity is attempting to serialize the already serialized metadata when it statically saves. I will work on a fix asap, the nodemeta compression is likely not needed after all.

@stujones11
Copy link
Owner

stujones11 commented Jul 8, 2017

Hopefully e99b25a will fix the problem, or at least fail a little more gracefully if a string is too long.

@jimy-byerley
Copy link
Author

Let see that: no difference ...

@stujones11
Copy link
Owner

stujones11 commented Jul 10, 2017

As far as I can see, the only serialization (besides meta positions) is done by the controller node when statically saving. I have tried activating large volumes of landscape as you suggested but have still been unable to reproduce the serialization error, however, I have noticed some other problems with large volumes being removed along with the controller but that's another issue.

I have moved the string length check back to the serialized inventory itemstacks which is the only place I can think they could possibly exceed the limit. If none of these attempts at fixing has actually produced the log message then I wonder if the limit is actually less than 65535? I am running out of ideas now :-/

@jimy-byerley
Copy link
Author

The string size that trigger the alert message is defined here https://github.com/minetest/minetest/blob/stable-0.4/src/util/serialize.h#L58. It has exactly the same value as you use.
However the fact your can't reproduce this issue is strange: That wouldn't be due to the minetest api then.
So maybe it comes with some C++ code specific to linux, in the minetest source code.

@jimy-byerley
Copy link
Author

As there is never any warning about single string length, I can suppose that minetest tries to store a string bigger that the string in the table's fields. Maybe a concatenation of several lines. Maybe minetest is using serializestring for tables too.

@stujones11
Copy link
Owner

stujones11 commented Jul 10, 2017

I am using linux myself, unless you mean that you are using something else? I would be interested to know if my latest commit makes any difference, though I suspect it will not, you could try reducing the limit to see if you can at least produce the log error rather than crashing. I will continue to look into this, I could always roll my own serializer as @HybridDog suggested but this will be much slower done in pure Lua.

@jimy-byerley
Copy link
Author

jimy-byerley commented Jul 10, 2017

No, I'm using debian 8. The more I try each commit and search in the minetest source code, the less I understand how this bug can appear ...
Which version of minetest do you have ? Mine is

Minetest 0.4.16 (Linux)
Using Irrlicht 1.8.1
Build info: VER=0.4.16 BUILD_TYPE=Release RUN_IN_PLACE=0 USE_GETTEXT=0 USE_SOUND=1 USE_CURL=1 USE_FREETYPE=1 USE_LUAJIT=0 STATIC_SHAREDIR="/usr/local/share/minetest"

Downloaded from the stable 0.4 branch of the repo

@stujones11
Copy link
Owner

I am also using debian 8, but it seems strange as I just did a quick test and I do not even get that error when calling minetest.serialize on a table containing a string of twice the supposed limit. Here's the code I used.

local hex = "0123456789ABCDEF"
local str = ""
for i = 1, 8192 do
	str = str..hex
end
print(str:len())
local tab = {str}
local test = minetest.serialize(tab)

@jimy-byerley
Copy link
Author

Strange ... it doesn't crash for me too. And it doesn't crash when I serialize the string directly.

@jimy-byerley
Copy link
Author

jimy-byerley commented Jul 10, 2017

The only conclusion we can get is that the issue occurs because a minetest subroutine calls serializeString() instead of serializeLongString where the lua binding minetest.serialize() checks for the string size.

@stujones11
Copy link
Owner

stujones11 commented Jul 10, 2017

If this is caused by an internal call then it's really an engine bug IMO. What version are you using exactly? I am using 0.4.16 but am a bit behind current head, not sure if anything changed with serialization recently though.

EDIT: I see you are using 0.4.16 also, it doesn't look like the code has changed since february :?

@orbea
Copy link
Contributor

orbea commented Nov 2, 2017

This crash occurs for me too.

2017-11-02 21:23:13: ERROR[Server]: An unhandled exception occurred: String too long for serializeString
2017-11-02 21:23:13: ERROR[Server]: In thread 0x805c08000:                        
2017-11-02 21:23:13: ERROR[Server]: /home/orbea/src/minetest/src/server.cpp:73: virtual void *ServerThread::run(): A fatal error occured: String too long for serializeString
2017-11-02 21:23:13: ERROR[Server]: Debug stacks:                                 
2017-11-02 21:23:13: ERROR[Server]: DEBUG STACK FOR THREAD 0x805c06400:           
2017-11-02 21:23:13: ERROR[Server]: #0  int main(int, char **)                    
2017-11-02 21:23:13: ERROR[Server]: #1  Dedicated server branch                   
2017-11-02 21:23:13: ERROR[Server]: #2  void dedicated_server_loop(Server &, bool &)
2017-11-02 21:23:13: ERROR[Server]: DEBUG STACK FOR THREAD 0x805c08000:           
2017-11-02 21:23:13: ERROR[Server]: #0  virtual void *ServerThread::run()         
2017-11-02 21:23:13: ERROR[Server]: DEBUG STACK FOR THREAD 0x805c08400:           
2017-11-02 21:23:13: ERROR[Server]: #0  virtual void *EmergeThread::run()         
2017-11-02 21:23:13: ERROR[Server]: #1  MapBlock *ServerMap::loadBlock(v3s16)     
2017-11-02 21:23:13: ERROR[Server]: #2  void ServerMap::loadBlock(std::string *, v3s16, MapSector *, bool)
Abort trap (core dumped)

minetest-0.4.16
meshnode commit 5901137
OS: FreeBSD 10.3

This crash occurred while a player in our server was flying, he lost his ship when the server was restarted.

@midnightpizza
Copy link

I wonder if this is related to blocks count. for example too wide to too many blocks used.

@rubenwardy
Copy link

local test = minetest.serialize(tab)

This does not call serializeString - serialize is a Lua function defined which writes tables to a format similar to Lua syntax.

seralizeString is a function that writes a string to a binary buffer before sending or storing it.

I'd need to look into this mod though to see what the issue could be. I won't be able to do this for the next few weeks, but I can confirm that this certainly looks like an engine issue - whether or not that's caused by the stuff you're serializing being excessively long (in which case the crash should have a better error message anyway)

@orbea
Copy link
Contributor

orbea commented Nov 3, 2017

We have a test world with only the meshnode mod and the default game enabled which consistently reproduces this crash for at least two different Linux environments.

$ minetest 
2017-11-02 17:52:55: WARNING[Main]: Couldn't find a locale directory!
2017-11-02 17:52:55: [Main]: Automatically selecting world at [/home/orbea/.minetest/worlds/testthis]
Loaded texture: /usr/share/minetest/games/minetest_game/menu/header.png
Loaded texture: /usr/share/minetest/games/minetest_game/menu/icon.png
Loaded texture: /usr/share/minetest/games/minimal/menu/icon.png
2017-11-02 17:53:00: WARNING[Main]: NodeDefManager: Ignoring CONTENT_IGNORE redefinition
2017-11-02 17:53:00: ACTION[Main]:         .__               __                   __   
2017-11-02 17:53:00: ACTION[Main]:   _____ |__| ____   _____/  |_  ____   _______/  |_ 
2017-11-02 17:53:00: ACTION[Main]:  /     \|  |/    \_/ __ \   __\/ __ \ /  ___/\   __\n2017-11-02 17:53:00: ACTION[Main]: |  Y Y  \  |   |  \  ___/|  | \  ___/ \___ \  |  |  
2017-11-02 17:53:00: ACTION[Main]: |__|_|  /__|___|  /\___  >__|  \___  >____  > |__|  
2017-11-02 17:53:00: ACTION[Main]:       \/        \/     \/          \/     \/        
2017-11-02 17:53:00: ACTION[Main]: World at [/home/orbea/.minetest/worlds/testthis]
2017-11-02 17:53:00: ACTION[Main]: Server for gameid="minetest" listening on 0.0.0.0:58205.
2017-11-02 17:53:01: ACTION[Server]: singleplayer [127.0.0.1] joins game. 
2017-11-02 17:53:01: ACTION[Server]: singleplayer joins game. List of players: singleplayer
Loaded mesh: meshnode_plant.obj
Loaded mesh: meshnode_plant.obj
2017-11-02 17:53:09: ACTION[Server]: singleplayer right-clicks object 2: LuaEntitySAO at (10,30,10)
2017-11-02 17:53:43: ERROR[Server]: An unhandled exception occurred: String too long for serializeString
2017-11-02 17:53:43: ERROR[Server]: In thread 7f76d9ad2700:
2017-11-02 17:53:43: ERROR[Server]: /tmp/SBo/minetest-0.4.16/src/server.cpp:73: virtual void* ServerThread::run(): A fatal error occured: String too long for serializeString
2017-11-02 17:53:43: ERROR[Server]: Debug stacks:
2017-11-02 17:53:43: ERROR[Server]: DEBUG STACK FOR THREAD 7f76c1b6f700: 
2017-11-02 17:53:43: ERROR[Server]: #0  virtual void* EmergeThread::run()
2017-11-02 17:53:43: ERROR[Server]: DEBUG STACK FOR THREAD 7f76c2370700: 
2017-11-02 17:53:43: ERROR[Server]: #0  virtual void* UpdateThread::run()
2017-11-02 17:53:43: ERROR[Server]: DEBUG STACK FOR THREAD 7f76c2b71700: 
2017-11-02 17:53:43: ERROR[Server]: #0  virtual void* UpdateThread::run()
2017-11-02 17:53:43: ERROR[Server]: DEBUG STACK FOR THREAD 7f76d9ad2700: 
2017-11-02 17:53:43: ERROR[Server]: #0  virtual void* ServerThread::run()
2017-11-02 17:53:43: ERROR[Server]: DEBUG STACK FOR THREAD 7f76ecbc47c0: 
2017-11-02 17:53:43: ERROR[Server]: #0  int main(int, char**)
2017-11-02 17:53:43: ERROR[Server]: #1  void ClientMap::renderMap(irr::video::IVideoDriver*, irr::s32)
Aborted
$ minetest 
2017-11-02 17:53:48: WARNING[Main]: Couldn't find a locale directory!
2017-11-02 17:53:48: [Main]: Automatically selecting world at [/home/orbea/.minetest/worlds/testthis]
Loaded texture: /usr/share/minetest/games/minetest_game/menu/header.png
Loaded texture: /usr/share/minetest/games/minetest_game/menu/icon.png
Loaded texture: /usr/share/minetest/games/minimal/menu/icon.png
2017-11-02 17:53:50: WARNING[Main]: NodeDefManager: Ignoring CONTENT_IGNORE redefinition
2017-11-02 17:53:50: ACTION[Main]:         .__               __                   __   
2017-11-02 17:53:50: ACTION[Main]:   _____ |__| ____   _____/  |_  ____   _______/  |_ 
2017-11-02 17:53:50: ACTION[Main]:  /     \|  |/    \_/ __ \   __\/ __ \ /  ___/\   __\n2017-11-02 17:53:50: ACTION[Main]: |  Y Y  \  |   |  \  ___/|  | \  ___/ \___ \  |  |  
2017-11-02 17:53:50: ACTION[Main]: |__|_|  /__|___|  /\___  >__|  \___  >____  > |__|  
2017-11-02 17:53:50: ACTION[Main]:       \/        \/     \/          \/     \/        
2017-11-02 17:53:50: ACTION[Main]: World at [/home/orbea/.minetest/worlds/testthis]
2017-11-02 17:53:50: ACTION[Main]: Server for gameid="minetest" listening on 0.0.0.0:53448.
2017-11-02 17:53:51: ACTION[Server]: singleplayer [127.0.0.1] joins game. 
2017-11-02 17:53:51: ACTION[Server]: singleplayer joins game. List of players: singleplayer
2017-11-02 17:54:12: WARNING[Server]: meshnode: stray object removed

http://9skunks.tech/minetest/meshnode-only-test-world-testthis.tar.xz
http://9skunks.tech/minetest/meshnode-only-test-world-testthis.tar.xz.md5sum
http://9skunks.tech/minetest/readme.txt

@orbea
Copy link
Contributor

orbea commented Nov 3, 2017

I was able to reproduce this with a git version of minetest.
minetest-2017.10.31_2884196_master-x86_64-1_git
minetest_game-2017.10.27_49cc4c7_master-noarch-1_git
luanti-org/luanti@2884196
luanti-org/minetest_game@49cc4c7
(Btw that second commit is pretty cool to see!)

@orbea
Copy link
Contributor

orbea commented Nov 3, 2017

With a debug build.

$ minetest 
2017-11-02 18:54:57: [Main]: Automatically selecting world at [/home/orbea/.minetest/worlds/testthis]
Loaded texture: /usr/share/games/minetest/games/minetest_game/menu/header.png
Loaded texture: /usr/share/games/minetest/games/minetest_game/menu/icon.png
Loaded texture: /usr/share/games/minetest/games/minimal/menu/icon.png
2017-11-02 18:54:59: WARNING[Main]: NodeDefManager: Ignoring CONTENT_IGNORE redefinition
2017-11-02 18:54:59: ACTION[Main]:         .__               __                   __   
2017-11-02 18:54:59: ACTION[Main]:   _____ |__| ____   _____/  |_  ____   _______/  |_ 
2017-11-02 18:54:59: ACTION[Main]:  /     \|  |/    \_/ __ \   __\/ __ \ /  ___/\   __\n2017-11-02 18:54:59: ACTION[Main]: |  Y Y  \  |   |  \  ___/|  | \  ___/ \___ \  |  |  
2017-11-02 18:54:59: ACTION[Main]: |__|_|  /__|___|  /\___  >__|  \___  >____  > |__|  
2017-11-02 18:54:59: ACTION[Main]:       \/        \/     \/          \/     \/        
2017-11-02 18:54:59: ACTION[Main]: World at [/home/orbea/.minetest/worlds/testthis]
2017-11-02 18:54:59: ACTION[Main]: Server for gameid="minetest" listening on 0.0.0.0:62594.
2017-11-02 18:55:00: ACTION[Server]: singleplayer [127.0.0.1] joins game. 
2017-11-02 18:55:00: ACTION[Server]: singleplayer joins game. List of players: singleplayer
Loaded mesh: character.b3d
2017-11-02 18:55:07: ACTION[Server]: singleplayer right-clicks object 2: LuaEntitySAO at (10,30,10)
2017-11-02 18:55:42: ERROR[Server]: An unhandled exception occurred: String too long for serializeString
2017-11-02 18:55:42: ERROR[Server]: In thread 7fe211012700:
2017-11-02 18:55:42: ERROR[Server]: /tmp/SBo/minetest/src/server.cpp:74: virtual void* ServerThread::run(): A fatal error occured: String too long for serializeString
Aborted

Backtrace.

Thread 11 "Server" received signal SIGABRT, Aborted.
[Switching to Thread 0x7f1cda177700 (LWP 28713)]
0x00007f1ce7c11c9f in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00007f1ce7c11c9f in raise () from /lib64/libc.so.6
#1  0x00007f1ce7c13717 in abort () from /lib64/libc.so.6
#2  0x000000000075c63c in fatal_error_fn(char const*, char const*, unsigned int, char const*) ()
#3  0x00000000009a3453 in ServerThread::run() ()
#4  0x00000000006132af in Thread::threadProc(Thread*) ()
#5  0x00007f1ce85deb7f in ?? () from /usr/lib64/libstdc++.so.6
#6  0x00007f1ce9aa2565 in start_thread () from /lib64/libpthread.so.0
#7  0x00007f1ce7cf131f in clone () from /lib64/libc.so.6

Full GDB log. - https://pastebin.com/j1HUq5aT

@stujones11
Copy link
Owner

@orbea Thanks for the info, I will look into this as soon as I have time.

@rubenwardy Thank you for your interest in this issue, let's hope we can trace the problem :)

@orbea
Copy link
Contributor

orbea commented Nov 4, 2017

Thanks! I would really like to see this mod work well. :)

@stujones11
Copy link
Owner

stujones11 commented Nov 4, 2017

While it would be nice to get to the bottom of this for a number of reasons, I would not recommend anyone waste too much time on this mod as it will never be perfect. Better to work on true engine provided 'voxel area entities' instead, this is (or was) on the longer term roadmap for minetest, iirc.

@orbea
Copy link
Contributor

orbea commented Nov 4, 2017

Yea, a proper solution will always be more desirable. I guess to rephrase I would really like to see the ideas behind this mod work well. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants