-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crashing with big prompt when streaming. #14
Comments
Thanks for the info. I think I understand the problem and will put in handling for partial responses from the server. I'm assuming you were using the |
Yes. Generate, and only crash when streaming is true and prompt is large. (Not sure about the response.) I'm curious what context is used for. Can it actually be used into another prompt? |
Context is the list of the tokens from the past conversation history, eg the prompt you provided and the model response in vector form. If you feed this as an input to another prompt the model will have the entire conversation history when producing a response. Most models only support a context window of a fixed size (usually 2048 - 8096 tokens) but this can be used for longer conversations or generations with longer inputs. The Ollama team has added a few things to the |
I tried something really silly.
Message 1. terminate called after throwing an instance of 'nlohmann::json_abi_v3_11_3::detail::type_error' So yeah... It came in part for the whole message. That's pretty wild to me. That's pretty interesting about the use of context. What is difference between chat and context then? |
Streaming is useless until this is fixed. If you could resolve the handling of partial messages that would be fantastic! :) |
I merged in a solution testing this with the generate endpoint with #21. This looks good from my tests; Let me know if this resolves your specific use case. It looks like any replies from the server are being limited to 4090 bytes. Anything larger than that size is being fragmented into multiple messages over HTTP. This patch keeps an intermediate buffer and stores all received messages until they form a valid JSON message. If you can confirm resolved performance I'll also apply this change to the chat and embedding endpoints. |
It work! Yay! |
I need to be able to send some large prompt, but it appear to crash.
It happen consistently for large prompt but not for small one.
I tried to make it not happen with try and catch exception, but it didn't work either.
However, it does not crash if I use no stream.
So, I attempted to find the issue and tried turning off allow_exception and show replies because it only threw Ollama exception without any more info. I discovered that it end like this:
{"model":"llama3.1:latest","created_at":"2024-08-12T04:18:00.108913513Z","response":"","done":true,"done_reason":"stop","context":[128006,9125,128007,271,1687,2351,304,264,15861,1405,1274,527,41416,71017,28296,10269,2209,437,346,645,39131,449,11977,389,872,11314,13,11205,6548,527,4251,12960,449,6437,61072,26432,11,323,264,3254,4251,28029,11013,627,32,8762,39131,8965,617,264,2385,1933,315,1193,279,35044,7106,1933,1093,11,3816,11,22725,11,26541,11,7997,11,8828,6437,11,8868,13,2030,11,814,1101,617,10824,1093,5929,11,14198,11,14531,627,32,8954,39131,4869,11,617,279,10824,315,2579,323,6437,8146,11,7231,1124,3816,11,41489,11,27211,11,7023,16985,11,323,8868,13,2435,617,18460,11813,369,682,220,18,3585,11,20444,19960,527,17676,14618,627,2675,527,279,10877,13,1472,23846,279,3446,358,3371,499,1093,433,596,264,5818,13,128009,128006,9125,128007,271,1687,2351,304,264,15861,1405,1274,527,41416,71017,28296,10269,2209,437,346,645,39131,449,11977,389,872,11314,13,11205,6548,527,4251,12960,449,6437,61072,26432,11,323,264,3254,4251,28029,11013,627,32,8762,39131,8965,617,264,2385,1933,315,1193,279,35044,7106,1933,1093,11,3816,11,22725,11,26541,11,7997,11,8828,6437,11,8868,13,2030,11,814,1101,617,10824,1093,5929,11,14198,11,14531,627,32,8954,39131,4869,11,617,279,10824,315,2579,323,6437,8146,11,7231,1124,3816,11,41489,11,27211,11,7023,16985,11,323,8868,13,2435,617,18460,11813,369,682,220,18,3585,11,20444,19960,527,17676,14618,627,2675,527,279,10877,13,1472,23846,279,3446,358,3371,499,1093,433,596,264,5818,13,128009,128006,882,128007,1038,10224,25,622,638,70408,271,2170,279,7160,6137,311,743,389,279,13057,16763,1974,11,264,47766,26541,39131,7086,622,638,1903,813,1648,1203,311,813,13691,11,264,3254,1515,1776,2234,927,813,17308,627,16366,6548,15541,839,449,264,5647,315,24617,11,719,24923,430,11203,264,13310,315,87163,5849,323,264,19662,41328,922,279,1917,7953,279,35174,627,1548,71771,264,14351,11,323,1243,4780,311,9762,8356,505,813,13863,627,2170,568,15203,11,813,36496,41398,704,311,279,61838,11,653,50009,506,95519,315,279,16763,8329,11,264,5647,315,79422,323,31398,28786,927,1461,627,2409,813,14351,2539,11,622,638,6052,311,813,13691,11,5675,304,3463,439,568,11427,3093,279,64550,315,264,50999,628,40458,86188,627,1548,5900,311,1373,1063,315,813,7075,2363,11,568,1047,4101,315,2999,18025,345,1,11787,12639,17694,2744,1314,36818,72059,264,1314,22217,3041,499,264,58768,34536,1,52522,279,1314,39131,34536,1,96621,264,69934,1994,39131,994,499,1518,832,34536,1,5530,6296,13,1314,34536,1,96621,701,14469,1182,9135,627,1548,2503,304,813,10716,627,438,3940,5403,627,791,1828,1938,9522,41,638,4321,704,315,813,3838,3411,264,2697,12703,13,320,18433,568,6439,7636,29275,1548,2010,10411,311,813,1866,6246,1828,311,13952,627,2181,706,264,4686,12960,315,65442,2212,264,36670,304,279,6278,449,5496,3245,1093,37085,24269,304,433,627,1548,2503,389,813,23162,10716,627,23274,12703,627,1548,16008,264,3070,315,37085,627,1548,53914,627,98331,568,21423,68,3059,627,32,938,11975,80608,311,430,28485,7162,627,31082,11,622,638,374,86015,813,1515,449,813,1450,627,3112,568,53006,433,449,813,40902,1139,4027,6798,627,1548,5900,4871,323,923,1515,311,813,40511,627,1548,2800,304,813,10716,369,279,3814,14624,311,4027,17944,2785,627,2181,2543,6453,627,791,1828,6693,11,568,1427,704,279,3321,627,16366,3663,374,1437,41133,449,65875,627,32,1317,892,4227,994,568,574,264,1716,13,5414,18233,3309,1461,11,330,40,2846,1022,311,990,11,358,3358,387,1203,18396,10246,41,638,20592,11,330,6854,956,358,2586,449,499,48469,7189,6562,499,649,1210,568,6013,11,330,4071,11,358,2011,5944,34504,2533,433,596,264,1317,11879,13,30070,499,649,11722,11,499,2011,4822,1618,10246,7009,65682,47555,323,568,3952,11213,14836,11226,627,1548,2646,6052,627,41,638,14980,1070,555,5678,627,3112,706,3596,2533,11,439,568,25735,64033,9522,1548,4321,927,311,813,2363,55050,3131,810,627,47,92122,704,264,3361,2363,627,10227,15457,315,264,48085,39131,702,1548,6227,813,18233,11890,1461,11,330,40,1390,499,311,6865,264,3446,922,4423,1633,3361,627,2181,574,264,28812,38490,889,12439,1690,1667,4227,13,9176,1274,3463,568,574,1120,264,2184
terminate called after throwing an instance of 'nlohmann::json_abi_v3_11_3::detail::type_error'
what(): [json.exception.type_error.305] cannot use operator[] with a string argument with null
It fail to complete the context and ended up missing some integral info that an operator have tried and expected to access and thus crash.
What we would expect to have after the context is the end bracket and: ],"total_duration":1157301429,"load_duration":29868219,"prompt_eval_count":11,"prompt_eval_duration":27732000,"eval_count":51,"eval_duration":1049088000}
So, it must be guaranteed that the information come through entirely or otherwise we must check whether an object exist before trying to access it.
Or, it could be how it gather the information?
Anyway, this need to be fixed and I'm not sure how I can provide you something to replicate the error except make a large prompt.
The text was updated successfully, but these errors were encountered: