Skip to content

Refactor BufferedJsonReader async paths to reduce GC pressure #3359

@gathogojr

Description

@gathogojr

Is your feature request related to a problem? Please describe.

  1. Per‑node allocation in the buffer
    Each buffered node is a fresh object; with large payloads or deep look‑ahead this drives allocation and GC pressure.

  2. Unconditional async state machines
    Many BufferedJsonReader methods are async, so the compiler emits state machines even when operations complete synchronously. This adds avoidable allocations and hurts throughput. (We expect additional gains once the inner JsonReader also adopts synchronous fast paths + ValueTask - Refactor JsonReader async paths to reduce GC pressure #3358.)

Describe the solution you'd like

  • Reduce allocations and GC pressure in hot paths without changing observable behavior.
  • Preserve correctness.
  • Enable the inner JsonReader refactor to deliver compounding benefits.

A) Buffered node pooling (bounded, safe)

What to change

  • Introduce a bounded pool for BufferedNode instances to avoid per‑node allocations.
  • Return nodes to the pool when they are removed from the buffer.

Key guardrails

  • Bound the pool to prevent unbounded memory retention under pathological cases.
  • Scrub references on return (clear value and links) to avoid retaining large graphs.
  • Defensive checks in Debug builds to catch double‑return or use‑after‑free.

NOTE: BufferedJsonReader is not thread‑safe today. A thread‑local (per‑thread) pool is acceptable, but note that async continuations may hop threads; returning a node to a different thread’s pool is harmless but reduces reuse. If we want cross‑thread reuse with bounded retention, a global ObjectPool<T> (e.g., Microsoft.Extensions.ObjectPool) with maximumRetained cap is a safer default.

Suggested implementation:

// Keep pools small; tune via benchmarks.
private const int MaxPooledNodesPerThread = 256;

[ThreadStatic]
private static Stack<BufferedNode>? bufferedNodePool    ;

private static BufferedNode RentNode(JsonNodeType nodeType, object? value)
{
    Stack<BufferedNode> pool = this.bufferedNodePool ??= new Stack<BufferedNode>(capacity: 128);
    if (pool.Count > 0)
    {
        BufferedNode node = pool.Pop();
        node.Reset(nodeType, value);
#if DEBUG
        node._pooled = false;
#endif
        return node;
    }

    return new BufferedNode(nodeType, value);
}

private static void ReturnNode(BufferedNode node)
{
    // Break links & clear payload to avoid retaining large graphs in the pool.
    node.Previous = null;
    node.Next = null;
    node.Value = null;

#if DEBUG
    if (node._pooled) throw new InvalidOperationException("double return");
    node._pooled = true;
#endif

    Stack<BufferedNode> pool = this.bufferedNodePool ??= new Stack<BufferedNode>(capacity: 128);
    if (pool.Count < MaxPooledNodesPerThread)
    {
        pool.Push(node);
        return;
    }

    // Drop on the floor beyond the cap; let GC reclaim.
}

Apply pooling in AddNewNodeToTheEndOfBufferedNodesList and RemoveFirstNodeInBuffer (and any other creation/removal points). We could consider wrapping removals in try/finally so nodes return to the pool even on exceptions.

B) Avoid redundant state machines in async methods

Pattern

  • Fast path: If the buffer can satisfy the request synchronously, return a completed ValueTask/ValueTask<T>; no state machine allocation.
  • Slow path: Allocate the state machine only when you need to await the inner reader.

Priority methods

Most heavily used (high priority):

  • CanStreamAsync
  • ProcessObjectValueAsync
  • ReadAsync - OK
  • ReadInternalAsync
  • ReadNextAndCheckForInStreamErrorAsync
  • SkipValueInternalAsync
  • StartBufferingAndTryToReadInStreamErrorPropertyValueAsync
  • StartBufferingAsync

Less heavily used (lower priority):

  • CreateReadStreamAsync
  • CreateTextReaderAsync
  • TryReadErrorDetailAsync
  • TryReadErrorDetailsPropertyValueAsync
  • TryReadErrorStringPropertyValueAsync
  • TryReadInnerErrorPropertyValueAsync
  • TryReadInStreamErrorPropertyValueAsync

Below is aggressively refactored ReadInternalAsync for illustration:

private static class BooleanTasks
{
    internal static readonly Task<bool> True = Task.FromResult(true);
    internal static readonly Task<bool> False = Task.FromResult(false);
}
        
protected Task<bool> ReadInternalAsync()
{
	this.AssertAsynchronous();

	if (this.removeOnNextRead)
	{
		Debug.Assert(this.bufferedNodesHead != null, "If removeOnNextRead is true we must have at least one node in the buffer.");

		this.RemoveFirstNodeInBuffer();
		this.removeOnNextRead = false;
	}

	// Fast path: non-buffering replay from buffer
	if (!this.isBuffering && this.bufferedNodesHead != null)
	{
		Debug.Assert(!this.parsingInStreamError, "!this.parsingInStreamError");

		// Non-buffering read from the buffer
		bool result = this.bufferedNodesHead.NodeType != JsonNodeType.EndOfInput;
		this.removeOnNextRead = true;

		return result ? BooleanTasks.True : BooleanTasks.False;
	}

	// Fast path: still walking inside buffer
	if (this.isBuffering)
	{
		Debug.Assert(this.currentBufferedNode != null, "this.currentBufferedNode != null");

		if (this.currentBufferedNode.Next != this.bufferedNodesHead)
		{
			this.currentBufferedNode = this.currentBufferedNode.Next;
			return BooleanTasks.True;
		}
	}

	return ReadInternalLocalAsync(this);

	static Task<bool> ReadInternalLocalAsync(BufferingJsonReader thisParam)
	{
		if (thisParam.isBuffering)
		{
			Debug.Assert(thisParam.currentBufferedNode != null, "thisParam.currentBufferedNode != null");

			if (thisParam.parsingInStreamError)
			{
				// Read more from the input stream and buffer it
				Task<bool> innerReadTask = thisParam.innerReader.ReadAsync();
				if (!innerReadTask.IsCompletedSuccessfully)
				{
					return AwaitInnerReaderAsync(thisParam, innerReadTask);
				}
				
				bool localResult = innerReadTask.Result;
				if (!localResult)
				{
					return BooleanTasks.False;
				}

				return BufferValueIfNeededAsync(thisParam);
			}
			else
			{
				// Read the next node from the input stream and check
				// whether it is an in-stream error
				Task<bool> checkForInStreamErrorTask = thisParam.ReadNextAndCheckForInStreamErrorAsync();
				if (checkForInStreamErrorTask.IsCompletedSuccessfully)
				{
					return checkForInStreamErrorTask.Result ? BooleanTasks.True : BooleanTasks.False;
				}
				
				return checkForInStreamErrorTask;
			}
		}
		
		// Non-buffering + no buffered nodes yet
		if(thisParam.bufferedNodesHead == null)
		{
			// If parsingInStreamError nothing in the buffer; read from the base,
			// else read the next node from the input stream and check
			// whether it is an in-stream error
			if (thisParam.parsingInStreamError)
			{
				Task<bool> readTask = thisParam.innerReader.ReadAsync();
				if (!readTask.IsCompletedSuccessfully)
				{
					return readTask;
				}
				
				return readTask.Result ? BooleanTasks.True : BooleanTasks.False;
			}
			else
			{
				Task<bool> checkForInStreamErrorTask = thisParam.ReadNextAndCheckForInStreamErrorAsync();
				
				if (checkForInStreamErrorTask.IsCompletedSuccessfully)
				{
					return checkForInStreamErrorTask.Result ? BooleanTasks.True : BooleanTasks.False;
				}
				
				return checkForInStreamErrorTask;
			}
		}
		
		// Should not reach here - buffered replay handled above
		bool result = thisParam.bufferedNodesHead.NodeType != JsonNodeType.EndOfInput;
		thisParam.removeOnNextRead = true;
		
		return result ? BooleanTasks.True : BooleanTasks.False;
	}
	
	static Task<bool> BufferValueIfNeededAsync(BufferingJsonReader thisParam)
	{
		object value = null;
		JsonNodeType localNodeType = thisParam.innerReader.NodeType;
		// GetValueAsync returns null for everything other than primitive value and property nameof
		// This check should help us avoid pointless calls
		// TODO: Verify correctness of this check
		if (localNodeType == JsonNodeType.PrimitiveValue || localNodeType == JsonNodeType.Property)
		{
			Task<object> getValueTask = thisParam.innerReader.GetValueAsync();
			if (!getValueTask.IsCompletedSuccessfully)
			{
				return AwaitGetValueAsync(thisParam, getValueTask);
			}
			
			value = getValueTask.Result;
		}
		
		thisParam.AddNewNodeToTheEndOfBufferedNodesList(thisParam.innerReader.NodeType, value);
		
		return BooleanTasks.True;
	}
	
	static async Task<bool> AwaitInnerReaderAsync(BufferingJsonReader thisParam, Task<bool> innerReadTask)
	{
		if (!await innerReadTask.ConfigureAwait(false))
		{
			return false;
		}
		
		object value = null;
		JsonNodeType localNodeType = thisParam.innerReader.NodeType;
		// GetValueAsync returns null for everything other than primitive value and property nameof
		// This check should help us avoid pointless calls
		// TODO: Verify correctness of this check
		if (localNodeType == JsonNodeType.PrimitiveValue || localNodeType == JsonNodeType.Property)
		{
			value = await thisParam.innerReader.GetValueAsync().ConfigureAwait(false);
		}
		
		// Add new node to the end
		thisParam.AddNewNodeToTheEndOfBufferedNodesList(thisParam.innerReader.NodeType, value);
		
		return true;
	}
	
	static async Task<bool> AwaitGetValueAsync(BufferingJsonReader thisParam, Task<object> getValueTask)
	{
		object value = await getValueTask.ConfigureAwait(false);
		// Add new node to the end
		thisParam.AddNewNodeToTheEndOfBufferedNodesList(thisParam.innerReader.NodeType, value);

		return true;				
	}
}

Be noted that ReadInternalAsync does the heavily lifting in the BufferedJsonReader class.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions