Skip to content
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions usermods/audioreactive/audio_reactive.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -224,8 +224,8 @@ void FFTcode(void * parameter)
DEBUGSR_PRINT("FFT started on core: "); DEBUGSR_PRINTLN(xPortGetCoreID());

// allocate FFT buffers on first call
if (vReal == nullptr) vReal = (float*) calloc(sizeof(float), samplesFFT);
if (vImag == nullptr) vImag = (float*) calloc(sizeof(float), samplesFFT);
if (vReal == nullptr) vReal = (float*) calloc(samplesFFT, sizeof(float));
if (vImag == nullptr) vImag = (float*) calloc(samplesFFT, sizeof(float));
if ((vReal == nullptr) || (vImag == nullptr)) {
// something went wrong
if (vReal) free(vReal); vReal = nullptr;
Expand Down
13 changes: 6 additions & 7 deletions wled00/FX.h
Original file line number Diff line number Diff line change
Expand Up @@ -88,18 +88,17 @@ extern byte realtimeMode; // used in getMappedPixelIndex()
#endif
#define FPS_CALC_SHIFT 7 // bit shift for fixed point math

/* each segment uses 82 bytes of SRAM memory, so if you're application fails because of
insufficient memory, decreasing MAX_NUM_SEGMENTS may help */
// heap memory limit for effects data, pixel buffers try to reserve it if PSRAM is available
#ifdef ESP8266
#define MAX_NUM_SEGMENTS 16
/* How much data bytes all segments combined may allocate */
#define MAX_SEGMENT_DATA 5120
#define MAX_SEGMENT_DATA (MAX_NUM_SEGMENTS*640) // 10k by default
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot (and I mean a lot) for ESP8266.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you can give me a scenario that is critical in ram usage, I can test.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 usermods (PIR, temperature, 4LD, AR) and 16x16 matrix

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works fine:
Free heap: 16384, Contiguous: 11344

Main:
Free heap: 15720, Contiguous: 7272

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you enable MQTT and use GPIO3?

Copy link
Collaborator Author

@DedeHai DedeHai Jul 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, I dont have the hardware to do that, you'd need to check with your setup.
edit:
since this is all about free heap : if it works in main, I see no reason it should not work with this PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the condition to delete and reallocate of the segmentdata should not be FAIR_DATA_PER_SEG but something a bit larger, might help with fragmentation on ESP8266

#elif defined(CONFIG_IDF_TARGET_ESP32S2)
#define MAX_NUM_SEGMENTS 20
#define MAX_SEGMENT_DATA (MAX_NUM_SEGMENTS*512) // 10k by default (S2 is short on free RAM)
#define MAX_SEGMENT_DATA (MAX_NUM_SEGMENTS*1024) // 20k by default (S2 is short on free RAM)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not cut it on every build for S2. Mine is one example, having only about 20-30k of free heap after everything is loaded and operational.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its no longer used in the same way: it now tries to keep this much ram free for FX data if there is PSRAM, otherwise it will limit the use to this amount but if heap runs low, it will respect minheap. so even if you set this to 100k nothing bad happens. one exception (which is no different in current implementation): if a usermod or something else uses heap to deplete it.

#else
#define MAX_NUM_SEGMENTS 32 // warning: going beyond 32 may consume too much RAM for stable operation
#define MAX_SEGMENT_DATA (MAX_NUM_SEGMENTS*1280) // 40k by default
#define MAX_SEGMENT_DATA (MAX_NUM_SEGMENTS*1920) // 60k by default
#endif

/* How much data bytes each segment should max allocate to leave enough space for other segments,
Expand Down Expand Up @@ -600,8 +599,8 @@ class Segment {
, _t(nullptr)
{
DEBUGFX_PRINTF_P(PSTR("-- Creating segment: %p [%d,%d:%d,%d]\n"), this, (int)start, (int)stop, (int)startY, (int)stopY);
// allocate render buffer (always entire segment)
pixels = static_cast<uint32_t*>(d_calloc(sizeof(uint32_t), length())); // error handling is also done in isActive()
// allocate render buffer (always entire segment), prefer PSRAM if DRAM is running low. Note: impact on FPS with PSRAM buffer is low (<2% with QSPI PSRAM)
pixels = static_cast<uint32_t*>(allocate_buffer(length() * sizeof(uint32_t), BFRALLOC_PREFER_PSRAM | BFRALLOC_NOBYTEACCESS | BFRALLOC_CLEAR));
if (!pixels) {
DEBUGFX_PRINTLN(F("!!! Not enough RAM for pixel buffer !!!"));
extern byte errorFlag;
Expand Down
81 changes: 47 additions & 34 deletions wled00/FX_fcn.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,9 @@ Segment::Segment(const Segment &orig) {
if (orig.name) { name = static_cast<char*>(d_malloc(strlen(orig.name)+1)); if (name) strcpy(name, orig.name); }
if (orig.data) { if (allocateData(orig._dataLen)) memcpy(data, orig.data, orig._dataLen); }
if (orig.pixels) {
pixels = static_cast<uint32_t*>(d_malloc(sizeof(uint32_t) * orig.length()));
if (pixels) memcpy(pixels, orig.pixels, sizeof(uint32_t) * orig.length());
// allocate pixel buffer: prefer PSRAM if DRAM is running low
pixels = static_cast<uint32_t*>(allocate_buffer(orig.length() * sizeof(uint32_t), BFRALLOC_PREFER_PSRAM | BFRALLOC_NOBYTEACCESS));
if (pixels) memcpy(pixels, orig.pixels, orig.length() * sizeof(uint32_t));
else {
DEBUG_PRINTLN(F("!!! Not enough RAM for pixel buffer !!!"));
errorFlag = ERR_NORAM_PX;
Expand Down Expand Up @@ -110,8 +111,9 @@ Segment& Segment::operator= (const Segment &orig) {
if (orig.name) { name = static_cast<char*>(d_malloc(strlen(orig.name)+1)); if (name) strcpy(name, orig.name); }
if (orig.data) { if (allocateData(orig._dataLen)) memcpy(data, orig.data, orig._dataLen); }
if (orig.pixels) {
pixels = static_cast<uint32_t*>(d_malloc(sizeof(uint32_t) * orig.length()));
if (pixels) memcpy(pixels, orig.pixels, sizeof(uint32_t) * orig.length());
// allocate pixel buffer: prefer PSRAM if DRAM is running low
pixels = static_cast<uint32_t*>(allocate_buffer(orig.length() * sizeof(uint32_t), BFRALLOC_PREFER_PSRAM | BFRALLOC_NOBYTEACCESS));
if (pixels) memcpy(pixels, orig.pixels, orig.length() * sizeof(uint32_t));
else {
DEBUG_PRINTLN(F("!!! Not enough RAM for pixel buffer !!!"));
errorFlag = ERR_NORAM_PX;
Expand Down Expand Up @@ -143,35 +145,41 @@ Segment& Segment::operator= (Segment &&orig) noexcept {

// allocates effect data buffer on heap and initialises (erases) it
bool Segment::allocateData(size_t len) {
if (len == 0) return false; // nothing to do
if (data && _dataLen >= len) { // already allocated enough (reduce fragmentation)
if (len == 0) return false; // nothing to do
if (data && _dataLen >= len) { // already allocated enough (reduce fragmentation)
if (call == 0) {
//DEBUG_PRINTF_P(PSTR("-- Clearing data (%d): %p\n"), len, this);
memset(data, 0, len); // erase buffer if called during effect initialisation
if (_dataLen < FAIR_DATA_PER_SEG) { // segment data is small
//DEBUG_PRINTF_P(PSTR("-- Clearing data (%d): %p\n"), len, this);
memset(data, 0, len); // erase buffer if called during effect initialisation
return true; // no need to reallocate
}
}
return true;
else
return true;
}
//DEBUG_PRINTF_P(PSTR("-- Allocating data (%d): %p\n"), len, this);
if (Segment::getUsedSegmentData() + len - _dataLen > MAX_SEGMENT_DATA) {
// not enough memory
DEBUG_PRINTF_P(PSTR("!!! Not enough RAM: %d/%d !!!\n"), len, Segment::getUsedSegmentData());
errorFlag = ERR_NORAM;
return false;
// limit to MAX_SEGMENT_DATA if there is no PSRAM, otherwise prefer functionality over speed
#if defined(ARDUINO_ARCH_ESP32)
if (!(psramFound() && psramSafe))
#endif
{
if (Segment::getUsedSegmentData() + len - _dataLen > MAX_SEGMENT_DATA) {
// not enough memory
DEBUG_PRINTF_P(PSTR("!!! Not enough RAM: %d/%d !!!\n"), len, Segment::getUsedSegmentData());
errorFlag = ERR_NORAM;
return false;
}
}
// prefer DRAM over SPI RAM on ESP32 since it is slow
// prefer DRAM over PSRAM for speed
if (data) {
data = (byte*)d_realloc_malloc(data, len); // realloc with malloc fallback
if (!data) {
data = nullptr;
Segment::addUsedSegmentData(-_dataLen); // subtract original buffer size
_dataLen = 0; // reset data length
}
d_free(data); // free data and try to allocate again (segment buffer may be blocking contiguous heap)
Segment::addUsedSegmentData(-_dataLen); // subtract buffer size
}
else data = (byte*)d_malloc(len);

data = static_cast<byte*>(allocate_buffer(len, BFRALLOC_PREFER_DRAM | BFRALLOC_CLEAR));

if (data) {
memset(data, 0, len); // erase buffer
Segment::addUsedSegmentData(len - _dataLen);
Segment::addUsedSegmentData(len);
_dataLen = len;
//DEBUG_PRINTF_P(PSTR("--- Allocated data (%p): %d/%d -> %p\n"), this, len, Segment::getUsedSegmentData(), data);
return true;
Expand Down Expand Up @@ -205,7 +213,11 @@ void Segment::deallocateData() {
void Segment::resetIfRequired() {
if (!reset || !isActive()) return;
//DEBUG_PRINTF_P(PSTR("-- Segment reset: %p\n"), this);
if (data && _dataLen > 0) memset(data, 0, _dataLen); // prevent heap fragmentation (just erase buffer instead of deallocateData())
if (data && _dataLen > 0) {
if (_dataLen > FAIR_DATA_PER_SEG) deallocateData(); // do not keep large allocations
else memset(data, 0, _dataLen); // can prevent heap fragmentation
DEBUG_PRINTF_P(PSTR("-- Segment %p reset, data cleared\n"), this);
}
if (pixels) for (size_t i = 0; i < length(); i++) pixels[i] = BLACK; // clear pixel buffer
next_time = 0; step = 0; call = 0; aux0 = 0; aux1 = 0;
reset = false;
Expand Down Expand Up @@ -454,16 +466,17 @@ void Segment::setGeometry(uint16_t i1, uint16_t i2, uint8_t grp, uint8_t spc, ui
stop = 0;
return;
}
// re-allocate FX render buffer
// allocate FX render buffer
if (length() != oldLength) {
if (pixels) d_free(pixels); // using realloc on large buffers can cause additional fragmentation instead of reducing it
pixels = static_cast<uint32_t*>(d_malloc(sizeof(uint32_t) * length()));
if (pixels) free(pixels); // note: using realloc can block larger heap segments
pixels = static_cast<uint32_t*>(allocate_buffer(length() * sizeof(uint32_t), BFRALLOC_PREFER_PSRAM | BFRALLOC_NOBYTEACCESS));
if (!pixels) {
DEBUG_PRINTLN(F("!!! Not enough RAM for pixel buffer !!!"));
errorFlag = ERR_NORAM_PX;
stop = 0;
return;
}

}
refreshLightCapabilities();
}
Expand Down Expand Up @@ -1198,7 +1211,7 @@ void WS2812FX::finalizeInit() {
bus->begin();
bus->setBrightness(bri);
}
DEBUG_PRINTF_P(PSTR("Heap after buses: %d\n"), ESP.getFreeHeap());
DEBUG_PRINTF_P(PSTR("Heap after buses: %d\n"), getFreeHeapSize());

Segment::maxWidth = _length;
Segment::maxHeight = 1;
Expand All @@ -1210,11 +1223,11 @@ void WS2812FX::finalizeInit() {
deserializeMap(); // (re)load default ledmap (will also setUpMatrix() if ledmap does not exist)

// allocate frame buffer after matrix has been set up (gaps!)
if (_pixels) d_free(_pixels); // using realloc on large buffers can cause additional fragmentation instead of reducing it
_pixels = static_cast<uint32_t*>(d_malloc(getLengthTotal() * sizeof(uint32_t)));
if (_pixels) d_free(_pixels);
// use PSRAM if available: there is no measurable perfomance impact between PSRAM and DRAM on S2/S3 with QSPI PSRAM for this buffer
_pixels = static_cast<uint32_t*>(allocate_buffer(getLengthTotal() * sizeof(uint32_t), BFRALLOC_ENFORCE_PSRAM | BFRALLOC_NOBYTEACCESS | BFRALLOC_CLEAR));
DEBUG_PRINTF_P(PSTR("strip buffer size: %uB\n"), getLengthTotal() * sizeof(uint32_t));

DEBUG_PRINTF_P(PSTR("Heap after strip init: %uB\n"), ESP.getFreeHeap());
DEBUG_PRINTF_P(PSTR("Heap after strip init: %uB\n"), getFreeHeapSize());
}

void WS2812FX::service() {
Expand Down Expand Up @@ -1258,7 +1271,7 @@ void WS2812FX::service() {
// if segment is in transition and no old segment exists we don't need to run the old mode
// (blendSegments() takes care of On/Off transitions and clipping)
Segment *segO = seg.getOldSegment();
if (segO && (seg.mode != segO->mode || blendingStyle != BLEND_STYLE_FADE)) {
if (segO && (seg.mode != segO->mode || blendingStyle != BLEND_STYLE_FADE) && segO->isActive()) {
Segment::modeBlend(true); // set semaphore for beginDraw() to blend colors and palette
segO->beginDraw(prog); // set up palette & colors (also sets draw dimensions), parent segment has transition progress
_currentSegment = segO; // set current segment
Expand Down
2 changes: 1 addition & 1 deletion wled00/cfg.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ bool deserializeConfig(JsonObject doc, bool fromFS) {
}
#endif

DEBUG_PRINTF_P(PSTR("Heap before buses: %d\n"), ESP.getFreeHeap());
DEBUG_PRINTF_P(PSTR("Heap before buses: %d\n"), getFreeHeapSize());
JsonArray ins = hw_led["ins"];
if (!ins.isNull()) {
int s = 0; // bus iterator
Expand Down
17 changes: 15 additions & 2 deletions wled00/const.h
Original file line number Diff line number Diff line change
Expand Up @@ -546,8 +546,21 @@ static_assert(WLED_MAX_BUSSES <= 32, "WLED_MAX_BUSSES exceeds hard limit");
#endif
#endif

// minimum heap size required to process web requests
#define MIN_HEAP_SIZE 8192
// minimum heap size required to process web requests: try to keep free heap above this value
#ifdef ESP8266
#define MIN_HEAP_SIZE (8*1024)
#else
#define MIN_HEAP_SIZE (12*1024)
#endif
// threshold for PSRAM use: if heap is running low, requests above PSRAM_THRESHOLD will be allocated in PSRAM
// if heap is plenty, requests below PSRAM_THRESHOLD will be allocated in DRAM for speed
#if defined(CONFIG_IDF_TARGET_ESP32S3)
#define PSRAM_THRESHOLD 5120
#elif defined(CONFIG_IDF_TARGET_ESP32)
#define PSRAM_THRESHOLD 4096
#else
#define PSRAM_THRESHOLD 1024 // S2 does not have a lot of RAM. C3 and ESP8266 do not support PSRAM: the value is not used
#endif

// Web server limits
#ifdef ESP8266
Expand Down
16 changes: 15 additions & 1 deletion wled00/fcn_declare.h
Original file line number Diff line number Diff line change
Expand Up @@ -550,7 +550,7 @@ inline uint8_t hw_random8() { return HW_RND_REGISTER; };
inline uint8_t hw_random8(uint32_t upperlimit) { return (hw_random8() * upperlimit) >> 8; }; // input range 0-255
inline uint8_t hw_random8(uint32_t lowerlimit, uint32_t upperlimit) { uint32_t range = upperlimit - lowerlimit; return lowerlimit + hw_random8(range); }; // input range 0-255

// PSRAM allocation wrappers
// memory allocation wrappers
#if !defined(ESP8266) && !defined(CONFIG_IDF_TARGET_ESP32C3)
extern "C" {
void *p_malloc(size_t); // prefer PSRAM over DRAM
Expand Down Expand Up @@ -579,6 +579,20 @@ extern "C" {
#define d_realloc_malloc realloc_malloc
#define d_free free
#endif
#ifndef ESP8266
inline size_t getFreeHeapSize() { return heap_caps_get_free_size(MALLOC_CAP_INTERNAL | MALLOC_CAP_8BIT); } // returns free heap (ESP.getFreeHeap() can include other memory types)
inline size_t getContiguousFreeHeap() { return heap_caps_get_largest_free_block(MALLOC_CAP_INTERNAL | MALLOC_CAP_8BIT); } // returns largest contiguous free block
#else
inline size_t getFreeHeapSize() { return ESP.getFreeHeap(); } // returns free heap
inline size_t getContiguousFreeHeap() { return ESP.getMaxFreeBlockSize(); } // returns largest contiguous free block
#endif
#define BFRALLOC_NOBYTEACCESS (1 << 0) // ESP32 has 32bit accessible DRAM (usually ~50kB free) that must not be byte-accessed
#define BFRALLOC_PREFER_DRAM (1 << 1) // prefer DRAM over PSRAM
#define BFRALLOC_ENFORCE_DRAM (1 << 2) // use DRAM only, no PSRAM
#define BFRALLOC_PREFER_PSRAM (1 << 3) // prefer PSRAM over DRAM
#define BFRALLOC_ENFORCE_PSRAM (1 << 4) // use PSRAM if available, otherwise fall back to DRAM
#define BFRALLOC_CLEAR (1 << 5) // clear allocated buffer after allocation
void *allocate_buffer(size_t size, uint32_t type);

// RAII guard class for the JSON Buffer lock
// Modeled after std::lock_guard
Expand Down
6 changes: 3 additions & 3 deletions wled00/json.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -812,7 +812,7 @@ void serializeInfo(JsonObject root)
root[F("clock")] = ESP.getCpuFreqMHz();
root[F("flash")] = (ESP.getFlashChipSize()/1024)/1024;
#ifdef WLED_DEBUG
root[F("maxalloc")] = ESP.getMaxAllocHeap();
root[F("maxalloc")] = getContiguousFreeHeap();
root[F("resetReason0")] = (int)rtc_get_reset_reason(0);
root[F("resetReason1")] = (int)rtc_get_reset_reason(1);
#endif
Expand All @@ -823,13 +823,13 @@ void serializeInfo(JsonObject root)
root[F("clock")] = ESP.getCpuFreqMHz();
root[F("flash")] = (ESP.getFlashChipSize()/1024)/1024;
#ifdef WLED_DEBUG
root[F("maxalloc")] = ESP.getMaxFreeBlockSize();
root[F("maxalloc")] = getContiguousFreeHeap();
root[F("resetReason")] = (int)ESP.getResetInfoPtr()->reason;
#endif
root[F("lwip")] = LWIP_VERSION_MAJOR;
#endif

root[F("freeheap")] = ESP.getFreeHeap();
root[F("freeheap")] = getFreeHeapSize();
#if defined(ARDUINO_ARCH_ESP32)
if (psramFound()) root[F("psram")] = ESP.getFreePsram();
#endif
Expand Down
Loading