Skip to content

mmap performance improvement #6

@gnif

Description

@gnif

Have you considered memory mapping the file instead? On my Linux system, this saves 2.636ms during the file read.

diff --git a/parkerwords.cpp b/parkerwords.cpp
index 9d2a6f7..1a279ef 100644
--- a/parkerwords.cpp
+++ b/parkerwords.cpp
@@ -17,6 +17,11 @@
 #include <condition_variable>
 #include <mutex>
 
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <sys/mman.h>
+
 constexpr int MaxThreads = 16;
 
 // uncomment this line to write info to stdout, which takes away precious CPU time
@@ -65,20 +70,18 @@ uint getbits(std::string_view word)
 
 void readwords(const char* file)
 {
+  int fd = open(file, O_RDONLY);
+  struct stat fileStat;
+  fstat(fd, &fileStat);
+  char * buf = (char *)mmap(NULL, fileStat.st_size,
+      PROT_READ, MAP_PRIVATE, fd, 0);
+
        struct { int f, l; } freq[26] = { };
        for (int i = 0; i < 26; i++)
                freq[i].l = i;
 
-    // open file
-    std::vector<char> buf;
-    std::ifstream in(file);
-    in.seekg(0, std::ios::end);
-    buf.resize(in.tellg());
-    in.seekg(0, std::ios::beg);
-    in.read(&buf[0], buf.size());
-
     const char* str = &buf[0];
-       const char* strEnd = str + buf.size();
+       const char* strEnd = str + fileStat.st_size;
 
     // read words
     std::string_view word;
@@ -127,6 +130,9 @@ void readwords(const char* file)
 
         letterindex[min].push_back(w);
     }
+
+  munmap(buf, fileStat.st_size);
+  close(fd);
 }
 
 using WordArray = std::array<uint, 5>;

Before:

538 solutions written to solutions.txt.
Total time: 91423us (0.091423s)
Read:       19497us
Process:    70792us
Write:       1134us

After

538 solutions written to solutions.txt.
Total time: 84546us (0.084546s)
Read:       16861us
Process:    66477us
Write:       1208us

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions