notes.txt

resolution:
Approach 1. take bytecode (e.g. optimized from K2), use inline bpf
assembly to produce a source C file that can then be compiled to an
object file.
** (currently prototyped for simple sockex program)
** problematic to find linkage from bpf instruction to map, when
   handling programs using bpf maps. Recreating map in source code
   will be harder. e.g., maps pointing to other programs, etc.
** by looking at the text part of the bytecode, we don’t know which
   map it’s coming from. one hypothesis: the order in which the maps
   are instantiated.
** adding this info will make it similar to approach 3 (below). Some
   libbpf undocumented functions could provide some of this
   info. Simple ones like getting the size of the key/value. If the
   input is the actual struct definition itself, that will be harder
   to work.
** advantage of having this prototyped: any programs without maps
   (purely instructions) is amenable to throughput testing.
** try and use macros mainly for readability
Approach 2. completely perform elf section modifications. Take
optimized instructions, change only the appropriate parts of the
object file. Similar to our earlier (~2021) bpf patching tool. But you
need to resolve the new jump offsets (done), linkages to other elf
sections including maps and other sections outside of the code
itself.
** table of contents section : page numbers change when a section is
   shrunk. libelf elf_update may already do this.
** ebpf specific sections : relos. We may have to update this by
   ourselves. BTF section / kernel data structures
** you need to know exactly what the elf sections are, how they are
   organized, and how to fix linkages (exist now or will come up in
   the future) to produce new elf object files from new instructions +
   old object file
** will need to inspect many object files using tools that can display
   the contents of those object files legibly (llvm-objdump doesn’t do
   a great job with any non-text sections)
** looking at clang bpf source code may be more challenging. (libbpf
   at least seems manageable)
** deconstructing the elf, in particular the ebpf-elf format
   (understanding relationships) seems harder than actually fixing
   it.
Approach 3. selective decompilation. take as input the source code + a
modified set of instructions, attempt to generate fresh source code
which when compiled would lead to that (optimized) set of
instructions. Use ebpf inline assembly to fill in the parts.
source code line s -> assembly instructions a1...aN
if only a3 is changed or removed, then we need to use inline assebmly
for all of them (that is the “obvious” thing to do, there may be other
ways to get around this.)
** main advantage: only need to consider the text section. No need to
   consider linkages to other sections
** cons: consider the interplay between assembly and source code. If
   bpf_map_lookup(map, &key) <-- key : it is held in register
   R2. Relate the variable “key” to this thing in R2.
   The mapping between variables and their registers is not known
   beforehand and can change through the instructions in the program.
** will loop iterations also affect this mapping between variable
   names (in source code) and registers (in assembly)?  “Loops should
   not” (Srinivas’s belief)
** looking at the relationship between source code lines and assembly
   instructions in the original object file (using appropriate flags
   to clang and llvm-objdump)
** May want K2 to report the set of instructions that it changed
   (relationship between instruction x in the optimized program and
   which instruction it comes from in the original program)
** BTF sections will also be generated by running clang on the
   decompiled source code as long as we keep the information about
   kernel struct access in the source code.
   ** could an instruction that was generated from such a line of
   source code be optimized in a way that (we) cannot reconstruct this
   information
Q: Are there more general approaches to compressing an eBPF object
file?
--> investigate the use of pahole to check data structure layout
Next steps: “consolidate wins”
1. with given set of source programs, report the benefits of k2 (from
different versions of clang outputs) vs. clang of different versions
for instruction counts. Use the appropriate flags for clang that
optimize for instructions size (-Os? -O0? -O2?)
2. create programs that can be fully expressed using just text
sections (no maps, other sections...). Can we measure throughput or
latency for those programs?
- take sample programs and produce binaries with k2 and with clang
(pick a version).
- bpf_latency values:
https://developers.redhat.com/articles/2022/06/22/measuring-bpf-performance-tips-tricks-and-best-practices#who_traces_the_tracer_
https://developers.redhat.com/articles/2022/06/22/measuring-bpf-performance-tips-tricks-and-best-practices#profiling_bpf_programs
** use the 552 project setup (preferably on cloudlab) to measure the
latency difference between the k2- and clang- variants
possible programs to test:
* 552 project 3
* sample programs from linux (may need porting from other hooks to
  xdp)
3. document all the scripts, approaches, results.
4. whatever else you want to do :-)
   -> beautifying the inline assembly in approach 1
   -> ...